git-annex uses FilePath (String) extensively. That's a slow data type. Converting to ByteString, and RawFilePath, should speed it up significantly, according to [[/profiling]]. I've made a test branch, `bs`, to see what kind of performance improvement to expect. Benchmarking `git-annex find`, speedups range from 28-66%. The files fly by much more snappily. Other commands likely also speed up, but do more work than find so the improvement is not as large. The `bs` branch is in a mergeable state now, but still needs work: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. Or at least most of them. There are likely quite a few places where a value is converted back and forth several times. As a first step, profile and look for the hot spots. Known hot spots: * keyFile uses fromRawFilePath and that adds around 3% overhead in `git-annex find`. Converting it to a RawFilePath needs a version of `` for RawFilePaths. * getJournalFileStale uses fromRawFilePath, and adds 3-5% overhead in `git-annex whereis`. Converting it to RawFilePath needs a version of `` for RawFilePaths. It also needs a ByteString.readFile for RawFilePath. * System.FilePath is not available for RawFilePath, and many of the conversions are to get a FilePath in order to use that library. It should be entirely straightforward to make a version of System.FilePath that can operate on RawFilePath, except possibly there could be some complications due to Windows. * Use versions of IO actions like getFileStatus that take a RawFilePath, avoiding a conversion. Note that these are only available on unix, not windows, so a compatability shim will be needed. (I can't seem to find any library that provides one.)