git-annex uses FilePath (String) extensively. That's a slow data type. Converting to ByteString, and RawFilePath, should speed it up significantly, according to [[/profiling]]. I've made a test branch, `bs`, to see what kind of performance improvement to expect. Benchmarking `git-annex find`, speedups range from 28-66%. The files fly by much more snappily. Other commands likely also speed up, but do more work than find so the improvement is not as large. The `bs` branch is in a mergeable state now, but still needs work: * There's a bug impacting WORM keys with / in the keyname. The files stored in the git-annex branch used to have the `/` changed to `_`, but on the bs branch that does not happen. git also outputs a message about "Ignoring" the file. Test case: git config annex.backend WORM git annex addurl http://localhost/~joey/index.html Hmm, that prints out the Ignoring message, and the file does not get written to the git-annex branch. But in my big repo, I saw the message and saw a file in the branch, with `/` in its keyname. Earlier in the branch, the same key used `_`. (Look for "36bfe385607b32c4d5150404c0" to find it again.) * Profile various commands and look for hot spots. * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. Or at least most of them. There are likely some places where a value is converted back and forth several times. * Use versions of IO actions like getFileStatus that take a RawFilePath, avoiding a conversion. Note that these are only available on unix, not windows, so a compatability shim will be needed. (I can't seem to find any library that provides one.)