diff --git a/doc/todo/optimize_by_converting_String_to_ByteString.mdwn b/doc/todo/optimize_by_converting_String_to_ByteString.mdwn index 13d29603fc..7ac7efe382 100644 --- a/doc/todo/optimize_by_converting_String_to_ByteString.mdwn +++ b/doc/todo/optimize_by_converting_String_to_ByteString.mdwn @@ -3,19 +3,26 @@ Converting to ByteString, and RawFilePath, should speed it up significantly, according to [[/profiling]]. I've made a test branch, `bs`, to see what kind of performance improvement -to expect. Most commands don't built yet in that branch, but `git annex -find` does. Speedups range from 28-66%. The files fly by much more -snappily. +to expect. -As well as adding back all the code that was disabled to get it to build, -the `bs` branch has quite a lot of things still needing work, including: +Benchmarking `git-annex find`, speedups range from 28-66%. The files fly by +much more snappily. Other commands likely also speed up, but do more work +than find so the improvement is not as large. + +The `bs` branch is in a mergeable state now, but still needs work: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. Or at least most of them. There are likely quite a few places where a value is converted back and forth several times. - It would be good to instrument them with Debug.Trace and find out which - are the hot ones that get called, and focus on those. + As a first step, profile and look for the hot spots. Known hot spots: + + * keyFile uses fromRawFilePath and that adds around 3% overhead in `git-annex find`. + Converting it to a RawFilePath needs a version of `` for RawFilePaths. + * getJournalFileStale uses fromRawFilePath, and adds 3-5% overhead in + `git-annex whereis`. Converting it to RawFilePath needs a version + of `` for RawFilePaths. It also needs a ByteString.readFile + for RawFilePath. * System.FilePath is not available for RawFilePath, and many of the conversions are to get a FilePath in order to use that library. @@ -28,7 +35,3 @@ the `bs` branch has quite a lot of things still needing work, including: avoiding a conversion. Note that these are only available on unix, not windows, so a compatability shim will be needed. (I can't seem to find any library that provides one.) - -* Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. - -* Use ByteString for parsing git config to speed up startup.