update re state of bs branch

This commit is contained in:
Joey Hess 2019-12-06 15:13:36 -04:00
parent 4265344bc8
commit 3d936e4343
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -3,19 +3,26 @@ Converting to ByteString, and RawFilePath, should speed it up
significantly, according to [[/profiling]]. significantly, according to [[/profiling]].
I've made a test branch, `bs`, to see what kind of performance improvement I've made a test branch, `bs`, to see what kind of performance improvement
to expect. Most commands don't built yet in that branch, but `git annex to expect.
find` does. Speedups range from 28-66%. The files fly by much more
snappily.
As well as adding back all the code that was disabled to get it to build, Benchmarking `git-annex find`, speedups range from 28-66%. The files fly by
the `bs` branch has quite a lot of things still needing work, including: much more snappily. Other commands likely also speed up, but do more work
than find so the improvement is not as large.
The `bs` branch is in a mergeable state now, but still needs work:
* Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS,
decodeBS conversions. Or at least most of them. There are likely decodeBS conversions. Or at least most of them. There are likely
quite a few places where a value is converted back and forth several times. quite a few places where a value is converted back and forth several times.
It would be good to instrument them with Debug.Trace and find out which As a first step, profile and look for the hot spots. Known hot spots:
are the hot ones that get called, and focus on those.
* keyFile uses fromRawFilePath and that adds around 3% overhead in `git-annex find`.
Converting it to a RawFilePath needs a version of `</>` for RawFilePaths.
* getJournalFileStale uses fromRawFilePath, and adds 3-5% overhead in
`git-annex whereis`. Converting it to RawFilePath needs a version
of `</>` for RawFilePaths. It also needs a ByteString.readFile
for RawFilePath.
* System.FilePath is not available for RawFilePath, and many of the * System.FilePath is not available for RawFilePath, and many of the
conversions are to get a FilePath in order to use that library. conversions are to get a FilePath in order to use that library.
@ -28,7 +35,3 @@ the `bs` branch has quite a lot of things still needing work, including:
avoiding a conversion. Note that these are only available on unix, not avoiding a conversion. Note that these are only available on unix, not
windows, so a compatability shim will be needed. windows, so a compatability shim will be needed.
(I can't seem to find any library that provides one.) (I can't seem to find any library that provides one.)
* Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy.
* Use ByteString for parsing git config to speed up startup.