39 lines
1.7 KiB
Markdown
39 lines
1.7 KiB
Markdown
git-annex uses FilePath (String) extensively. That's a slow data type.
|
|
Converting to ByteString, and RawFilePath, should speed it up
|
|
significantly, according to [[/profiling]].
|
|
|
|
I've made a test branch, `bs`, to see what kind of performance improvement
|
|
to expect.
|
|
|
|
Benchmarking `git-annex find`, speedups range from 28-66%. The files fly by
|
|
much more snappily. Other commands likely also speed up, but do more work
|
|
than find so the improvement is not as large.
|
|
|
|
The `bs` branch is in a mergeable state now, but still needs work:
|
|
|
|
* There's a bug impacting WORM keys with / in the keyname.
|
|
The files stored in the git-annex branch used to have the `/` changed
|
|
to `_`, but on the bs branch that does not happen. git also outputs
|
|
a message about "Ignoring" the file.
|
|
|
|
Test case:
|
|
|
|
git config annex.backend WORM
|
|
git annex addurl http://localhost/~joey/index.html
|
|
|
|
Hmm, that prints out the Ignoring message, and the file does not get
|
|
written to the git-annex branch. But in my big repo, I saw the message
|
|
and saw a file in the branch, with `/` in its keyname. Earlier in the
|
|
branch, the same key used `_`. (Look for "36bfe385607b32c4d5150404c0" to
|
|
find it again.)
|
|
|
|
* Profile various commands and look for hot spots.
|
|
|
|
* Eliminate all the fromRawFilePath, toRawFilePath, encodeBS,
|
|
decodeBS conversions. Or at least most of them. There are likely
|
|
some places where a value is converted back and forth several times.
|
|
|
|
* Use versions of IO actions like getFileStatus that take a RawFilePath,
|
|
avoiding a conversion. Note that these are only available on unix, not
|
|
windows, so a compatability shim will be needed.
|
|
(I can't seem to find any library that provides one.)
|