This commit is contained in:
Joey Hess 2019-01-14 19:00:38 -04:00
parent f289663611
commit d79ac08532
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 27 additions and 1 deletions

View file

@ -14,7 +14,7 @@ git-annex (7.20181212) UNRELEASED; urgency=medium
* importfeed: Better error message when downloading the feed fails.
* Some optimisations, including a 10x faster timestamp parser,
a 7x faster key parser, and improved parsing and serialization of
git-annex branch data. Some commands will run up to 15% faster.
git-annex branch data. Many commands will run 5-15% faster.
* Stricter parser for keys doesn't allow doubled fields or out of order fields.
* The benchmark command, which only had some old benchmarking of the sqlite
databases before, now allows benchmarking any other git-annex commands.

View file

@ -0,0 +1,26 @@
I said I was going to stop with the ByteString conversion, but then I
looked at [[/profiling]], and I knew I couldn't stop there --
conversion between String and ByteString had became a major cost center.
So today, converted all the code that reads and parses symlinks and pointer files
to ByteString, now ByteString is used all the way from disk to Key. Also
put in some caching, so git-annex does not need to re-serialize a Key
that it's just deserialized from a ByteString.
There's still some ByteString to String conversion when generating
FilePaths; to avoid that will need an equivilant of System.FilePath that
operates on RawFilePath, and I don't think there is one yet? But the
[[/profiling]] does show improvement, it's more and more dominated by IO
operations that can't be sped up, and less by slow code.
This really does feel like a stopping place now.
Updated benchmarks (compared to last git-annex release):
find on 10000 files, none present... 8% speedup
whereis on 1000 files............... 12% speedup
info on dir with 1000 files......... 7% speedup
local get ; drop of 1000 files...... 4% speedup
setting metadata in 1000 files...... 8% speedup
getting metadata from 1000 files.... 7% speedup
finding a single file out of 1000 that has a given metadata value... 8% speedup