git-annex/doc/devblog/day_566__stopping_place.mdwn
2019-01-14 19:00:38 -04:00

26 lines
1.3 KiB
Markdown

I said I was going to stop with the ByteString conversion, but then I
looked at [[/profiling]], and I knew I couldn't stop there --
conversion between String and ByteString had became a major cost center.
So today, converted all the code that reads and parses symlinks and pointer files
to ByteString, now ByteString is used all the way from disk to Key. Also
put in some caching, so git-annex does not need to re-serialize a Key
that it's just deserialized from a ByteString.
There's still some ByteString to String conversion when generating
FilePaths; to avoid that will need an equivilant of System.FilePath that
operates on RawFilePath, and I don't think there is one yet? But the
[[/profiling]] does show improvement, it's more and more dominated by IO
operations that can't be sped up, and less by slow code.
This really does feel like a stopping place now.
Updated benchmarks (compared to last git-annex release):
find on 10000 files, none present... 8% speedup
whereis on 1000 files............... 12% speedup
info on dir with 1000 files......... 7% speedup
local get ; drop of 1000 files...... 4% speedup
setting metadata in 1000 files...... 8% speedup
getting metadata from 1000 files.... 7% speedup
finding a single file out of 1000 that has a given metadata value... 8% speedup