queue more changes to keys db

Increasing the size of the queue 10x makes git-annex init 7% faster in a
repository with 86000 annexed files.

The memory use goes up, from 70876 kb to 85376 kb.
This commit is contained in:
Joey Hess 2022-11-18 13:29:34 -04:00
parent 8fcee4ac9d
commit c834d2025a
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 11 additions and 5 deletions

View file

@ -1,7 +1,7 @@
git-annex (10.20221105) UNRELEASED; urgency=medium
* Support quettabyte and yottabyte.
* Sped up the initial scanning for annexed files by 15%.
* Sped up the initial scanning for annexed files by 21%.
-- Joey Hess <id@joeyh.name> Fri, 18 Nov 2022 12:58:06 -0400

View file

@ -73,8 +73,8 @@ newtype WriteHandle = WriteHandle H.DbQueue
queueDb :: SqlPersistM () -> WriteHandle -> IO ()
queueDb a (WriteHandle h) = H.queueDb h checkcommit a
where
-- commit queue after 1000 changes
checkcommit sz _lastcommittime = pure (sz > 1000)
-- commit queue after 10000 changes
checkcommit sz _lastcommittime = pure (sz > 10000)
-- Insert the associated file.
-- When the file was associated with a different key before,

View file

@ -12,6 +12,4 @@ This will need some care to be implemented safely...
I benchmarked it, and using insertUnique is no faster, but using insert is.
This would be a 15% speed up.
Update: Implemented this optimisation.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="joey"
subject="""comment 14"""
date="2022-11-18T17:26:03Z"
content="""
Implemented the two optimisations discussed above, and init in that
repository dropped from 24 seconds to 19 seconds, a 21% speedup.
"""]]