cache negative lookups of global numcopies and mincopies
Speeds up eg git-annex sync --content by up to 50%. When it does not need to transfer or drop anything, it now noops a lot more quickly. I didn't see anything else in sync --content noop loop that could really be sped up. It has to cat git objects to keys, stat object files, etc. Sponsored-by: unqueued on Patreon
This commit is contained in:
parent
4437e187e6
commit
3c15e0f7a0
5 changed files with 38 additions and 6 deletions
4
Annex.hs
4
Annex.hs
|
@ -183,8 +183,8 @@ data AnnexState = AnnexState
|
|||
, hashobjecthandle :: Maybe (ResourcePool HashObjectHandle)
|
||||
, checkattrhandle :: Maybe (ResourcePool CheckAttrHandle)
|
||||
, checkignorehandle :: Maybe (ResourcePool CheckIgnoreHandle)
|
||||
, globalnumcopies :: Maybe NumCopies
|
||||
, globalmincopies :: Maybe MinCopies
|
||||
, globalnumcopies :: Maybe (Maybe NumCopies)
|
||||
, globalmincopies :: Maybe (Maybe MinCopies)
|
||||
, limit :: ExpandableMatcher Annex
|
||||
, timelimit :: Maybe (Duration, POSIXTime)
|
||||
, sizelimit :: Maybe (TVar Integer)
|
||||
|
|
|
@ -79,6 +79,8 @@ git-annex (10.20230408) UNRELEASED; urgency=medium
|
|||
* Large speed up to importing trees from special remotes that contain a lot
|
||||
of files, by only processing changed files.
|
||||
* Some other speedups to importing trees from special remotes.
|
||||
* Cache negative lookups of global numcopies and mincopies.
|
||||
Speeds up eg git-annex sync --content by up to 50%.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Sat, 08 Apr 2023 13:57:18 -0400
|
||||
|
||||
|
|
|
@ -45,22 +45,22 @@ setGlobalMinCopies new = do
|
|||
|
||||
{- Value configured in the numcopies log. Cached for speed. -}
|
||||
getGlobalNumCopies :: Annex (Maybe NumCopies)
|
||||
getGlobalNumCopies = maybe globalNumCopiesLoad (return . Just)
|
||||
getGlobalNumCopies = maybe globalNumCopiesLoad return
|
||||
=<< Annex.getState Annex.globalnumcopies
|
||||
|
||||
{- Value configured in the mincopies log. Cached for speed. -}
|
||||
getGlobalMinCopies :: Annex (Maybe MinCopies)
|
||||
getGlobalMinCopies = maybe globalMinCopiesLoad (return . Just)
|
||||
getGlobalMinCopies = maybe globalMinCopiesLoad return
|
||||
=<< Annex.getState Annex.globalmincopies
|
||||
|
||||
globalNumCopiesLoad :: Annex (Maybe NumCopies)
|
||||
globalNumCopiesLoad = do
|
||||
v <- getLog numcopiesLog
|
||||
Annex.changeState $ \s -> s { Annex.globalnumcopies = v }
|
||||
Annex.changeState $ \s -> s { Annex.globalnumcopies = Just v }
|
||||
return v
|
||||
|
||||
globalMinCopiesLoad :: Annex (Maybe MinCopies)
|
||||
globalMinCopiesLoad = do
|
||||
v <- getLog mincopiesLog
|
||||
Annex.changeState $ \s -> s { Annex.globalmincopies = v }
|
||||
Annex.changeState $ \s -> s { Annex.globalmincopies = Just v }
|
||||
return v
|
||||
|
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 14"""
|
||||
date="2023-06-06T17:11:35Z"
|
||||
content="""
|
||||
There's only one import in the sync, and your output shows it completed
|
||||
(with error).
|
||||
|
||||
The only other phase of sync that could be run after that and take a lot of
|
||||
time is content syncing. You would have to have annex.synccontent set
|
||||
somewhere for sync to do that. Do you?
|
||||
"""]]
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 15"""
|
||||
date="2023-06-06T17:31:49Z"
|
||||
content="""
|
||||
It would make a lot of sense for --content syncing to be what remains slow.
|
||||
That has to scan over all the files and when it decides that it does not
|
||||
need to copy the content anywhere, that's a tight loop with no output.
|
||||
|
||||
In my repo with 10000 files that was set up by the latest test case,
|
||||
`git-annex sync` takes 13 seconds, and with --content it takes 61 seconds.
|
||||
|
||||
I optimised a numcopies/mincopies lookup away, and that got it
|
||||
down to 28 seconds.
|
||||
|
||||
The cidsdb does not get accessed by the --content scan
|
||||
in my testing, although there may be other situations where it does.
|
||||
"""]]
|
Loading…
Reference in a new issue