avoid double work in git-annex init, second try
reconcileStaged populates the db, so scanAnnexedFiles does not need to do it again. It still makes a pass over the HEAD tree, but populating the db was most of the expensive part. Benchmarking with 100,000 files, git-annex init now takes 40 seconds, vs 37 seconds with the old, buggy version of this fix. It should be possible to win those 3 precious seconds per 100k files back, in the case when when annex.thin is not set, with improvements to reconcileStaged that avoid needing this second pass. Sponsored-by: Dartmouth College's Datalad project
This commit is contained in:
parent
22185b4a4e
commit
c941ab6f5b
3 changed files with 10 additions and 19 deletions
|
@ -339,7 +339,7 @@ reconcileStaged qh = do
|
|||
(asTopFilePath file)
|
||||
(SQL.WriteHandle qh)
|
||||
when (dstmode /= fmtTreeItemType TreeSymlink) $
|
||||
reconcilerace (asTopFilePath file) key
|
||||
reconcilepointerfile (asTopFilePath file) key
|
||||
return True
|
||||
Nothing -> return False
|
||||
procdiff mdfeeder rest
|
||||
|
@ -367,7 +367,7 @@ reconcileStaged qh = do
|
|||
_ -> return conflicted -- parse failed
|
||||
procmergeconflictdiff _ _ conflicted = return conflicted
|
||||
|
||||
reconcilerace file key = do
|
||||
reconcilepointerfile file key = do
|
||||
caches <- liftIO $ SQL.getInodeCaches key (SQL.ReadHandle qh)
|
||||
keyloc <- calcRepo (gitAnnexLocation key)
|
||||
keypopulated <- sameInodeCache keyloc caches
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue