avoid uncessary call to inAnnex
sync --content: Avoid a redundant checksum of a file that was incrementally verified, when used on NTFS and perhaps other filesystems. When sync has just gotten the content, it does not need to check inAnnex a second time. On NTFS, for some reason the write of the inode cache after it gets the content is not immediately able to be read, and with an empty/non-matching inode cache due to that stale data, inAnnex falls back to hashing the whole object to determine if it's present. Sponsored-by: Brock Spratlen on Patreon
This commit is contained in:
parent
17a31f8e1b
commit
b9a1cc512d
4 changed files with 33 additions and 10 deletions
|
@ -11,6 +11,8 @@ git-annex (8.20210904) UNRELEASED; urgency=medium
|
|||
* Sped up git-annex smudge --clean by 25%.
|
||||
* Resume where it left off when copying a file to/from a local git remote
|
||||
was interrupted.
|
||||
* sync --content: Avoid a redundant checksum of a file that was
|
||||
incrementally verified, when used on NTFS and perhaps other filesystems.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Fri, 03 Sep 2021 12:02:55 -0400
|
||||
|
||||
|
|
|
@ -809,10 +809,11 @@ syncFile ebloom rs af k = do
|
|||
let (have, lack) = partition (\r -> Remote.uuid r `elem` locs) rs
|
||||
|
||||
got <- anyM id =<< handleget have inhere
|
||||
putrs <- handleput lack
|
||||
let inhere' = inhere || got
|
||||
putrs <- handleput lack inhere'
|
||||
|
||||
u <- getUUID
|
||||
let locs' = concat [if inhere || got then [u] else [], putrs, locs]
|
||||
let locs' = concat [if inhere' then [u] else [], putrs, locs]
|
||||
|
||||
-- To handle --all, a bloom filter is populated with all the keys
|
||||
-- of files in the working tree in the first pass. On the second
|
||||
|
@ -855,14 +856,15 @@ syncFile ebloom rs af k = do
|
|||
| Remote.readonly r || remoteAnnexReadOnly (Remote.gitconfig r) = return False
|
||||
| isThirdPartyPopulated r = return False
|
||||
| otherwise = wantSend True (Just k) af (Remote.uuid r)
|
||||
handleput lack = catMaybes <$> ifM (inAnnex k)
|
||||
( forM lack $ \r ->
|
||||
ifM (wantput r <&&> put r)
|
||||
( return (Just (Remote.uuid r))
|
||||
, return Nothing
|
||||
)
|
||||
, return []
|
||||
)
|
||||
handleput lack inhere
|
||||
| inhere = catMaybes <$>
|
||||
( forM lack $ \r ->
|
||||
ifM (wantput r <&&> put r)
|
||||
( return (Just (Remote.uuid r))
|
||||
, return Nothing
|
||||
)
|
||||
)
|
||||
| otherwise = return []
|
||||
put dest = includeCommandAction $
|
||||
Command.Move.toStart' dest Command.Move.RemoveNever af k ai si
|
||||
|
||||
|
|
|
@ -724,3 +724,5 @@ from 159 to 296985).
|
|||
Git Annex is great. It works quite nicely with my multi-gigabyte backup files (largest around 180GB) via the BLAKE2B160E backend :)
|
||||
|
||||
[[!meta title="windows: sync -C takes longer than get, apparently extra checksum"]]
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
||||
|
|
|
@ -0,0 +1,17 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 11"""
|
||||
date="2021-10-01T15:57:12Z"
|
||||
content="""
|
||||
Fixed by avoiding sync calling inAnnex when it knows it has the content,
|
||||
because it just got it.
|
||||
|
||||
This does leave open the possibility that there are similar situations
|
||||
elsewhere, that lead to either extra work like this, or to incorrect
|
||||
behavior. Since sqlite write followed by a read is generally something
|
||||
git-annex is careful of, and also since it is generally careful to have
|
||||
reasonable behavior is sqlite somehow loses data, I'm not too worried about
|
||||
incorrect behavior. I feel comfortable closing this bug with just this fix,
|
||||
despite not getting to the bottom of the issue of why sqlite writes are
|
||||
not immediately able to be read back on NTFS.
|
||||
"""]]
|
Loading…
Reference in a new issue