avoid uncessary call to inAnnex
sync --content: Avoid a redundant checksum of a file that was incrementally verified, when used on NTFS and perhaps other filesystems. When sync has just gotten the content, it does not need to check inAnnex a second time. On NTFS, for some reason the write of the inode cache after it gets the content is not immediately able to be read, and with an empty/non-matching inode cache due to that stale data, inAnnex falls back to hashing the whole object to determine if it's present. Sponsored-by: Brock Spratlen on Patreon
This commit is contained in:
parent
17a31f8e1b
commit
b9a1cc512d
4 changed files with 33 additions and 10 deletions
|
@ -11,6 +11,8 @@ git-annex (8.20210904) UNRELEASED; urgency=medium
|
||||||
* Sped up git-annex smudge --clean by 25%.
|
* Sped up git-annex smudge --clean by 25%.
|
||||||
* Resume where it left off when copying a file to/from a local git remote
|
* Resume where it left off when copying a file to/from a local git remote
|
||||||
was interrupted.
|
was interrupted.
|
||||||
|
* sync --content: Avoid a redundant checksum of a file that was
|
||||||
|
incrementally verified, when used on NTFS and perhaps other filesystems.
|
||||||
|
|
||||||
-- Joey Hess <id@joeyh.name> Fri, 03 Sep 2021 12:02:55 -0400
|
-- Joey Hess <id@joeyh.name> Fri, 03 Sep 2021 12:02:55 -0400
|
||||||
|
|
||||||
|
|
|
@ -809,10 +809,11 @@ syncFile ebloom rs af k = do
|
||||||
let (have, lack) = partition (\r -> Remote.uuid r `elem` locs) rs
|
let (have, lack) = partition (\r -> Remote.uuid r `elem` locs) rs
|
||||||
|
|
||||||
got <- anyM id =<< handleget have inhere
|
got <- anyM id =<< handleget have inhere
|
||||||
putrs <- handleput lack
|
let inhere' = inhere || got
|
||||||
|
putrs <- handleput lack inhere'
|
||||||
|
|
||||||
u <- getUUID
|
u <- getUUID
|
||||||
let locs' = concat [if inhere || got then [u] else [], putrs, locs]
|
let locs' = concat [if inhere' then [u] else [], putrs, locs]
|
||||||
|
|
||||||
-- To handle --all, a bloom filter is populated with all the keys
|
-- To handle --all, a bloom filter is populated with all the keys
|
||||||
-- of files in the working tree in the first pass. On the second
|
-- of files in the working tree in the first pass. On the second
|
||||||
|
@ -855,14 +856,15 @@ syncFile ebloom rs af k = do
|
||||||
| Remote.readonly r || remoteAnnexReadOnly (Remote.gitconfig r) = return False
|
| Remote.readonly r || remoteAnnexReadOnly (Remote.gitconfig r) = return False
|
||||||
| isThirdPartyPopulated r = return False
|
| isThirdPartyPopulated r = return False
|
||||||
| otherwise = wantSend True (Just k) af (Remote.uuid r)
|
| otherwise = wantSend True (Just k) af (Remote.uuid r)
|
||||||
handleput lack = catMaybes <$> ifM (inAnnex k)
|
handleput lack inhere
|
||||||
|
| inhere = catMaybes <$>
|
||||||
( forM lack $ \r ->
|
( forM lack $ \r ->
|
||||||
ifM (wantput r <&&> put r)
|
ifM (wantput r <&&> put r)
|
||||||
( return (Just (Remote.uuid r))
|
( return (Just (Remote.uuid r))
|
||||||
, return Nothing
|
, return Nothing
|
||||||
)
|
)
|
||||||
, return []
|
|
||||||
)
|
)
|
||||||
|
| otherwise = return []
|
||||||
put dest = includeCommandAction $
|
put dest = includeCommandAction $
|
||||||
Command.Move.toStart' dest Command.Move.RemoveNever af k ai si
|
Command.Move.toStart' dest Command.Move.RemoveNever af k ai si
|
||||||
|
|
||||||
|
|
|
@ -724,3 +724,5 @@ from 159 to 296985).
|
||||||
Git Annex is great. It works quite nicely with my multi-gigabyte backup files (largest around 180GB) via the BLAKE2B160E backend :)
|
Git Annex is great. It works quite nicely with my multi-gigabyte backup files (largest around 180GB) via the BLAKE2B160E backend :)
|
||||||
|
|
||||||
[[!meta title="windows: sync -C takes longer than get, apparently extra checksum"]]
|
[[!meta title="windows: sync -C takes longer than get, apparently extra checksum"]]
|
||||||
|
|
||||||
|
> [[fixed|done]] --[[Joey]]
|
||||||
|
|
|
@ -0,0 +1,17 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 11"""
|
||||||
|
date="2021-10-01T15:57:12Z"
|
||||||
|
content="""
|
||||||
|
Fixed by avoiding sync calling inAnnex when it knows it has the content,
|
||||||
|
because it just got it.
|
||||||
|
|
||||||
|
This does leave open the possibility that there are similar situations
|
||||||
|
elsewhere, that lead to either extra work like this, or to incorrect
|
||||||
|
behavior. Since sqlite write followed by a read is generally something
|
||||||
|
git-annex is careful of, and also since it is generally careful to have
|
||||||
|
reasonable behavior is sqlite somehow loses data, I'm not too worried about
|
||||||
|
incorrect behavior. I feel comfortable closing this bug with just this fix,
|
||||||
|
despite not getting to the bottom of the issue of why sqlite writes are
|
||||||
|
not immediately able to be read back on NTFS.
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue