assistant: Fix a race condition that could cause a pointer file to get ingested into the annex

This was caused by commit fb8ab2469d putting
an isPointerFile check in the wrong place. So if the file was not a pointer
file at that point, but got replaced by one before the file got locked
down, the pointer file would be ingested into the annex.

The fix is simply to move the isPointerFile check to after safeToAdd locks
down the file. Now if the file changes to a pointer file after the
isPointerFile check, ingestion will see that it changed after lockdown,
and will refuse to add it to the annex.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
This commit is contained in:
Joey Hess 2024-07-02 12:24:57 -04:00
parent a65068fb66
commit 12a0ca9656
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 45 additions and 8 deletions

View file

@ -290,12 +290,14 @@ handleAdds lockdowndir havelsof largefilematcher annexdotfiles delayadd cs = ret
refillChanges postponed refillChanges postponed
returnWhen (null toadd) $ do returnWhen (null toadd) $ do
(addedpointerfiles, toaddrest) <- partitionEithers
<$> mapM checkpointerfile toadd
(toaddannexed, toaddsmall) <- partitionEithers (toaddannexed, toaddsmall) <- partitionEithers
<$> mapM checksmall toadd <$> mapM checksmall toaddrest
addsmall toaddsmall addsmall toaddsmall
addedannexed <- addaction toadd $ addedannexed <- addaction toadd $
catMaybes <$> addannexed toaddannexed catMaybes <$> addannexed toaddannexed
return $ addedannexed ++ toaddsmall ++ otherchanges return $ addedannexed ++ toaddsmall ++ addedpointerfiles ++ otherchanges
where where
(incomplete, otherchanges) = partition (\c -> isPendingAddChange c || isInProcessAddChange c) cs (incomplete, otherchanges) = partition (\c -> isPendingAddChange c || isInProcessAddChange c) cs
@ -303,6 +305,19 @@ handleAdds lockdowndir havelsof largefilematcher annexdotfiles delayadd cs = ret
| c = return otherchanges | c = return otherchanges
| otherwise = a | otherwise = a
checkpointerfile change = do
let file = toRawFilePath $ changeFile change
mk <- liftIO $ isPointerFile file
case mk of
Nothing -> return (Right change)
Just key -> do
mode <- liftIO $ catchMaybeIO $ fileMode <$> R.getFileStatus file
liftAnnex $ stagePointerFile file mode =<< hashPointerFile key
return $ Left $ Change
(changeTime change)
(changeFile change)
(LinkChange (Just key))
checksmall change checksmall change
| not annexdotfiles && dotfile f = | not annexdotfiles && dotfile f =
return (Right change) return (Right change)

View file

@ -196,11 +196,8 @@ shouldRestage :: DaemonStatus -> Bool
shouldRestage ds = scanComplete ds || forceRestage ds shouldRestage ds = scanComplete ds || forceRestage ds
onAddFile :: Bool -> Handler onAddFile :: Bool -> Handler
onAddFile symlinkssupported f fs = do onAddFile symlinkssupported f fs =
mk <- liftIO $ isPointerFile $ toRawFilePath f onAddFile' contentchanged addassociatedfile addlink samefilestatus symlinkssupported f fs
case mk of
Nothing -> onAddFile' contentchanged addassociatedfile addlink samefilestatus symlinkssupported f fs
Just k -> addlink f k
where where
addassociatedfile key file = addassociatedfile key file =
Database.Keys.addAssociatedFile key Database.Keys.addAssociatedFile key

View file

@ -1,3 +1,10 @@
git-annex (10.20240702) UNRELEASED; urgency=medium
* assistant: Fix a race condition that could cause a pointer file to
get ingested into the annex.
-- Joey Hess <id@joeyh.name> Tue, 02 Jul 2024 12:14:53 -0400
git-annex (10.20240701) upstream; urgency=medium git-annex (10.20240701) upstream; urgency=medium
* git-annex remotes can now act as proxies that provide access to * git-annex remotes can now act as proxies that provide access to

View file

@ -104,3 +104,6 @@ on laptop where I dive into inception: 10.20240129
[[!meta author=yoh]] [[!meta author=yoh]]
[[!tag projects/repronim]] [[!tag projects/repronim]]
[[!meta title="inception: pointer file can be ingested into the annex due to assistant bug or manually"]]

View file

@ -0,0 +1,15 @@
[[!comment format=mdwn
username="joey"
subject="""comment 7"""
date="2024-07-02T16:15:28Z"
content="""
I've fixed this race in the assistant.
Question now is, can this bug be closed, or does it need to be left open,
and git-annex made to recover from this situation? Given the complexity
of making git-annex notice this, I'm sort of inclined to not have it
auto-recover. Manual recovery seems pretty simple, just delete the file and
re-add it with the right key.
Thoughts?
"""]]