recover from race between git mv+commit and git-annex get

Last of the known v6 races.

This also makes git add of a pointer file populate it when its content
is present in the annex. Which makes sense to do, I think.

This commit was supported by the NSF-funded DataLad project.
This commit is contained in:
Joey Hess 2018-08-22 16:01:50 -04:00
parent 50fa17aee6
commit 98fd7ec6c9
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 23 additions and 25 deletions

View file

@ -73,9 +73,11 @@ smudge file = do
clean :: FilePath -> CommandStart
clean file = do
b <- liftIO $ B.hGetContents stdin
if isJust (parseLinkOrPointer b)
then liftIO $ B.hPut stdout b
else ifM (shouldAnnex file)
case parseLinkOrPointer b of
Just k -> do
getMoveRaceRecovery k file
liftIO $ B.hPut stdout b
Nothing -> ifM (shouldAnnex file)
( do
-- Before git 2.5, failing to consume all
-- stdin here would cause a SIGPIPE and
@ -122,3 +124,21 @@ shouldAnnex file = do
emitPointer :: Key -> IO ()
emitPointer = putStr . formatPointer
-- Recover from a previous race between eg git mv and git-annex get.
-- That could result in the file remaining a pointer file, while
-- its content is present in the annex. Populate the pointer file.
--
-- This also handles the case where a copy of a pointer file is made,
-- then git-annex gets the content, and later git add is run on
-- the pointer copy. It will then be populated with the content.
getMoveRaceRecovery :: Key -> FilePath -> Annex ()
getMoveRaceRecovery k file = void $ tryNonAsync $
liftIO (isPointerFile file) >>= \k' -> when (Just k == k') $
whenM (inAnnex k) $ do
obj <- calcRepo (gitAnnexLocation k)
-- Cannot restage because git add is running and has
-- the index locked.
populatePointerFile (Restage False) k obj file >>= \case
Nothing -> return ()
Just ic -> Database.Keys.addInodeCaches k [ic]

View file

@ -2,28 +2,6 @@ git-annex should use smudge/clean filters. v6 mode
### August sprint todo list
* If `git mv` of an unlocked file is run at the same time as `git annex drop`,
and when git-annex starts up, the mv has not happened yet, but once it
wants to update the associated file to drop the content, the mv has
happened, then the content will be left in the working tree despite
git-annex having said it dropped it. And `git annex move` has the inverse
problem.
git-annex fsck does notice and fix this problem, at least sometimes.
This could be partially dealt with in reconcileStaged. The next time
git-annex runs it, it will notice the staged change, and it could update
the worktree file that was not gotten/dropped before. -- this is done now
But, if a git mv is run, and then a git commit, reconcileStaged won't
get a chance to notice the changes. git commit does run the clean filter.
If the file was supposed to be dropped but is still present, the clean
filter will re-inject it, and it's as if the drop never happened.
OTOH, if the file was supposed to be gotten but is not present, the clean
filter currently does nothing. It would need to update the worktree file
to have the content to fully recover from the race.
* Checking out a different branch causes git to smudge all changed files,
and write their content. This does not honor annex.thin. A warning
message is printed in this case.