make repair interruption safe

Fixed bug that interrupting git-annex repair (or assistant) while it was
fixing repository corruption would lose objects that were contained in pack
files.

Unpack all pack files and move objects into place *before* deleting the
pack files. The old approach moved the pack files to a temp directory
before unpacking them, which was not interruption safe.

Sponsored-By: Jochen Bartl on Patreon
This commit is contained in:
Joey Hess 2021-06-29 12:57:19 -04:00
parent a492553eca
commit 199391befe
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 48 additions and 16 deletions

View file

@ -13,6 +13,9 @@ git-annex (8.20210622) UNRELEASED; urgency=medium
with youtube-dl or a special remote. In particular this can avoid
falling back to raw download when youtube-dl is blocked for some
reason.
* Fixed bug that interrupting git-annex repair (or assistant) while
it was fixing repository corruption would lose objects that were
contained in pack files.
-- Joey Hess <id@joeyh.name> Mon, 21 Jun 2021 12:25:25 -0400

View file

@ -1,6 +1,6 @@
{- git repository recovery
-
- Copyright 2013-2014 Joey Hess <id@joeyh.name>
- Copyright 2013-2021 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
@ -29,6 +29,7 @@ import Git.Sha
import Git.Types
import Git.Fsck
import Git.Index
import Git.Env
import qualified Git.Config as Config
import qualified Git.Construct as Construct
import qualified Git.LsTree as LsTree
@ -61,15 +62,14 @@ cleanCorruptObjects fsckresults r = do
whenM (isMissing s r) $
removeLoose s
{- Explodes all pack files, and deletes them.
{- Explodes all pack files to loose objects, and deletes the pack files.
-
- First moves all pack files to a temp dir, before unpacking them each in
- turn.
- git unpack-objects will not unpack objects from a pack file that are
- in the git repo. So, GIT_OBJECT_DIRECTORY is pointed to a temporary
- directory, and the loose objects then are moved into place, before
- deleting the pack files.
-
- This is because unpack-objects will not unpack a pack file if it's in the
- git repo.
-
- Also, this prevents unpack-objects from possibly looking at corrupt
- Also, that prevents unpack-objects from possibly looking at corrupt
- pack files to see if they contain an object, while unpacking a
- non-corrupt pack file.
-}
@ -78,18 +78,28 @@ explodePacks r = go =<< listPackFiles r
where
go [] = return False
go packs = withTmpDir "packs" $ \tmpdir -> do
r' <- addGitEnv r "GIT_OBJECT_DIRECTORY" tmpdir
putStrLn "Unpacking all pack files."
forM_ packs $ \packfile -> do
moveFile packfile (tmpdir </> takeFileName packfile)
removeWhenExistsWith R.removeLink
(packIdxFile (toRawFilePath packfile))
forM_ packs $ \packfile -> do
let tmp = tmpdir </> takeFileName packfile
allowRead (toRawFilePath tmp)
-- Just in case permissions are messed up.
allowRead (toRawFilePath packfile)
-- May fail, if pack file is corrupt.
void $ tryIO $
pipeWrite [Param "unpack-objects", Param "-r"] r $ \h ->
L.hPut h =<< L.readFile tmp
pipeWrite [Param "unpack-objects", Param "-r"] r' $ \h ->
L.hPut h =<< L.readFile packfile
objs <- dirContentsRecursive tmpdir
forM_ objs $ \objfile -> do
f <- relPathDirToFile
(toRawFilePath tmpdir)
(toRawFilePath objfile)
let dest = objectsDir r P.</> f
createDirectoryIfMissing True
(fromRawFilePath (parentDir dest))
moveFile objfile (fromRawFilePath dest)
forM_ packs $ \packfile -> do
let f = toRawFilePath packfile
removeWhenExistsWith R.removeLink f
removeWhenExistsWith R.removeLink (packIdxFile f)
return True
{- Try to retrieve a set of missing objects, from the remotes of a

View file

@ -0,0 +1,19 @@
[[!comment format=mdwn
username="joey"
subject="""comment 16"""
date="2021-06-29T16:11:03Z"
content="""
I have verified that an interrupted repair results in data loss, when
.git/objects contains pack files.
Fixed that. Which leaves 2 open questions:
* Was the assistant interrupted somehow while it was running repair, in
Atemu's case? That could have been anything from the laptop being
powered off, to it being stopped explicitly, to it crashing.
* What caused the assistant to think it needed repair in the first place?
Although the answer to the second question doesn't matter much if the
data loss bug is fixed -- if there's a problem of some kind causing
unncessary repairs, it would only be some excess CPU load.
"""]]