make repair interruption safe
Fixed bug that interrupting git-annex repair (or assistant) while it was fixing repository corruption would lose objects that were contained in pack files. Unpack all pack files and move objects into place *before* deleting the pack files. The old approach moved the pack files to a temp directory before unpacking them, which was not interruption safe. Sponsored-By: Jochen Bartl on Patreon
This commit is contained in:
parent
a492553eca
commit
199391befe
3 changed files with 48 additions and 16 deletions
|
@ -13,6 +13,9 @@ git-annex (8.20210622) UNRELEASED; urgency=medium
|
||||||
with youtube-dl or a special remote. In particular this can avoid
|
with youtube-dl or a special remote. In particular this can avoid
|
||||||
falling back to raw download when youtube-dl is blocked for some
|
falling back to raw download when youtube-dl is blocked for some
|
||||||
reason.
|
reason.
|
||||||
|
* Fixed bug that interrupting git-annex repair (or assistant) while
|
||||||
|
it was fixing repository corruption would lose objects that were
|
||||||
|
contained in pack files.
|
||||||
|
|
||||||
-- Joey Hess <id@joeyh.name> Mon, 21 Jun 2021 12:25:25 -0400
|
-- Joey Hess <id@joeyh.name> Mon, 21 Jun 2021 12:25:25 -0400
|
||||||
|
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
{- git repository recovery
|
{- git repository recovery
|
||||||
-
|
-
|
||||||
- Copyright 2013-2014 Joey Hess <id@joeyh.name>
|
- Copyright 2013-2021 Joey Hess <id@joeyh.name>
|
||||||
-
|
-
|
||||||
- Licensed under the GNU AGPL version 3 or higher.
|
- Licensed under the GNU AGPL version 3 or higher.
|
||||||
-}
|
-}
|
||||||
|
@ -29,6 +29,7 @@ import Git.Sha
|
||||||
import Git.Types
|
import Git.Types
|
||||||
import Git.Fsck
|
import Git.Fsck
|
||||||
import Git.Index
|
import Git.Index
|
||||||
|
import Git.Env
|
||||||
import qualified Git.Config as Config
|
import qualified Git.Config as Config
|
||||||
import qualified Git.Construct as Construct
|
import qualified Git.Construct as Construct
|
||||||
import qualified Git.LsTree as LsTree
|
import qualified Git.LsTree as LsTree
|
||||||
|
@ -61,15 +62,14 @@ cleanCorruptObjects fsckresults r = do
|
||||||
whenM (isMissing s r) $
|
whenM (isMissing s r) $
|
||||||
removeLoose s
|
removeLoose s
|
||||||
|
|
||||||
{- Explodes all pack files, and deletes them.
|
{- Explodes all pack files to loose objects, and deletes the pack files.
|
||||||
-
|
-
|
||||||
- First moves all pack files to a temp dir, before unpacking them each in
|
- git unpack-objects will not unpack objects from a pack file that are
|
||||||
- turn.
|
- in the git repo. So, GIT_OBJECT_DIRECTORY is pointed to a temporary
|
||||||
|
- directory, and the loose objects then are moved into place, before
|
||||||
|
- deleting the pack files.
|
||||||
-
|
-
|
||||||
- This is because unpack-objects will not unpack a pack file if it's in the
|
- Also, that prevents unpack-objects from possibly looking at corrupt
|
||||||
- git repo.
|
|
||||||
-
|
|
||||||
- Also, this prevents unpack-objects from possibly looking at corrupt
|
|
||||||
- pack files to see if they contain an object, while unpacking a
|
- pack files to see if they contain an object, while unpacking a
|
||||||
- non-corrupt pack file.
|
- non-corrupt pack file.
|
||||||
-}
|
-}
|
||||||
|
@ -78,18 +78,28 @@ explodePacks r = go =<< listPackFiles r
|
||||||
where
|
where
|
||||||
go [] = return False
|
go [] = return False
|
||||||
go packs = withTmpDir "packs" $ \tmpdir -> do
|
go packs = withTmpDir "packs" $ \tmpdir -> do
|
||||||
|
r' <- addGitEnv r "GIT_OBJECT_DIRECTORY" tmpdir
|
||||||
putStrLn "Unpacking all pack files."
|
putStrLn "Unpacking all pack files."
|
||||||
forM_ packs $ \packfile -> do
|
forM_ packs $ \packfile -> do
|
||||||
moveFile packfile (tmpdir </> takeFileName packfile)
|
-- Just in case permissions are messed up.
|
||||||
removeWhenExistsWith R.removeLink
|
allowRead (toRawFilePath packfile)
|
||||||
(packIdxFile (toRawFilePath packfile))
|
|
||||||
forM_ packs $ \packfile -> do
|
|
||||||
let tmp = tmpdir </> takeFileName packfile
|
|
||||||
allowRead (toRawFilePath tmp)
|
|
||||||
-- May fail, if pack file is corrupt.
|
-- May fail, if pack file is corrupt.
|
||||||
void $ tryIO $
|
void $ tryIO $
|
||||||
pipeWrite [Param "unpack-objects", Param "-r"] r $ \h ->
|
pipeWrite [Param "unpack-objects", Param "-r"] r' $ \h ->
|
||||||
L.hPut h =<< L.readFile tmp
|
L.hPut h =<< L.readFile packfile
|
||||||
|
objs <- dirContentsRecursive tmpdir
|
||||||
|
forM_ objs $ \objfile -> do
|
||||||
|
f <- relPathDirToFile
|
||||||
|
(toRawFilePath tmpdir)
|
||||||
|
(toRawFilePath objfile)
|
||||||
|
let dest = objectsDir r P.</> f
|
||||||
|
createDirectoryIfMissing True
|
||||||
|
(fromRawFilePath (parentDir dest))
|
||||||
|
moveFile objfile (fromRawFilePath dest)
|
||||||
|
forM_ packs $ \packfile -> do
|
||||||
|
let f = toRawFilePath packfile
|
||||||
|
removeWhenExistsWith R.removeLink f
|
||||||
|
removeWhenExistsWith R.removeLink (packIdxFile f)
|
||||||
return True
|
return True
|
||||||
|
|
||||||
{- Try to retrieve a set of missing objects, from the remotes of a
|
{- Try to retrieve a set of missing objects, from the remotes of a
|
||||||
|
|
|
@ -0,0 +1,19 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 16"""
|
||||||
|
date="2021-06-29T16:11:03Z"
|
||||||
|
content="""
|
||||||
|
I have verified that an interrupted repair results in data loss, when
|
||||||
|
.git/objects contains pack files.
|
||||||
|
|
||||||
|
Fixed that. Which leaves 2 open questions:
|
||||||
|
|
||||||
|
* Was the assistant interrupted somehow while it was running repair, in
|
||||||
|
Atemu's case? That could have been anything from the laptop being
|
||||||
|
powered off, to it being stopped explicitly, to it crashing.
|
||||||
|
* What caused the assistant to think it needed repair in the first place?
|
||||||
|
|
||||||
|
Although the answer to the second question doesn't matter much if the
|
||||||
|
data loss bug is fixed -- if there's a problem of some kind causing
|
||||||
|
unncessary repairs, it would only be some excess CPU load.
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue