dropping unused marks as dead

Dropping an object with drop --unused or dropunused will mark it as
dead, preventing fsck --all from complaining about it after it's been
dropped from all repositories.

If another repository still has a copy, it won't be treated as dead
until it's also dropped from there.

The drop has to use --unused, can't be --key or something else, because
this indicates that the user has recently ran git-annex unused. If it
checked the unused log on every drop, bad things would happen when the
unused log was out of date, eg a file used to be unused but then got
re-added. Marking such a file as dead could be confusing. When the user
uses --unused/dropunused, they must consider the unused information to be
up-to-date.

The particular workflow this enables is:

	git annex add foo
	git annex unannex foo
	git annex unused
	git annex drop --unused / dropunused
	git annex fsck --all # no warnings

The docs for git-annex unannex say to use git-annex unused and dropunused,
so the user should be pointed in this direction when they want to undo an
accidental add.

Sponsored-by: Brock Spratlen on Patreon
This commit is contained in:
Joey Hess 2021-06-25 15:22:05 -04:00
parent b08b9ed210
commit 3a14648142
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
10 changed files with 88 additions and 31 deletions

View file

@ -118,10 +118,10 @@ handleDropsFrom locs rs reason fromhere key afile si preverified runner = do
dropl fs n = checkdrop fs n Nothing $ \pcc numcopies mincopies ->
stopUnless (inAnnex key) $
Command.Drop.startLocal pcc afile ai si numcopies mincopies key preverified
Command.Drop.startLocal pcc afile ai si numcopies mincopies key preverified (Command.Drop.DroppingUnused False)
dropr fs r n = checkdrop fs n (Just $ Remote.uuid r) $ \pcc numcopies mincopies ->
Command.Drop.startRemote pcc afile ai si numcopies mincopies key r
Command.Drop.startRemote pcc afile ai si numcopies mincopies key (Command.Drop.DroppingUnused False) r
ai = mkActionItem (key, afile)

View file

@ -6,6 +6,9 @@ git-annex (8.20210622) UNRELEASED; urgency=medium
* Added annex.freezecontent-command and annex.thawcontent-command
configs.
* Improve display of errors when transfers fail.
* Dropping an unused object with drop --unused or dropunused will
mark it as dead, preventing fsck --all from complaining about it
after it's been dropped from all repositories.
-- Joey Hess <id@joeyh.name> Mon, 21 Jun 2021 12:25:25 -0400

View file

@ -86,29 +86,32 @@ start' o from key afile ai si =
checkDropAuto (autoMode o) from afile key $ \numcopies mincopies ->
stopUnless wantdrop $
case from of
Nothing -> startLocal pcc afile ai si numcopies mincopies key []
Just remote -> startRemote pcc afile ai si numcopies mincopies key remote
Nothing -> startLocal pcc afile ai si numcopies mincopies key [] ud
Just remote -> startRemote pcc afile ai si numcopies mincopies key ud remote
where
wantdrop
| autoMode o = wantDrop False (Remote.uuid <$> from) (Just key) afile Nothing
| otherwise = return True
pcc = PreferredContentChecked (autoMode o)
ud = case (batchOption o, keyOptions o) of
(NoBatch, Just WantUnusedKeys) -> DroppingUnused True
_ -> DroppingUnused False
startKeys :: DropOptions -> Maybe Remote -> (SeekInput, Key, ActionItem) -> CommandStart
startKeys o from (si, key, ai) = start' o from key (AssociatedFile Nothing) ai si
startLocal :: PreferredContentChecked -> AssociatedFile -> ActionItem -> SeekInput -> NumCopies -> MinCopies -> Key -> [VerifiedCopy] -> CommandStart
startLocal pcc afile ai si numcopies mincopies key preverified =
startLocal :: PreferredContentChecked -> AssociatedFile -> ActionItem -> SeekInput -> NumCopies -> MinCopies -> Key -> [VerifiedCopy] -> DroppingUnused -> CommandStart
startLocal pcc afile ai si numcopies mincopies key preverified ud =
starting "drop" (OnlyActionOn key ai) si $
performLocal pcc key afile numcopies mincopies preverified
performLocal pcc key afile numcopies mincopies preverified ud
startRemote :: PreferredContentChecked -> AssociatedFile -> ActionItem -> SeekInput -> NumCopies -> MinCopies -> Key -> Remote -> CommandStart
startRemote pcc afile ai si numcopies mincopies key remote =
startRemote :: PreferredContentChecked -> AssociatedFile -> ActionItem -> SeekInput -> NumCopies -> MinCopies -> Key -> DroppingUnused -> Remote -> CommandStart
startRemote pcc afile ai si numcopies mincopies key ud remote =
starting ("drop " ++ Remote.name remote) (OnlyActionOn key ai) si $
performRemote pcc key afile numcopies mincopies remote
performRemote pcc key afile numcopies mincopies remote ud
performLocal :: PreferredContentChecked -> Key -> AssociatedFile -> NumCopies -> MinCopies -> [VerifiedCopy] -> CommandPerform
performLocal pcc key afile numcopies mincopies preverified = lockContentForRemoval key fallback $ \contentlock -> do
performLocal :: PreferredContentChecked -> Key -> AssociatedFile -> NumCopies -> MinCopies -> [VerifiedCopy] -> DroppingUnused -> CommandPerform
performLocal pcc key afile numcopies mincopies preverified ud = lockContentForRemoval key fallback $ \contentlock -> do
u <- getUUID
(tocheck, verified) <- verifiableCopies key [u]
doDrop pcc u (Just contentlock) key afile numcopies mincopies [] (preverified ++ verified) tocheck
@ -120,7 +123,7 @@ performLocal pcc key afile numcopies mincopies preverified = lockContentForRemov
]
removeAnnex contentlock
notifyDrop afile True
next $ cleanupLocal key
next $ cleanupLocal key ud
, do
notifyDrop afile False
stop
@ -131,10 +134,10 @@ performLocal pcc key afile numcopies mincopies preverified = lockContentForRemov
-- is present, but due to buffering, may find it present for the
-- second file before the first is dropped. If so, nothing remains
-- to be done except for cleaning up.
fallback = next $ cleanupLocal key
fallback = next $ cleanupLocal key ud
performRemote :: PreferredContentChecked -> Key -> AssociatedFile -> NumCopies -> MinCopies -> Remote -> CommandPerform
performRemote pcc key afile numcopies mincopies remote = do
performRemote :: PreferredContentChecked -> Key -> AssociatedFile -> NumCopies -> MinCopies -> Remote -> DroppingUnused -> CommandPerform
performRemote pcc key afile numcopies mincopies remote ud = do
-- Filter the uuid it's being dropped from out of the lists of
-- places assumed to have the key, and places to check.
(tocheck, verified) <- verifiableCopies key [uuid]
@ -147,23 +150,36 @@ performRemote pcc key afile numcopies mincopies remote = do
, show proof
]
ok <- Remote.action (Remote.removeKey remote key)
next $ cleanupRemote key remote ok
next $ cleanupRemote key remote ud ok
, stop
)
where
uuid = Remote.uuid remote
cleanupLocal :: Key -> CommandCleanup
cleanupLocal key = do
logStatus key InfoMissing
cleanupLocal :: Key -> DroppingUnused -> CommandCleanup
cleanupLocal key ud = do
logStatus key (dropStatus ud)
return True
cleanupRemote :: Key -> Remote -> Bool -> CommandCleanup
cleanupRemote key remote ok = do
cleanupRemote :: Key -> Remote -> DroppingUnused -> Bool -> CommandCleanup
cleanupRemote key remote ud ok = do
when ok $
Remote.logStatus remote key InfoMissing
Remote.logStatus remote key (dropStatus ud)
return ok
{- Set when the user explicitly chose to operate on unused content.
- Presumably the user still expects the last git-annex unused to be
- correct at this point. -}
newtype DroppingUnused = DroppingUnused Bool
{- When explicitly dropping unused content, mark the key as dead, at least
- in the repository it was dropped from. It may still be in other
- repositories, and will not be treated as dead until dropped from all of
- them. -}
dropStatus :: DroppingUnused -> LogStatus
dropStatus (DroppingUnused False) = InfoMissing
dropStatus (DroppingUnused True) = InfoDead
{- Before running the dropaction, checks specified remotes to
- verify that enough copies of a key exist to allow it to be
- safely removed (with no data loss).

View file

@ -49,7 +49,7 @@ perform :: Maybe Remote -> NumCopies -> MinCopies -> Key -> CommandPerform
perform from numcopies mincopies key = case from of
Just r -> do
showAction $ "from " ++ Remote.name r
Command.Drop.performRemote pcc key (AssociatedFile Nothing) numcopies mincopies r
Command.Drop.performRemote pcc key (AssociatedFile Nothing) numcopies mincopies r ud
Nothing -> ifM (inAnnex key)
( droplocal
, ifM (objectFileExists key)
@ -63,8 +63,9 @@ perform from numcopies mincopies key = case from of
)
)
where
droplocal = Command.Drop.performLocal pcc key (AssociatedFile Nothing) numcopies mincopies []
droplocal = Command.Drop.performLocal pcc key (AssociatedFile Nothing) numcopies mincopies [] ud
pcc = Command.Drop.PreferredContentChecked False
ud = Command.Drop.DroppingUnused True
performOther :: (Key -> Git.Repo -> RawFilePath) -> Key -> CommandPerform
performOther filespec key = do

View file

@ -69,7 +69,8 @@ startKey o afile (si, key, ai) = case fromToOptions o of
( Command.Move.toStart Command.Move.RemoveNever afile key ai si =<< getParsed r
, do
(numcopies, mincopies) <- getSafestNumMinCopies afile key
Command.Drop.startRemote pcc afile ai si numcopies mincopies key =<< getParsed r
Command.Drop.startRemote pcc afile ai si numcopies mincopies key (Command.Drop.DroppingUnused False)
=<< getParsed r
)
FromRemote r -> checkFailedTransferDirection ai Download $ do
haskey <- flip Remote.hasKey key =<< getParsed r
@ -82,7 +83,7 @@ startKey o afile (si, key, ai) = case fromToOptions o of
Right False -> ifM (inAnnex key)
( do
(numcopies, mincopies) <- getSafestNumMinCopies afile key
Command.Drop.startLocal pcc afile ai si numcopies mincopies key []
Command.Drop.startLocal pcc afile ai si numcopies mincopies key [] (Command.Drop.DroppingUnused False)
, stop
)
where

View file

@ -186,7 +186,7 @@ toPerform dest removewhen key afile fastcheck isthere = do
removeAnnex contentlock
next $ do
() <- setpresentremote
Command.Drop.cleanupLocal key
Command.Drop.cleanupLocal key (Command.Drop.DroppingUnused False)
faileddrophere setpresentremote = do
showLongNote "(Use --force to override this check, or adjust numcopies.)"
showLongNote "Content not dropped from here."
@ -199,7 +199,7 @@ toPerform dest removewhen key afile fastcheck isthere = do
-- is present, but due to buffering, may find it present for the
-- second file before the first is dropped. If so, nothing remains
-- to be done except for cleaning up.
lockfailed = next $ Command.Drop.cleanupLocal key
lockfailed = next $ Command.Drop.cleanupLocal key (Command.Drop.DroppingUnused False)
fromStart :: RemoveWhen -> AssociatedFile -> Key -> ActionItem -> SeekInput -> Remote -> CommandStart
fromStart removewhen afile key ai si src =
@ -262,7 +262,7 @@ fromPerform src removewhen key afile = do
, "(" ++ reason ++ ")"
]
ok <- Remote.action (Remote.removeKey src key)
next $ Command.Drop.cleanupRemote key src ok
next $ Command.Drop.cleanupRemote key src (Command.Drop.DroppingUnused False) ok
faileddropremote = do
showLongNote "(Use --force to override this check, or adjust numcopies.)"

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="joey"
subject="""comment 9"""
date="2021-06-25T18:19:14Z"
content="""
I think this is the same thing being discussed in
[[todo/dead_files_in_checkout_directly]].
"""]]

View file

@ -9,7 +9,7 @@ git annex dropunused `[number|range ...]`
# DESCRIPTION
Drops the data corresponding to the numbers, as listed by the last
git annex unused`
`git annex unused`
You can also specify ranges of numbers, such as "1-1000".
Or, specify "all" to drop all unused data.

View file

@ -1,3 +1,5 @@
When you want to dead a file in your checkout, you can only do so via the key of the file. You can find the corresponding key with a bit of bash like this: `git annex dead --key $(basename $(readlink file))` but that shouldn't be necessary IMO.
It'd be a lot better if you could just dead files like this: `git annex dead --file file` or even like this: `git annex dead --file file1 file2 file3 otherfiles.*` (or maybe even like this: `git annex dead --file file1 file2 --key $key1 $key2`).
> [[done]] in another way --[[Joey]]

View file

@ -0,0 +1,26 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2021-06-25T18:22:04Z"
content="""
In <https://git-annex.branchable.com/forum/Forget_about_accidentally_added_file__63__/>
there is an idea of `git annex unannex --forget file`
And using unannex for this makes some sense; it's intended to be used to undo an
accidental `git-annex add`. When it's used that way, and later `git-annex
unused` finds the object file is not used by anything and the object gets
deleted, fsck --all will start complaining about it.
But there are still many ways it could go wrong. Being run recursively by
accident. Or another file, perhaps in another branch, using the same key,
which gets marked as dead.
Hmm, `git annex dropunused` (or `drop --unused`)
could mark the key as dead. At that point it's known to be unused.
This way, the existing workflow of git-annex unannex followed by git-annex
unused followed by dropping can be followed, and fsck --all does
not later complain about the key.
Done!
"""]]