finishing up move --from --to

Lock the local content for drop after getting it from src, to prevent another
process from using the local content as a copy and dropping it from src,
which would prevent dropping the local content after sending it to dest.

Support resuming an interrupted move that downloaded the content from
src, leaving the local content populated. In this case, the location log
has not been updated to say the content is present locally, so we can
assume that it's resuming and go ahead and drop the local content after
sending it to dest.

Note that if a `git-annex get` is being ran at the same time as a
`git-annex move --from --to`, it may get a file just before the move
processes it. So the location log has not been updated yet, and the move
thinks it's resuming. Resulting in local copy being dropped after it's
sent to the dest. This race is something we'll just have to live with,
it seems.

I also gave up on the idea of checking if the location log had been updated
by a `git-annex get` that is ran at the same time. That wouldn't work, because
the location log is precached in the seek stage, so reading it again after
sending the content to dest would not notice changes made to it, unless the cache
were invalidated, which would slow it down a lot. That idea anyway was subject
to races where it would not detect the concurrent `git-annex get`.

So concurrent `git-annex get` will have results that may be surprising.
To make that less surprising, updated the documentation of this feature to
be explicit that it downloads content to the local repository
temporarily.

Sponsored-by: Dartmouth College's DANDI project
This commit is contained in:
Joey Hess 2023-01-23 17:07:21 -04:00
parent f5f799f17e
commit acc3f6211f
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 60 additions and 43 deletions

View file

@ -139,7 +139,8 @@ lockContentShared key a = lockContentUsing lock key notpresent $
-
- If locking fails, throws an exception rather than running the action.
-
- If locking fails because the the content is not present, runs the
- When the content file itself is used as the lock file,
- and locking fails because the the content is not present, runs the
- fallback action instead. However, the content is not guaranteed to be
- present when this succeeds.
-}

View file

@ -149,7 +149,10 @@ expectedPresent dest key = do
return $ dest `elem` remotes
toPerform :: Remote -> RemoveWhen -> Key -> AssociatedFile -> Bool -> Either String Bool -> CommandPerform
toPerform dest removewhen key afile fastcheck isthere = do
toPerform = toPerform' Nothing
toPerform' :: Maybe ContentRemovalLock -> Remote -> RemoveWhen -> Key -> AssociatedFile -> Bool -> Either String Bool -> CommandPerform
toPerform' mcontentlock dest removewhen key afile fastcheck isthere = do
srcuuid <- getUUID
case isthere of
Left err -> do
@ -178,7 +181,7 @@ toPerform dest removewhen key afile fastcheck isthere = do
setpresentremote
logMoveCleanup deststartedwithcopy
next $ return True
RemoveSafe -> lockContentForRemoval key lockfailed $ \contentlock -> do
RemoveSafe -> lockcontentforremoval $ \contentlock -> do
srcuuid <- getUUID
r <- willDropMakeItWorse srcuuid destuuid deststartedwithcopy key afile >>= \case
DropAllowed -> drophere setpresentremote contentlock "moved"
@ -213,6 +216,10 @@ toPerform dest removewhen key afile fastcheck isthere = do
() <- setpresentremote
return False
lockcontentforremoval a = case mcontentlock of
Nothing -> lockContentForRemoval key lockfailed a
Just contentlock -> a contentlock
-- This occurs when, for example, two files are being dropped
-- and have the same content. The seek stage checks if the content
-- is present, but due to buffering, may find it present for the
@ -350,29 +357,26 @@ fromToStart removewhen afile key ai si src dest = do
- Using a regular download of the local copy, rather than download to
- some other file makes resuming an interruped download work as usual,
- and simplifies implementation. It does mean that, if `git-annex get` of
- the same content is being run at the same time, it will see that
- the local copy exists, but then it would get deleted. To avoid that
- unexpected behavior, check the location log before dropping the local
- copy, and if it has been updated (by another process) to say that the
- content is present locally, skip dropping the local copy.
-
- (That leaves a small race, where the other process updates the location
- log after we check it. And another where the other process sees the
- local copy exists just before we drop it. In either case the resulting
- behavior is similar to `git-annex move --to` being run concurrently
- with `git-annex get`.)
-
- The other complication of this approach is that the temporary local
- copy could be seen by another process that uses it as one of the
- necessary copies when dropping from somewhere else. To avoid the number
- of copies being reduced in such a situation (or the local copy not being
- able to be safely dropped), lock the local copy for drop before
- downloading it (v10) or immediately after download (v9 or older).
- the same content is being run at the same time as this move, the content
- may end up locally present, or not. This is similar to the behavior
- when running `git-annex move --to` concurrently with git-annex get.
-}
fromToPerform :: Remote -> Remote -> RemoveWhen -> Key -> AssociatedFile -> CommandPerform
fromToPerform src dest removewhen key afile = go =<< inAnnex key
fromToPerform src dest removewhen key afile = do
hereuuid <- getUUID
loggedpresent <- any (== hereuuid)
<$> loggedLocations key
ispresent <- inAnnex key
go ispresent loggedpresent
where
go True = do
-- The content is present, and is logged as present, so it
-- can be sent to dest and dropped from src.
--
-- When resuming an interrupted move --from --to, where the content
-- was not present but got downloaded from src, it will not be
-- logged present, and so this won't be used. Instead, the local
-- content will get dropped after being copied to dest.
go True True = do
haskey <- Remote.hasKey dest key
-- Prepare to drop from src later. Doing this first
-- makes "from src" be shown consistently before
@ -380,12 +384,12 @@ fromToPerform src dest removewhen key afile = go =<< inAnnex key
dropsrc <- fromsrc True
combinecleanups
-- Send to dest, preserve local copy.
(todest RemoveNever haskey)
(todest Nothing RemoveNever haskey)
(\senttodest -> if senttodest
then dropsrc removewhen
else stop
)
go False = do
go ispresent _loggedpresent = do
haskey <- Remote.hasKey dest key
case haskey of
Left err -> do
@ -399,25 +403,34 @@ fromToPerform src dest removewhen key afile = go =<< inAnnex key
dropfromsrc id
Right False -> do
-- Get local copy from src, defer dropping
-- from src until later.
cleanupfromsrc <- fromsrc False
combinecleanups
-- Send to dest and remove local copy.
(todest RemoveSafe haskey)
(\senttodest ->
-- Drop from src, checking
-- copies including dest.
combinecleanups
(cleanupfromsrc RemoveNever)
(\_ -> if senttodest
then dropfromsrc (\l -> UnVerifiedRemote dest : l)
else stop
)
)
-- from src until later. Note that fromsrc
-- does not update the location log.
cleanupfromsrc <- if ispresent
then return $ const $ next (return True)
else fromsrc False
-- Lock the local copy for removal early,
-- to avoid other processes relying on it
-- as a copy, and removing other copies
-- (such as the one in src), that prevents
-- dropping the local copy later.
lockContentForRemoval key stop $ \contentlock ->
combinecleanups
-- Send to dest and remove local copy.
(todest (Just contentlock) RemoveSafe haskey)
(\senttodest ->
-- Drop from src, checking
-- copies including dest.
combinecleanups
(cleanupfromsrc RemoveNever)
(\_ -> if senttodest
then dropfromsrc (\l -> UnVerifiedRemote dest : l)
else stop
)
)
fromsrc present = fromPerform' present False src key afile
todest removewhen' = toPerform dest removewhen' key afile False
todest mcontentlock removewhen' = toPerform' mcontentlock dest removewhen' key afile False
dropfromsrc adjusttocheck =
logMove (Remote.uuid src) (Remote.uuid dest) True key $ \deststartedwithcopy ->

View file

@ -31,9 +31,12 @@ Paths of files or directories to operate on can be specified.
* `--from=remote1 --to=remote2`
Move the content of files that are in remote1 to remote2. Does not change
what is stored in the local repository.
what is stored in the local repository.
Note: This may need to store an intermediate copy of the content on disk.
This is implemented by first downloading the content from remote1 to the
local repository (if not already present), then sending it to remote2, and
then deleting the content from the local repository (if it was not present
to start with).
* `--force`