export: fix multi-file delete bug

export: Fix a bug that left a file on a special remote when two files with
the same content were both deleted in the exported tree.

Case of the wrong data structure leading to the wrong result.
The DiffMap now contains all the old filenames, and all the new filenames.

Note that, when 2 files with the same content are both renamed,
it only renames the first, but deletes and re-exports the second.
Improving that is possible, but it would need to use a different temporary
filename. Anyway, that is an unusual case, and there are known to be other
unusual cases where export does not rename with maximum efficiency, IIRC.
(Or maybe this is the case that I remember?)

Sponsored-by: Dartmouth College's OpenNeuro project
This commit is contained in:
Joey Hess 2022-11-09 16:24:37 -04:00
parent 56d563577f
commit b2cc63d5bf
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 23 additions and 18 deletions

View file

@ -1,11 +1,13 @@
git-annex (10.20221104) UNRELEASED; urgency=medium
* export: Fix a bug that left a file on a special remote when
two files with the same content were both deleted in the exported tree.
* S3: Support signature=anonymous to access a S3 bucket anonymously.
This can be used, for example, with importtree=yes to import from
a public bucket.
This feature needs git-annex to be built with aws-0.23.
* stack.yaml: Updated to lts-19.32
-- Joey Hess <id@joeyh.name> Thu, 03 Nov 2022 14:07:31 -0400
git-annex (10.20221103) upstream; urgency=medium

View file

@ -149,23 +149,24 @@ changeExport r db (ExportFiltered new) = do
[] -> updateExportTree db emptyTree new
[oldtreesha] -> do
diffmap <- mkDiffMap oldtreesha new db
let seekdiffmap a = commandActions $
map a (M.toList diffmap)
let seekdiffmap a = mapM_ a (M.toList diffmap)
-- Rename old files to temp, or delete.
seekdiffmap $ \(ek, (moldf, mnewf)) -> do
case (moldf, mnewf) of
(Just oldf, Just _newf) ->
let deleteoldf = \ek oldf -> commandAction $
startUnexport' r db oldf ek
seekdiffmap $ \case
(ek, (oldf:oldfs, _newf:_)) -> do
commandAction $
startMoveToTempName r db oldf ek
(Just oldf, Nothing) ->
startUnexport' r db oldf ek
_ -> stop
forM_ oldfs (deleteoldf ek)
(ek, (oldfs, [])) ->
forM_ oldfs (deleteoldf ek)
(_ek, ([], _)) -> noop
waitForAllRunningCommandActions
-- Rename from temp to new files.
seekdiffmap $ \(ek, (moldf, mnewf)) ->
case (moldf, mnewf) of
(Just _oldf, Just newf) ->
startMoveFromTempName r db ek newf
_ -> stop
seekdiffmap $ \case
(ek, (_oldf:_, newf:_)) -> commandAction $
startMoveFromTempName r db ek newf
_ -> noop
waitForAllRunningCommandActions
ts -> do
warning "Resolving export conflict.."
@ -204,7 +205,7 @@ changeExport r db (ExportFiltered new) = do
void $ liftIO cleanup
-- Map of old and new filenames for each changed Key in a diff.
type DiffMap = M.Map Key (Maybe TopFilePath, Maybe TopFilePath)
type DiffMap = M.Map Key ([TopFilePath], [TopFilePath])
mkDiffMap :: Git.Ref -> Git.Ref -> ExportHandle -> Annex DiffMap
mkDiffMap old new db = do
@ -213,14 +214,14 @@ mkDiffMap old new db = do
void $ liftIO cleanup
return diffmap
where
combinedm (srca, dsta) (srcb, dstb) = (srca <|> srcb, dsta <|> dstb)
combinedm (srca, dsta) (srcb, dstb) = (srca ++ srcb, dsta ++ dstb)
mkdm i = do
srcek <- getek (Git.DiffTree.srcsha i)
dstek <- getek (Git.DiffTree.dstsha i)
updateExportTree' db srcek dstek i
return $ catMaybes
[ (, (Just (Git.DiffTree.file i), Nothing)) <$> srcek
, (, (Nothing, Just (Git.DiffTree.file i))) <$> dstek
[ (, ([Git.DiffTree.file i], [])) <$> srcek
, (, ([], [Git.DiffTree.file i])) <$> dstek
]
getek sha
| sha `elem` nullShas = return Nothing

View file

@ -395,3 +395,5 @@ ok
Thank you for all your work making this wonderful tool!
[[!meta title="export tree bug when two files with the same content both should be removed"]]
> [[fixed|done]] --[[Joey]]