Fix bug importing from a special remote into a subdirectory more than one level deep

Which generated unusual git trees that could confuse git merge,
since they incorrectly had 2 subtrees with the same name.

Root of the bug was a) not testing that at all! but also
b) confusing graftdirs, which contains eg "foo/bar" with
non-recursively read trees, which would contain eg "bar"
when reading a subtree of "foo".

It's worth noting that Annex.Import uses graftTree, but it really
shouldn't have needed to. Eg, when importing into foo/bar from a remote,
it's enough to generate a tree of foo/bar/x, foo/bar/y, and does not
include other files that are at the top of the master branch. It uses
graftTree, so it does include the other files, as well as the foo/bar
tree. git merge will do the same thing for both trees. With that said,
switching it away from graftTree would result in another import
generating a new commit that seems to delete files that were there in a
previous commit, so it probably has to keep using graftTree since it
used it before.

This commit was sponsored by Kevin Mueller on Patreon.
This commit is contained in:
Joey Hess 2021-03-26 16:01:55 -04:00
parent 4a387eda54
commit 4611813ef1
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 29 additions and 13 deletions

View file

@ -28,6 +28,9 @@ git-annex (8.20210311) UNRELEASED; urgency=medium
* Improved display of errors when accessing a git http remote fails. * Improved display of errors when accessing a git http remote fails.
* borg: Fix a bug that prevented importing keys of type URL and WORM. * borg: Fix a bug that prevented importing keys of type URL and WORM.
* borg: Support importing files that are hard linked in the borg backup. * borg: Support importing files that are hard linked in the borg backup.
* Fix bug importing from a special remote into a subdirectory more than
one level deep, which generated unusual git trees that could confuse
git merge.
-- Joey Hess <id@joeyh.name> Fri, 12 Mar 2021 12:06:37 -0400 -- Joey Hess <id@joeyh.name> Fri, 12 Mar 2021 12:06:37 -0400

View file

@ -306,43 +306,43 @@ graftTree'
-> Repo -> Repo
-> MkTreeHandle -> MkTreeHandle
-> IO Sha -> IO Sha
graftTree' subtree graftloc basetree repo hdl = go basetree graftdirs graftTree' subtree graftloc basetree repo hdl = go basetree subdirs graftdirs
where where
go tsha (topmostgraphdir:restgraphdirs) = do go tsha (subdir:restsubdirs) (topmostgraphdir:restgraphdirs) = do
Tree t <- getTree LsTree.LsTreeNonRecursive tsha repo Tree t <- getTree LsTree.LsTreeNonRecursive tsha repo
t' <- case partition isabovegraft t of let abovegraftpoint i = gitPath i == gitPath subdir
t' <- case partition abovegraftpoint t of
-- the graft point is not already in the tree,
-- so graft it in, keeping the existing tree
-- content
([], _) -> do ([], _) -> do
graft <- graftin (topmostgraphdir:restgraphdirs) graft <- graftin (topmostgraphdir:restgraphdirs)
return (graft:t) return (graft:t)
-- normally there can only be one matching item
-- in the tree, but it's theoretically possible
-- for a git tree to have multiple items with the
-- same name, so process them all
(matching, rest) -> do (matching, rest) -> do
newshas <- forM matching $ \case newshas <- forM matching $ \case
RecordedSubTree tloc tsha' _ RecordedSubTree tloc tsha' _
| null restgraphdirs -> return $ | null restgraphdirs -> return $
RecordedSubTree tloc subtree [] RecordedSubTree tloc subtree []
| otherwise -> do | otherwise -> do
tsha'' <- go tsha' restgraphdirs tsha'' <- go tsha' restsubdirs restgraphdirs
return $ RecordedSubTree tloc tsha'' [] return $ RecordedSubTree tloc tsha'' []
_ -> graftin (topmostgraphdir:restgraphdirs) _ -> graftin (topmostgraphdir:restgraphdirs)
return (newshas ++ rest) return (newshas ++ rest)
mkTree hdl t' mkTree hdl t'
go _ [] = return subtree go _ _ [] = return subtree
isabovegraft i = beneathSubTree i graftloc || gitPath i == gitPath graftloc
graftin t = recordSubTree hdl $ graftin' t graftin t = recordSubTree hdl $ graftin' t
graftin' [] = RecordedSubTree graftloc subtree [] graftin' [] = RecordedSubTree graftloc subtree []
graftin' (d:rest) graftin' (d:rest)
| d == graftloc = graftin' [] | d == graftloc = graftin' []
| otherwise = NewSubTree d [graftin' rest] | otherwise = NewSubTree d [graftin' rest]
subdirs = splitDirectories $ gitPath graftloc
-- For a graftloc of "foo/bar/baz", this generates -- For a graftloc of "foo/bar/baz", this generates
-- ["foo", "foo/bar", "foo/bar/baz"] -- ["foo", "foo/bar", "foo/bar/baz"]
graftdirs = map (asTopFilePath . toInternalGitPath . encodeBS) $ graftdirs = map (asTopFilePath . toInternalGitPath . encodeBS) $
mkpaths [] $ splitDirectories $ gitPath graftloc mkpaths [] subdirs
mkpaths _ [] = [] mkpaths _ [] = []
mkpaths base (d:rest) = (joinPath base </> d) : mkpaths (base ++ [d]) rest mkpaths base (d:rest) = (joinPath base </> d) : mkpaths (base ++ [d]) rest

View file

@ -100,3 +100,5 @@ supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8 local repository version: 8
``` ```
> [[fixed|done]], thanks for an excellent test case. --[[Joey]]

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2021-03-26T18:46:19Z"
content="""
Note that level1/level2 is sufficient for git-annex to generate the bad
tree. I also saw git merge fail at that depth.
I've fixed this, at least the test case given and some other similar ones
I thought of to try.
"""]]