formatLsTree did not use a tab where git does

Fixed that, and made parserLsTree accept the space as well as tab.

Fixes a reversion that made import of a tree from a special remote result in
a merge that deleted files that were not preferred content of that special
remote.
This commit is contained in:
Joey Hess 2021-01-28 12:36:37 -04:00
parent 3b8fcefb45
commit e3224ff77d
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 74 additions and 9 deletions

View file

@ -1,3 +1,11 @@
git-annex (8.20210128) UNRELEASED; urgency=medium
* Fix a reversion that made import of a tree from a special remote
result in a merge that deleted files that were not preferred content
of that special remote.
-- Joey Hess <id@joeyh.name> Thu, 28 Jan 2021 12:34:32 -0400
git-annex (8.20210127) upstream; urgency=medium git-annex (8.20210127) upstream; urgency=medium
* Added mincopies configuration. This is like numcopies, but is * Added mincopies configuration. This is like numcopies, but is

View file

@ -101,6 +101,10 @@ parseLsTreeStrict b = go (AS.parse parserLsTree b)
{- Parses a line of ls-tree output, in format: {- Parses a line of ls-tree output, in format:
- mode SP type SP sha TAB file - mode SP type SP sha TAB file
- -
- The TAB can also be a space. Git does not use that, but an earlier
- version of formatLsTree did, and this keeps parsing what it output
- working.
-
- (The --long format is not currently supported.) -} - (The --long format is not currently supported.) -}
parserLsTree :: A.Parser TreeItem parserLsTree :: A.Parser TreeItem
parserLsTree = TreeItem parserLsTree = TreeItem
@ -111,8 +115,8 @@ parserLsTree = TreeItem
<*> A8.takeTill (== ' ') <*> A8.takeTill (== ' ')
<* A8.char ' ' <* A8.char ' '
-- sha -- sha
<*> (Ref <$> A8.takeTill (== '\t')) <*> (Ref <$> A8.takeTill A8.isSpace)
<* A8.char '\t' <* A8.space
-- file -- file
<*> (asTopFilePath . Git.Filename.decode <$> A.takeByteString) <*> (asTopFilePath . Git.Filename.decode <$> A.takeByteString)
@ -122,5 +126,4 @@ formatLsTree ti = unwords
[ showOct (mode ti) "" [ showOct (mode ti) ""
, decodeBS (typeobj ti) , decodeBS (typeobj ti)
, fromRef (sha ti) , fromRef (sha ti)
, fromRawFilePath (getTopFilePath (file ti)) ] ++ ('\t' : fromRawFilePath (getTopFilePath (file ti)))
]

View file

@ -197,10 +197,12 @@ logExportExcluded u a = do
getExportExcluded :: UUID -> Annex [Git.Tree.TreeItem] getExportExcluded :: UUID -> Annex [Git.Tree.TreeItem]
getExportExcluded u = do getExportExcluded u = do
logf <- fromRepo $ gitAnnexExportExcludeLog u logf <- fromRepo $ gitAnnexExportExcludeLog u
liftIO $ catchDefaultIO [] $ parser liftIO $ catchDefaultIO [] $ exportExcludedParser
<$> L.readFile (fromRawFilePath logf) <$> L.readFile (fromRawFilePath logf)
where where
parser = map Git.Tree.lsTreeItemToTreeItem
exportExcludedParser :: L.ByteString -> [Git.Tree.TreeItem]
exportExcludedParser = map Git.Tree.lsTreeItemToTreeItem
. rights . rights
. map Git.LsTree.parseLsTree . map Git.LsTree.parseLsTree
. L.split (fromIntegral $ ord '\n') . L.split (fromIntegral $ ord '\n')

View file

@ -41,3 +41,5 @@ Debian Buster
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
I use git-annex for all kinds of stuff. I love it! I use git-annex for all kinds of stuff. I love it!
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,50 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2021-01-28T15:52:03Z"
content="""
Note that there's not data loss here, you can still check
out branches with the deleted files, or revert the merge.
Seems like importtree should add to the imported tree all files that
were in the export but were not preferred content.
Hmm, in Annex.Import.addBackExportExcluded, it tries to do just that. The
implementation uses a log file in .git/annex/export.ex that lists the
previously excluded files. There must be a bug in that.
I can easily reproduce this bug:
joey@darkstar:/tmp/bench2>mkdir d
joey@darkstar:/tmp/bench2>git init r
joey@darkstar:/tmp/bench2>cd r
joey@darkstar:/tmp/bench2/r>git annex init
joey@darkstar:/tmp/bench2/r>git annex initremote d type=directory directory=../d exporttree=yes importtree=yes encryption=none
joey@darkstar:/tmp/bench2/r>git annex wanted d 'exclude=*.mp3'
joey@darkstar:/tmp/bench2/r>date > foo.bar
joey@darkstar:/tmp/bench2/r>date > foo.mp3
joey@darkstar:/tmp/bench2/r>git annex add
joey@darkstar:/tmp/bench2/r>git commit -m add
joey@darkstar:/tmp/bench2/r>git annex export master --to d
export d foo.bar ok
joey@darkstar:/tmp/bench2/r>git annex import master --from d
list d ok
update refs/remotes/d/master ok
(recording state in git...)
joey@darkstar:/tmp/bench2/r>git merge d/master
Updating f818c13..b1a0434
Fast-forward
foo.mp3 | 1 -
1 file changed, 1 deletion(-)
delete mode 120000 foo.mp3
joey@darkstar:/tmp/bench2/r>cat .git/annex/export.ex/72c8d14c-af03-408c-845d-cac418d49e61
120000 blob 64c3e1f1f81026cb8ab5a6593d4120a1d73044c3 foo.mp3
So it seems adding back the exported file from the log is where the bug lies.
And specifically, it seems when it tries to read this log, it silently fails
to parse it, and so adds nothing back.
Aha! -- The parser is expecting a tab in the git ls-tree like log, but it's
written with a space instead. It did used to work but the parser got rewritten
for speed and was changed to only accept tab, not both space and tab.
"""]]