fix reversion in skipping deleted files

And add a test case for that.

This certianly loses some of the 2x performance improvement in file
seeking that seekFilteredKeys led to, because now it has to stat the
worktree files again. Without benchmarking, I expect there will still be
a sizable improvement, and also the git-annex branch precaching that
seekFilteredKeys can do will still be a win of its approach.

Also worth noting that lookupKey, when the file DNE, check if it's in an
adjusted branch with hidden files, and if so, finds the key for the
file anyway. That was intended to make git-annex sync --content be able
to process those files, but a side effect was that, when a file was
deleted but the deletion not yet staged, git-annex commands used to
still list it. That was actually a bug. This commit fixes that bug too.
(git-annex sync --content on such a branch does not use seekFilteredKeys
so was not affected by the reversion or by this behavior change)

This commit was sponsored by Jake Vosloo on Patreon.
This commit is contained in:
Joey Hess 2020-07-19 20:33:10 -04:00
parent 5dbb2924bb
commit 889603336a
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 29 additions and 9 deletions

View file

@ -36,6 +36,8 @@ git-annex (8.20200618) UNRELEASED; urgency=medium
* fsck: Detect if WORM keys contain a carriage return, and recommend
upgrading the key. (git-annex could have maybe created such keys back
in 2013).
* When on an adjust --hide-missing branch, fix handling of files that
have been deleted but the deletion is not yet staged.
-- Joey Hess <id@joeyh.name> Thu, 18 Jun 2020 12:21:14 -0400

View file

@ -335,14 +335,15 @@ seekFilteredKeys seeker listfs = do
process matcher ofeeder mdfeeder mdcloser seenpointer ((f, sha, mode):rest) =
case Git.toTreeItemType mode of
Just Git.TreeSymlink -> do
-- Once a pointer file has been seen,
-- symlinks have to be sent via the
-- metadata processor too. That is slightly
-- slower, but preserves the requested
-- file order.
if seenpointer
then liftIO $ mdfeeder (f, sha)
else feedmatches matcher ofeeder f sha
whenM (exists f) $
-- Once a pointer file has been seen,
-- symlinks have to be sent via the
-- metadata processor too. That is slightly
-- slower, but preserves the requested
-- file order.
if seenpointer
then liftIO $ mdfeeder (f, sha)
else feedmatches matcher ofeeder f sha
process matcher ofeeder mdfeeder mdcloser seenpointer rest
Just Git.TreeSubmodule ->
process matcher ofeeder mdfeeder mdcloser seenpointer rest
@ -350,12 +351,17 @@ seekFilteredKeys seeker listfs = do
-- file in git, possibly large. Avoid catting
-- large files by first looking up the size.
Just _ -> do
liftIO $ mdfeeder (f, sha)
whenM (exists f) $
liftIO $ mdfeeder (f, sha)
process matcher ofeeder mdfeeder mdcloser True rest
Nothing ->
process matcher ofeeder mdfeeder mdcloser seenpointer rest
process _ _ _ mdcloser _ [] = liftIO $ void mdcloser
-- Check if files exist, because a deleted file will still be
-- listed by ls-tree, but should not be processed.
exists p = isJust <$> liftIO (catchMaybeIO $ R.getSymbolicLinkStatus p)
mdprocess matcher mdreader ofeeder ocloser = liftIO mdreader >>= \case
Just (f, Just (sha, size, _type))
| size < maxPointerSz -> do

10
Test.hs
View file

@ -292,6 +292,7 @@ unitTests :: String -> TestTree
unitTests note = testGroup ("Unit Tests " ++ note)
[ testCase "add dup" test_add_dup
, testCase "add extras" test_add_extras
, testCase "ignore deleted files" test_ignore_deleted_files
, testCase "metadata" test_metadata
, testCase "export_import" test_export_import
, testCase "export_import_subdir" test_export_import_subdir
@ -403,6 +404,15 @@ test_add_extras = intmpclonerepo $ do
annexed_present wormannexedfile
checkbackend wormannexedfile backendWORM
test_ignore_deleted_files :: Assertion
test_ignore_deleted_files = intmpclonerepo $ do
git_annex "get" [annexedfile] @? "get failed"
git_annex_expectoutput "find" [] [annexedfile]
nukeFile annexedfile
-- A file that has been deleted, but the deletion not staged,
-- is a special case; make sure git-annex skips these.
git_annex_expectoutput "find" [] []
test_metadata :: Assertion
test_metadata = intmpclonerepo $ do
git_annex "metadata" ["-s", "foo=bar", annexedfile] @? "set metadata"

View file

@ -13,3 +13,5 @@ why did it break? --[[Joey]]
> Also, there should be a test case for this, IIRC there was one past case
> of a bug involving this and it's a bit of an easy edge case to forget
> about. --[[Joey]]
[[done]]