Revert "smudge: check for known annexed inodes before checking annex.largefiles"
This reverts commit 424bef6b6f
.
This commit caused other buggy behavior unfortunately.
This commit is contained in:
parent
56ccc0302e
commit
72a8bbce12
4 changed files with 65 additions and 30 deletions
|
@ -2,9 +2,6 @@ git-annex (8.20210429) UNRELEASED; urgency=medium
|
||||||
|
|
||||||
* fromkey: Create an unlocked file when used in an adjusted branch
|
* fromkey: Create an unlocked file when used in an adjusted branch
|
||||||
where the file should be unlocked, or when configured by annex.addunlocked.
|
where the file should be unlocked, or when configured by annex.addunlocked.
|
||||||
* smudge: Fix a case where an unlocked annexed file that annex.largefiles
|
|
||||||
does not match could get its unchanged content checked into git,
|
|
||||||
due to git running the smudge filter unecessarily.
|
|
||||||
* Fix behavior of several commands, including reinject, addurl, and rmurl
|
* Fix behavior of several commands, including reinject, addurl, and rmurl
|
||||||
when given an absolute path to an unlocked file, or a relative path
|
when given an absolute path to an unlocked file, or a relative path
|
||||||
that leaves and re-enters the repository.
|
that leaves and re-enters the repository.
|
||||||
|
|
|
@ -168,25 +168,24 @@ clean file = do
|
||||||
filepath <- liftIO $ absPath file
|
filepath <- liftIO $ absPath file
|
||||||
return $ not $ dirContains repopath filepath
|
return $ not $ dirContains repopath filepath
|
||||||
|
|
||||||
-- If annex.largefiles is configured (and not disabled by annex.gitaddtoannex
|
-- If annex.largefiles is configured, matching files are added to the
|
||||||
-- being set to false), matching files are added to the annex and the rest to
|
-- annex. But annex.gitaddtoannex can be set to false to disable that.
|
||||||
-- git.
|
|
||||||
--
|
--
|
||||||
-- When annex.largefiles is not configured, files are normally not
|
-- When annex.largefiles is not configured, files are normally not
|
||||||
-- added to the annex, so will be added to git. However, if the file
|
-- added to the annex, so will be added to git. But some heuristics
|
||||||
-- is annexed in the index, keep it annexed. This prevents accidental
|
-- are used to avoid bad behavior:
|
||||||
-- conversions when previously annexed files get modified and added.
|
|
||||||
--
|
--
|
||||||
-- In either case, if the file's inode is the same as one that was used
|
-- If the file is annexed in the index, keep it annexed.
|
||||||
-- for annexed content before, annex it. And if the file is not annexed
|
-- This prevents accidental conversions.
|
||||||
-- in the index, and has the same content, leave it in git.
|
--
|
||||||
-- This handles cases such as renaming a file followed by git add,
|
-- Otherwise, when the file's inode is the same as one that was used for
|
||||||
-- which the user naturally expects to behave the same as git mv.
|
-- annexed content before, annex it. This handles cases such as renaming an
|
||||||
|
-- unlocked annexed file followed by git add, which the user naturally
|
||||||
|
-- expects to behave the same as git mv.
|
||||||
shouldAnnex :: RawFilePath -> Maybe (Sha, FileSize, ObjectType) -> Maybe Key -> Annex Bool
|
shouldAnnex :: RawFilePath -> Maybe (Sha, FileSize, ObjectType) -> Maybe Key -> Annex Bool
|
||||||
shouldAnnex file indexmeta moldkey = do
|
shouldAnnex file indexmeta moldkey = ifM (annexGitAddToAnnex <$> Annex.getGitConfig)
|
||||||
ifM (annexGitAddToAnnex <$> Annex.getGitConfig)
|
( checkunchangedgitfile $ checkmatcher checkheuristics
|
||||||
( checkunchanged $ checkmatcher checkwasannexed
|
, checkunchangedgitfile checkheuristics
|
||||||
, checkunchanged checkwasannexed
|
|
||||||
)
|
)
|
||||||
where
|
where
|
||||||
checkmatcher d
|
checkmatcher d
|
||||||
|
@ -200,21 +199,14 @@ shouldAnnex file indexmeta moldkey = do
|
||||||
matcher <- largeFilesMatcher
|
matcher <- largeFilesMatcher
|
||||||
checkFileMatcher' matcher file d
|
checkFileMatcher' matcher file d
|
||||||
|
|
||||||
checkwasannexed = pure $ isJust moldkey
|
checkheuristics = case moldkey of
|
||||||
|
Just _ -> return True
|
||||||
|
Nothing -> checkknowninode
|
||||||
|
|
||||||
isknownannexedinode = withTSDelta (liftIO . genInodeCache file) >>= \case
|
checkknowninode = withTSDelta (liftIO . genInodeCache file) >>= \case
|
||||||
Nothing -> pure False
|
Nothing -> pure False
|
||||||
Just ic -> Database.Keys.isInodeKnown ic =<< sentinalStatus
|
Just ic -> Database.Keys.isInodeKnown ic =<< sentinalStatus
|
||||||
|
|
||||||
-- If the inode matches one known used for annexed content,
|
|
||||||
-- keep the file annexed. This handles a case where the file
|
|
||||||
-- has been annexed before, and the git is running the clean filter
|
|
||||||
-- again on it for whatever reason.
|
|
||||||
checkunchanged cont = ifM isknownannexedinode
|
|
||||||
( return True
|
|
||||||
, checkunchangedgitfile cont
|
|
||||||
)
|
|
||||||
|
|
||||||
-- This checks for a case where the file had been added to git
|
-- This checks for a case where the file had been added to git
|
||||||
-- previously, not to the annex before, and its content is not
|
-- previously, not to the annex before, and its content is not
|
||||||
-- changed, but git is running the clean filter again on it
|
-- changed, but git is running the clean filter again on it
|
||||||
|
|
|
@ -115,3 +115,5 @@ Thanks for taking a look.
|
||||||
|
|
||||||
[[!meta author=kyle]]
|
[[!meta author=kyle]]
|
||||||
[[!tag projects/datalad]]
|
[[!tag projects/datalad]]
|
||||||
|
|
||||||
|
> commit reverted; [[done]] --[[Joey]]
|
||||||
|
|
|
@ -0,0 +1,44 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 1"""
|
||||||
|
date="2021-05-10T16:04:07Z"
|
||||||
|
content="""
|
||||||
|
On the second `git annex add foo`, I see:
|
||||||
|
|
||||||
|
add foo (non-large file; adding content to git repository) ok
|
||||||
|
|
||||||
|
Which seems right as far as the largefiles config goes.
|
||||||
|
|
||||||
|
That in turn runs `git add`, which runs the smudge filter. Due to my change
|
||||||
|
in [[!commit 424bef6b6]], the smudge filter then sees an inode it already
|
||||||
|
knows (because the file was unlocked), and so it avoids adding the file to
|
||||||
|
git, and leaves it annexed.
|
||||||
|
|
||||||
|
I don't quite understand how the different ways of running git commit play
|
||||||
|
into this, or how the file ends up, after the commit, to having a
|
||||||
|
non-annexed file recorded in the index. I guess the smudge filter must
|
||||||
|
end up being run again, and perhaps the [[!commit 424bef6b6]] change
|
||||||
|
doesn't work then, but anyway the behavior before that point is already a
|
||||||
|
problem.
|
||||||
|
|
||||||
|
Another instance of I think the same problem is following the
|
||||||
|
[[tips/largefiles]] recipe to convert an annexed file to git:
|
||||||
|
|
||||||
|
git init repo; cd repo; git annex init
|
||||||
|
echo foo > file
|
||||||
|
git annex add file
|
||||||
|
git commit -m add
|
||||||
|
git annex unlock file
|
||||||
|
git rm --cached file
|
||||||
|
git annex add --force-small file
|
||||||
|
git commit -m convert file
|
||||||
|
|
||||||
|
This results in a commit that unlocked the file but leaves
|
||||||
|
it annexed, and after the commit, git diff --cached shows that the index
|
||||||
|
has staged a conversion from unlocked to stored in git. So very similar,
|
||||||
|
and clearly [[!commit 424bef6b6]] cannot be allowed to cause that breakage,
|
||||||
|
which does not even involve largefiles!
|
||||||
|
|
||||||
|
So I've reverted the commit for now, and we can discuss next steps in the
|
||||||
|
forum thread.
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue