diff --git a/CHANGELOG b/CHANGELOG index e0407b3919..deaa942d35 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -12,6 +12,10 @@ git-annex (10.20250631) UNRELEASED; urgency=medium branch in a linked worktree on a filesystem not supporting symlinks. * Add --url option and url= preferred content expression, to match content that is recorded as present in an url. + * Improved workaround for git 2.50 bug, avoding an occasional test suite + failure, as well as some situations where an unlocked file did not get + populated when adding another file to the repository with the same + content. -- Joey Hess Mon, 07 Jul 2025 15:59:42 -0400 diff --git a/Database/Keys.hs b/Database/Keys.hs index 98a1db9053..d3fce7bbd8 100644 --- a/Database/Keys.hs +++ b/Database/Keys.hs @@ -260,7 +260,7 @@ isInodeKnown i s = or <$> runReaderIO ContentTable - is an associated file. -} reconcileStaged :: Bool -> H.DbQueue -> Annex DbTablesChanged -reconcileStaged dbisnew qh = ifM notneeded +reconcileStaged dbisnew qh = ifM isBareRepo ( return mempty , do gitindex <- inRepo currentIndexFile @@ -299,12 +299,12 @@ reconcileStaged dbisnew qh = ifM notneeded inRepo $ update' lastindexref newtree fastDebug "Database.Keys" "reconcileStaged end" return (DbTablesChanged True True) - -- git write-tree will fail if the index is locked or when there is - -- a merge conflict. To get up-to-date with the current index, - -- diff --staged with the old index tree. The current index tree - -- is not known, so not recorded, and the inode cache is not updated, - -- so the next time git-annex runs, it will diff again, even - -- if the index is unchanged. + -- Was not able to run git write-tree, or it failed due to the + -- index being locked or a merge conflict. To get up-to-date with + -- the current index, diff --staged with the old index tree. The + -- current index tree is not known, so not recorded, and the inode + -- cache is not updated, so the next time git-annex runs, it will + -- diff again, even if the index is unchanged. -- -- When there is a merge conflict, that will not see the new local -- version of the files that are conflicted. So a second diff @@ -327,21 +327,22 @@ reconcileStaged dbisnew qh = ifM notneeded processor l False `finally` void cleanup - -- Avoid running smudge clean filter, which would block trying to - -- access the locked database. git write-tree sometimes calls it, - -- even though it is not adding work tree files to the index, - -- and so the filter cannot have an effect on the contents of the - -- index or on the tree that gets written from it. - getindextree = inRepo $ \r -> writeTreeQuiet $ r - { gitGlobalOpts = gitGlobalOpts r ++ bypassSmudgeConfig } - - notneeded = isBareRepo - -- Avoid doing anything when run by the - -- smudge clean filter. When that happens in a conflicted - -- merge situation, running git write-tree - -- here would cause git merge to fail with an internal - -- error. This works around around that bug in git. - <||> Annex.getState Annex.insmudgecleanfilter + -- This avoids running git write-tree when run by the smudge clean + -- filter, in order to work around a bug in git. That causes + -- git merge to fail with an internal error when git write-tree is + -- run by the smudge clean filter in conflicted merge situation. + -- + -- When running git write-tree, avoid it running the smudge clean + -- filter, which would block trying to access the locked database. + -- git write-tree sometimes calls it, even though it is not adding + -- work tree files to the index, and so the filter cannot have an + -- effect on the contents of the index or on the tree that gets + -- written from it. + getindextree = ifM (Annex.getState Annex.insmudgecleanfilter) + ( return Nothing + , inRepo $ \r -> writeTreeQuiet $ r + { gitGlobalOpts = gitGlobalOpts r ++ bypassSmudgeConfig } + ) diff old new = -- Avoid running smudge clean filter, since we want the diff --git a/doc/bugs/flaky_test_failure_add_dup.mdwn b/doc/bugs/flaky_test_failure_add_dup.mdwn new file mode 100644 index 0000000000..f685a9861a --- /dev/null +++ b/doc/bugs/flaky_test_failure_add_dup.mdwn @@ -0,0 +1,74 @@ + Repo Tests v10 unlocked + [...] + add dup: FAIL (0.25s) + ./Test/Framework.hs:393: + checkcontent foo + expected: "annexed file content" + but got: "/annex/objects/SHA256E-s20--e394a389d787383843decc5d3d99b6d184ffa5fddeec23b911f9ee7fc8b9ea77\n" + Use -p '/add dup/' to rerun this test only. + +I am able to produce this failure after about a minute of running the +test in a loop with: + + while git-annex test -p '/add dup/' ; do :;done + +Inside the test repo, file "foo" indeed is an unpopulated pointer file, +despite the file "foodup", which has the same git-annex key, being populated. + +Reverting [[!commit fb155b1e3e59cc1f9cf8a4fe7d47cba49d1c81af]] avoids +this test suite failure. (Or at least if it is flaky, it's much mess likely to +fail. I ran the loop for 10 minutes.) + +--- + +What the test suite is doing is a `git add` and is using the smudge filter +to add the new file as an unlocked annexed file. + +I have reproduced the problem doing the same outside the test suite. + +---- + +It seems that the keys database does not always get updated to indicate the +key used by file "foo". As shown here looking at the keys db in a failed +testcase dir: + + sqlite> select * from associated; + 1|SHA256E-s20--e394a389d787383843decc5d3d99b6d184ffa5fddeec23b911f9ee7fc8b9ea77|foodup + sqlite> + +That happens in a fresh clone of the repository. So, `reconcileStaged` never +gets a chance to do anything, because it's only ever called from inside the +smudge filter. And [[!commit fb155b1e3e59cc1f9cf8a4fe7d47cba49d1c81af]] made it +not run in the smudge filter. + +This particular failure could be avoided if `git-annex init` called +`reconcileStaged`. Then it would learn about pointer files in the tree. + +But would that be a complete fix for all situations? If the user is +only running git-annex via `git add` (the smudge clean filter), +but is making other changes to the tree too, I don't think +it would. Consider for example: + + git clone r r2 + cd r2 + git-annex fsck + git mv foo bar + git config annex.largefiles anything + echo hi > baz + git add baz + +In the above example, the `git-annex fsck` updates the associated files, +so it know that the file foo has the key. But then foo is renamed to bar +and when `git add` is run on a file, generating the same key, +`reconcileStaged` does not update the associated files, so it does not +know about the rename to baz. So it leaves bar unpopulated. + +Conclusion: `reconcileStaged` needs to run even in the smudge clean filter. +But to avoid the git bug worked around by +[[!commit fb155b1e3e59cc1f9cf8a4fe7d47cba49d1c81af]], it must avoid +running `git write-tree` when called in smudge clean, at least when +in a conflicted merge situation. Luckily, `reconcileStaged` does contain +code to update things when it is unable to run `git write-tree`, so that +only needs to be used when in the smudge clean filter. + +> [[fixed|done]] and confirmed the fix works with git 2.50. --[[Joey]]