git-annex/doc/bugs/case_where_keys_db_lags_reality.mdwn
Joey Hess 0101363eb8
correctly update keys db in merge conflict
This is quite a subtle edge case, see the bug report for full details.

The second git diff is needed only when there's a merge conflict.
It would be possible to speed it up marginally by using
--diff-filter=Unmerged, but probably not enough to bother with.

Sponsored-by: Graham Spencer on Patreon
2021-06-07 12:52:36 -04:00

55 lines
2.9 KiB
Markdown

Found a case where the associated files in the keys db end up out-of-date.
Make a repo with an locked file, clone it to a second repo, and set up a
conflict involving that file in both repos, using git-annex add to add the
conflicting version, committing, and not running other git-annex commands
after that, before pulling the conflicting branch. When the associated
files db gets updated in the conflict situation, only 1 key has the
conflicting file associated with it, rather than 2 or 3.
The original key before the conflict has the file associated with it, but
the new local key and new remote key do not.
The result is that a drop of another file that uses the same key may not
honor the preferred content of the file that is in conflict.
Once the conflict is resolved, git-annex will recover, the problem only
occurs while there's an unmerged conflict, and only when git-annex did not
get a change to notice the local modification before the conflict happened.
This only affected locked files, because when an unlocked file is staged,
git-annex updates the keys db. So, one solution to this bug will be for
git-annex to also update the keys db when staging locked files.
(Unfortunately this would make mass adds somewhat slower.)
Or, possibly, for reconcileStaged to not use git diff --cached in this case,
but git diff with -1 and -3. That lets both sides of the merge conflict be
accessed, and it could then add the file to both keys. As well as not
slowing down git-annex add, this would let it honor the preferred content
of the conflicting file for all 3 keys. --[[Joey]]
> On second thought, it's not really necessary that all 3 keys have the
> conflicted file associated with them. The original key doesn't because
> the user has already changed the file to use the new key. The new remote
> key does not really need to, and there might not even be any effect if it
> did. The new local key is the one that this bug is really about.
>
> Consider that checkDrop uses catKeyFile to double-check the associated
> files. And that will see the file pointing to the new local key. So
> if the original key or new remote key are also associated with the file,
> it will ignore them and drop anyway. And that's ok, from the user's
> perspective the one it needs to retain is the one that the file in the
> working tree uses, which is the new local key.
>
> > Hmm, -1 and -3 are not what's needed to get the new local key.
> > It's using `git diff oldtree --cached`, and the code preserves the old
> > key when it sees a merge conflict. Using instead
> > `git diff HEAD --cached` has the new key as the src sha, and nullsha as
> > the dst sha.
> >
> > However, the diff with the old tree is needed to incrementally
> > update when it's not in the middle of a merge conflict.
> > So what can be done is do the diff as now; when it sees a merge
> > conflict, run diff a second time with `HEAD --cached` to get the new
> > key.
> >
> > > [[done]] --[[Joey]]