git-annex/doc/bugs/sync_content_copies_unwanted_removed_files.mdwn

43 lines
2 KiB
Text
Raw Normal View History

2020-11-07 14:58:43 +00:00
A couple of times I have seen a git-annex sync -C . upload files to the
remote that are not part of the remote's preferred content. In the most
recent case, I had moved the file to another directory while the sync was
downloading a previous file. I suspect that the file being removed causes
preferred content checks to mess up. --[[Joey]]
2020-11-09 18:35:51 +00:00
> Reproduced reliably as follows: Have a bigfile in the remote
> and a smallfile in the local repo. Have the remote's preferred content
> be "not (copies=1)". Have the local repo's preferred content
2020-11-09 18:35:51 +00:00
> `include=*`. Run `git-annex sync -C.` while that's running, `git rm
> smallfile`. (bigfile has to be big enough to give time to run that
> command)
>
> smallfile gets sent to the remote unexpectedly. If it's not deleted
> first, that does not happen.
>
> --[[Joey]]
> > Hmm, so limitCopies uses checkKey, which for MatchingFile, uses
> > lookupKey. And with a deleted file, lookupKey falls
> > into a case where it uses catKeyFile, but since the file has been
> > removed from the index, that also fails. And when it fails,
> > that means it assumes it does not have 1 copy, and so the
> > "not (copies=1)" evalulates to true, so it thinks it's matched as
> > preferred content.
> >
> > The preferred content is being checked via wantSend, which already knows
> > the key in this case.
> >
> > It knows the key already because sync uses seekFilteredKeys and so it's
> > already streamed the file though and looked up the key before
> > it's deleted. If the file got deleted before that could look up the
> > key, it would skip it. It may be that recent changes to add this
> > streaming for performance led to this bug.
> >
> > So one fix might be to change it to use MatchingKey,
> > and so avoid the later lookup? Investigating the git history
> > and the code I see no reason not to do this. It didn't used to be that
> > MatchingKey included an AssociatedFile, which is probably why it was
> > not used in this case originally.
[[fixed|done]]