git-annex/doc/todo/sync_fast_import.mdwn
Joey Hess 3eaaec3113
consistently use importKey when available
This avoids import with --no-content and with --content potentially
generating two different trees, leading to a merge conflict when run in
two different clones of a repo. And it's necessary groundwork to make
git-annex sync --no-content import from special remotes that support
importKey.

Only the directory special remote currently supports importKey, and it
generates the same key as git-annex usually does, so there is no
behavior change for it.

Future special remotes will need to take care when adding importKey,
if it generates different keys. Added some warnings about that to
comments.

This commit was sponsored by Noam Kremen on Patreon.
2020-09-28 15:27:46 -04:00

52 lines
2.5 KiB
Markdown

git-annex import --no-content from a directory special remote is
implemented, but git-annex sync, when run without --content, does not
operate on import/export special remotes. This is inconsistent, and it
would be useful if it did.
<https://git-annex.branchable.com/todo/importing_from_special_remote_without_downloading/#comment-e3db95e073f01a05b205e26f422f5bc5>
describes a problem with doing that, involving merge conflicts. That
should not actually happen with the directory special remote, because
it generates the same key with importKey as git-annex import would.
But other special remotes later using this interface might generate a key
using a different hash than usual.
The suggestion there is that there could be a separate config that controls
whether sync does a fast import or an import with content. Then when sync
is run without --content, it can do a fast import. And when run with
--content, it can do a fast import, followed by getting the content.
Or maybe that should just be what it always does, when a remote supports
importKey? (If so, git-annex import should do the same.) Yeah, this seems
better than a config. Look at it like this: The special remote makes pseudo
"commits" when changes are made to it. And maybe it choses to use a
different kind of key than the local repository would use. Same could
happen when pulling from someone else's repo, if they've configured
git-annex to use a different backend.
> There could be future transition problems here. If a remote does not
> support importkey, and imports are done from it, and then in a new
> version, it does support importkey, there would be the same risk of
> conflicts.
>
> Could be solved by the remote's code indicating if
> importKey is safe to use by default. If a remote started off implementing
> only imports w/o importKey, and then added importKey, and importKey
> generates different keys than the keys generated by hashing downloaded
> content, then the remote could say, don't use importKey by default.
> (Or more likely, only the directory remote will be able to support
> importKey by default..)
>
> Problem: When annex.largefiles matches file content,
> cannot use importKey. So then should sync --content not use importKey
> then, risking generating a different tree? Or should it fail, even
> though importing with content is possible?
>
> > Well, different annex.largefiles settings in different clones
> > can already risk generating a different tree on import. So,
> > the former option seems preferable.
---
See also, [[todo/import_--no-content_largefiles_conflict]]
> [[done]] --[[Joey]]