make import tree from remote generate a merge commit

This way no history is lost, neither what was exported to the remote,
or the history of changes that is imported from it. No complicated
correlation of two possibly very different histories is needed, just
record what we know and then git merge will do a good job.

Also, it notices when the remote tracking branch doesn't need to be updated,
and avoids doing anything, so noop remotes are super cheap.

The only catch here is that, since the commits generated for imports
from the remote don't have a stable date or author/committer, each
(non-noop) import generates different commits for the same imported
trees. So, when the imported remote tracking branch is merged into master
and then a change is imported again, there will be an extra series of
commits, which will get more and more expensive each time.

This seems to call for making stable commits for imports. Also that
seems a good idea to make importing in several repositories have the
same result.
This commit is contained in:
Joey Hess 2019-04-30 16:13:21 -04:00
parent b69d11ec42
commit 1503b86a14
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 71 additions and 106 deletions

View file

@ -42,23 +42,26 @@ and `git annex sync --content` can be configured to use it.
while the S3 history is
[[newname, foo, bar], [foo, bar], [oldname, foo, bar], etc]
(Both of the "etc" values are the same.
(Both of the "etc" values are the same.)
While perhaps a heuristic to detect renames could be added to the History
comparison, maybe it would be better to make it notice that these
are the same after a certian point, and so preserve the divergence,
but with a less ugly history.
comparison, better would be to generate a merge between the git commit
that was exported to the remote before, and the imported history from the
remote. The merge just needs to have as its tree the current imported
tree.
This way whatever happened on the remote as a consequence of
exports and other changes is preserved in the git history in full.
Ie, rather than creating a history like:
When creating such a merge, first check if the old value of the remote
tracking branch matches the imported history. If so, nothing to do.
[[newname, foo, bar], [foo, bar], [oldname, foo, bar], etc, [newname, foo, bar], [oldname, foo, bar], etc]
Next, check if the old value of the remote tracking branch is a merge,
and its tree matches the top of the imported history, and one
of its parents matches the full imported history. If so, nothing to do
because that is what we want to generate.
Create:
[[newname, foo, bar], [oldname, foo, bar], etc]
and then upon merge of that s3/master, there will be two lines of
devlopment that branch out after "etc" and rejoin at the top.
Otherwise, commit the imported history and generate a merge commit.
* S3 buckets can be set up to allow reads and listing by an anonymous user.
That should allow importing from such a bucket, but the S3 remote