8b56d6b283
In a situation where there are two repos that are diverged and each pushes in turn to git-remote-annex, the first to push updates it. Then the second push fails because it is not a fast-forward. The problem is, before git push fails with "non-fast-forward", it actually calls git-remote-annex with push. So, to the user it appears as if the push failed, but it actually reached the remote, and overwrote the other push! The only solution to this seems to be for git-remote-annex push to notice when a non-force-push would overwrite a ref stored in the remote, and refuse to push that ref, returning an error to git. This seems strange, why would git make remote helpers implement that when it later checks the same thing itself? With this fix, it's still possible for a race to overwrite a change to the MANIFEST and lose work that was pushed from the other repo. But that needs two pushes to be running at the same time. From the user's perspective, that situation is the same as if one repo pushed new work, then the other repo did a git push --force, overwriting the first repo's push. In the first repo, another push will then fail as a non fast-forward, and the user can recover as usual. But, a MANIFEST overwrite will leave bundle files in the remote that are not listed in the MANIFEST. It seems likely that git-annex will eventually be able to detect that after the fact and clean it up. Eg, it can learn all bundles that are stored in the remote using the location log, and compare them to the MANIFEST to find bundles that got lost. The race can also appear to the user as if they pushed a ref, but then it got deleted from the remote. This happens when two two pushes are pushing different ref names. This might be harder for the user to notice; git fetch does not indicate that a remote ref got deleted. They would have to use git fetch --prune to notice the deletion. Once the user does notice, they can re-push their ref to recover. Sponsored-by: Jack Hill on Patreon
49 lines
1.7 KiB
Markdown
49 lines
1.7 KiB
Markdown
This adds two new object types to git-annex, GITMANIFEST and a GITBUNDLE.
|
|
|
|
GITMANIFEST--$UUID is the manifest for a git repository stored in the
|
|
git-annex repository with that UUID.
|
|
|
|
GITBUNDLE--sha256 is a git bundle.
|
|
|
|
# format of the manifest file
|
|
|
|
An ordered list of bundle keys, one per line.
|
|
|
|
The last bundle in the list provides all refs that are currently stored in
|
|
the repository. The bundles before it in the list can incrementally provide
|
|
objects, but not refs.
|
|
|
|
# fetching
|
|
|
|
1. download GITMANIFEST for the uuid of the special remote
|
|
2. download each listed GITBUNDLE object that we don't have
|
|
3. `git bundle unpack` each bundle in order
|
|
4. `git fetch` from the last bundle listed in the manifest
|
|
|
|
# pushing (incrementally)
|
|
|
|
1. create git bundle all refs that will be stored in the repository,
|
|
and objects since the previously pushed refs
|
|
2. hash to calculate GITBUNDLE key
|
|
3. upload GITBUNDLE object
|
|
4. download current manifest
|
|
5. append GITBUNDLE key to manifest
|
|
|
|
# pushing (replacing incrementals with single bundle)
|
|
|
|
1. create git bundle containing all refs stored in the repository, and all
|
|
objects
|
|
2. hash to calculate GITBUNDLE object name
|
|
3. upload GITBUNDLE object
|
|
4. download old manifest
|
|
4. upload new manifest listing only the single new GITBUNDLE
|
|
5. delete all other GITBUNDLEs that were listed in the old manifest
|
|
|
|
# multiple GITMANIFEST files
|
|
|
|
Usually there will only be one per special remote, but it's possible for
|
|
multiple special remotes to point to the same object storage, and if so
|
|
multiple GITMANIFEST objects can be stored.
|
|
|
|
It follows that the UUID of the special remote has to be included in the
|
|
annex:// uri, to know which GITMANIFEST to use when cloning from it.
|