In a situation where there are two repos that are diverged and each pushes
in turn to git-remote-annex, the first to push updates it. Then the second
push fails because it is not a fast-forward. The problem is, before git
push fails with "non-fast-forward", it actually calls git-remote-annex
with push.
So, to the user it appears as if the push failed, but it actually reached
the remote, and overwrote the other push!
The only solution to this seems to be for git-remote-annex push to notice
when a non-force-push would overwrite a ref stored in the remote, and
refuse to push that ref, returning an error to git. This seems strange,
why would git make remote helpers implement that when it later checks the
same thing itself?
With this fix, it's still possible for a race to overwrite a change to
the MANIFEST and lose work that was pushed from the other repo. But that
needs two pushes to be running at the same time. From the user's
perspective, that situation is the same as if one repo pushed new work,
then the other repo did a git push --force, overwriting the first repo's
push. In the first repo, another push will then fail as a non
fast-forward, and the user can recover as usual. But, a MANIFEST
overwrite will leave bundle files in the remote that are not listed in
the MANIFEST. It seems likely that git-annex will eventually be able to
detect that after the fact and clean it up. Eg, it can learn all bundles
that are stored in the remote using the location log, and compare them
to the MANIFEST to find bundles that got lost.
The race can also appear to the user as if they pushed a ref, but then
it got deleted from the remote. This happens when two two pushes are
pushing different ref names. This might be harder for the user to
notice; git fetch does not indicate that a remote ref got deleted.
They would have to use git fetch --prune to notice the deletion.
Once the user does notice, they can re-push their ref to recover.
Sponsored-by: Jack Hill on Patreon
This is a shell script, so not final code, and it does not use git-annex
at all, but it shows how to push to git bundles, listed in a MANIFEST,
the same as the git-remote-annex program will eventually do.
While developing this, I realized that the design needed to be changed
slightly regarding where refs are stored. Since a push can delete a ref
from a remote, storing each newly pushed ref in a bundle won't work,
because deleting a ref would then entail deleting all old bundles and
re-uploading from scratch. So instead, only the refs in the last bundle
listed in the MANIFEST are the active refs. Any refs in prior bundles
are just old refs that were stored previously (a reflog as it were).
That means that, in a situation where two different people are pushing
to the same special remote from different repos, whoever pushes last
wins. Any refs pushed by the other person earlier will be ignored. This
may not be desirable, and git-annex might be able use the git-annex
branch to detect such situations and rescue the refs that got lost. Even
without such a recovery process though, the refs that the other person
thought they pushed will be preserved in their refs/namespaces/mine, so
a pull followed by a push will generally resolve the situation.
Note that the use of refs/namespaces/mine in the bundle is not really
desirable, and it might be worth making a local clone of the repo in
order to set up the refs that will be put in the bundle. Which seems to
be the only way to avoid needing that. But it does need to maintain
the refs/namespaces/mine/ in the git repo in order to remember what refs
have been pushed to the remote before, in order to include them in the
next bundle pushed. A name that includes the remote uuid will be needed
in the final implementation.
Anyway, this shell script seems to fully work, including incremental
pushing, force pushing, and pushes that delete refs.
Sponsored-by: Brett Eisenberg on Patreon
Added rclone special remote, which can be used without needing to install
the git-annex-remote-rclone program. This needs a new version of rclone,
which supports "rclone gitannex".
This is implemented as a variant of an external special remote, that
runs "rclone gitannex" instead of the usual git-annex-remote- command.
Parameterized Remote.External to support that.
Sponsored-by: Luke T. Shumaker on Patreon