avoid interrupted push leaving remote without a manifest

Added a backup manifest key, which is used if the main manifest key is
not present. When uploading a new Manifest, it makes sure that it never
drops one key except when the other key is present.

It's entirely possible for the two manifest keys to get out of sync, due
to races. The main one wins when it's present, it is possible for the
main one being dropped to expose the backup one, which has a different
push recorded.
This commit is contained in:
Joey Hess 2024-05-20 15:41:09 -04:00
parent 594ca2fd3a
commit 3a38520aac
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 67 additions and 67 deletions

View file

@ -4,9 +4,10 @@ repository to a special remote, and later cloning from it.
This adds two new key types to git-annex, GITMANIFEST and a GITBUNDLE.
GITMANIFEST--$UUID is the manifest for a git repository stored in the
git-annex repository with that UUID.
git-annex repository with that UUID. When that is not present,
GITMANIFEST--$UUID.bak is a backup copy that can be used instead.
GITBUNDLE--$UUID-sha256 is a git bundle.
GITBUNDLE--$UUID-$sha256 is a git bundle.
# format of the manifest file
@ -23,11 +24,10 @@ and are in the process of being deleted.
In an exporttree=yes remote, the GITMANIFEST and GITBUNDLE objects are
stored in the remote, under the `.git/annex/objects/` path.
# multiple GITMANIFEST files
# multiple special remotes in the same place
Usually there will only be one per special remote, but it's possible for
multiple special remotes to point to the same object storage, and if so
multiple GITMANIFEST objects can be stored.
It's possible for multiple special remotes to point to the same
object storage.
This is why the UUID of the special remote is included in the GITMANIFEST
key, and in the annex:: uri.

View file

@ -47,26 +47,6 @@ This is implememented and working. Remaining todo list for it:
(with or without exporttree=yes). This is because the ContentIdentifier
db is not populated. It should be possible to work around this.
* See XXX in uploadManifest about recovering from a situation
where the remote is left with a deleted manifest when a push
is interrupted part way through.
This should be recoverable
by caching the manifest locally and re-uploading it when
the remote has no manifest or prompting the user to merge and re-push.
But, this leaves the remote unusable for fetching until that is dealt
with.
Or, could have two identical manifest files, A and B. When pushing, first
delete and upload A. Then delete and upload B. When fetching, if A does
not exist, use B instead. However, allows for races and interruptions
that cause A and B to be out of sync, with one push in A and another in B.
Once out of sync, in the window where a push has deleted but not
re-uploaded A yet, B will have a different content. So a fetch at that
point will see something that was pushed by a push that otherwise had
lost a push race.
* It would be nice if git-annex could generate an annex:: url
for a special remote and show it to the user, eg when
they have set the shorthand "annex::" url, so they know the full url.