fc37243ffe
Rather than requiring the last listed bundle in the manifest include all refs that are in the remote, build up refs from each bundle listed in the manifest. This fixes a bug where pushing first a new branch foo from one clone, and then pushing a new branch bar from another clone, caused the second push to lose branch foo. Now the second push will add a new bundle, but the foo ref in the bundle from the first push will still be used. Pushing a deletion of a ref now has to delete all bundles and push a new bundle with only the remaining refs in it. In a "list for-push", it now has to unbundle all bundles, in order for a deletion repush to have available all objects. (And a non-deletion push can also rely on refs/namespaces/mine/ being up-to-date.) It would have been possible to fix the bug by only making it do that unbundling in "list for-push", without changing what's stored in the bundles. But I think I prefer to populate the bundles this way. For one thing, deleting a pushed ref now really deletes all data relating to it, rather than leaving it present in old bundles. For another, it's easier to explain since there is no special case for the last bundle. And, it will often result in smaller bundles. Note that further efficiency gains are possible with respect to what objects are included in an incremental bundle. Two XXX comments document how to reduce excess objects. It didn't seem worth implementing those optimisations in this proof of concept code. Sponsored-by: Brock Spratlen on Patreon
52 lines
1.7 KiB
Markdown
52 lines
1.7 KiB
Markdown
This adds two new object types to git-annex, GITMANIFEST and a GITBUNDLE.
|
|
|
|
GITMANIFEST--$UUID is the manifest for a git repository stored in the
|
|
git-annex repository with that UUID.
|
|
|
|
GITBUNDLE--sha256 is a git bundle.
|
|
|
|
# format of the manifest file
|
|
|
|
An ordered list of bundle keys, one per line.
|
|
|
|
# fetching
|
|
|
|
1. download GITMANIFEST for the uuid of the special remote
|
|
2. download each listed GITBUNDLE object that we don't have
|
|
3. `git fetch` from each new bundle in order
|
|
(note that later bundles can update refs from the versions in previous
|
|
bundles)
|
|
|
|
# pushing (incrementally)
|
|
|
|
This is how pushes are usually done.
|
|
|
|
1. create git bundle of all refs that are being pushed and have changed,
|
|
and objects since the previously pushed refs
|
|
2. hash to calculate GITBUNDLE key
|
|
3. upload GITBUNDLE object
|
|
4. download current manifest
|
|
5. append GITBUNDLE key to manifest
|
|
|
|
# pushing (full)
|
|
|
|
Note that this can be used to replace incrementals with a single bundle for
|
|
performance. It is also the only way to handle a push that deletes a
|
|
previously pushed ref.
|
|
|
|
1. create git bundle containing all refs stored in the repository, and all
|
|
objects
|
|
2. hash to calculate GITBUNDLE object name
|
|
3. upload GITBUNDLE object
|
|
4. download old manifest
|
|
4. upload new manifest listing only the single new GITBUNDLE
|
|
5. delete all other GITBUNDLEs that were listed in the old manifest
|
|
|
|
# multiple GITMANIFEST files
|
|
|
|
Usually there will only be one per special remote, but it's possible for
|
|
multiple special remotes to point to the same object storage, and if so
|
|
multiple GITMANIFEST objects can be stored.
|
|
|
|
It follows that the UUID of the special remote has to be included in the
|
|
annex:// uri, to know which GITMANIFEST to use when cloning from it.
|