I hope to support importtree=yes eventually, but it does not currently
work.
Added remote.<name>.allow-encrypted-gitrepo that needs to be set to
allow using it with encrypted git repos.
Note that even encryption=pubkey uses a cipher stored in the git repo
to encrypt the keys stored in the remote. While it would be possible to
not encrypt the GITBUNDLE and GITMANIFEST keys, and then allow using
encryption=pubkey, it doesn't currently work, and that would be a
complication that I doubt is worth it.
Updating the remote list needs the config to be written to the git-annex
branch, which was not done for good reasons. While it would be possible
to instead use Remote.List.remoteGen without writing to the branch, I
already have a plan to discard git-annex branch writes made by
git-remote-annex, so the simplest fix is to write the config to the
branch.
Sponsored-by: k0ld on Patreon
Put the annex objects in .git/annex/objects/ inside the export remote.
This way, when importing from the remote, they will be filtered out.
Note that, when importtree=yes, content identifiers are used, and this
means that pushing to a remote updates the git-annex branch. Urk.
Will need to try to prevent that later, but I already had a todo about
that for other reasons.
Untested!
Sponsored-By: Brock Spratlen on Patreon
Otherwise, it can be confusing to clone from a wrong url, since it fails
to download a manifest and so appears as if the remote exists but is empty.
Sponsored-by: Jack Hill on Patreon
This will eventually be used to recover from an interrupted fullPush
and drop the old bundle keys it was unable to delete.
Sponsored-by: Luke T. Shumaker on Patreon
Update its todo with remaining items.
Add changelog entry.
Simplified internals document to no longer be notes to myself, but
target users who want to understand how the data is stored
and might want to extract these repos manually.
Sponsored-by: Kevin Mueller on Patreon
Making GITBUNDLE be in the backend list allows those keys to be
hashed to verify, both when git-remote-annex downloads them, and by other
transfers and by git fsck.
GITMANIFEST is not in the backend list, because those keys will never be
stored in .git/annex/objects and can't be verified in any case.
This does mean that git-annex version will include GITBUNDLE in the list
of backends.
Also documented these in backends.mdwn
Sponsored-by: Kevin Mueller on Patreon
Not quite there yet.
Also, changed the format of GITBUNDLE keys to use only one '-'
after the UUID. A sha256 does not contain that character, so can just
split at the last one.
Amusingly, the sha256 will probably not actually be verified. A git
bundle contains its own checksums that git uses to verify it. And if
someone wanted to replace the content of a GITBUNDLE object, they
could just edit the manifest to use a new one whose sha256 does verify.
Sponsored-by: Nicholas Golder-Manning
Changed the format of the url to use annex: rather than annex::
The reason is that in the future, might want to support an url that
includes an uriAuthority part, eg:
annex://foo@example.com:42/358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/"
To parse that foo@example.com:42 as an uriAuthority it needs to start with
annex: rather than annex::
That would also need something to be done with uriAuthority, and also
the uriPath (the UUID) is prefixed with "/" in that example. So the
current parser won't handle that example currently. But this leaves the
possibility for expansion.
Sponsored-by: Joshua Antonishen on Patreon
The UUID is included in the GITMANIFEST in order to allow a single
key/value store to be used to store several special remotes, without any
namespacing. In that situation though, if the same ref is pushed to two
special remotes, it will result in git bundles with the same content.
Which is ok, until a re-push happens to one of the special remote.
At that point, the old git bundle will be deleted. That will prevent
fetching it from the other special remote, where the re-push has not
happened.
Adding the UUID avoids this problem.
And document remote.<name>.git-remote-annex-max-bundles which will
configure it.
datalad-annex uses a similar url format, but with some enhancements.
See https://github.com/datalad/datalad-next/blob/main/datalad_next/gitremotes/datalad_annex.py
I added the UUID to the URL, because it is needed in order to pick out which
manifest file to use. The design allows for a single key/value store to have
several special remotes all stored in it, and so the manifest includes
the UUID in its name.
While datalad-annex allows datalad-annex::<url>?, and allows referencing
peices of the url in the parameters, needing the UUID prevents
git-remote-annex from supporting that syntax. And anyway, it is a
complication and I want to keep things simple for now.
Sponsored-by: unqueued on Patreon
Added to git-annex_proxies todo because this is something OpenNeuro
would need in order to use the git-annex proxy.
Sponsored-by: Dartmouth College's OpenNeuro project
Rather than requiring the last listed bundle in the manifest include all
refs that are in the remote, build up refs from each bundle listed in
the manifest.
This fixes a bug where pushing first a new branch foo from one clone,
and then pushing a new branch bar from another clone, caused the second
push to lose branch foo. Now the second push will add a new bundle, but
the foo ref in the bundle from the first push will still be used.
Pushing a deletion of a ref now has to delete all bundles and push a new
bundle with only the remaining refs in it.
In a "list for-push", it now has to unbundle all bundles, in order for a
deletion repush to have available all objects. (And a non-deletion push
can also rely on refs/namespaces/mine/ being up-to-date.)
It would have been possible to fix the bug by only making it do that
unbundling in "list for-push", without changing what's stored in the
bundles. But I think I prefer to populate the bundles this way. For one
thing, deleting a pushed ref now really deletes all data relating to it,
rather than leaving it present in old bundles. For another, it's easier
to explain since there is no special case for the last bundle. And, it
will often result in smaller bundles.
Note that further efficiency gains are possible with respect to what
objects are included in an incremental bundle. Two XXX comments
document how to reduce excess objects. It didn't seem worth implementing
those optimisations in this proof of concept code.
Sponsored-by: Brock Spratlen on Patreon