24af51e66d
This turns out to only be necessary is edge cases. Most of the time, git-annex unused --from remote doesn't see git-remote-annex keys at all, because it does not record a location log for them. On the other hand, git-annex unused does find them, since it does not rely on the location log. And that's good because they're a local cache that the user should be able to drop. If, however, the user ran git-annex unused and then git-annex move --unused --to remote, the keys would have a location log for that remote. Then git-annex unused --from remote would see them, and would consider them unused. Even when they are present on the special remote they belong to. And that risks losing data if they drop the keys from the special remote, but didn't expect it would delete git branches they had pushed to it. So, make git-annex unused --from skip git-remote-annex keys whose uuid is the same as the remote.
96 lines
4.4 KiB
Markdown
96 lines
4.4 KiB
Markdown
git-remote-annex will be a program that allows push/pull/clone of a git
|
|
repository to many types of git-annex special remote.
|
|
|
|
This is a redesign and reimplementation of git-remote-datalad-annex.
|
|
It will be a safer implementation, will support incremental pushes, and
|
|
will be available to users who don't use datalad.
|
|
--[[Joey]]
|
|
|
|
---
|
|
|
|
This is implememented and working. Remaining todo list for it:
|
|
|
|
* Test pushes that delete branches.
|
|
|
|
* Test incremental pushes that don't fast-forward.
|
|
|
|
* Cloning from an annex:: url with importtree=yes doesn't work
|
|
(with or without exporttree=yes). This is because the ContentIdentifier
|
|
db is not populated.
|
|
|
|
* Need to mention git-remote-annex in special remotes page, and perhaps
|
|
write a tip for it. Also link to it from git-annex man page.
|
|
|
|
* It would be nice if git-annex could generate an annex:: url
|
|
for a special remote and show it to the user, eg when
|
|
they have set the shorthand "annex::" url, so they know the full url.
|
|
`git-annex info $remote` could also display it.
|
|
Currently, the user has to remember how the special remote was
|
|
configured and replicate it all in the url.
|
|
|
|
There are some difficulties to doing this, including that
|
|
RemoteConfig can have hidden fields that should be omitted.
|
|
|
|
* initremote/enableremote could have an option that configures the url to a
|
|
special remote to a annex:: url. This would make it easier to use
|
|
git-remote-annex, since the user would not need to set up the url
|
|
themselves. (Also it would then avoid setting `skipFetchAll = true`)
|
|
|
|
* Improve recovery from interrupted push by using outManifest to clean up
|
|
after it. (Requires populating outManifest.)
|
|
|
|
* See XXX in uploadManifest about recovering from a situation
|
|
where the remote is left with a deleted manifest when a push
|
|
is interrupted part way through. This should be recoverable
|
|
by caching the manifest locally and re-uploading it when
|
|
the remote has no manifest or prompting the user to merge and re-push.
|
|
|
|
* datalad-annex supports cloning from the web special remote,
|
|
using an url that contains the result of pushing to eg, a directory
|
|
special remote.
|
|
`datalad-annex::https://example.com?type=web&url={noquery}`
|
|
Supporting something like this would be good.
|
|
|
|
* Improve behavior in push races. A race can overwrite a change
|
|
to the MANIFEST and lose work that was pushed from the other repo.
|
|
From the user's perspective, that situation is the same as if one repo
|
|
pushed new work, then the other repo did a git push --force, overwriting
|
|
the first repo's push. In the first repo, another push will then fail as
|
|
a non fast-forward, and the user can recover as usual. This is probably
|
|
okish.
|
|
|
|
But.. a MANIFEST overwrite will leave bundle files in the remote that
|
|
are not listed in the MANIFEST. It seems likely that git-annex could
|
|
detect that after the fact and clean it up. Eg, if it caches
|
|
the last MANIFEST it uploaded, next time it downloads the MANIFEST
|
|
it can check if there are bundle files in the old one that are not
|
|
in the new one. If so, it can drop those bundle files from the remote.
|
|
|
|
* A push race can also appear to the user as if they pushed a ref, but then
|
|
it got deleted from the remote. This happens when two pushes are
|
|
pushing different ref names. This might be harder for the user to
|
|
notice; git fetch does not indicate that a remote ref got deleted.
|
|
They would have to use git fetch --prune to notice the deletion.
|
|
Once the user does notice, they can re-push their ref to recover.
|
|
Can this be improved?
|
|
|
|
* The race condition described in
|
|
[[!commit 797f27ab0517e0021363791ff269300f2ba095a5]]
|
|
where before git-annex init is run in a repo,
|
|
using git-remote-annex and at the same time git-annex init can lose
|
|
changes that the latter command (and ones after it) write to the
|
|
git-annex branch.
|
|
|
|
This should be fixable by making git-remote-annex not write to the
|
|
git-annex branch, but to eg, a temporary journal directory.
|
|
|
|
Also, cloning currently writes the special remote config into remote.log,
|
|
which might be slightly different in some way than the config in
|
|
remote.log for the same remote. cloning should not change the stored
|
|
config of a remote, but that branch write was necessary. So throwing
|
|
away the branch write is also good for this case.
|
|
|
|
Also, when the remote uses importtree=yes, pushing to it updates
|
|
content identifiers, which currently get recorded in the git-annex
|
|
branch. It would be good to avoid that being written as well.
|
|
|