2024-05-14 17:52:20 +00:00
|
|
|
git-remote-annex will be a program that allows push/pull/clone of a git
|
|
|
|
repository to many types of git-annex special remote.
|
2024-04-06 23:52:07 +00:00
|
|
|
|
|
|
|
This is a redesign and reimplementation of git-remote-datalad-annex.
|
|
|
|
It will be a safer implementation, will support incremental pushes, and
|
|
|
|
will be available to users who don't use datalad.
|
2024-05-10 18:41:18 +00:00
|
|
|
--[[Joey]]
|
2024-04-06 23:52:07 +00:00
|
|
|
|
2024-05-10 18:41:18 +00:00
|
|
|
---
|
2024-04-06 23:52:07 +00:00
|
|
|
|
2024-05-10 18:41:18 +00:00
|
|
|
This is implememented and working. Remaining todo list for it:
|
2024-04-25 21:01:17 +00:00
|
|
|
|
2024-05-14 20:17:27 +00:00
|
|
|
* Test incremental push edge cases involving checkprereq.
|
2024-05-13 18:30:18 +00:00
|
|
|
|
2024-05-13 18:42:25 +00:00
|
|
|
* Cloning from an annex:: url with importtree=yes doesn't work
|
|
|
|
(with or without exporttree=yes). This is because the ContentIdentifier
|
|
|
|
db is not populated.
|
|
|
|
|
2024-05-14 17:52:20 +00:00
|
|
|
* It would be nice if git-annex could generate an annex:: url
|
|
|
|
for a special remote and show it to the user, eg when
|
|
|
|
they have set the shorthand "annex::" url, so they know the full url.
|
|
|
|
`git-annex info $remote` could also display it.
|
|
|
|
Currently, the user has to remember how the special remote was
|
|
|
|
configured and replicate it all in the url.
|
|
|
|
|
|
|
|
There are some difficulties to doing this, including that
|
|
|
|
RemoteConfig can have hidden fields that should be omitted.
|
2024-05-10 18:41:18 +00:00
|
|
|
|
2024-05-14 17:52:20 +00:00
|
|
|
* initremote/enableremote could have an option that configures the url to a
|
|
|
|
special remote to a annex:: url. This would make it easier to use
|
|
|
|
git-remote-annex, since the user would not need to set up the url
|
|
|
|
themselves. (Also it would then avoid setting `skipFetchAll = true`)
|
2024-05-10 18:41:18 +00:00
|
|
|
|
2024-05-13 13:03:43 +00:00
|
|
|
* Improve recovery from interrupted push by using outManifest to clean up
|
|
|
|
after it. (Requires populating outManifest.)
|
|
|
|
|
2024-05-10 18:41:18 +00:00
|
|
|
* See XXX in uploadManifest about recovering from a situation
|
|
|
|
where the remote is left with a deleted manifest when a push
|
|
|
|
is interrupted part way through. This should be recoverable
|
|
|
|
by caching the manifest locally and re-uploading it when
|
2024-05-13 15:37:47 +00:00
|
|
|
the remote has no manifest or prompting the user to merge and re-push.
|
2024-05-10 18:41:18 +00:00
|
|
|
|
|
|
|
* datalad-annex supports cloning from the web special remote,
|
|
|
|
using an url that contains the result of pushing to eg, a directory
|
|
|
|
special remote.
|
|
|
|
`datalad-annex::https://example.com?type=web&url={noquery}`
|
|
|
|
Supporting something like this would be good.
|
|
|
|
|
|
|
|
* Improve behavior in push races. A race can overwrite a change
|
|
|
|
to the MANIFEST and lose work that was pushed from the other repo.
|
|
|
|
From the user's perspective, that situation is the same as if one repo
|
|
|
|
pushed new work, then the other repo did a git push --force, overwriting
|
|
|
|
the first repo's push. In the first repo, another push will then fail as
|
|
|
|
a non fast-forward, and the user can recover as usual. This is probably
|
|
|
|
okish.
|
|
|
|
|
|
|
|
But.. a MANIFEST overwrite will leave bundle files in the remote that
|
|
|
|
are not listed in the MANIFEST. It seems likely that git-annex could
|
|
|
|
detect that after the fact and clean it up. Eg, if it caches
|
|
|
|
the last MANIFEST it uploaded, next time it downloads the MANIFEST
|
|
|
|
it can check if there are bundle files in the old one that are not
|
|
|
|
in the new one. If so, it can drop those bundle files from the remote.
|
|
|
|
|
|
|
|
* A push race can also appear to the user as if they pushed a ref, but then
|
|
|
|
it got deleted from the remote. This happens when two pushes are
|
|
|
|
pushing different ref names. This might be harder for the user to
|
|
|
|
notice; git fetch does not indicate that a remote ref got deleted.
|
|
|
|
They would have to use git fetch --prune to notice the deletion.
|
|
|
|
Once the user does notice, they can re-push their ref to recover.
|
|
|
|
Can this be improved?
|
|
|
|
|
|
|
|
* The race condition described in
|
|
|
|
[[!commit 797f27ab0517e0021363791ff269300f2ba095a5]]
|
|
|
|
where before git-annex init is run in a repo,
|
|
|
|
using git-remote-annex and at the same time git-annex init can lose
|
|
|
|
changes that the latter command (and ones after it) write to the
|
|
|
|
git-annex branch.
|
|
|
|
|
|
|
|
This should be fixable by making git-remote-annex not write to the
|
|
|
|
git-annex branch, but to eg, a temporary journal directory.
|
2024-05-13 15:37:47 +00:00
|
|
|
|
2024-05-13 18:35:17 +00:00
|
|
|
Also, cloning currently writes the special remote config into remote.log,
|
|
|
|
which might be slightly different in some way than the config in
|
|
|
|
remote.log for the same remote. cloning should not change the stored
|
|
|
|
config of a remote, but that branch write was necessary. So throwing
|
|
|
|
away the branch write is also good for this case.
|
|
|
|
|
2024-05-14 17:57:56 +00:00
|
|
|
Also, when the remote uses importtree=yes, pushing to it updates
|
2024-05-13 15:37:47 +00:00
|
|
|
content identifiers, which currently get recorded in the git-annex
|
|
|
|
branch. It would be good to avoid that being written as well.
|
2024-05-14 17:52:20 +00:00
|
|
|
|