3e7324bbcb
This avoids some apparently otherwise unsolveable problems involving races that resulted in the manifest listing bundles that were deleted. Removed the annex-max-git-bundles config because it can't actually result in deleting old bundles. It would still be possible to have a config that controls how often to do a full push, which would avoid needing to download too many bundles on clone, as well as needing to checkpresent too many bundles in verifyManifest. But it would need a different name and description.
65 lines
3.1 KiB
Markdown
65 lines
3.1 KiB
Markdown
git-remote-annex will be a program that allows push/pull/clone of a git
|
|
repository to many types of git-annex special remote.
|
|
|
|
This is a redesign and reimplementation of git-remote-datalad-annex.
|
|
It will be a safer implementation, will support incremental pushes, and
|
|
will be available to users who don't use datalad.
|
|
--[[Joey]]
|
|
|
|
---
|
|
|
|
This is implememented and working. Remaining todo list for it:
|
|
|
|
* Test incremental push edge cases involving checkprereq.
|
|
|
|
* Cloning a special remote with an empty manifest results in a repo where
|
|
git fetch fails, claiming the special remote is encrypted, when it's not.
|
|
|
|
* Cloning from an annex:: url with importtree=yes doesn't work
|
|
(with or without exporttree=yes). This is because the ContentIdentifier
|
|
db is not populated. It should be possible to work around this.
|
|
|
|
* It would be nice if git-annex could generate an annex:: url
|
|
for a special remote and show it to the user, eg when
|
|
they have set the shorthand "annex::" url, so they know the full url.
|
|
`git-annex info $remote` could also display it.
|
|
Currently, the user has to remember how the special remote was
|
|
configured and replicate it all in the url.
|
|
|
|
There are some difficulties to doing this, including that
|
|
RemoteConfig can have hidden fields that should be omitted.
|
|
|
|
* initremote/enableremote could have an option that configures the url to a
|
|
special remote to a annex:: url. This would make it easier to use
|
|
git-remote-annex, since the user would not need to set up the url
|
|
themselves. (Also it would then avoid setting `skipFetchAll = true`)
|
|
|
|
* datalad-annex supports cloning from the web special remote,
|
|
using an url that contains the result of pushing to eg, a directory
|
|
special remote.
|
|
`datalad-annex::https://example.com?type=web&url={noquery}`
|
|
Supporting something like this would be good.
|
|
|
|
* Improve behavior in push races. A race can overwrite a change
|
|
to the MANIFEST and lose work that was pushed from the other repo.
|
|
From the user's perspective, that situation is the same as if one repo
|
|
pushed new work, then the other repo did a git push --force, overwriting
|
|
the first repo's push. In the first repo, another push will then fail as
|
|
a non fast-forward, and the user can recover as usual. This is probably
|
|
okish.
|
|
|
|
But.. a MANIFEST overwrite will leave bundle files in the remote that
|
|
are not listed in the MANIFEST. It seems likely that git-annex could
|
|
detect that after the fact and clean it up. Eg, if it caches
|
|
the last MANIFEST it uploaded, next time it downloads the MANIFEST
|
|
it can check if there are bundle files in the old one that are not
|
|
in the new one. If so, it can drop those bundle files from the remote.
|
|
(May be unsafe, see below section on bundle deletion problems.)
|
|
|
|
* A push race can also appear to the user as if they pushed a ref, but then
|
|
it got deleted from the remote. This happens when two pushes are
|
|
pushing different ref names. This might be harder for the user to
|
|
notice; git fetch does not indicate that a remote ref got deleted.
|
|
They would have to use git fetch --prune to notice the deletion.
|
|
Once the user does notice, they can re-push their ref to recover.
|
|
Can this be improved?
|