finalized design for proxying to exporttree=yes annexobjects=yes special remotes

This commit is contained in:
Joey Hess 2024-08-06 11:45:45 -04:00
parent 84d27cf34f
commit 4750ffbd3b
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 39 additions and 22 deletions

View file

@ -616,27 +616,33 @@ store any key:
* Configure annex-tracking-branch in the proxy's git config. * Configure annex-tracking-branch in the proxy's git config.
* Then the user's workflow is simply: `git-annex push` * Then the user's workflow is simply: `git-annex push`
* The proxy handles PUT/GET/REMOVE of a key that is not in the * The proxy handles PUT by always storing to the special remote's
annex-tracking branch that it currently knows about, by using .git/annex/objects/ location, not updating the exported tree.
the special remote's .git/annex/objects/ location. * The proxy allows REMOVE from the special remote's
* Upon receiving a new annex-tracking-branch or any transfer of a key .git/annex/objects/ location, but not removal of keys
used in the current annex-tracking-branch, the proxy can update that are in the currently exported tree.
the exporttree=yes remote. This needs to happen incrementally, * When `git-annex post-receive` is run by the post-receive hook
eg upon receiving a key, just proxy it on to the exporttree=yes remote, and the annex-tracking-branch has been updated, it exports
and update the export database. Once all keys are received, update the tree to the special remote.
the git-annex branch to indicate a new tree has been exported. (But, `git-annex push` sends the updated tree first, so
* `git-annex sync` may optionally push updates to the annex-tracking-branch this will often be an incomplete export.)
before sending content. This can let the proxy be more efficient, * When there is an incomplete export and a key is received
especially when the special remote does not support renaming. that is part of that export, check if it is the *last* key
that is needed to complete the export. If so, export the tree to the
Note that this necessarily means that an object that the client uploads special remote again.
once to the proxy might need to be uploaded multiple times from the proxy (This avoids overhead and complication of incrementally updating
to the special remote. Eg, if a key is used 10 times in a tree, it will the export. It relies on the special remote supporting renameExport.
need to upload 10 times. Adding a "copy" operation to exportActions would Incrementally updating the export might be worth doing eventually,
avoid this problem, but only for special remotes that were able to for special remotes that do no support renameExport.)
implement it. Even a rename of a single file can need the proxy to download * When exporting a tree to the special remote, handle cases
it from the special remote and upload it back under a new name, when the where a single key is used by multiple files, and the key is not
special remote does not support renames. present locally. In this case it currently fails to update
one of the files (and renames the annexobjects location to the other
one). It will need to download the content from the special remote and
send it back to it.
* When the special remote does not support renameExport, will need to
download from the annexobjects location in order to store to the export
location.
## possible enhancement: indirect uploads ## possible enhancement: indirect uploads

View file

@ -351,7 +351,6 @@ content from the key-value store.
See [[git-annex-extendcluster](1) for details. See [[git-annex-extendcluster](1) for details.
* `updateproxy` * `updateproxy`
Update records with proxy configuration. Update records with proxy configuration.

View file

@ -33,6 +33,18 @@ Planned schedule of work:
* Working on `exportreeplus` branch which is groundwork for proxying to * Working on `exportreeplus` branch which is groundwork for proxying to
exporttree=yes special remotes. exporttree=yes special remotes.
* `git-annex post-receive` of a proxied exporttree=yes special remote's
annex-tracking-branch needs to exporttree.
* When there is an incomplete export and a key is received, the proxy
should check if it's the *last* key that is needed to complete the
export, and when so, do a final exporttree.
* Handle cases where a single key is used by multiple files in the exported
tree. Need to download from the special remote in order to export
multiple copies to it.
* Handle case where the special remote does not support renameExport.
Each key will need to be downloaded from it in order to export the key
back to it, if the proxy is to support such a remote.
## items deferred until later for p2p protocol over http ## items deferred until later for p2p protocol over http
* `git-annex p2phttp` should support serving several repositories at the same * `git-annex p2phttp` should support serving several repositories at the same