This commit is contained in:
Joey Hess 2024-06-12 11:55:18 -04:00
parent 2e76a4744f
commit 67d1e2a459
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 58 additions and 31 deletions

View file

@ -108,16 +108,34 @@ The only real difference seems to be that the UUID of a remote is cached,
so A could only do this the first time we accessed it, and not later.
With UUID discovery, A can do that at any time.
## user interface
## proxied remote names
What to name the instantiated remotes? Probably the best that could
be done is to use the proxy's own remote names as suffixes on the client.
Eg, the proxy's "node1" remote is "proxy-node1".
But, the user might have their own "proxy-node1" remote configured that
points to something else. To avoid a proxy changing the configuration of
the user's remote to point to its remote, git-annex must avoid
instantiating a proxied remote when there's already a configuration for a
remote with that same name.
That does mean that, if a user wants to set a git config for a proxy
remote, they will need to manually set its annex-uuid and its url.
Which is awkward. Many git configs of the proxy remote can be inherited by
the instantiated remotes, so users won't often need to do that.
A user can also set up a remote with another name that they
prefer, that points at a remote behind a proxy. They just need to set
its annex-uuid and its url. Perhaps there should be a git-annex command
that eases setting up a remote like that?
## user interface
But the user probably doesn't want to pick which node to send content to.
They don't necessarily know anything about the nodes. Ideally the user
would `git-annex copy --to proxy` or `git-annex push` and let it pick
which instantiated remote(s) to send to.
which proxied remote(s) to send to.
To make `git-annex copy --to proxy` work, `storeKey` could be changed to
allow returning a UUID (or UUIDs) where the content was actually stored.

View file

@ -26,46 +26,55 @@ In development on the `proxy` branch.
For June's work on [[design/passthrough_proxy]], implementation plan:
1. UUID discovery via git-annex branch. Add a log file listing UUIDs
accessible via proxy UUIDs. It also will contain the names
of the remotes that the proxy is a proxy for,
from the perspective of the proxy. (done)
* UUID discovery via git-annex branch. Add a log file listing UUIDs
accessible via proxy UUIDs. It also will contain the names
of the remotes that the proxy is a proxy for,
from the perspective of the proxy. (done)
1. Add `git-annex updateproxy` command and remote.name.annex-proxy
configuration. (done)
* Add `git-annex updateproxy` command and remote.name.annex-proxy
configuration. (done)
2. Remote instantiation for proxies. (done)
* Remote instantiation for proxies. (done)
3. Implement git-annex-shell proxying to git remotes. (done)
* Implement git-annex-shell proxying to git remotes. (done)
3. Proxy should update location tracking information for proxied remotes,
so it is available to other users who sync with it. (done)
* Proxy should update location tracking information for proxied remotes,
so it is available to other users who sync with it. (done)
4. Let `storeKey` return a list of UUIDs where content was stored,
and make proxies accept uploads directed at them, rather than a specific
instantiated remote, and fan out the upload to whatever nodes behind
the proxy want it. This will need P2P protocol extensions.
* Would it be possible to get instantiated remotes into git remote list?
This would make eg, tab completion of remote names work. Just setting
annex-uuid would suffice, but currently any such config prevents setting
up a remote as an instantiated remote. Perhaps if only annex-uuid and no
other config is set, treat that as an instantiated remote, and overwrite
the annex-uuid config as necessaery? Or, add a config that says this is
an instanatiated remote, and when set, allow overwriting configs.
This seems better, it would let `git push proxy-foo` work, for example.
5. Make `git-annex copy --from $proxy` pick a node that contains each
file, and use the instantiated remote for getting the file. Same for
similar commands.
* Cycle prevention. See design.
6. Make `git-annex drop --from $proxy` drop, when possible, from every
remote accessible by the proxy. Communicate partial drops somehow.
* Make `git-annex copy --from $proxy` pick a node that contains each
file, and use the instantiated remote for getting the file. Same for
similar commands.
7. Make commands like `git-annex push` not iterate over instantiate
remotes, and instead just send content to the proxy for fanout.
* Make `git-annex drop --from $proxy` drop, when possible, from every
remote accessible by the proxy. Communicate partial drops somehow.
8. Optimise proxy speed. See design for idea.
* Let `storeKey` return a list of UUIDs where content was stored,
and make proxies accept uploads directed at them, rather than a specific
instantiated remote, and fan out the upload to whatever nodes behind
the proxy want it. This will need P2P protocol extensions.
8. Use `sendfile()` to avoid data copying overhead when
`receiveBytes` is being fed right into `sendBytes`.
* Make commands like `git-annex push` not iterate over instantiated
remotes, and instead just send content to the proxy for fanout.
9. Encryption and chunking. See design for issues.
* Optimise proxy speed. See design for ideas.
10. Cycle prevention. See design.
* Use `sendfile()` to avoid data copying overhead when
`receiveBytes` is being fed right into `sendBytes`.
11. indirect uploads (to be considered). See design.
* Encryption and chunking. See design for issues.
12. Support using a proxy when its url is a P2P address.
(Eg tor-annex remotes.)
* indirect uploads (to be considered). See design.
* Support using a proxy when its url is a P2P address.
(Eg tor-annex remotes.)