This commit is contained in:
Joey Hess 2024-06-27 13:40:09 -04:00
parent dabd05e547
commit dceb8dc776
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 17 additions and 14 deletions

View file

@ -318,22 +318,24 @@ This does mean that cycles need to be prevented. See section below.
## speed
A passthrough proxy should be as fast as possible so as not to add overhead
A proxy should be as fast as possible so as not to add overhead
to a file retrieve, store, or checkpresent. This probably means that
it keeps TCP connections open to each host in the cluster. It might use a
it keeps TCP connections open to each host. It might use a
protocol with less overhead than ssh.
In the case of checkpresent, it would be possible for the proxy to not
communicate with the cluster to check that the data is still present on it.
As long as all access is intermediated via the proxy, its git-annex branch
could be relied on to always be correct, in theory. Proving that theory,
making sure to account for all possible race conditions and other scenarios,
would be necessary for such an optimisation.
In the case of checkpresent, it would be possible for the gateway to not
communicate with cluster nodes to check that the data is still present
in the cluster. As long as all access is intermediated via a single gateway,
its git-annex branch could be relied on to always be correct, in theory.
Proving that theory, making sure to account for all possible race conditions
and other scenarios, would be necessary for such an optimisation. This
would not work for multi-gateway clusters unless the gateways were kept in
sync about locations, which they currently are not.
Another way the proxy could speed things up is to cache some subset of
content. Eg, analize what files are typically requested, and store another
copy of those on the proxy. Perhaps prioritize storing smaller files, where
latency tends to swamp transfer speed.
Another way the cluster gateway could speed things up is to cache some
subset of content. Eg, analize what files are typically requested, and
store another copy of those on the proxy. Perhaps prioritize storing
smaller files, where latency tends to swamp transfer speed.
## proxying to special remotes

View file

@ -33,6 +33,9 @@ For June's work on [[design/passthrough_proxy]], remaining todos:
* Encryption and chunking. See design for issues.
* Indirect uploads when proxying for special remote
(to be considered). See design.
* Getting a key from a cluster currently always selects the lowest cost
remote, and always the same remote if cost is the same. Should
round-robin amoung remotes, and prefer to avoid using remotes that
@ -45,8 +48,6 @@ For June's work on [[design/passthrough_proxy]], remaining todos:
Library to use:
<https://hackage.haskell.org/package/hsyscall-0.4/docs/System-Syscall.html>
* Indirect uploads (to be considered). See design.
* Support using a proxy when its url is a P2P address.
(Eg tor-annex remotes.)