This commit is contained in:
Joey Hess 2024-06-27 13:40:09 -04:00
parent dabd05e547
commit dceb8dc776
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 17 additions and 14 deletions

View file

@ -318,22 +318,24 @@ This does mean that cycles need to be prevented. See section below.
## speed ## speed
A passthrough proxy should be as fast as possible so as not to add overhead A proxy should be as fast as possible so as not to add overhead
to a file retrieve, store, or checkpresent. This probably means that to a file retrieve, store, or checkpresent. This probably means that
it keeps TCP connections open to each host in the cluster. It might use a it keeps TCP connections open to each host. It might use a
protocol with less overhead than ssh. protocol with less overhead than ssh.
In the case of checkpresent, it would be possible for the proxy to not In the case of checkpresent, it would be possible for the gateway to not
communicate with the cluster to check that the data is still present on it. communicate with cluster nodes to check that the data is still present
As long as all access is intermediated via the proxy, its git-annex branch in the cluster. As long as all access is intermediated via a single gateway,
could be relied on to always be correct, in theory. Proving that theory, its git-annex branch could be relied on to always be correct, in theory.
making sure to account for all possible race conditions and other scenarios, Proving that theory, making sure to account for all possible race conditions
would be necessary for such an optimisation. and other scenarios, would be necessary for such an optimisation. This
would not work for multi-gateway clusters unless the gateways were kept in
sync about locations, which they currently are not.
Another way the proxy could speed things up is to cache some subset of Another way the cluster gateway could speed things up is to cache some
content. Eg, analize what files are typically requested, and store another subset of content. Eg, analize what files are typically requested, and
copy of those on the proxy. Perhaps prioritize storing smaller files, where store another copy of those on the proxy. Perhaps prioritize storing
latency tends to swamp transfer speed. smaller files, where latency tends to swamp transfer speed.
## proxying to special remotes ## proxying to special remotes

View file

@ -33,6 +33,9 @@ For June's work on [[design/passthrough_proxy]], remaining todos:
* Encryption and chunking. See design for issues. * Encryption and chunking. See design for issues.
* Indirect uploads when proxying for special remote
(to be considered). See design.
* Getting a key from a cluster currently always selects the lowest cost * Getting a key from a cluster currently always selects the lowest cost
remote, and always the same remote if cost is the same. Should remote, and always the same remote if cost is the same. Should
round-robin amoung remotes, and prefer to avoid using remotes that round-robin amoung remotes, and prefer to avoid using remotes that
@ -45,8 +48,6 @@ For June's work on [[design/passthrough_proxy]], remaining todos:
Library to use: Library to use:
<https://hackage.haskell.org/package/hsyscall-0.4/docs/System-Syscall.html> <https://hackage.haskell.org/package/hsyscall-0.4/docs/System-Syscall.html>
* Indirect uploads (to be considered). See design.
* Support using a proxy when its url is a P2P address. * Support using a proxy when its url is a P2P address.
(Eg tor-annex remotes.) (Eg tor-annex remotes.)