This commit is contained in:
Joey Hess 2024-03-13 10:29:48 -04:00
parent 1dd4109bc8
commit 3877be35bd
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -88,3 +88,22 @@ The remote interface operates on object files stored on disk. See
[[todo/transitive_transfers]] for discussion of that problem. If proxies
get implemented, that problem should be revisited.
## speed
A passthrough proxy should be as fast as possible so as not to add overhead
to a file retrieve, store, or checkpresent. This probably means that
it keeps TCP connections open to each host in the cluster. It might use a
protocol with less overhead than ssh.
In the case of checkpresent, it would be possible for the proxy to not
communicate with the cluster to check that the data is still present on it.
As long as all access is intermediated via the proxy, its git-annex branch
could be relied on to always be correct, in theory. Proving that theory,
making sure to account for all possible race conditions and other scenarios,
would be necessary for such an optimisation.
Another way the proxy could speed things up is to cache some subset of
content. Eg, analize what files are typically requested, and store another
copy of those on the proxy. Perhaps prioritize storing smaller files, where
latency tends to swamp transfer speed.