diff --git a/doc/design/passthrouh_proxy.mdwn b/doc/design/passthrouh_proxy.mdwn index 6b5063e7df..7860d0f21f 100644 --- a/doc/design/passthrouh_proxy.mdwn +++ b/doc/design/passthrouh_proxy.mdwn @@ -88,3 +88,22 @@ The remote interface operates on object files stored on disk. See [[todo/transitive_transfers]] for discussion of that problem. If proxies get implemented, that problem should be revisited. +## speed + +A passthrough proxy should be as fast as possible so as not to add overhead +to a file retrieve, store, or checkpresent. This probably means that +it keeps TCP connections open to each host in the cluster. It might use a +protocol with less overhead than ssh. + +In the case of checkpresent, it would be possible for the proxy to not +communicate with the cluster to check that the data is still present on it. +As long as all access is intermediated via the proxy, its git-annex branch +could be relied on to always be correct, in theory. Proving that theory, +making sure to account for all possible race conditions and other scenarios, +would be necessary for such an optimisation. + +Another way the proxy could speed things up is to cache some subset of +content. Eg, analize what files are typically requested, and store another +copy of those on the proxy. Perhaps prioritize storing smaller files, where +latency tends to swamp transfer speed. +