Sped up proxied downloads from special remotes, by streaming

Currently works for special remotes that don't use fileRetriever. Ones that
do will download to another filename and rename it into place, defeating
the streaming.

This actually benchmarks slightly slower when getting a large file from
a fast proxied special remote. However, when the proxied special remote
is slow, it will be a big win.
This commit is contained in:
Joey Hess 2024-10-15 12:22:34 -04:00
parent 76a1989a0e
commit edaed18e4c
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 110 additions and 32 deletions

View file

@ -30,28 +30,11 @@ Planned schedule of work:
* Currently working on streaming download via proxy from special remote.
* Tried implementing a background thread in the proxy that runs while
retrieving a file, to stream it out as it comes in. That failed because
reading from a file that the same process is writing to is prevented by
locking in haskell. (Could be gotten around by using FD rather than Handle,
but would need to read from the FD and use packCString to make a ByteString.)
But also, remotes using fileRetriever retrieve to the temp object file,
* Remotes using fileRetriever retrieve to the temp object file,
before it is renamed to the requested file. In the case of a proxy,
that is a different file, and so it won't see the file until it's all
been transferred and renamed.
* Could the P2P protocol be used as an alternate interface for a special
remote? Would avoid needing temp files when proxying for special remotes,
and would support resume from offset as well for special remotes for
which that makes sense.
But this would need encryption and chunking to be implemented on top of
the P2P protocol, and all special remotes rewritten, and a bridge for the
current external special remote interface or rewrite all external special
remotes. Probably not worth it to unify the two things like this, if the
only benefit is streaming through a proxy.
## completed items for September's work on proving behavior of preferred content
* Static analysis to detect "not present", "not balanced", and similar