oops, add the new todos meant to be in prev commit
This commit is contained in:
parent
87871f724e
commit
3c973aba57
5 changed files with 94 additions and 0 deletions
8
doc/todo/assistant_does_not_use_LiveUpdate.mdwn
Normal file
8
doc/todo/assistant_does_not_use_LiveUpdate.mdwn
Normal file
|
@ -0,0 +1,8 @@
|
|||
The assistant is using NoLiveUpdate, but it should be posssible to plumb
|
||||
a LiveUpdate through it from preferred content checking to location log
|
||||
updating.
|
||||
|
||||
The benefit would be when using balanced preferred content expressions,
|
||||
the assistant would get live updates about repo sizes.
|
||||
|
||||
(This is a deferred item from the [[todo/git-annex_proxies]] megatodo.) --[[Joey]]
|
35
doc/todo/faster_proxying.mdwn
Normal file
35
doc/todo/faster_proxying.mdwn
Normal file
|
@ -0,0 +1,35 @@
|
|||
Not that proxying is super slow, but it does involve bouncing content
|
||||
through the proxy, and could be made faster. Some ideas:
|
||||
|
||||
* A proxy to a local git repository spawns git-annex-shell
|
||||
to communicate with it. It would be more efficient to operate
|
||||
directly on the Remote. Especially when transferring content to/from it.
|
||||
But: When a cluster has several nodes that are local git repositories,
|
||||
and is sending data to all of them, this would need an alternate
|
||||
interface than `storeKey`, which supports streaming, of chunks
|
||||
of a ByteString.
|
||||
|
||||
* Use `sendfile()` to avoid data copying overhead when
|
||||
`receiveBytes` is being fed right into `sendBytes`.
|
||||
Library to use:
|
||||
<https://hackage.haskell.org/package/hsyscall-0.4/docs/System-Syscall.html>
|
||||
|
||||
* Getting a key from a cluster currently picks from amoung
|
||||
the lowest cost nodes at random. This could be smarter,
|
||||
eg prefer to avoid using nodes that are doing other transfers at the
|
||||
same time.
|
||||
|
||||
* The cost of a proxied node that is accessed via an intermediate gateway
|
||||
is currently the same as a node accessed via the cluster gateway. So in
|
||||
such a situation, git-annex may make a suboptimal choice of path.
|
||||
To fix this, there needs to be some way to tell how many hops through
|
||||
gateways it takes to get to a node. Currently the only way is to
|
||||
guess based on number of dashes in the node name, which is not satisfying.
|
||||
|
||||
Even counting hops is not very satisfying, one cluster gateway could
|
||||
be much more expensive to traverse than another one.
|
||||
|
||||
If seriously tackling this, it might be worth making enough information
|
||||
available to use spanning tree protocol for routing inside clusters.
|
||||
|
||||
(This is a deferred item from the [[todo/git-annex_proxies]] megatodo.) --[[Joey]]
|
12
doc/todo/git-remote-annex_support_for_p2phttp.mdwn
Normal file
12
doc/todo/git-remote-annex_support_for_p2phttp.mdwn
Normal file
|
@ -0,0 +1,12 @@
|
|||
Should be possible to use a git-remote-annex annex::$uuid url as
|
||||
remote.foo.url with remote.foo.annexUrl using annex+http, and so
|
||||
not need a separate web server to serve the git repository when using
|
||||
`git-annex p2phttp`.
|
||||
|
||||
Doesn't work currently because git-remote-annex urls only support
|
||||
special remotes.
|
||||
|
||||
It would need a new form of git-remote-annex url, eg:
|
||||
annex::$uuid?annex+http://example.com/git-annex/
|
||||
|
||||
(This is a deferred item from the [[todo/git-annex_proxies]] megatodo.) --[[Joey]]
|
13
doc/todo/proxying_for_p2phttp_and_tor-annex_remotes.mdwn
Normal file
13
doc/todo/proxying_for_p2phttp_and_tor-annex_remotes.mdwn
Normal file
|
@ -0,0 +1,13 @@
|
|||
git-annex can proxy for remotes that are accessed locally or over
|
||||
ssh, as well as special remotes. But, it cannot proxy for remotes that
|
||||
themselves have a annex+http annexUrl.
|
||||
|
||||
This would need a translation from P2P protocol to servant client.
|
||||
Should not be very hard to implement if someone needs it for some reason.
|
||||
|
||||
Also, git-annex could support proxying to remotes whose url is a P2P
|
||||
address. Eg, tor-annex remotes. This only needs a way to
|
||||
generate a RemoteSide for them.
|
||||
|
||||
(This is a deferred item from the [[todo/git-annex_proxies]] megatodo.) --[[Joey]]
|
||||
|
26
doc/todo/smarter_use_of_disk_when_proxying.mdwn
Normal file
26
doc/todo/smarter_use_of_disk_when_proxying.mdwn
Normal file
|
@ -0,0 +1,26 @@
|
|||
When proxying for a special remote, downloads can stream in from it and out
|
||||
the proxy, but that does happen via a temporary file, which grows to the
|
||||
full size of the file being downloaded. And uploads to a special get buffered to a
|
||||
temporary file.
|
||||
|
||||
It would be nice to do full streaming without temp files, but also it's a
|
||||
hard change to make.
|
||||
|
||||
Some improvements that could be made without making such a big change:
|
||||
|
||||
* When an upload to a cluster is distributed to multiple special remotes,
|
||||
a temporary file is written for each one, which may even happen in
|
||||
parallel. This is a lot of extra work and may use excess disk space.
|
||||
It should be possible to only write a single temp file.
|
||||
|
||||
* Check annex.diskreserve when proxying for special remotes
|
||||
to avoid the proxy's disk filling up with the temporary object file
|
||||
cached there.
|
||||
|
||||
* Resuming an interrupted download from proxied special remote makes the proxy
|
||||
re-download the whole content. It could instead keep some of the
|
||||
object files around when the client does not send SUCCESS. This would
|
||||
use more disk, but could minimize to eg, the last 2 or so.
|
||||
The [[design/passthrough_proxy]] design doc has some more thoughts about this.
|
||||
|
||||
(This is a deferred item from the [[todo/git-annex_proxies]] megatodo.) --[[Joey]]
|
Loading…
Reference in a new issue