This is a summary todo covering several subprojects, which would extend git-annex to be able to use proxies which sit in front of a cluster of repositories. 1. [[design/passthrough_proxy]] 2. [[design/p2p_protocol_over_http]] 3. [[design/balanced_preferred_content]] 4. [[todo/track_free_space_in_repos_via_git-annex_branch]] 5. [[todo/proving_preferred_content_behavior]] Joey has received funding to work on this. Planned schedule of work: * June: git-annex proxy * July, part 1: git-annex proxy support for exporttree * July, part 2: p2p protocol over http * August: balanced preferred content * September: streaming through proxy to special remotes (especially S3) * October: proving behavior of balanced preferred content with proxies [[!tag projects/openneuro]] # work notes In development on the `proxy` branch. For June's work on [[design/passthrough_proxy]], implementation plan: * UUID discovery via git-annex branch. Add a log file listing UUIDs accessible via proxy UUIDs. It also will contain the names of the remotes that the proxy is a proxy for, from the perspective of the proxy. (done) * Add `git-annex updateproxy` command and remote.name.annex-proxy configuration. (done) * Remote instantiation for proxies. (done) * Implement git-annex-shell proxying to git remotes. (done) * Proxy should update location tracking information for proxied remotes, so it is available to other users who sync with it. (done) * Consider getting instantiated remotes into git remote list. See design. * Cycle prevention. See design. * Make `git-annex copy --from $proxy` pick a node that contains each file, and use the instantiated remote for getting the file. Same for similar commands. * Make `git-annex drop --from $proxy` drop, when possible, from every remote accessible by the proxy. Communicate partial drops somehow. * Let `storeKey` return a list of UUIDs where content was stored, and make proxies accept uploads directed at them, rather than a specific instantiated remote, and fan out the upload to whatever nodes behind the proxy want it. This will need P2P protocol extensions. * Make commands like `git-annex push` not iterate over instantiated remotes, and instead just send content to the proxy for fanout. * Optimise proxy speed. See design for ideas. * Use `sendfile()` to avoid data copying overhead when `receiveBytes` is being fed right into `sendBytes`. * Encryption and chunking. See design for issues. * indirect uploads (to be considered). See design. * Support using a proxy when its url is a P2P address. (Eg tor-annex remotes.)