184 lines
6.3 KiB
Markdown
184 lines
6.3 KiB
Markdown
This is a summary todo covering several subprojects, which extend
|
|
git-annex to be able to use proxies which sit in front of a cluster of
|
|
repositories.
|
|
|
|
1. [[design/passthrough_proxy]]
|
|
2. [[design/p2p_protocol_over_http]]
|
|
3. [[design/balanced_preferred_content]]
|
|
4. [[todo/track_free_space_in_repos_via_git-annex_branch]]
|
|
5. [[todo/proving_preferred_content_behavior]]
|
|
|
|
## table of contents
|
|
|
|
[[!toc ]]
|
|
|
|
## plan
|
|
|
|
Joey has received funding to work on this.
|
|
Planned schedule of work:
|
|
|
|
* June: git-annex proxies and clusters
|
|
* July: p2p protocol over http
|
|
* August, part 1: git-annex proxy support for exporttree
|
|
* August, part 2: balanced preferred content
|
|
* September: proving behavior of balanced preferred content with proxies
|
|
* October: streaming through proxy to special remotes (especially S3)
|
|
|
|
> This project is now complete! [[done]] --[[Joey]]
|
|
|
|
[[!tag projects/openneuro]]
|
|
|
|
## some todos that spun off from this project and didn't get implemented during it:
|
|
|
|
For balanced preferred content and maxsize tracking:
|
|
|
|
* [[todo/assistant_does_not_use_LiveUpdate]]
|
|
* [[todo/git-annex_info_with_limit_overcounts]]
|
|
|
|
For p2p protocol over http:
|
|
|
|
* [[p2phttp_serve_multiple_repositories]]
|
|
* [[git-remote-annex_support_for_p2phttp]]
|
|
|
|
For proxying:
|
|
|
|
* [[proxying_for_p2phttp_and_tor-annex_remotes]]
|
|
* [[faster_proxying]]
|
|
* [[smarter_use_of_disk_when_proxying]]
|
|
|
|
## completed items for October's work on streaming through proxy to special remotes
|
|
|
|
* Stream downloads through proxy for all special remotes that indicate
|
|
they download in order.
|
|
* Added ORDERED message to external special remote protocol.
|
|
* Added DATA-PRESENT and documented in
|
|
[[tips/client_side_upload_to_a_special_remote]]
|
|
|
|
## completed items for September's work on proving behavior of preferred content
|
|
|
|
* Static analysis to detect "not present", "not balanced", and similar
|
|
unstable preferred content expressions and avoid problems with them.
|
|
* Implemented `git-annex sim` command.
|
|
* Simulated a variety of repository networks, and random preferred content
|
|
expressions, checking that a stable state is always reached.
|
|
* Fix bug that prevented anything being stored in an empty
|
|
repository whose preferred content expression uses sizebalanced.
|
|
(Identified via `git-annex sim`)
|
|
|
|
## completed items for August's work on balanced preferred content
|
|
|
|
* Balanced preferred content basic implementation, including --rebalance
|
|
option.
|
|
* Implemented [[track_free_space_in_repos_via_git-annex_branch]]
|
|
* Implemented tracking of live changes to repository sizes.
|
|
* `git-annex maxsize`
|
|
* annex.fullybalancedthreshhold
|
|
|
|
## completed items for August's work on git-annex proxy support for exporttre
|
|
|
|
* Special remotes configured with exporttree=yes annexobjects=yes
|
|
can store objects in .git/annex/objects, as well as an exported tree.
|
|
|
|
* Support proxying to special remotes configured with
|
|
exporttree=yes annexobjects=yes.
|
|
|
|
* post-retrieve: When proxying is enabled for an exporttree=yes
|
|
special remote and the configured remote.name.annex-tracking-branch
|
|
is received, the tree is exported to the special remote.
|
|
|
|
* When getting from a P2P HTTP remote, prompt for credentials when
|
|
required, instead of failing.
|
|
|
|
* Prevent `updateproxy` and `updatecluster` from adding
|
|
an exporttree=yes special remote that does not have
|
|
annexobjects=yes, to avoid foot shooting.
|
|
|
|
* Implement `git-annex export treeish --to=foo --from=bar`, which
|
|
gets from bar as needed to send to foo. Make post-retrieve use
|
|
`--to=r --from=r` to handle the multiple files case.
|
|
|
|
## completed items for July's work on p2p protocol over http
|
|
|
|
* HTTP P2P protocol design [[design/p2p_protocol_over_http]].
|
|
|
|
* addressed [[doc/todo/P2P_locking_connection_drop_safety]]
|
|
|
|
* implemented server and client for HTTP P2P protocol
|
|
|
|
* added git-annex p2phttp command to serve HTTP P2P protocol
|
|
|
|
* Make git-annex p2phttp support https.
|
|
|
|
* Allow using annex+http urls in remote.name.annexUrl
|
|
|
|
* Make http server support proxying.
|
|
|
|
* Make http server support serving a cluster.
|
|
|
|
## completed items for June's work on [[design/passthrough_proxy]]:
|
|
|
|
* UUID discovery via git-annex branch. Add a log file listing UUIDs
|
|
accessible via proxy UUIDs. It also will contain the names
|
|
of the remotes that the proxy is a proxy for,
|
|
from the perspective of the proxy. (done)
|
|
|
|
* Add `git-annex updateproxy` command (done)
|
|
|
|
* Remote instantiation for proxies. (done)
|
|
|
|
* Implement git-annex-shell proxying to git remotes. (done)
|
|
|
|
* Proxy should update location tracking information for proxied remotes,
|
|
so it is available to other users who sync with it. (done)
|
|
|
|
* Implement `git-annex initcluster` and `git-annex updatecluster` commands (done)
|
|
|
|
* Implement cluster UUID insertation on location log load, and removal
|
|
on location log store. (done)
|
|
|
|
* Omit cluster UUIDs when constructing drop proofs, since lockcontent will
|
|
always fail on a cluster. (done)
|
|
|
|
* Don't count cluster UUID as a copy in numcopies checking etc. (done)
|
|
|
|
* Tab complete proxied remotes and clusters in eg --from option. (done)
|
|
|
|
* Getting a key from a cluster should proxy from one of the nodes that has
|
|
it. (done)
|
|
|
|
* Implement upload with fanout to multiple cluster nodes and reporting back
|
|
additional UUIDs over P2P protocol. (done)
|
|
|
|
* Implement cluster drops, trying to remove from all nodes, and returning
|
|
which UUIDs it was dropped from. (done)
|
|
|
|
* `git-annex testremote` works against proxied remote and cluster. (done)
|
|
|
|
* Avoid `git-annex sync --content` etc from operating on cluster nodes by
|
|
default since syncing with a cluster implicitly syncs with its nodes. (done)
|
|
|
|
* On upload to cluster, send to nodes where its preferred content, and not
|
|
to other nodes. (done)
|
|
|
|
* Support annex.jobs for clusters. (done)
|
|
|
|
* Add `git-annex extendcluster` command and extend `git-annex updatecluster`
|
|
to support clusters with multiple gateways. (done)
|
|
|
|
* Support proxying for a remote that is proxied by another gateway of
|
|
a cluster. (done)
|
|
|
|
* Support distributed clusters: Make a proxy for a cluster repeat
|
|
protocol messages on to any remotes that have the same UUID as
|
|
the cluster. Needs extension to P2P protocol to avoid cycles.
|
|
(done)
|
|
|
|
* Proxied cluster nodes should have slightly higher cost than the cluster
|
|
gateway. (done)
|
|
|
|
* Basic support for proxying special remotes. (But not exporttree=yes ones
|
|
yet.) (done)
|
|
|
|
* Tab complete remotes in all relevant commands (done)
|
|
|
|
* Display cluster and proxy information in git-annex info (done)
|