From b9889917a32334e4dcbb101fa74c4adbc33871ab Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 25 Jun 2024 15:27:03 -0400 Subject: [PATCH] thoughts on cycles Rejected the idea of automatically instantiating remotes for proxies-of-proxies. That needs cycle protection, while the current behavior, which happened for free, is that running git-annex updateproxy on the proxy can be used to configure it, but only for topologies that actually exist. --- doc/design/passthrough_proxy.mdwn | 35 ++++++++----------------------- doc/todo/git-annex_proxies.mdwn | 11 +++------- 2 files changed, 12 insertions(+), 34 deletions(-) diff --git a/doc/design/passthrough_proxy.mdwn b/doc/design/passthrough_proxy.mdwn index 24917f8aac..fe91ac7e7f 100644 --- a/doc/design/passthrough_proxy.mdwn +++ b/doc/design/passthrough_proxy.mdwn @@ -462,36 +462,19 @@ proxy, etc. Since the proxied repo uuid is communicated to git-annex-shell via --uuid, a repo that advertises proxying for itself will be connected to -with its own uuid. No proxying is done in this case. Same happens with a -larger cycle. - -Instantiating remotes needs to identity cycles and break them. Otherwise -it would construct an infinite number of proxied remotes with names -like "foo-foo-foo-foo-..." or "foo-bar-foo-bar-..." - -Once `git-annex copy --to proxy` is implemented, and the proxy decides -where to send content that is being sent directly to it, cycles will -become an issue with that as well. +with its own uuid. No proxying is done in this case. What if repo A is a proxy and has repo B as a remote. Meanwhile, repo B is -a proxy and has repo A as a remote? +a proxy and has repo A as a remote? git-annex-shell on repo A will get +A's uuid, and so will operate on it directly without proxying. So larger +cycles are also not a problem on the proxy side. -An upload to repo A will start by checking if repo B wants the content and if so, -start an upload to repo B. Then the same happens on repo B, leading it to -start an upload to repo A. +On the client side, instantiating remotes needs to identity cycles and +break them. Otherwise it would construct an infinite number of proxied +remotes with names like "foo-foo-foo-foo-..." or "foo-bar-foo-bar-..." -At this point, it might be possible for git-annex to detect the cycle, -if the proxy uses a transfer lock file. If repo B or repo A had some other -remote that is not part of a cycle, they could deposit the upload there and -the upload still succeed. Otherwise the upload would fail, which is -probably the best that can be done with such a broken configuration. - -So, it seems like proxies would need to take transfer locks for uploads, -even though the content is being proxied to elsewhere. - -Dropping could have similar cycles with content presence locking, which -needs to be thought through as well. A cycle of the actual dropContent -operation might also be possible. +Clusters could also have cycles, if a cluster's UUID were configured as +a node of itself, or of another cluster that was a node of it. ## exporttree=yes diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn index e3491b57cc..5c8ee8e875 100644 --- a/doc/todo/git-annex_proxies.mdwn +++ b/doc/todo/git-annex_proxies.mdwn @@ -33,14 +33,9 @@ For June's work on [[design/passthrough_proxy]], remaining todos: * Basic proxying to special remote support (non-streaming). -* Support proxies-of-proxies better, eg foo-bar-baz. - Currently, it does work, but have to run `git-annex updateproxy` - on foo in order for it to notice the bar-baz proxied remote exists, - and record it as foo-bar-baz. Make it skip recording proxies of - proxies like that, and instead automatically generate those from the log. - (With cycle prevention there of course.) - -* Cycle prevention including cluster-in-cluster cycles. See design. +* Make sure that cluster-in-cluster cycles are prevented. + (Actually supporting cluster-in-cluster is optional, and it might + be added later.) * Optimise proxy speed. See design for ideas.