P2P protocol version 2, adding SUCCESS-PLUS and ALREADY-HAVE-PLUS

Client side support for SUCCESS-PLUS and ALREADY-HAVE-PLUS is complete, when a PUT stores to additional repositories than the expected on, the location log is updated with the additional UUIDs that contain the content. Started implementing PUT fanout to multiple remotes for clusters. It is untested, and I fear fencepost errors in the relative offset calculations. And it is missing proxying for the protocol after DATA.
2024-06-18 12:07:01 -04:00 · 2024-06-18 12:07:01 -04:00 · f18740699e
commit f18740699e
parent ca08f3fcc2
12 changed files with 206 additions and 61 deletions
--- a/doc/design/p2p_protocol.mdwn
+++ b/doc/design/p2p_protocol.mdwn
@ -55,7 +55,7 @@ any authentication.

 The client sends the highest protocol version it supports:

-	VERSION 2
+	VERSION 3

 The server responds with the highest protocol version it supports
 that is less than or equal to the version the client sent:
@ -132,7 +132,14 @@ spaces, since it's not the last token in the line. Use '%' to indicate
 whitespace.)

 The server may respond with ALREADY-HAVE if it already
-had the conent of that key. Otherwise, it responds with:
+had the conent of that key. 
+
+In protocol version 2, the server can optionally reply with
+ALREADY-HAVE-PLUS. The subsequent list of UUIDs are additional
+UUIDs where the content is stored, in addition to the UUID where
+the client was going to send it.
+
+Otherwise, it responds with:

 	PUT-FROM Offset

@ -152,6 +159,10 @@ was being sent.
 If the server successfully receives the data and stores the content,
 it replies with SUCCESS. Otherwise, FAILURE.

+In protocol version 2, the server can optionally reply with SUCCESS-PLUS.
+The subsequent list of UUIDs are additional UUIDs where the content was
+stored, in addition to the UUID where the client was sending it.
+
 ## Getting content from the server

 To get content from the server, the client sends:
--- a/doc/design/passthrough_proxy.mdwn
+++ b/doc/design/passthrough_proxy.mdwn
@ -251,31 +251,19 @@ No other protocol extensions or special cases should be needed.
 If we want to send a file to multiple repositories that are behind the same
 proxy, it would be wasteful to upload it through the proxy repeatedly.

-Perhaps a good user interface to this is `git-annex copy --to proxy`.
-The proxy could fan out the upload and store it in one or more nodes behind
-it. Using preferred content to select which nodes to use.
-This would need `storeKey` to be changed to allow returning a UUID (or UUIDs)
-where the content was actually stored.
+This is certianly needed when doing `git-annex copy --to remote-cluster`,
+the cluster picks the nodes to store the content in, and it needs to report
+back some UUID that is different than the cluster UUID, in order for the
+location log to get updated. (Cluster UUIDs are not written to the location
+log.) So this will need a change to the P2P protocol to support reporting
+back additional UUIDs where the content was stored.

-Alternatively, `git-annex copy --to proxy-foo` could notice that proxy-bar
-also wants the content, and fan out a copy to there. Then it could 
-record in its git-annex branch that the content is present in proxy-bar.
-If the user later does `git-annex copy --to proxy-bar`, it would avoid
-another upload (and the user would learn at that point that it was in
-proxy-bar). This avoids needing to change the `storeKey` interface.
-
-Should a proxy always fanout? if `git-annex copy --to proxy` is what does
-fanout, and `git-annex copy --to proxy-foo` doesn't, then the user has
-content. But if the latter does fanout, that might be annoying to users who
-want to use proxies, but want full control over what lands where, and don't
-want to use preferred content to do it. So probably fanout should be
-configurable. But it can't be configured client side, because the fanout
-happens on the proxy. Seems like remote.name.annex-fanout could be set to
-false to prevent fanout to a specific remote. (This is analagous to a
-remote having `git-annex assistant` running on it, it might fan out uploads
-to it to other repos, and only the owner of that repo can control it.)
-
-Alternatively, fanout could be limited to clusters.
+This might also be useful for proxies. `git-annex copy --to proxy-foo`
+could notice that proxy-bar also wants the content, and fan out a copy to
+there. But that might be annoying to users, who want full control over what
+goes where when using a proxy. Seems it would need a config setting. But
+since clusters will support fanout, it seems unncessary to make proxies
+also support it.

 A command like `git-annex push` would see all the instantiated remotes and
 would pick ones to send content to. If fanout is done, this would