From 9984252ab5a0de5cac0fadeaecec055ae86bdb5e Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 22 Jul 2024 19:50:08 -0400 Subject: [PATCH] P2P protocol is finalized --- doc/design/p2p_protocol_over_http.mdwn | 492 ++++++++++++++---- doc/design/p2p_protocol_over_http/draft1.mdwn | 423 --------------- doc/todo/git-annex_proxies.mdwn | 11 +- 3 files changed, 388 insertions(+), 538 deletions(-) delete mode 100644 doc/design/p2p_protocol_over_http/draft1.mdwn diff --git a/doc/design/p2p_protocol_over_http.mdwn b/doc/design/p2p_protocol_over_http.mdwn index 6fe7ea4d4e..53863760bc 100644 --- a/doc/design/p2p_protocol_over_http.mdwn +++ b/doc/design/p2p_protocol_over_http.mdwn @@ -1,153 +1,427 @@ [[!toc ]] -## motivation +## introduction The [[P2P protocol]] is a custom protocol that git-annex speaks over a ssh -connection (mostly). This is a design working on supporting the P2P -protocol over HTTP. +connection (mostly). This is a translation of that protocol to HTTP. -Upload of annex objects to git remotes that use http is currently not -supported by git-annex, and this would be a generally very useful addition. +## base64 encoding of keys, uuids, and filenames -For use cases such as OpenNeuro's javascript client, ssh is too difficult -to support, so they currently use a special remote that talks to a http -endpoint in order to upload objects. Implementing this would let them -talk to git-annex over http. +A git-annex key can contain text in any encoding. So can a filename, +and it's even possible, though unlikely, that the UUID of a git-annex +repository might. -With the [[passthrough_proxy]], this would let clients configure a single -http remote that accesses a more complicated network of git-annex -repositories. +But this protocol requires that UTF-8 be used throughout, except +where bodies use `Content-Type: application/octet-stream`. -## integration with git +So this protocol allows using +[base64url](https://datatracker.ietf.org/doc/html/rfc4648#section-5) +encoding for such values. Any key, filename, or UUID wrapped in square +brackets is a base64url encoded value. +For example, "[Zm9v]" is the same as "foo". -A webserver that is configured to serve a git repository either serves the -files in the repository with dumb http, or uses the git-http-backend CGI -program for url paths under eg `/git/`. +A filename like "[foo]" will need to itself be encoded that way: "[W2Zvb10=]" -To integrate with that, git-annex would need a git-annex-http-backend CGI -program, that the webserver is configured to run for url paths under -`/git/.*/annex/`. +## authentication -So, for a remote with an url `http://example.com/git/foo`, git-annex would -use paths under `http://example.com/git/foo/annex/` to run its CGI. +Some requests need authentication. Which requests do depends on the +configuration of the HTTP server. When a request needs authentication, +it will fail with 401 Unauthorized. -But, the CGI interface is a poor match for the P2P protocol. +Authentication is done using HTTP basic auth. The realm to use when +authenticating is "git-annex". The charset is UTF-8. -A particular problem is that `LOCKCONTENT` would need to be in one CGI -request, followed by another request to `UNLOCKCONTENT`. Unless -git-annex-http-backend forked a daemon to keep the content locked, it would -not be able to retain a file lock across the 2 requests. While the 10 -minute retention lock would paper over that, UNLOCKCONTENT would not be -able to delete the retention lock, because there is no way to know if -another LOCKCONTENT was received later. So LOCKCONTENT would always lock -content for 10 minutes. Which would result in some undesirable behaviors. +When authentication is successful but does not allow a request to be +performed, it will fail with 403 Forbidden. -Another problem is with proxies and clusters. The CGI would need to open -ssh (or http) connections to the proxied repositories and cluster nodes -each time it is run. That would add a lot of latency to every request. +Note that HTTP basic auth is not encrypted so is only secure when used +over HTTPS. -And running a git-annex process once per CGI request also makes git-annex's -own startup speed, which is ok but not great, add latency. And each time -the CGI needed to change the git-annex branch, it would have to commit on -shutdown. Lots of time and space optimisations would be prevented by using -the CGI interface. +## protocol version -So, rather than having the CGI program do anything in the repository -itself, have it pass each request through to a long-running server. -(This does have the downside that files would get double-copied -through the CGI, which adds some overhead.) -A reasonable way to do that would be to have a webserver speaking a -HTTP version of the git-annex P2P protocol and the CGI just talks to that. +Requests are versioned. The versions correspond to +P2P protocol versions. The version is part of the request path, +eg "v3" -The CGI program then becomes tiny, and just needs to know the url to -connect to the git-annex HTTP server. +If the server does not support a particular protocol version, the +request will fail with a 404, and the client should fall +back to an earlier protocol version. -Alternatively, a remote's configuration could include that url, and -then we don't need the complication and overhead of the CGI program at all. -Eg: +## common request parameters - git config remote.origin.annex-url http://example.com:8080/ +Every request supports this parameter, and unless documented +otherwise, it is required to be included. -So, the rest of this design will focus on implementing that. The CGI -program can be added later if desired, so avoid users needing to configure -an additional thing. +* `clientuuid` -Note that, one nice benefit of having a separate annex-url is it allows -having remote.origin.url on eg github, but with an annex-url configured -that remote can also be used as a git-annex repository. + The value is the UUID of the git-annex repository of the client. -## approach 1: websockets +Any request may also optionally include these parameters: -The client connects to the server over a websocket. From there on, -the protocol is encapsulated in websockets. +* `bypass` -This seems nice and simple to implement, but not very web native. Anyone -wanting to talk to this web server would need to understand the P2P -protocol. Just to upload a file would need to deal with AUTH, -AUTH-SUCCESS, AUTH-FAILURE, VERSION, PUT, ALREADY-HAVE, PUT-FROM, DATA, -INVALID, VALID, SUCCESS, and FAILURE messages. Seems like a lot. + The value is the UUID of a cluster gateway, which the server should avoid + connecting to when serving a cluster. This is the equivilant of the + `BYPASS` message in the [[P2P_Protocol]]. -Some requests like `LOCKCONTENT` do need full duplex communication like -websockets provide. But, it might be more web native to only use websockets -for that request, and not for everything. + This parameter can be given multiple times to list several cluster + gateway UUIDs. -## approach 2: web-native API + This parameter is only available for v2 and above. -Another approach is to define a web-native API with endpoints that -correspond to each action in the P2P protocol. +[Internally, git-annex can use these common parameters, plus the protocol +version, and remote UUID, to create a P2P session. The P2P session is +driven through the AUTH, VERSION, and BYPASS messages, leaving the session +ready to service requests.] -Something like this: +## requests - > POST /git-annex/v1/AUTH?clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.0 - < AUTH-SUCCESS ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6 +### GET /git-annex/$uuid/key/$key - > POST /git-annex/v1/CHECKPRESENT?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925&serveruuid=ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6 HTTP/1.0 - > SUCCESS +This is a simple, unversioned interface to get the content of a key +from a repository. - > POST /git-annex/v1/PUT-FROM?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925&serveruuid=ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6 HTTP/1.0 - < PUT-FROM 0 +It is not part of the P2P protocol per se, but is provided to let +other clients than git-annex easily download the content of keys from the +http server. - > POST /git-annex/v1/PUT?key=SHA1--foo&associatedfile=bar&put-from=0&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925&serveruuid=ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6 HTTP/1.0 - > Content-Type: application/octet-stream - > Content-Length: 20 - > foo - > {"valid": true} - < {"stored": true} +When the key is not present on the server, it will respond +with 404 Not Found. -(In the last example above "foo" is the content, it is followed by a line of json. -This seems better than needing an entire other request to indicate validitity.) +### GET /git-annex/$uuid/v3/key/$key -This needs a more complex spec. But it's easier for others to implement, -especially since it does not need a session identifier, so the HTTP server can -be stateless. +Get the content of a key from the repository with the specified uuid. -A full draft protocol for this is being developed at [[p2p_protocol_over_http/draft1]]. +Example: -## HTTP GET - -It should be possible to support a regular HTTP get of a key, with -no additional parameters, so that annex objects can be served to other clients -from this web server. - - > GET /git-annex/key/SHA1--foo HTTP/1.0 + > GET /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/key/SHA1--foo&associatedfile=bar&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + < X-git-annex-data-length: 3 + < Content-Type: application/octet-stream + < < foo -Although this would be a special case, not used by git-annex, because the P2P -protocol's GET has the complication of offsets, and of the server sending -VALID/INVALID after the content, and of needing to know the client's UUID in -order to update the location log. +All parameters are optional, including the common parameters, and these: -## Problem: CONNECT +* `associatedfile` -The CONNECT message allows both sides of the P2P protocol to send DATA -messages in any order. This seems difficult to encapsulate in HTTP. + The name of a file in the git repository, for informational purposes + only. -Probably this can be not implemented, it's probably not needed for a HTTP -remote? This is used to tunnel git protocol over the P2P protocol, but for -a HTTP remote the git repository can be accessed over HTTP as well. +* `offset` -## security + Number of bytes to skip sending from the beginning of the file. -Should support HTTPS and/or be limited to only HTTPS. +Request headers are currently ignored, so eg Range requests are +not supported. (This would be possible to implement, up to a point.) -Authentication via http basic auth? +The body of the request is empty. + +The server's response will have a `Content-Type` header of +`application/octet-stream`. + +The server's response will have a `X-git-annex-data-length` +header that indicates the number of bytes of content that are expected to +be sent. Note that there is no Content-Length header. + +The body of the response is the content of the key. + +If the length of the body is different than what the the +X-git-annex-data-length header indicated, then the data is invalid and +should not be used. This can happen when eg, the data was being sent from +an unlocked annexed file, which got modified while it was being sent. + +When the content is not present, the server will respond with +422 Unprocessable Content. + +### GET /git-annex/$uuid/v2/key/$key + +Identical to v3. + +### GET /git-annex/$uuid/v1/key/$key + +Identical to v3. + +### GET /git-annex/$uuid/v0/key/$key + +Same as v3, except the X-git-annex-data-length header is not used. +Additional checking client-side will be required to validate the data. + +### POST /git-annex/$uuid/v3/checkpresent + +Checks if a key is currently present on the server. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/checkpresent?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + < {"present": true} + +There is one required additional parameter, `key`. + +The body of the request is empty. + +The server responds with a JSON object with a "present" field that is true +if the key is present, or false if it is not present. + +### POST /git-annex/$uuid/v2/checkpresent + +Identical to v3. + +### POST /git-annex/$uuid/v1/checkpresent + +Identical to v3. + +### POST /git-annex/$uuid/v0/checkpresent + +Identical to v3. + +### POST /git-annex/$uuid/v3/lockcontent + +Locks the content of a key on the server, preventing it from being removed. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/lockcontent?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + < {"locked": true, "lockid": "foo"} + +There is one required additional parameter, `key`. + +The server will reply with `{"locked": true}` if it was able +to lock the key, or `{"locked": false}` if it was not. + +The key will remain locked for 10 minutes. But, usually `keeplocked` +is used to control the lifetime of the lock, using the "lockid" +parameter from the server's reply. (See below.) + +### POST /git-annex/$uuid/v2/lockcontent + +Identical to v3. + +### POST /git-annex/$uuid/v1/lockcontent + +Identical to v3. + +### POST /git-annex/$uuid/v0/lockcontent + +Identical to v3. + +### POST /git-annex/$uuid/v3/keeplocked + +Controls the lifetime of a lock on a key that was earlier obtained +with `lockcontent`. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/keeplocked?lockid=foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + > Connection: Keep-Alive + > Keep-Alive: timeout=1200 + [some time later] + > {"unlock": true} + < {"locked": false} + +There is one required additional parameter, `lockid`. + +This uses long polling. So it's important to use +Connection and Keep-Alive headers. + +This keeps an active lock from expiring until the client sends +`{"unlock": true}`, and then it immediately unlocks it. + +The client can send `{"unlock": false}` any number of times first. +This has no effect, but may be useful to keep the connection alive. + +This must be called within ten minutes of `lockcontent`, otherwise +the lock will have already expired when this runs. Note that this +does not indicate if the lock expired, it always returns +`{"locked": false}`. + +If the connection is closed before the client sends `{"unlock": true}, +or even if the web server gets shut down, the content will remain +locked for 10 minutes from the time it was first locked. + +Note that the common parameters bypass and clientuuid, while +accepted, have no effect. + +### POST /git-annex/$uuid/v2/keeplocked + +Identical to v3. + +### POST /git-annex/$uuid/v1/keeplocked + +Identical to v3. + +### POST /git-annex/$uuid/v0/keeplocked + +Identical to v3. + +### POST /git-annex/$uuid/v3/remove + +Remove a key's content from the server. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/remove?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + < {"removed": true} + +There is one required additional parameter, `key`. + +The body of the request is empty. + +The server responds with a JSON object with a "removed" field that is true +if the key was removed (or was not present on the server), +or false if the key was not able to be removed. + +The JSON object can have an additional field "plusuuids" that is a list of +UUIDs of other repositories that the content was removed from. + +### POST /git-annex/$uuid/v2/remove + +Identical to v3. + +### POST /git-annex/$uuid/v1/remove + +Same as v3, except the JSON will not include "plusuuids". + +### POST /git-annex/$uuid/v0/remove + +Identical to v1. + +## POST /git-annex/$uuid/v3/remove-before + +Remove a key's content from the server, but only before a specified time. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/remove-before?timestamp=4949292929&key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + < {"removed": true} + +This is the same as the `remove` request, but with an additional parameter, +`timestamp`. + +If the server's monotonic clock is past the specified timestamp, the +removal will fail and the server will respond with: `{"removed": false}` + +This is used to avoid removing content after a point in +time where it is no longer locked in other repostitories. + +## POST /git-annex/$uuid/v3/gettimestamp + +Gets the current timestamp from the server. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/gettimestamp?clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + < {"timestamp": 59459392} + +The body of the request is empty. + +The server responds with JSON object with a timestmap field that has the +current value of its monotonic clock, as a number of seconds. + +Important: If multiple servers are serving this protocol for the same +repository, they MUST all use the same monotonic clock. + +### POST /git-annex/$uuid/v3/put + +Store content on the server. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/put?key=SHA1--foo&associatedfile=bar&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + > Content-Type: application/octet-stream + > X-git-annex-data-length: 3 + > + > foo + < {"stored": true} + +There is one required additional parameter, `key`. + +There are are also these optional parameters: + +* `associatedfile` + + The name of a file in the git repository, for informational purposes + only. + +* `offset` + + Number of bytes that have been omitted from the beginning of the file. + Usually this will be determined by making a `putoffset` request. + +The `Content-Type` header should be `application/octet-stream`. + +The `X-git-annex-data-length` must be included. It indicates the number +of bytes of content that are expected to be sent. +Note that there is no need to send a Content-Length header. + +If the length of the body is different than what the the +X-git-annex-data-length header indicated, then the data is invalid and +should not be used. This can happen when eg, the data was being sent from +an unlocked annexed file, which got modified while it was being sent. + +The server responds with a JSON object with a field "stored" +that is true if it received the data and stored the content. + +The JSON object can have an additional field "plusuuids" that is a list of +UUIDs of other repositories that the content was stored to. + +### POST /git-annex/$uuid/v2/put + +Identical to v3. + +### POST /git-annex/$uuid/v1/put + +Same as v3, except the JSON will not include "plusuuids". + +### POST /git-annex/$uuid/v0/put + +Same as v1, except additional checking is done to validate the data. + +### POST /git-annex/$uuid/v3/putoffset + +Asks the server what `offset` can be used in a `put` of a key. + +This should usually be used right before sending a `put` request. +The offset may not be valid after some point in time, which could result in +the `put` request failing. + +Example: + + > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/putoffset?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 + < {"offset": 10} + +There is one required additional parameter, `key`. + +The body of the request is empty. + +The server responds with a JSON object with an "offset" field that +is the largest allowable offset. + +If the server already has the content of the key, it will respond instead +with a JSON object with an "alreadyhave" field that is set to true. This JSON +object may also have a field "plusuuids" that lists +the UUIDs of other repositories where the content is stored, in addition to +the serveruuid. + +[Implementation note: This will be implemented by sending `PUT` and +returning the `PUT-FROM` offset. To avoid leaving the P2P protocol stuck +part way through a `PUT`, a synthetic empty `DATA` followed by `INVALID` +will be used to get the P2P protocol back into a state where it will accept +any request.] + +### POST /git-annex/$uuid/v2/putoffset + +Identical to v3. + +### POST /git-annex/$uuid/v1/putoffset + +Same as v3, except the JSON will not include "plusuuids". + +## parts of P2P protocol that are not supported over HTTP + +`NOTIFYCHANGE` is not supported, but it would be possible to extend +this HTTP protocol to support it. + +`CONNECT` is not supported, and due to the bi-directional message passing +nature of it, it cannot easily be done over HTTP (would need websockets). +It should not be necessary anyway, because the git repository itself can be +accessed over HTTP. diff --git a/doc/design/p2p_protocol_over_http/draft1.mdwn b/doc/design/p2p_protocol_over_http/draft1.mdwn deleted file mode 100644 index 25108f336e..0000000000 --- a/doc/design/p2p_protocol_over_http/draft1.mdwn +++ /dev/null @@ -1,423 +0,0 @@ -[[!toc ]] - -Draft 1 of a complete [[P2P_protocol]] over HTTP. - -## base64 encoding of keys, uuids, and filenames - -A git-annex key can contain text in any encoding. So can a filename, -and it's even possible, though unlikely, that the UUID of a git-annex -repository might. - -But this protocol requires that UTF-8 be used throughout, except -where bodies use `Content-Type: application/octet-stream`. - -So this protocol allows using -[base64url](https://datatracker.ietf.org/doc/html/rfc4648#section-5) -encoding for such values. Any key, filename, or UUID wrapped in square -brackets is a base64url encoded value. -For example, "[Zm9v]" is the same as "foo". - -A filename like "[foo]" will need to itself be encoded that way: "[W2Zvb10=]" - -## authentication - -Some requests need authentication. Which requests do depends on the -configuration of the HTTP server. When a request needs authentication, -it will fail with 401 Unauthorized. - -Authentication is done using HTTP basic auth. The realm to use when -authenticating is "git-annex". The charset is UTF-8. - -When authentication is successful but does not allow a request to be -performed, it will fail with 403 Forbidden. - -Note that HTTP basic auth is not encrypted so is only secure when used -over HTTPS. - -## protocol version - -Each request in the protocol is versioned. The versions correspond -to P2P protocol versions. - -If the server does not support a particular protocol version, the -request will fail with a 400 Bad Request, and the client should fall -back to an earlier protocol version. - -## common request parameters - -Every request supports this parameter, and unless documented -otherwise, a request it to be included. - -* `clientuuid` - - The value is the UUID of the git-annex repository of the client. - -Any request may also optionally include these parameters: - -* `bypass` - - The value is the UUID of a cluster gateway, which the server should avoid - connecting to when serving a cluster. This is the equivilant of the - `BYPASS` message in the [[P2P_Protocol]]. - - This parameter can be given multiple times to list several cluster - gateway UUIDs. - - This parameter is only available for v2 and above. - -[Internally, git-annex can use these common parameters, plus the protocol -version, and remote UUID, to create a P2P session. The P2P session is -driven through the AUTH, VERSION, and BYPASS messages, leaving the session -ready to service requests.] - -## requests - -### GET /git-annex/$uuid/key/$key - -This is a simple, unversioned interface to get the content of a key -from a repository. - -It is not part of the P2P protocol per se, but is provided to let -other clients than git-annex easily download the content of keys from the -http server. - -When the key is not present on the server, it will respond -with 404 Not Found. - -### GET /git-annex/$uuid/v3/key/$key - -Get the content of a key from the repository with the specified uuid. - -Example: - - > GET /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/key/SHA1--foo&associatedfile=bar&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - < X-git-annex-data-length: 3 - < Content-Type: application/octet-stream - < - < foo - -All parameters are optional, including the common parameters, and these: - -* `associatedfile` - - The name of a file in the git repository, for informational purposes - only. - -* `offset` - - Number of bytes to skip sending from the beginning of the file. - -Request headers are currently ignored, so eg Range requests are -not supported. (This would be possible to implement, up to a point.) - -The body of the request is empty. - -The server's response will have a `Content-Type` header of -`application/octet-stream`. - -The server's response will have a `X-git-annex-data-length` -header that indicates the number of bytes of content that are expected to -be sent. Note that there is no Content-Length header. - -The body of the response is the content of the key. - -If the length of the body is different than what the the -X-git-annex-data-length header indicated, then the data is invalid and -should not be used. This can happen when eg, the data was being sent from -an unlocked annexed file, which got modified while it was being sent. - -When the content is not present, the server will respond with -422 Unprocessable Content. - -### GET /git-annex/$uuid/v2/key/$key - -Identical to v3. - -### GET /git-annex/$uuid/v1/key/$key - -Identical to v3. - -### GET /git-annex/$uuid/v0/key/$key - -Same as v3, except the X-git-annex-data-length header is not used. -Additional checking client-side will be required to validate the data. - -### POST /git-annex/$uuid/v3/checkpresent - -Checks if a key is currently present on the server. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/checkpresent?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - < {"present": true} - -There is one required additional parameter, `key`. - -The body of the request is empty. - -The server responds with a JSON object with a "present" field that is true -if the key is present, or false if it is not present. - -### POST /git-annex/$uuid/v2/checkpresent - -Identical to v3. - -### POST /git-annex/$uuid/v1/checkpresent - -Identical to v3. - -### POST /git-annex/$uuid/v0/checkpresent - -Identical to v3. - -### POST /git-annex/$uuid/v3/lockcontent - -Locks the content of a key on the server, preventing it from being removed. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/lockcontent?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - < {"locked": true, "lockid": "foo"} - -There is one required additional parameter, `key`. - -The server will reply with `{"locked": true}` if it was able -to lock the key, or `{"locked": false}` if it was not. - -The key will remain locked for 10 minutes. But, usually `keeplocked` -is used to control the lifetime of the lock, using the "lockid" -parameter from the server's reply. (See below.) - -### POST /git-annex/$uuid/v2/lockcontent - -Identical to v3. - -### POST /git-annex/$uuid/v1/lockcontent - -Identical to v3. - -### POST /git-annex/$uuid/v0/lockcontent - -Identical to v3. - -### POST /git-annex/$uuid/v3/keeplocked - -Controls the lifetime of a lock on a key that was earlier obtained -with `lockcontent`. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/keeplocked?lockid=foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - > Connection: Keep-Alive - > Keep-Alive: timeout=1200 - [some time later] - > {"unlock": true} - < {"locked": false} - -There is one required additional parameter, `lockid`. - -This uses long polling. So it's important to use -Connection and Keep-Alive headers. - -This keeps an active lock from expiring until the client sends -`{"unlock": true}`, and then it immediately unlocks it. - -The client can send `{"unlock": false}` any number of times first. -This has no effect, but may be useful to keep the connection alive. - -This must be called within ten minutes of `lockcontent`, otherwise -the lock will have already expired when this runs. Note that this -does not indicate if the lock expired, it always returns -`{"locked": false}`. - -If the connection is closed before the client sends `{"unlock": true}, -or even if the web server gets shut down, the content will remain -locked for 10 minutes from the time it was first locked. - -Note that the common parameters bypass and clientuuid, while -accepted, have no effect. - -### POST /git-annex/$uuid/v2/keeplocked - -Identical to v3. - -### POST /git-annex/$uuid/v1/keeplocked - -Identical to v3. - -### POST /git-annex/$uuid/v0/keeplocked - -Identical to v3. - -### POST /git-annex/$uuid/v3/remove - -Remove a key's content from the server. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/remove?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - < {"removed": true} - -There is one required additional parameter, `key`. - -The body of the request is empty. - -The server responds with a JSON object with a "removed" field that is true -if the key was removed (or was not present on the server), -or false if the key was not able to be removed. - -The JSON object can have an additional field "plusuuids" that is a list of -UUIDs of other repositories that the content was removed from. - -### POST /git-annex/$uuid/v2/remove - -Identical to v3. - -### POST /git-annex/$uuid/v1/remove - -Same as v3, except the JSON will not include "plusuuids". - -### POST /git-annex/$uuid/v0/remove - -Identical to v1. - -## POST /git-annex/$uuid/v3/remove-before - -Remove a key's content from the server, but only before a specified time. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/remove-before?timestamp=4949292929&key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - < {"removed": true} - -This is the same as the `remove` request, but with an additional parameter, -`timestamp`. - -If the server's monotonic clock is past the specified timestamp, the -removal will fail and the server will respond with: `{"removed": false}` - -This is used to avoid removing content after a point in -time where it is no longer locked in other repostitories. - -## POST /git-annex/$uuid/v3/gettimestamp - -Gets the current timestamp from the server. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/gettimestamp?clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - < {"timestamp": 59459392} - -The body of the request is empty. - -The server responds with JSON object with a timestmap field that has the -current value of its monotonic clock, as a number of seconds. - -Important: If multiple servers are serving this protocol for the same -repository, they MUST all use the same monotonic clock. - -### POST /git-annex/$uuid/v3/put - -Store content on the server. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/put?key=SHA1--foo&associatedfile=bar&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - > Content-Type: application/octet-stream - > X-git-annex-data-length: 3 - > - > foo - < {"stored": true} - -There is one required additional parameter, `key`. - -There are are also these optional parameters: - -* `associatedfile` - - The name of a file in the git repository, for informational purposes - only. - -* `offset` - - Number of bytes that have been omitted from the beginning of the file. - Usually this will be determined by making a `putoffset` request. - -The `Content-Type` header should be `application/octet-stream`. - -The `X-git-annex-data-length` must be included. It indicates the number -of bytes of content that are expected to be sent. -Note that there is no need to send a Content-Length header. - -If the length of the body is different than what the the -X-git-annex-data-length header indicated, then the data is invalid and -should not be used. This can happen when eg, the data was being sent from -an unlocked annexed file, which got modified while it was being sent. - -The server responds with a JSON object with a field "stored" -that is true if it received the data and stored the content. - -The JSON object can have an additional field "plusuuids" that is a list of -UUIDs of other repositories that the content was stored to. - -### POST /git-annex/$uuid/v2/put - -Identical to v3. - -### POST /git-annex/$uuid/v1/put - -Same as v3, except the JSON will not include "plusuuids". - -### POST /git-annex/$uuid/v0/put - -Same as v1, except additional checking is done to validate the data. - -### POST /git-annex/$uuid/v3/putoffset - -Asks the server what `offset` can be used in a `put` of a key. - -This should usually be used right before sending a `put` request. -The offset may not be valid after some point in time, which could result in -the `put` request failing. - -Example: - - > POST /git-annex/ecf6d4ca-07e8-11ef-8990-9b8c1f696bf6/v3/putoffset?key=SHA1--foo&clientuuid=79a5a1f4-07e8-11ef-873d-97f93ca91925 HTTP/1.1 - < {"offset": 10} - -There is one required additional parameter, `key`. - -The body of the request is empty. - -The server responds with a JSON object with an "offset" field that -is the largest allowable offset. - -If the server already has the content of the key, it will respond instead -with a JSON object with an "alreadyhave" field that is set to true. This JSON -object may also have a field "plusuuids" that lists -the UUIDs of other repositories where the content is stored, in addition to -the serveruuid. - -[Implementation note: This will be implemented by sending `PUT` and -returning the `PUT-FROM` offset. To avoid leaving the P2P protocol stuck -part way through a `PUT`, a synthetic empty `DATA` followed by `INVALID` -will be used to get the P2P protocol back into a state where it will accept -any request.] - -### POST /git-annex/$uuid/v2/putoffset - -Identical to v3. - -### POST /git-annex/$uuid/v1/putoffset - -Same as v3, except the JSON will not include "plusuuids". - -## parts of P2P protocol that are not supported over HTTP - -`NOTIFYCHANGE` is not supported, but it would be possible to extend -this HTTP protocol to support it. - -`CONNECT` is not supported, and due to the bi-directional message passing -nature of it, it cannot easily be done over HTTP (would need websockets). -It should not be necessary anyway, because the git repository itself can be -accessed over HTTP. diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn index 04718d829e..7a068f8478 100644 --- a/doc/todo/git-annex_proxies.mdwn +++ b/doc/todo/git-annex_proxies.mdwn @@ -28,9 +28,9 @@ Planned schedule of work: ## work notes -* Test serveLockContent - -* A Locker should expire the lock on its own after 10 minutes initially. +* A Locker should expire the lock on its own after 10 minutes, + initially. Once keeplocked is called, the expiry should end with the end + of that call. * Make Remote.Git use http client when remote.name.annex-url is configured. @@ -41,10 +41,9 @@ Planned schedule of work: ## completed items for July's work on p2p protocol over http -* addressed [[doc/todo/P2P_locking_connection_drop_safety]] +* HTTP P2P protocol document [[design/p2p_protocol_over_http]]. -* finalized HTTP P2P protocol draft 1, - [[design/p2p_protocol_over_http/draft1]] +* addressed [[doc/todo/P2P_locking_connection_drop_safety]] * implemented server and client for HTTP P2P protocol