2018-03-07 19:08:17 +00:00
|
|
|
The git-annex P2P protocol is a custom protocol that git-annex uses to
|
|
|
|
communicate between peers.
|
|
|
|
|
|
|
|
There's a common line-based serialization of the protocol, but other
|
|
|
|
serializations are also possible. The line-based serialization is spoken
|
2018-03-07 19:15:23 +00:00
|
|
|
by [[git-annex-shell], and by git-annex over tor.
|
2018-03-07 19:08:17 +00:00
|
|
|
|
|
|
|
One peer is known as the client, and is the peer that initiates the
|
2018-03-12 17:43:19 +00:00
|
|
|
connection and sends commands. The other peer is known as the server, and
|
|
|
|
is the peer that the client connects to. It's possible for two connections
|
|
|
|
to be run at the same time between the same two peers, in different
|
|
|
|
directions.
|
|
|
|
|
|
|
|
## Errors
|
|
|
|
|
|
|
|
Either the client or the server may send an error message at any
|
|
|
|
time.
|
|
|
|
|
|
|
|
When the client sends an ERROR, the server will close the connection.
|
|
|
|
|
|
|
|
If the server sends an ERROR in response to the client's
|
|
|
|
request, the connection will remain open, and the client can make
|
|
|
|
another request.
|
|
|
|
|
|
|
|
ERROR this repository is read-only; write access denied
|
2018-03-07 19:08:17 +00:00
|
|
|
|
|
|
|
## Authentication
|
|
|
|
|
2021-08-09 16:44:20 +00:00
|
|
|
The protocol generally starts with authentication. However, if
|
2018-03-07 19:08:17 +00:00
|
|
|
authentication already occurs on another layer, as is the case with
|
|
|
|
git-annex-shell, authentication will be skipped.
|
|
|
|
|
2018-03-12 17:43:19 +00:00
|
|
|
The client starts by sending an authentication command to the server,
|
2018-03-07 19:08:17 +00:00
|
|
|
along with its UUID. The AuthToken is some arbitrary token that has been
|
|
|
|
agreed upon beforehand.
|
|
|
|
|
|
|
|
AUTH UUID AuthToken
|
|
|
|
|
|
|
|
The server responds with either its own UUID when authentication
|
|
|
|
is successful. Or, it can fail the authentication, and close the
|
|
|
|
connection.
|
|
|
|
|
2024-06-10 19:05:41 +00:00
|
|
|
AUTH-SUCCESS UUID
|
2024-06-10 22:01:36 +00:00
|
|
|
AUTH-FAILURE
|
2018-03-07 19:08:17 +00:00
|
|
|
|
2018-03-07 19:15:23 +00:00
|
|
|
Note that authentication does not guarantee that the client is talking to
|
|
|
|
who they expect to be talking to. This, and encryption of the connection,
|
|
|
|
are handled at a lower level.
|
|
|
|
|
2018-03-12 17:43:19 +00:00
|
|
|
## Protocol version
|
2018-03-07 19:08:17 +00:00
|
|
|
|
2018-03-12 17:43:19 +00:00
|
|
|
The default protocol version is 0. The client can choose to
|
|
|
|
negotiate a new version with the server. This must come after
|
|
|
|
any authentication.
|
2018-03-07 19:08:17 +00:00
|
|
|
|
2018-03-12 17:43:19 +00:00
|
|
|
The client sends the highest protocol version it supports:
|
2018-03-07 19:08:17 +00:00
|
|
|
|
2024-07-03 20:59:22 +00:00
|
|
|
VERSION 3
|
2018-03-07 19:08:17 +00:00
|
|
|
|
2018-03-12 17:43:19 +00:00
|
|
|
The server responds with the highest protocol version it supports
|
|
|
|
that is less than or equal to the version the client sent:
|
|
|
|
|
|
|
|
VERSION 1
|
|
|
|
|
|
|
|
Now both client and server should use version 1.
|
|
|
|
|
2024-06-27 16:20:22 +00:00
|
|
|
## Cluster cycle prevention
|
|
|
|
|
2024-07-05 19:34:58 +00:00
|
|
|
In protocol version 2 and above, immediately after VERSION, the
|
2024-06-27 16:20:22 +00:00
|
|
|
client can send an additional message that is used to
|
|
|
|
prevent cycles when accessing clusters.
|
|
|
|
|
|
|
|
BYPASS UUID1 UUID2 ...
|
|
|
|
|
|
|
|
The UUIDs are cluster gateways to avoid connecting to when
|
|
|
|
serving a cluster.
|
|
|
|
|
|
|
|
The server makes no response to this message.
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
## Binary data
|
|
|
|
|
|
|
|
The protocol allows raw binary data to be sent. This is done
|
|
|
|
using a DATA message. In the line-based serialization, this comes
|
|
|
|
on its own line, followed by a newline and the binary data.
|
|
|
|
The Len value tells how many bytes of data to read.
|
|
|
|
|
|
|
|
DATA 3
|
2024-05-01 19:26:51 +00:00
|
|
|
foo
|
2018-03-07 19:08:17 +00:00
|
|
|
|
|
|
|
Note that there is no newline after the binary data; the next protocol
|
|
|
|
message will come immediately after it.
|
2018-03-12 17:43:19 +00:00
|
|
|
|
|
|
|
If the sender finds itself unable to send as many bytes of data as it
|
|
|
|
promised (perhaps because a file got truncated while it was being sent),
|
|
|
|
its only option is to close the protocol connection.
|
|
|
|
|
Fixed some other potential hangs in the P2P protocol
Finishes the start made in 983c9d5a53189f71797591692c0ed675f5bd1c16, by
handling the case where `transfer` fails for some other reason, and so the
ReadContent callback does not get run. I don't know of a case where
`transfer` does fail other than the locking dealt with in that commit, but
it's good to have a guarantee.
StoreContent and StoreContentTo had a similar problem.
Things like `getViaTmp` may decide not to run the transfer action.
And `transfer` could certianly fail, if another transfer of the same
object was in progress. (Or a different object when annex.pidlock is set.)
If the transfer action was not run, the content of the object would
not all get consumed, and so would get interpreted as protocol commands,
which would not go well.
My approach to fixing all of these things is to set a TVar only
once all the data in the transfer is known to have been read/written.
This way the internals of `transfer`, `getViaTmp` etc don't matter.
So in ReadContent, it checks if the transfer completed.
If not, as long as it didn't throw an exception, send empty and Invalid
data to the callback. On an exception the state of the protocol is unknown
so it has to raise ProtoFailureException and close the connection,
same as before.
In StoreContent, if the transfer did not complete
some portion of the DATA has been read, so the protocol is in an unknown
state and it has to close the conection as well.
(The ProtoFailureMessage used here matches the one in Annex.Transfer, which
is the most likely reason. Not ideal to duplicate it..)
StoreContent did not ever close the protocol connection before. So this is
a protocol change, but only in an exceptional circumstance, and it's not
going to break anything, because clients already need to deal with the
connection breaking at any point.
The way this new behavior looks (here origin has annex.pidlock = true so will
only accept one upload to it at a time):
git annex copy --to origin -J2
copy x (to origin...) ok
copy y (to origin...)
Lost connection (fd:25: hGetChar: end of file)
This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
2018-11-06 18:44:00 +00:00
|
|
|
And if the receiver finds itself unable to receive all the data for some
|
|
|
|
reason (eg, out of disk space), its only option is to close the protocol
|
|
|
|
connection.
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
## Checking if content is present
|
|
|
|
|
2018-03-07 20:22:39 +00:00
|
|
|
To check if a key is currently present on the server, the client sends:
|
2018-03-07 19:08:17 +00:00
|
|
|
|
|
|
|
CHECKPRESENT Key
|
|
|
|
|
|
|
|
The server responds with either SUCCESS or FAILURE.
|
|
|
|
|
|
|
|
## Locking content
|
|
|
|
|
|
|
|
To lock content on the server, preventing it from being removed,
|
|
|
|
the client sends:
|
|
|
|
|
|
|
|
LOCKCONTENT Key
|
|
|
|
|
|
|
|
The server responds with either SUCCESS or FAILURE.
|
2024-07-27 00:37:38 +00:00
|
|
|
The former indicates the content is locked.
|
|
|
|
|
|
|
|
After SUCCESS, the content will remain locked until the
|
|
|
|
client sends its next message, which must be:
|
2018-03-07 19:08:17 +00:00
|
|
|
|
|
|
|
UNLOCKCONTENT Key
|
|
|
|
|
|
|
|
The server makes no response to that.
|
|
|
|
|
2024-07-04 19:26:05 +00:00
|
|
|
If the connection is broken before the client sends UNLOCKCONTENT,
|
|
|
|
the content will remain locked for at least 10 minutes from when the server
|
|
|
|
sent SUCCESS.
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
## Removing content
|
|
|
|
|
|
|
|
To remove a key's content from the server, the client sends:
|
|
|
|
|
|
|
|
REMOVE Key
|
|
|
|
|
|
|
|
The server responds with either SUCCESS or FAILURE.
|
|
|
|
|
2024-07-05 14:08:43 +00:00
|
|
|
Note that if the content was not present, SUCCESS will be returned.
|
|
|
|
|
2024-07-05 19:34:58 +00:00
|
|
|
In protocol version 2 and above, the server can optionally reply with
|
|
|
|
SUCCESS-PLUS or FAILURE-PLUS. Each has a subsequent list of UUIDs of
|
|
|
|
repositories that the content was removed from.
|
2024-06-23 13:28:18 +00:00
|
|
|
|
2024-07-03 20:59:22 +00:00
|
|
|
## Removing content before a specified time
|
|
|
|
|
|
|
|
This is only available in protocol version 3 and above.
|
|
|
|
|
|
|
|
To remove a key's content from the server, but only before a specified time,
|
|
|
|
the client sends:
|
|
|
|
|
|
|
|
REMOVE-BEFORE Timestamp Key
|
|
|
|
|
|
|
|
The server responds to the message in the same way as to REMOVE.
|
|
|
|
|
|
|
|
If the server receives the message at a time after the specified timestamp,
|
toward SafeDropProof expiry checking
Added Maybe POSIXTime to SafeDropProof, which gets set when the proof is
based on a LockedCopy. If there are several LockedCopies, it uses the
closest expiry time. That is not optimal, it may be that the proof
expires based on one LockedCopy but another one has not expired. But
that seems unlikely to really happen, and anyway the user can just
re-run a drop if it fails due to expiry.
Pass the SafeDropProof to removeKey, which is responsible for checking
it for expiry in situations where that could be a problem. Which really
only means in Remote.Git.
Made Remote.Git check expiry when dropping from a local remote.
Checking expiry when dropping from a P2P remote is not yet implemented.
P2P.Protocol.remove has SafeDropProof plumbed through to it for that
purpose.
Fixing the remaining 2 build warnings should complete this work.
Note that the use of a POSIXTime here means that if the clock gets set
forward while git-annex is in the middle of a drop, it may say that
dropping took too long. That seems ok. Less ok is that if the clock gets
turned back a sufficient amount (eg 5 minutes), proof expiry won't be
noticed. It might be better to use the Monotonic clock, but that doesn't
advance when a laptop is suspended, and while there is the linux
Boottime clock, that is not available on other systems. Perhaps a
combination of POSIXTime and the Monotonic clock could detect laptop
suspension and also detect clock being turned back?
There is a potential future flag day where
p2pDefaultLockContentRetentionDuration is not assumed, but is probed
using the P2P protocol, and peers that don't support it can no longer
produce a LockedCopy. Until that happens, when git-annex is
communicating with older peers there is a risk of data loss when
a ssh connection closes during LOCKCONTENT.
2024-07-04 16:23:46 +00:00
|
|
|
the remove must fail.This is used to avoid removing content after a point
|
2024-07-03 20:59:22 +00:00
|
|
|
in time where it is no longer locked in other repostitories.
|
|
|
|
|
|
|
|
## Getting a timestamp
|
|
|
|
|
|
|
|
This is only available in protocol version 3 and above.
|
|
|
|
|
|
|
|
To get the current timestamp from the server, the client sends:
|
|
|
|
|
|
|
|
GETTIMESTAMP
|
|
|
|
|
|
|
|
The server responds with TIMESTAMP followed by its current time, as a
|
|
|
|
number of seconds. Note that this uses a monotonic clock.
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
## Storing content on the server
|
|
|
|
|
|
|
|
To store content on the server, the client sends:
|
|
|
|
|
|
|
|
PUT AssociatedFile Key
|
|
|
|
|
|
|
|
Here AssociatedFile may be the name of a file in the git
|
|
|
|
repository, for information purposes only. Or it can be the
|
|
|
|
empty string. It will always have unix directory separators.
|
|
|
|
|
|
|
|
(Note that in the line-based serialization. AssociatedFile may not contain any
|
|
|
|
spaces, since it's not the last token in the line. Use '%' to indicate
|
|
|
|
whitespace.)
|
|
|
|
|
|
|
|
The server may respond with ALREADY-HAVE if it already
|
2024-06-28 21:07:01 +00:00
|
|
|
had the content of that key.
|
2024-06-18 16:07:01 +00:00
|
|
|
|
2024-07-22 14:20:18 +00:00
|
|
|
In protocol version 2 and above, the server can optionally reply with
|
2024-06-18 16:07:01 +00:00
|
|
|
ALREADY-HAVE-PLUS. The subsequent list of UUIDs are additional
|
|
|
|
UUIDs where the content is stored, in addition to the UUID where
|
|
|
|
the client was going to send it.
|
|
|
|
|
|
|
|
Otherwise, it responds with:
|
2018-03-07 19:08:17 +00:00
|
|
|
|
|
|
|
PUT-FROM Offset
|
|
|
|
|
|
|
|
Offset is the number of bytes into the file that the server wants
|
|
|
|
the client to start. This allows resuming transfers.
|
|
|
|
|
|
|
|
The client then sends a DATA message with content of the file from
|
|
|
|
the offset to the end of file.
|
|
|
|
|
2024-07-22 14:20:18 +00:00
|
|
|
In protocol version 1 and above, after the data, the client sends an
|
|
|
|
additional message, to indicate if the content of the file has changed
|
|
|
|
while it was being sent.
|
2018-03-13 18:18:30 +00:00
|
|
|
|
|
|
|
INVALID
|
|
|
|
VALID
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
If the server successfully receives the data and stores the content,
|
|
|
|
it replies with SUCCESS. Otherwise, FAILURE.
|
|
|
|
|
2024-07-22 14:20:18 +00:00
|
|
|
In protocol version 2 and above, the server can optionally reply with
|
|
|
|
SUCCESS-PLUS and a list of UUIDs where the content was stored.
|
2024-06-18 16:07:01 +00:00
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
## Getting content from the server
|
|
|
|
|
|
|
|
To get content from the server, the client sends:
|
|
|
|
|
|
|
|
GET Offset AssociatedFile Key
|
|
|
|
|
|
|
|
The Offset is the number of bytes into the file that the client wants
|
|
|
|
the server to skip, which allows resuming transfers.
|
|
|
|
See description of AssociatedFile above.
|
|
|
|
|
|
|
|
The server then sends a DATA message with the content of the file
|
|
|
|
from the offset to end of file.
|
|
|
|
|
2024-07-22 14:20:18 +00:00
|
|
|
In protocol version 1 and above, after the data, the server sends an additional
|
2018-03-13 18:18:30 +00:00
|
|
|
message, to indicate if the content of the file has changed while it
|
|
|
|
was being sent.
|
|
|
|
|
|
|
|
INVALID
|
|
|
|
VALID
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
The client replies with SUCCESS or FAILURE.
|
|
|
|
|
2024-07-02 20:14:45 +00:00
|
|
|
Note that the client responding with SUCCESS does not indicate to the
|
|
|
|
server that it has stored the content. It may receive it and throw it away.
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
## Connection to services
|
|
|
|
|
|
|
|
This is used to connect to services like git-upload-pack and
|
|
|
|
git-receive-pack that speak their own protocol.
|
|
|
|
|
|
|
|
The client sends a message to request the connection.
|
|
|
|
Service is the name of the service, eg "git-upload-pack".
|
|
|
|
|
|
|
|
CONNECT Service
|
|
|
|
|
|
|
|
Both client and server may now exchange DATA messages in any order,
|
|
|
|
encapsulating the service's protocol.
|
|
|
|
|
|
|
|
When the service exits, the server indicates this by telling the client
|
|
|
|
its exit code.
|
|
|
|
|
|
|
|
CONNECTDONE ExitCode
|
|
|
|
|
2024-06-12 14:40:51 +00:00
|
|
|
After that, the server closes the connection.
|
|
|
|
|
2018-03-07 19:08:17 +00:00
|
|
|
## Change notification
|
|
|
|
|
|
|
|
The client can request to be notified when a ref in
|
|
|
|
the git repository on the server changes.
|
|
|
|
|
|
|
|
NOTIFYCHANGE
|
|
|
|
|
|
|
|
The server will block until at least
|
|
|
|
one of the refs changes, and send a list of changed
|
|
|
|
refs.
|
|
|
|
|
|
|
|
CHANGED ChangedRefs
|
|
|
|
|
|
|
|
For example:
|
|
|
|
|
|
|
|
CHANGED refs/heads/master refs/heads/git-annex
|
|
|
|
|
|
|
|
Some servers may not support this command.
|