From 2de27751d6c81f1842aa762fbd7e01d84968896e Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 7 Jul 2025 14:26:02 -0400 Subject: [PATCH] design --- ..._d1bb9a968329e889e05879001a2b41de._comment | 38 ++++++++++ ..._eae9998285899a22b7619fc75c52e270._comment | 69 +++++++++++++++++++ ..._3ec0a998f89533555c14a9745956f800._comment | 24 +++++++ 3 files changed, 131 insertions(+) create mode 100644 doc/todo/generic_p2p_socket_transport/comment_10_d1bb9a968329e889e05879001a2b41de._comment create mode 100644 doc/todo/generic_p2p_socket_transport/comment_8_eae9998285899a22b7619fc75c52e270._comment create mode 100644 doc/todo/generic_p2p_socket_transport/comment_9_3ec0a998f89533555c14a9745956f800._comment diff --git a/doc/todo/generic_p2p_socket_transport/comment_10_d1bb9a968329e889e05879001a2b41de._comment b/doc/todo/generic_p2p_socket_transport/comment_10_d1bb9a968329e889e05879001a2b41de._comment new file mode 100644 index 0000000000..49b17eb5f7 --- /dev/null +++ b/doc/todo/generic_p2p_socket_transport/comment_10_d1bb9a968329e889e05879001a2b41de._comment @@ -0,0 +1,38 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 10""" + date="2025-07-07T17:29:16Z" + content=""" +I had suggested using the remote's configuration to determine the socket +that remotedaemon listens on. + +> Eg, a remote with uuid U could use .git/annex/p2p/U as its socket file. + +But it may be that only incoming connections are wanted to be served, +without having any remotes configured that use a P2P network. (And there +could be multiple remotes that use the same P2P network.) + +Instead, I think that remotedaemon should use socket files in the form +`.git/annex/p2p/$address`, for each P2P address that loadP2PAddresses +returns (except tor ones). + +There could be a `git-annex p2p --enable` command, which is passed +the P2P address to enable. Eg: + + git-annex p2p --enable p2p-annex::yggstack+somepubkey.pk.ygg + +That is similar to `git-annex enable-tor` in that it would run +`storeP2PAddress`. And so configure remotedaemon to listen on the socket +file for that address. + +It could also generate an AuthToken and output a version of the address +with the AuthToken included, similar to `git-annex p2p --gen-addresses`. + +That would let its output be communicated to the remote users, who can feed +it into `git-annex p2p --link`. For that matter, I think that `git-annex +p2p --pair` would also work. + +The address passed to `git-annex p2p --enable` could be anything, +but using a p2p-annex::foo address makes a `git-annex-p2p-foo` command be +used when connecting to the address. +"""]] diff --git a/doc/todo/generic_p2p_socket_transport/comment_8_eae9998285899a22b7619fc75c52e270._comment b/doc/todo/generic_p2p_socket_transport/comment_8_eae9998285899a22b7619fc75c52e270._comment new file mode 100644 index 0000000000..cf639281c8 --- /dev/null +++ b/doc/todo/generic_p2p_socket_transport/comment_8_eae9998285899a22b7619fc75c52e270._comment @@ -0,0 +1,69 @@ +[[!comment format=mdwn + username="joey" + subject="""AuthTokens""" + date="2025-07-07T14:54:36Z" + content=""" +I wrote: + +> If the P2P protocol's AUTH is provided with an AuthToken, there would +> need to be an interface to record the one to use for a given p2p +> connection. + +But, as implemented `git-annex remotedaemon` will accept +any of the authtokens in its list for any p2p connection. So if there are +2 onion services for the same repository for some reason, there will be 2 +authtokens, but either can be used with either. + +If there are 2 P2P connections and you decide to stop listening to one of +them, it does mean that authtoken needs to be removed from the list, +otherwise someone could still use it with the other P2P connection. If we +think about 2 different P2P protocols, one might turn out to be insecure, +so you stop using it. But then if the insecurity allowed someone else to +observe the authtoken that was used with it, and you didn't remove it from +the list, they could use that to connect via the other P2P service. + +And the user does not know about authtokens, they're an implementation +detail currently. So expecting the user to remove them from the list isn't +really sufficient. + +So it seems better for each P2P address to have its own unique authtoken, +that is not accepted for any other address. Or at least each P2P address +that needs an authtoken; perhaps some don't. (I don't think it's a problem +that for tor each hidden service accepts all listed authtokens though.) + +@matrrs wrote: + +> A configuration option annex.start-p2psocket=true would instruct +> remotedaemon to listen on .git/annex/p2psocket (I think a hardcoded +> location is fine, as there only really needs to be one such socket even +> with multiple networks + +That single socket wouldn't work if each P2P address has its own unique +authtoken. Because remotedaemon would have no way to know what P2P address +that socket was connected with. + +It also could be that some P2P protocol is 100% certain not to need an +authtoken for security. That would need a separate socket where +remotedaemon does not require AUTH with a valid authtoken. Or, setting up +a P2P connection for such a network would need to exchange authtokens, even +though there is no security benefit in doing so. + +I don't know if I would want to make the determination of whether or not +some P2P protocol needs an authtoken or not. It may be that the security +situation of a P2P protocol evolves over time. +Consider the case of tor, where it used to be fairly trivially possible to +enumerate onion addresses. See for example +[this paper](https://pure.port.ac.uk/ws/files/11523722/paper.pdf). +(Which is why I made tor use AuthTokens in the first place IIRC.) +Apparently changes were later made to tor to prevent that. I don't know +how secure it is considered to be in this area now though. + +If `git-annex p2p` is used to set up the P2P connection, it handles +generating the authtokens and exchanging them, fairly transparently to the +user. So maybe it would be simplest to always require authtokens. + +There is another reason for the authtoken: The socket file may be +accessible by other users of the system. This is the case with the tor +socket, since tor runs as another user, and so the socket file is made +world writable. +"""]] diff --git a/doc/todo/generic_p2p_socket_transport/comment_9_3ec0a998f89533555c14a9745956f800._comment b/doc/todo/generic_p2p_socket_transport/comment_9_3ec0a998f89533555c14a9745956f800._comment new file mode 100644 index 0000000000..d2778b8a19 --- /dev/null +++ b/doc/todo/generic_p2p_socket_transport/comment_9_3ec0a998f89533555c14a9745956f800._comment @@ -0,0 +1,24 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 9""" + date="2025-07-07T16:00:45Z" + content=""" +> A configuration option annex.expose-p2p-via=foo that could be supplied +> zero, one, or multiple times, and each of these configurations would +> instruct remotedaemon to start the external program +> git-annex-p2ptransport-foo after the p2p socket is ready + +Hmm, I don't know if it would generally make sense for remotedaemon to +start up external programs that run P2P networks. That might be something +that runs system-wide, like tor (often) does. Or the user might expect to +run it themselves and only have git-annex use it when it's running. + +It seems to me that in your yggstack example, there's no real need +for remotedaemon to be responsible for running +`git-annex-p2ptransport-yggstack`. You could run that yourself first. +Then the remotedaemon can create the socket file and listen to it. + +If a tcp connection comes in before the socket file exists, socat handles +it by closing that connection, and keeps listening for further +connections. +"""]]