Commit graph

330 commits

Author SHA1 Message Date
Joey Hess
c15dad6040
improve AuthToken display in P2P --debug
Using an empty string to obscure the AuthToken made it impossible to
tell if one was really being presented or not.
2025-08-01 12:58:03 -04:00
Joey Hess
7e0802e951
avoid broken pipe zombies
hClose crashes if the pipe is broken, preventing waiting for the process
2025-07-31 14:57:51 -04:00
Joey Hess
f1781d01d8
remotedaemon support for generic P2P transports
RemoteDaemon.Transport.Tor was refactored into this, and most of the
code is reused between them.

getSocketFile does not yet deal with repositories on crippled
filesystems that don't support sockets. Annex.Ssh detects that and
allows the user to set an environment variable, and something similar
could be done here.

And it does not deal with a situation where there is no path to the
socket file that is not too long. In that situation it would crash out
I suppose. Probably though, remotedaemon is ran from the top of the
repo, and in that case the path is just ".git/annex/p2p/<md5>" so nice
and short.

This seems to mostly work. But I don't yet have a working git-annex-p2p-
command to test it with.

And with my not quite working git-annex-p2p-foo test script, running
remotedaemon results in an ever-growing number of zombie processes
that it's not waiting on.
2025-07-31 14:45:32 -04:00
Joey Hess
7403aeb95f
use Annex.ExternalAddonProcess for P2P.Generic processes
These are another sort of external addon process, and this makes several
things work including shell scripts on windows. And it makes for nicer
error messages when the command is not in the path.

Note that the refactored startExternalAddonProcess used by this
does not use propGitEnv to set git environment variables in the
environment. Unlike startExternalAddonProcessProtocol which does.
This is because it runs in IO and does not have access to that
information. But also, I don't think that P2P.Generic processes need
that.
2025-07-30 14:46:37 -04:00
Joey Hess
d3fbda13e4
p2p --enable
p2p: Added --enable option, which can be used to enable P2P networks
provided by external commands git-annex-p2p-<netname>

Made git-annex p2p --enable tor behave the same as git-annex enable-tor,
to make tor a bit less of a special case. However, it canot be run as root,
since it cannot take the user id parameter.
2025-07-30 14:08:59 -04:00
Joey Hess
4fb9b7cb67
support P2PAnnex in connectPeer
This is probably enough to support accessing remotes using p2p-annex:: urls.
Not tested yet of course since there is not yet support for serving the
other side of such a connection, or for setting up such a connection.

P2P.Generic has an implementation of the whole interface to the
git-annex-p2p-<netname> commands.
2025-07-30 13:23:23 -04:00
Joey Hess
a6f8248465
add connProcess to P2PConnection
When using the new generic P2P transport to open an outgoing connection
to a peer, this will hold the pid of the git-annex-p2p-<netname>
command.

closeConnection simply waits for it. Rather than relying on garbage
collection of the closed handles to close it.

In Remote.Helper.Ssh, connProcess is set to Nothing, even though there
is a similar process being used there. That code stores the pid in
OpenConnection instead, and handles waiting for it itself. A bit ugly,
but not worth cleaning up at this point, maybe later.
2025-07-30 12:35:16 -04:00
Joey Hess
f631bc9e56
add P2PAnnex constructor
This is for p2p-annex:: urls that will use the new generic P2P
transport.

In addressCredsFile, threw in an url encoding of any non-alphanumeric
characters that are in the address. This is to avoid any possible path
traversal attacks via a p2p-annex:: url, since the address part of it
could contain any characters. And, went ahead and did the same url
encoding of tor-annex:: urls, even though tor onion addresses are all
alphanumerics, on the off chance that might avoid a similar problem.
(It does not seem likely enough to treat it as a security hole.)
2025-07-30 12:09:17 -04:00
Joey Hess
46ee651c94
non-tor AuthTokens
As groundwork for making git-annex p2p support other P2P networks than
tor hidden services, when an AuthToken is not a TorAnnex value, but
something else (that will be added later), store the P2PAddress that it
will be used with along with the AuthToken. And in loadP2PAuthTokens,
only return AuthTokens for the specified P2PAddress.

See commit 2de27751d6 for some design work
that led to this.

Also, git-annex p2p --gen-addresses is changed to generate a separate
AuthToken for every P2P address. Rather than generating a single
AuthToke and using it for every one. When we have more than just tor,
this will be important for security, to avoid a compromise of one P2P
network exposing the AuthToken used for another network.
2025-07-07 15:10:15 -04:00
Joey Hess
0a89dc0d5d
remove garbage from middle of comment 2025-07-07 13:21:21 -04:00
Joey Hess
e81fd72018
Added remote.name.annex-web-options config
Which is a per-remote version of the annex.web-options config.

Had to plumb RemoteGitConfig through to getUrlOptions. In cases where a
special remote does not use curl, there was no need to do that and I used
Nothing instead.

In the case of the addurl and importfeed commands, it seemed best to say
that running these commands is not using the web special remote per se,
so the config is not used for those commands.
2025-04-01 10:17:38 -04:00
Joey Hess
4f1eea9061
remove unused adjustedBranchRefresh associated file parameter 2025-02-21 14:51:02 -04:00
Joey Hess
0b9e9cbf70
more OsPath conversion (502/749)
Sponsored-by: Kevin Mueller on Patreon
2025-02-05 13:29:58 -04:00
Joey Hess
474cf3bc8b
more OsPath conversion
Sponsored-by: Brock Spratlen
2025-02-01 11:54:19 -04:00
Joey Hess
a9f3a31a52
more OsPath conversion
Sponsored-by: Kevin Mueller
2025-01-29 16:24:51 -04:00
Joey Hess
46ea041eba
more fixing for building without servant 2024-11-21 15:35:06 -04:00
Joey Hess
204c19583c
more fixing for building without servant 2024-11-21 15:34:07 -04:00
Joey Hess
757f93203a
Merge branch 'p2phttp-multi' 2024-11-21 15:16:06 -04:00
Joey Hess
4c785c338a
p2phttp: notice when new repositories are added to --directory
When a uuid is not known, rescan for new repositories. Easy.

When a repository is removed, it will also get removed from the server
state on the next scan. But until a new uuid is seen, there will not be
a scan. This leaves the server trying to serve a uuid whose repository
is gone. That seems buggy. While getting just fails, dropping fails the
first time, but seems to leave the server in an unusable state, so the
next drop attempt hangs. The server is still able to serve other uuids,
only the one whose repository was removed has that problem.
2024-11-21 15:09:12 -04:00
Joey Hess
3c18398d5a
p2phttp support --jobs with --directory
--jobs is usually an Annex option setter, but --directory runs in IO, so
would not have that available. So instead moved the option parser into
the command's Options.
2024-11-21 14:15:14 -04:00
Joey Hess
9f84dd82da
p2phttp --directory implementation
Untested, but it compiles, so.

Known problems:

* --jobs is not available to startIO
* Does not notice when new repositories are added to a directory.
* Does not notice when repositories are removed from a directory.
2024-11-21 14:02:58 -04:00
Joey Hess
6bdf4a85fb
move the p2phttp server state map into a data type 2024-11-21 12:24:14 -04:00
Joey Hess
475823c2d3
fix to build w/o servant 2024-11-20 16:29:43 -04:00
Joey Hess
07026cf58b
add proxied uuids to http server state map
This fixes support for proxying after last commit broke it.

Note that withP2PConnections is called at server startup, and so only
proxies seen at that point will appear in the map and be used. It was
already the case that a proxy added after p2phttp was running would not
be served.

I think that is possibly a bug, but at least this commit doesn't
introduce the problem, though it might make it harder to fix it.

As bugs go, it's probably not a big deal, because after all,
git configs needs to be set in the local repository, followed by
git-annex updateproxy being run, to set up proxying. If someone is doing
that, they can restart their http server I suppose.
2024-11-20 13:22:25 -04:00
Joey Hess
254073569f
p2pHttpApp with a map of UUIDs to server states
This is early groundwork for making p2phttp support serving multiple
repositories from a single daemon.

So far only 1 repository is served still. And this commit breaks support
for proxying!
2024-11-20 12:51:25 -04:00
Joey Hess
b8a717a617
reuse http url password for p2phttp url when on same host
When remote.name.annexUrl is an annex+http(s) url, that uses the same
hostname as remote.name.url, which is itself a http(s) url, they are
assumed to share a username and password.

This avoids unnecessary duplicate password prompts.
2024-11-19 15:27:26 -04:00
Joey Hess
54dc1d6f6e
fix recording present on the proxy when proxying DATA-PRESENT
Apparently the protoerrhandler parameter never runs. Also the const
typo prevented the type checker from complaining that relayPUTRecord was
being called with 1 less parameter than needed.

So move the relayPUTRecord out of it.
2024-10-30 13:17:31 -04:00
Joey Hess
2fc3fbfed2
support DATA-PRESENT in the p2p protocol proxy
But not yet when proxying to special remotes.

When proxying for a cluster, the client can store the object on any node
or nodes of the cluster, and send DATA-PRESENT. That gets proxied to each
node, and if any of them agree that they have the data, the proxy will
respond with SUCCESS or SUCCESS-PLUS.

I think it's ok to not check for the proxied remotes supporting protocol
version 4. When there are multiple remotes in a cluster, it behaves as
described above, and if they all respond with ERROR, the result will be
FAILURE. And when not proxying for a cluster, the proxy negotiates the
p2p protocol to be the same version or lower than the proxied remote,
which will prevent sending DATA-PRESENT when it's too old.
2024-10-29 14:44:23 -04:00
Joey Hess
a4e9057486
implement put data-present parameter in http servant
Changed the protocol docs because servant parses "true" and "false" for
booleans in query parameters, not "1" and "0".

clientPut with datapresent=True is not used by git-annex, and I don't
anticipate it being used in git-annex, except for testing.

I've tested this by making clientPut be called with datapresent=True and
git-annex copy to a remote succeeds once the object file is first
manually copied to the remote. That would be a good test for the test
suite, but running the http client means exposing it to at least
localhost, and would fail if a real http client was already running on
that port.
2024-10-29 13:32:43 -04:00
Joey Hess
57e27adb55
implement DATA-PRESENT in p2p protocol
Not yet implemented for the http server or the proxy.
2024-10-29 13:12:12 -04:00
Joey Hess
20df236a13
update http servant for p2p protocol version 4
This is all just adding the v4 routes and boilerplate. At this point v4
is implemented the same as v3.
2024-10-29 12:13:56 -04:00
Joey Hess
de138c642b
p2phttp: Allow unauthenticated users to lock content by default
* p2phttp: Allow unauthenticated users to lock content by default.
* p2phttp: Added --unauth-nolocking option to prevent unauthenticated
  users from locking content.

The rationalle for this is that locking is not really a write operation, so
makes sense to allow in a repository that only allows read-only access. Not
supporting locking in that situation will prevent the user from dropping
content from a special remote they control in cases where the other copy of
the content is on the p2phttp server.

Also, when p2phttp is configured to also allow authenticated access,
lockcontent was resulting in a password prompt for users who had no way to
authenticate. And there is no good way to distinguish between the two types
of users client side.

--unauth-nolocking anticipates that this might be abused, and seems better
than disabling unauthenticated access entirely if a server is being
attacked. It may be that rate limiting locking by IP address or similar
would be an effective measure in such a situation. Or just limiting the
number of locks by anonymous users that can be live at any one time. Since
the impact of such an DOS attempt is limited to preventing dropping content
from the server, it seems not a very appealing target anyway.
2024-10-21 10:02:12 -04:00
Joey Hess
b83fdf66df
Allow enabling the servant build flag with older versions of stm
Allowing building with ghc 9.0.2 (debian stable).

Updated patch covering all uses of writeTMVar.
2024-10-17 20:55:31 -04:00
Joey Hess
3a53c60121
Allow enabling the servant build flag with older versions of stm
Allowing building with ghc 9.0.2 (debian stable).
2024-10-17 14:04:31 -04:00
Joey Hess
0629219617
p2phttp combining unauth and auth options
p2phttp: Support serving unauthenticated users while requesting
authentication for operations that need it. Eg, --unauth-readonly can be
combined with --authenv.

Drop locking currently needs authentication so it will prompt for that.
That still needs to be addressed somehow.
2024-10-17 11:10:28 -04:00
Joey Hess
84d1bb746b
LiveUpdate for clusters 2024-08-24 10:20:12 -04:00
Joey Hess
1d51f18dd0
remove FIXME
Using NoLiveUpdate here is appropriate, because this is running the
server side of the P2P protocol. There no preferred content checking is
done.
2024-08-24 09:34:22 -04:00
Joey Hess
c3d40b9ec3
plumb in LiveUpdate (WIP)
Each command that first checks preferred content (and/or required
content) and then does something that can change the sizes of
repositories needs to call prepareLiveUpdate, and plumb it through the
preferred content check and the location log update.

So far, only Command.Drop is done. Many other commands that don't need
to do this have been updated to keep working.

There may be some calls to NoLiveUpdate in places where that should be
done. All will need to be double checked.

Not currently in a compilable state.
2024-08-23 16:35:12 -04:00
Joey Hess
509b23fa00
catch ClientError from withClientM
When getting from a P2P HTTP remote, prompt for credentials when required,
instead of failing.

This feels like it might be a bug in servant-client. withClientM's type
suggests it would not throw a ClientError. But it does in this case.
2024-08-07 11:24:34 -04:00
Joey Hess
7c6c3e703b
clean up build warnings when built w/o servant 2024-07-31 14:07:30 -04:00
Joey Hess
76f31d59b0
more fixes to build w/o servant 2024-07-30 12:39:17 -04:00
Joey Hess
456ec9ccf2
typo 2024-07-30 12:18:39 -04:00
Joey Hess
1632beaf70
fix negative DATA when 1 node of a cluster has a partial transfer 2024-07-30 11:42:17 -04:00
Joey Hess
41cef62dad
fix build without servant some more 2024-07-30 10:53:44 -04:00
Joey Hess
640fdffd12
fix build without servant 2024-07-30 09:49:37 -04:00
Joey Hess
acb436b999
fix build with text older than 2.0 2024-07-29 18:15:29 -04:00
Joey Hess
1467fed572
fix build with old text
Don't need decodeUtf8Lenient here because B64.encode surely always
generates utf8. So decodeUtf8 is safe, it will never throw an exception.
2024-07-29 17:21:41 -04:00
Joey Hess
7402ae61d9
fix reversion in GET from proxy over http
4f3ae96666 caused a hang in GET,
which git-annex testremote could reliably cause.

The problem is that closing both P2P handles before waiting on the
asyncworker prevents all the DATA from getting sent.

The solution is to only close the P2P handles early when the
P2PConnection is being closed. When it's being released, let the
asyncworker finish. closeP2PConnection is called in GET when it was
unable to send all data, and in PUT when it did not receive all the
data, and in both cases closing the P2P handles early is ok.
2024-07-29 11:07:09 -04:00
Joey Hess
4f3ae96666
cleanly close proxy connection on interrupted PUT
An interrupted PUT to cluster that has a node that is a special remote
over http left open the connection to the cluster, so the next request
opens another one. So did an interrupted PUT directly to the proxied
special remote over http.

proxySpecialRemote was stuck waiting for all the DATA. Its connection
remained open so it kept waiting.

In servePut, checktooshort handles closing the P2P connection
when too short a data is received from PUT. But, checktooshort was only
called after the protoaction, which is what runs the proxy, which is
what was getting stuck. Modified it to run as a background thread,
which waits for the tooshortv to be written to, which gather always does
once it gets to the end of the data received from the http client.

That makes proxyConnection's releaseconn run once all data is received
from the http client. Made it close the connection handles before
waiting on the asyncworker thread. This lets proxySpecialRemote finish
processing any data from the handle, and then it will give up,
more or less cleanly, if it didn't receive enough data.

I say "more or less cleanly" because with both sides of the P2P
connection taken down, some protocol unhappyness results. Which can lead
to some ugly debug messages. But also can cause the asyncworker thread
to throw an exception. So made withP2PConnections not crash when it
receives an exception from releaseconn.

This did have a small change to the behavior of an interrupted PUT when
proxying to a regular remote. proxyConnection has a protoerrorhandler
that closes the proxy connection on a protocol error. But the proxy
connection is also closed by checktooshort when it closes the P2P
connection. Closing the same proxy connection twice is not a problem,
it just results in duplicated debug messages about it.
2024-07-29 10:37:19 -04:00
Joey Hess
c8e7231f48
add debugging of opening and closing connections to proxies 2024-07-29 09:52:26 -04:00