Commit graph

44 commits

Author SHA1 Message Date
Joey Hess
3c18398d5a
p2phttp support --jobs with --directory
--jobs is usually an Annex option setter, but --directory runs in IO, so
would not have that available. So instead moved the option parser into
the command's Options.
2024-11-21 14:15:14 -04:00
Joey Hess
9f84dd82da
p2phttp --directory implementation
Untested, but it compiles, so.

Known problems:

* --jobs is not available to startIO
* Does not notice when new repositories are added to a directory.
* Does not notice when repositories are removed from a directory.
2024-11-21 14:02:58 -04:00
Joey Hess
6bdf4a85fb
move the p2phttp server state map into a data type 2024-11-21 12:24:14 -04:00
Joey Hess
07026cf58b
add proxied uuids to http server state map
This fixes support for proxying after last commit broke it.

Note that withP2PConnections is called at server startup, and so only
proxies seen at that point will appear in the map and be used. It was
already the case that a proxy added after p2phttp was running would not
be served.

I think that is possibly a bug, but at least this commit doesn't
introduce the problem, though it might make it harder to fix it.

As bugs go, it's probably not a big deal, because after all,
git configs needs to be set in the local repository, followed by
git-annex updateproxy being run, to set up proxying. If someone is doing
that, they can restart their http server I suppose.
2024-11-20 13:22:25 -04:00
Joey Hess
254073569f
p2pHttpApp with a map of UUIDs to server states
This is early groundwork for making p2phttp support serving multiple
repositories from a single daemon.

So far only 1 repository is served still. And this commit breaks support
for proxying!
2024-11-20 12:51:25 -04:00
Joey Hess
de138c642b
p2phttp: Allow unauthenticated users to lock content by default
* p2phttp: Allow unauthenticated users to lock content by default.
* p2phttp: Added --unauth-nolocking option to prevent unauthenticated
  users from locking content.

The rationalle for this is that locking is not really a write operation, so
makes sense to allow in a repository that only allows read-only access. Not
supporting locking in that situation will prevent the user from dropping
content from a special remote they control in cases where the other copy of
the content is on the p2phttp server.

Also, when p2phttp is configured to also allow authenticated access,
lockcontent was resulting in a password prompt for users who had no way to
authenticate. And there is no good way to distinguish between the two types
of users client side.

--unauth-nolocking anticipates that this might be abused, and seems better
than disabling unauthenticated access entirely if a server is being
attacked. It may be that rate limiting locking by IP address or similar
would be an effective measure in such a situation. Or just limiting the
number of locks by anonymous users that can be live at any one time. Since
the impact of such an DOS attempt is limited to preventing dropping content
from the server, it seems not a very appealing target anyway.
2024-10-21 10:02:12 -04:00
Joey Hess
b83fdf66df
Allow enabling the servant build flag with older versions of stm
Allowing building with ghc 9.0.2 (debian stable).

Updated patch covering all uses of writeTMVar.
2024-10-17 20:55:31 -04:00
Joey Hess
0629219617
p2phttp combining unauth and auth options
p2phttp: Support serving unauthenticated users while requesting
authentication for operations that need it. Eg, --unauth-readonly can be
combined with --authenv.

Drop locking currently needs authentication so it will prompt for that.
That still needs to be addressed somehow.
2024-10-17 11:10:28 -04:00
Joey Hess
7402ae61d9
fix reversion in GET from proxy over http
4f3ae96666 caused a hang in GET,
which git-annex testremote could reliably cause.

The problem is that closing both P2P handles before waiting on the
asyncworker prevents all the DATA from getting sent.

The solution is to only close the P2P handles early when the
P2PConnection is being closed. When it's being released, let the
asyncworker finish. closeP2PConnection is called in GET when it was
unable to send all data, and in PUT when it did not receive all the
data, and in both cases closing the P2P handles early is ok.
2024-07-29 11:07:09 -04:00
Joey Hess
4f3ae96666
cleanly close proxy connection on interrupted PUT
An interrupted PUT to cluster that has a node that is a special remote
over http left open the connection to the cluster, so the next request
opens another one. So did an interrupted PUT directly to the proxied
special remote over http.

proxySpecialRemote was stuck waiting for all the DATA. Its connection
remained open so it kept waiting.

In servePut, checktooshort handles closing the P2P connection
when too short a data is received from PUT. But, checktooshort was only
called after the protoaction, which is what runs the proxy, which is
what was getting stuck. Modified it to run as a background thread,
which waits for the tooshortv to be written to, which gather always does
once it gets to the end of the data received from the http client.

That makes proxyConnection's releaseconn run once all data is received
from the http client. Made it close the connection handles before
waiting on the asyncworker thread. This lets proxySpecialRemote finish
processing any data from the handle, and then it will give up,
more or less cleanly, if it didn't receive enough data.

I say "more or less cleanly" because with both sides of the P2P
connection taken down, some protocol unhappyness results. Which can lead
to some ugly debug messages. But also can cause the asyncworker thread
to throw an exception. So made withP2PConnections not crash when it
receives an exception from releaseconn.

This did have a small change to the behavior of an interrupted PUT when
proxying to a regular remote. proxyConnection has a protoerrorhandler
that closes the proxy connection on a protocol error. But the proxy
connection is also closed by checktooshort when it closes the P2P
connection. Closing the same proxy connection twice is not a problem,
it just results in duplicated debug messages about it.
2024-07-29 10:37:19 -04:00
Joey Hess
c8e7231f48
add debugging of opening and closing connections to proxies 2024-07-29 09:52:26 -04:00
Joey Hess
dfe65b92c8
avoid repeatedly parsing the proxy log 2024-07-28 16:04:20 -04:00
Joey Hess
6722a61a21
clusters need enableInteractiveBranchAccess
As seen in commit 770aac97a7, a cluster
relies accurate location logs. If long-running processes are serving a
cluster, and one process puts a file, the other process needs to see
what nodes it was stored on when checking if the file is present.
2024-07-28 12:39:42 -04:00
Joey Hess
fbbedae497
add --clusterjobs option and default to 1
The default of 1 is not ideal at all, but it avoids an accidental M*N
causing so much concurrency it becomes unusable.
2024-07-28 10:36:22 -04:00
Joey Hess
1259ad89b6
cluster support in http API server
Wired it up and it seems to basically work, although the test suite is
not fully passing.

Note that --jobs currently gets multiplied by the number of nodes in the
cluster, which is probably not good.
2024-07-28 10:17:29 -04:00
Joey Hess
d1faa13d6a
implement proxy connection pool
removeOldestProxyConnectionPool will be innefficient the larger the pool
is. A better data structure could be more efficient. Eg, make each value
in the pool include the timestamp of its oldest element, then the oldest
value can be found and modified, rather than rebuilding the whole Map.

But, for pools of a few hundred items, this should be fine. It's O(n*n log n)
or so.

Also, when more than 1 connection with the same pool key exists,
it's efficient even for larger pools, since removeOldestProxyConnectionPool
is not needed.

The default of 1 idle connection could perhaps be larger.. like the
number of jobs? Otoh, it seems good to ramp up and down the number of
connections, which does happen. With 1, there is at most one stale
connection, which might cause a request to fail.
2024-07-26 17:03:31 -04:00
Joey Hess
267a202e72
clean up after http p2p proxy GET is interrupted
There was an annex worker thread that did not get stopped.

It was stuck in ReceiveMessage from the P2PHandleTMVar.

Fixed by making P2PHandleTMVar closeable.

In serveGet, releaseP2PConnection has to come first, else the
annexworker may not shut down, if it's waiting to read from it.

In proxyConnection, call closeRemoteSide in order to wait for the ssh
process (for example).
2024-07-26 15:33:20 -04:00
Joey Hess
5ebbb31b36
close proxy remote side when done with it 2024-07-26 13:57:28 -04:00
Joey Hess
ad025b8e5e
clean up protocol version for proxying
The proxy always checks the protocol version of a remote before talking
to it in a version-specific way, so the protocol version in the ProxyParams
is the client's protocol version. The remote will always be at the same or
an older protocol version than the client.

Note that in relayDATAFinish, when the client is at protocol version 0,
the remote must thus be as well, and that's why its version is not
checked in the case for that.

With that clarified, it's evident that, in P2P.Http.State, there's no
need to look at the proxied remote's protocol version at all.
2024-07-26 13:49:05 -04:00
Joey Hess
576ec6ed71
fix hang in GET from http p2p proxy
serverP2PConnection = proxyfromclientconn causes serveGet to
signalFullyConsumedByteString to it, which is what it's waiting for
2024-07-26 12:51:00 -04:00
Joey Hess
cc1da2d516
http p2p proxy is now largely working 2024-07-26 10:44:10 -04:00
Joey Hess
96ad0ccc5b
wip 2024-07-25 15:39:57 -04:00
Joey Hess
3d14e2cf58
http server support for proxies, incomplete
Refactored git-annex-shell code so this can use checkCanProxy'.

At this point all that remains is opening a proxy connection,
and using a proxy connection.
2024-07-25 13:19:24 -04:00
Joey Hess
f5624a69e3
expire lock after 10 minutes initially
Once keeplocked is called, the lock will expire at the end
of that call. But if keeplocked never gets called, this avoids
the lock persisting forever.
2024-07-24 14:25:40 -04:00
Joey Hess
53000cf06e
exit cleanly on eg, failure to bind socket 2024-07-22 20:37:37 -04:00
Joey Hess
e979e85bff
make serveKeepLocked check auth just to be safe 2024-07-22 19:15:52 -04:00
Joey Hess
f5dd7a8bc0
implemented serveLockContent (untested) 2024-07-22 17:38:42 -04:00
Joey Hess
a01426b713
avoid padding in servePut
This means that when the client sends a truncated data to indicate
invalidity, DATA is not passed the full expected data. That leaves the
P2P connection in a state where it cannot be reused. While so far, they
are not reused, they will be later when proxies are supported. So, have
to close the P2P connection in this situation.
2024-07-22 12:30:30 -04:00
Joey Hess
726c815a7f
fix releasing of p2p connection 2024-07-22 11:26:22 -04:00
Joey Hess
4826a3745d
servePut and clientPut implementation
Made the data-length header required even for v0. This simplifies the
implementation, and doesn't preclude extra verification being done for
v0.

The connectionWaitVar is an ugly hack. In servePut, nothing waits
on the waitvar, and I could not find a good way to make anything wait on
it.
2024-07-22 10:27:44 -04:00
Joey Hess
97a2d0e4fb
use worker pool in withLocalP2PConnections
This allows multiple clients to be handled at the same time.
2024-07-11 14:37:52 -04:00
Joey Hess
74c6175795
fix serveGet early handle close
Needed that waitv after all..
2024-07-11 09:55:17 -04:00
Joey Hess
3b37b9e53f
fix serveGet hang
This came down to SendBytes waiting on the waitv. Nothing ever filled
it.

Only Annex.Proxy needs the waitv, and it handles filling it. So make it
optional.
2024-07-11 07:46:52 -04:00
Joey Hess
1e0f92a5a1
implemented serveGet and clientGet
Both are only at bare proof of concept stage. Still need to deal with
signaling validity and invalidity, and checking it.

And there's a bad bug: After -JN*2 requests, another request hangs!

So, I think it's failing to free up the Annex worker and end of request
lifetime.

Perhaps I need to use this:

https://docs.servant.dev/en/stable/cookbook/managed-resource/ManagedResource.html
2024-07-10 16:06:39 -04:00
Joey Hess
f9b7ce7224
add Annex worker pool to P2PHttp
This will be needed for get and store, since those need to run Annex
actions.

withLocalP2PConnections will also probably use it.
2024-07-10 12:19:47 -04:00
Joey Hess
48f76cb3e8
implement serveRemove and send WWW-Authenticate header on auth failure 2024-07-10 09:13:01 -04:00
Joey Hess
97d0fc9b65
git-annex p2phttp options 2024-07-10 00:01:55 -04:00
Joey Hess
6a8a4d1775
authentication is implemented
just need to make Command.P2PHttp generate a GetServerMode from options
2024-07-09 20:54:47 -04:00
Joey Hess
08371c3745
started on auth 2024-07-09 17:30:55 -04:00
Joey Hess
dcd77ee555
fix p2phttp server to not get stuck
Process 1 command, then stop. Hopefully each of the Handlers will only
need 1 command.
2024-07-09 14:26:30 -04:00
Joey Hess
3d13521479
set up handles for p2phttp
Now it fully works.. for the first request. But then it gets stuck
waiting for the P2P protocol runner to shut down.
2024-07-09 13:50:42 -04:00
Joey Hess
edf8a3df2d
p2phttp is almost working for checkpresent
The server is fully running annex actions, only the P2PConnection is
wrong, currently using stdio.
2024-07-09 13:37:55 -04:00
Joey Hess
751b8e0baf
implemented serveCheckPresent
Still need a way to run Proto though
2024-07-09 09:08:42 -04:00
Joey Hess
9a592f946f
split module 2024-07-08 21:12:23 -04:00