git-annex

Author	SHA1	Message	Date
Joey Hess	46ea041eba	more fixing for building without servant	2024-11-21 15:35:06 -04:00
Joey Hess	204c19583c	more fixing for building without servant	2024-11-21 15:34:07 -04:00
Joey Hess	757f93203a	Merge branch 'p2phttp-multi'	2024-11-21 15:16:06 -04:00
Joey Hess	4c785c338a	p2phttp: notice when new repositories are added to --directory When a uuid is not known, rescan for new repositories. Easy. When a repository is removed, it will also get removed from the server state on the next scan. But until a new uuid is seen, there will not be a scan. This leaves the server trying to serve a uuid whose repository is gone. That seems buggy. While getting just fails, dropping fails the first time, but seems to leave the server in an unusable state, so the next drop attempt hangs. The server is still able to serve other uuids, only the one whose repository was removed has that problem.	2024-11-21 15:09:12 -04:00
Joey Hess	3c18398d5a	p2phttp support --jobs with --directory --jobs is usually an Annex option setter, but --directory runs in IO, so would not have that available. So instead moved the option parser into the command's Options.	2024-11-21 14:15:14 -04:00
Joey Hess	9f84dd82da	p2phttp --directory implementation Untested, but it compiles, so. Known problems: * --jobs is not available to startIO * Does not notice when new repositories are added to a directory. * Does not notice when repositories are removed from a directory.	2024-11-21 14:02:58 -04:00
Joey Hess	6bdf4a85fb	move the p2phttp server state map into a data type	2024-11-21 12:24:14 -04:00
Joey Hess	475823c2d3	fix to build w/o servant	2024-11-20 16:29:43 -04:00
Joey Hess	07026cf58b	add proxied uuids to http server state map This fixes support for proxying after last commit broke it. Note that withP2PConnections is called at server startup, and so only proxies seen at that point will appear in the map and be used. It was already the case that a proxy added after p2phttp was running would not be served. I think that is possibly a bug, but at least this commit doesn't introduce the problem, though it might make it harder to fix it. As bugs go, it's probably not a big deal, because after all, git configs needs to be set in the local repository, followed by git-annex updateproxy being run, to set up proxying. If someone is doing that, they can restart their http server I suppose.	2024-11-20 13:22:25 -04:00
Joey Hess	254073569f	p2pHttpApp with a map of UUIDs to server states This is early groundwork for making p2phttp support serving multiple repositories from a single daemon. So far only 1 repository is served still. And this commit breaks support for proxying!	2024-11-20 12:51:25 -04:00
Joey Hess	b8a717a617	reuse http url password for p2phttp url when on same host When remote.name.annexUrl is an annex+http(s) url, that uses the same hostname as remote.name.url, which is itself a http(s) url, they are assumed to share a username and password. This avoids unnecessary duplicate password prompts.	2024-11-19 15:27:26 -04:00
Joey Hess	54dc1d6f6e	fix recording present on the proxy when proxying DATA-PRESENT Apparently the protoerrhandler parameter never runs. Also the const typo prevented the type checker from complaining that relayPUTRecord was being called with 1 less parameter than needed. So move the relayPUTRecord out of it.	2024-10-30 13:17:31 -04:00
Joey Hess	2fc3fbfed2	support DATA-PRESENT in the p2p protocol proxy But not yet when proxying to special remotes. When proxying for a cluster, the client can store the object on any node or nodes of the cluster, and send DATA-PRESENT. That gets proxied to each node, and if any of them agree that they have the data, the proxy will respond with SUCCESS or SUCCESS-PLUS. I think it's ok to not check for the proxied remotes supporting protocol version 4. When there are multiple remotes in a cluster, it behaves as described above, and if they all respond with ERROR, the result will be FAILURE. And when not proxying for a cluster, the proxy negotiates the p2p protocol to be the same version or lower than the proxied remote, which will prevent sending DATA-PRESENT when it's too old.	2024-10-29 14:44:23 -04:00
Joey Hess	a4e9057486	implement put data-present parameter in http servant Changed the protocol docs because servant parses "true" and "false" for booleans in query parameters, not "1" and "0". clientPut with datapresent=True is not used by git-annex, and I don't anticipate it being used in git-annex, except for testing. I've tested this by making clientPut be called with datapresent=True and git-annex copy to a remote succeeds once the object file is first manually copied to the remote. That would be a good test for the test suite, but running the http client means exposing it to at least localhost, and would fail if a real http client was already running on that port.	2024-10-29 13:32:43 -04:00
Joey Hess	57e27adb55	implement DATA-PRESENT in p2p protocol Not yet implemented for the http server or the proxy.	2024-10-29 13:12:12 -04:00
Joey Hess	20df236a13	update http servant for p2p protocol version 4 This is all just adding the v4 routes and boilerplate. At this point v4 is implemented the same as v3.	2024-10-29 12:13:56 -04:00
Joey Hess	de138c642b	p2phttp: Allow unauthenticated users to lock content by default * p2phttp: Allow unauthenticated users to lock content by default. * p2phttp: Added --unauth-nolocking option to prevent unauthenticated users from locking content. The rationalle for this is that locking is not really a write operation, so makes sense to allow in a repository that only allows read-only access. Not supporting locking in that situation will prevent the user from dropping content from a special remote they control in cases where the other copy of the content is on the p2phttp server. Also, when p2phttp is configured to also allow authenticated access, lockcontent was resulting in a password prompt for users who had no way to authenticate. And there is no good way to distinguish between the two types of users client side. --unauth-nolocking anticipates that this might be abused, and seems better than disabling unauthenticated access entirely if a server is being attacked. It may be that rate limiting locking by IP address or similar would be an effective measure in such a situation. Or just limiting the number of locks by anonymous users that can be live at any one time. Since the impact of such an DOS attempt is limited to preventing dropping content from the server, it seems not a very appealing target anyway.	2024-10-21 10:02:12 -04:00
Joey Hess	b83fdf66df	Allow enabling the servant build flag with older versions of stm Allowing building with ghc 9.0.2 (debian stable). Updated patch covering all uses of writeTMVar.	2024-10-17 20:55:31 -04:00
Joey Hess	3a53c60121	Allow enabling the servant build flag with older versions of stm Allowing building with ghc 9.0.2 (debian stable).	2024-10-17 14:04:31 -04:00
Joey Hess	0629219617	p2phttp combining unauth and auth options p2phttp: Support serving unauthenticated users while requesting authentication for operations that need it. Eg, --unauth-readonly can be combined with --authenv. Drop locking currently needs authentication so it will prompt for that. That still needs to be addressed somehow.	2024-10-17 11:10:28 -04:00
Joey Hess	84d1bb746b	LiveUpdate for clusters	2024-08-24 10:20:12 -04:00
Joey Hess	1d51f18dd0	remove FIXME Using NoLiveUpdate here is appropriate, because this is running the server side of the P2P protocol. There no preferred content checking is done.	2024-08-24 09:34:22 -04:00
Joey Hess	c3d40b9ec3	plumb in LiveUpdate (WIP) Each command that first checks preferred content (and/or required content) and then does something that can change the sizes of repositories needs to call prepareLiveUpdate, and plumb it through the preferred content check and the location log update. So far, only Command.Drop is done. Many other commands that don't need to do this have been updated to keep working. There may be some calls to NoLiveUpdate in places where that should be done. All will need to be double checked. Not currently in a compilable state.	2024-08-23 16:35:12 -04:00
Joey Hess	509b23fa00	catch ClientError from withClientM When getting from a P2P HTTP remote, prompt for credentials when required, instead of failing. This feels like it might be a bug in servant-client. withClientM's type suggests it would not throw a ClientError. But it does in this case.	2024-08-07 11:24:34 -04:00
Joey Hess	7c6c3e703b	clean up build warnings when built w/o servant	2024-07-31 14:07:30 -04:00
Joey Hess	76f31d59b0	more fixes to build w/o servant	2024-07-30 12:39:17 -04:00
Joey Hess	456ec9ccf2	typo	2024-07-30 12:18:39 -04:00
Joey Hess	1632beaf70	fix negative DATA when 1 node of a cluster has a partial transfer	2024-07-30 11:42:17 -04:00
Joey Hess	41cef62dad	fix build without servant some more	2024-07-30 10:53:44 -04:00
Joey Hess	640fdffd12	fix build without servant	2024-07-30 09:49:37 -04:00
Joey Hess	acb436b999	fix build with text older than 2.0	2024-07-29 18:15:29 -04:00
Joey Hess	1467fed572	fix build with old text Don't need decodeUtf8Lenient here because B64.encode surely always generates utf8. So decodeUtf8 is safe, it will never throw an exception.	2024-07-29 17:21:41 -04:00
Joey Hess	7402ae61d9	fix reversion in GET from proxy over http `4f3ae96666` caused a hang in GET, which git-annex testremote could reliably cause. The problem is that closing both P2P handles before waiting on the asyncworker prevents all the DATA from getting sent. The solution is to only close the P2P handles early when the P2PConnection is being closed. When it's being released, let the asyncworker finish. closeP2PConnection is called in GET when it was unable to send all data, and in PUT when it did not receive all the data, and in both cases closing the P2P handles early is ok.	2024-07-29 11:07:09 -04:00
Joey Hess	4f3ae96666	cleanly close proxy connection on interrupted PUT An interrupted PUT to cluster that has a node that is a special remote over http left open the connection to the cluster, so the next request opens another one. So did an interrupted PUT directly to the proxied special remote over http. proxySpecialRemote was stuck waiting for all the DATA. Its connection remained open so it kept waiting. In servePut, checktooshort handles closing the P2P connection when too short a data is received from PUT. But, checktooshort was only called after the protoaction, which is what runs the proxy, which is what was getting stuck. Modified it to run as a background thread, which waits for the tooshortv to be written to, which gather always does once it gets to the end of the data received from the http client. That makes proxyConnection's releaseconn run once all data is received from the http client. Made it close the connection handles before waiting on the asyncworker thread. This lets proxySpecialRemote finish processing any data from the handle, and then it will give up, more or less cleanly, if it didn't receive enough data. I say "more or less cleanly" because with both sides of the P2P connection taken down, some protocol unhappyness results. Which can lead to some ugly debug messages. But also can cause the asyncworker thread to throw an exception. So made withP2PConnections not crash when it receives an exception from releaseconn. This did have a small change to the behavior of an interrupted PUT when proxying to a regular remote. proxyConnection has a protoerrorhandler that closes the proxy connection on a protocol error. But the proxy connection is also closed by checktooshort when it closes the P2P connection. Closing the same proxy connection twice is not a problem, it just results in duplicated debug messages about it.	2024-07-29 10:37:19 -04:00
Joey Hess	c8e7231f48	add debugging of opening and closing connections to proxies	2024-07-29 09:52:26 -04:00
Joey Hess	5ef3f1e703	remove unused imports	2024-07-28 21:11:23 -04:00
Joey Hess	cd89f91aa5	remove uuid from annex+http urls Not needed it turns out.	2024-07-28 20:29:42 -04:00
Joey Hess	dfe65b92c8	avoid repeatedly parsing the proxy log	2024-07-28 16:04:20 -04:00
Joey Hess	5e205f215d	clean shut down of cluster connection when PUT is interrupted An interrupted `git-annex copy --to` a cluster via the http server, when repeated, failed. The http server output "transfer already in progress, or unable to take transfer lock". Apparently a second connection was opened to the cluster, because the first connection never got shut down. Turned out the problem was that when proxying to a cluster, it would read a short ByteString from the client, and send that to the nodes. But that left the nodes warning more. Meanwhile, the proxy was expecting a SUCCESS/FAILURE message from the nodes. So it didn't return, and so the cluster connection stayed open.	2024-07-28 14:20:11 -04:00
Joey Hess	6722a61a21	clusters need enableInteractiveBranchAccess As seen in commit `770aac97a7`, a cluster relies accurate location logs. If long-running processes are serving a cluster, and one process puts a file, the other process needs to see what nodes it was stored on when checking if the file is present.	2024-07-28 12:39:42 -04:00
Joey Hess	4304f1b6ae	better handling of content not available from cluster Sending ERROR caused the client to get confused and protocol to freeze. Better to send empty DATA and indicate it's not valid. This fixes a hang in git-annex testremote of a cluster accessed via the http server. That testremote is still failing, for some reason after storing a test key, the cluster reports it as not present.	2024-07-28 11:09:07 -04:00
Joey Hess	fbbedae497	add --clusterjobs option and default to 1 The default of 1 is not ideal at all, but it avoids an accidental M*N causing so much concurrency it becomes unusable.	2024-07-28 10:36:22 -04:00
Joey Hess	1259ad89b6	cluster support in http API server Wired it up and it seems to basically work, although the test suite is not fully passing. Note that --jobs currently gets multiplied by the number of nodes in the cluster, which is probably not good.	2024-07-28 10:17:29 -04:00
Joey Hess	0fb86d2916	UNLOCKCONTENT is not a top-level request proxyRequest was treating UNLOCKCONTENT as a separate request. That made it possible for there to be two different connections to the proxied remote, with LOCKCONTENT being sent to one, and UNLOCKCONTENT to the other one. A protocol error. git-annex testremote now passes against a http proxied remote.	2024-07-26 20:39:06 -04:00
Joey Hess	a3dab58be2	fix hang at end of PUT to proxied p2p http remote sendExactly will now be sure to evaluate the whole lazy ByteString. In this case, the lazy ByteString was exactly the right lenth. But, it seems that L.take caused it to not actually be fully evaluated. In servePut, this manifested as gather never being fully evaluated, which caused the hang. Very, very subtle, and horrible bug. Clearly the use of lazy ByteString (or really just laziness) is at fault, and it would be very worth moving to conduit or whatever to avoid this.	2024-07-26 19:50:15 -04:00
Joey Hess	d1faa13d6a	implement proxy connection pool removeOldestProxyConnectionPool will be innefficient the larger the pool is. A better data structure could be more efficient. Eg, make each value in the pool include the timestamp of its oldest element, then the oldest value can be found and modified, rather than rebuilding the whole Map. But, for pools of a few hundred items, this should be fine. It's O(n*n log n) or so. Also, when more than 1 connection with the same pool key exists, it's efficient even for larger pools, since removeOldestProxyConnectionPool is not needed. The default of 1 idle connection could perhaps be larger.. like the number of jobs? Otoh, it seems good to ramp up and down the number of connections, which does happen. With 1, there is at most one stale connection, which might cause a request to fail.	2024-07-26 17:03:31 -04:00
Joey Hess	fb43b7ea3f	closeP2PConnection on interrupted GET	2024-07-26 15:50:01 -04:00
Joey Hess	267a202e72	clean up after http p2p proxy GET is interrupted There was an annex worker thread that did not get stopped. It was stuck in ReceiveMessage from the P2PHandleTMVar. Fixed by making P2PHandleTMVar closeable. In serveGet, releaseP2PConnection has to come first, else the annexworker may not shut down, if it's waiting to read from it. In proxyConnection, call closeRemoteSide in order to wait for the ssh process (for example).	2024-07-26 15:33:20 -04:00
Joey Hess	5ebbb31b36	close proxy remote side when done with it	2024-07-26 13:57:28 -04:00
Joey Hess	b028b1a379	oops	2024-07-26 13:55:14 -04:00

1 2 3 4 5 ...

315 commits