git-annex

Author	SHA1	Message	Date
Joey Hess	f98605bce7	a local git remote cannot proxy Prevent listProxied from listing anything when the proxy remote's url is a local directory. Proxying does not work in that situation, because the proxied remotes have the same url, and so git-annex-shell is not run when accessing them, instead the proxy remote is accessed directly. I don't think there is any good way to support this. Even if the instantiated git repos for the proxied remotes somehow used an url that caused it to use git-annex-shell to access them, planned features like `git-annex copy --to proxy` accepting a key and sending it on to nodes behind the proxy would not work, since git-annex-shell is not used to access the proxy. So it would need to use something to access the proxy that causes git-annex-shell to be run and speaks P2P protocol over it. And we have that. It's a ssh connection to localhost. Of course, it would be possible to take ssh out of that mix, and swap in something that does not have encryption overhead and authentication complications, but otherwise behaves the same as ssh. And if the user wants to do that, GIT_SSH does exist.	2024-06-12 10:16:04 -04:00
Joey Hess	c6e0710281	proxying to local git remotes works This just happened to work correctly. Rather surprisingly. It turns out that openP2PSshConnection actually also supports local git remotes, by just running git-annex-shell with the path to the remote. Renamed "P2PSsh" to "P2PShell" to make this clear.	2024-06-12 10:10:11 -04:00
Joey Hess	178da0dc99	Merge branch 'master' into proxy	2024-06-12 09:49:30 -04:00
Joey Hess	345494e3b4	expanding on the exporttree=yes design	2024-06-12 09:43:59 -04:00
Joey Hess	6e1df33960	minimized code duplication due to type checker limitations	2024-06-11 17:16:49 -04:00
Joey Hess	5beaffb412	proxying PUT now working The almost identical code duplication between relayDATA and relayDATA' is very annoying. I tried quite a few things to parameterize them, but the type checker is having fits when I try it.	2024-06-11 16:56:52 -04:00
Joey Hess	ed4fda098b	todo	2024-06-11 15:15:58 -04:00
Joey Hess	a2f4a8eddf	proxying GET now working Memory use is small and constant; receiveBytes returns a lazy bytestring and it does stream. Comparing speed of a get of a 500 mb file over proxy from origin-origin, vs from the same remote over a direct ssh: joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from origin-origin get bigfile (from origin-origin...) ok (recording state in git...) 1.89user 0.67system 0:10.79elapsed 23%CPU (0avgtext+0avgdata 68716maxresident)k 0inputs+984320outputs (0major+10779minor)pagefaults 0swaps joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from direct-ssh get bigfile (from direct-ssh...) ok 1.79user 0.63system 0:10.49elapsed 23%CPU (0avgtext+0avgdata 65776maxresident)k 0inputs+1024312outputs (0major+9773minor)pagefaults 0swaps So the proxy doesn't add much overhead even when run on the same machine as the client and remote. Still, piping receiveBytes into sendBytes like this does suggest that the proxy could be made to use less CPU resouces by using `sendfile()`.	2024-06-11 15:09:43 -04:00
Joey Hess	09b5e53f49	set annex.uuid in proxy's Repo getRepoUUID looks at that, and was seeing the annex.uuid of the proxy. Which caused it to unncessarily set the git config. Probably also would have led to other problems.	2024-06-11 13:40:50 -04:00
Joey Hess	657a91527a	update	2024-06-11 13:22:03 -04:00
Joey Hess	dd429ba8fe	Merge branch 'master' of ssh://git-annex.branchable.com	2024-06-11 13:08:45 -04:00
Joey Hess	5bb7f8cd64	Merge branch 'master' into proxy	2024-06-11 13:08:23 -04:00
Joey Hess	d2e3c5c89f	update	2024-06-11 13:07:53 -04:00
Joey Hess	60e63fb85b	enable proxying for git-annex-shell p2pstdio	2024-06-11 13:07:04 -04:00
Joey Hess	58d8ba5a4f	implement simple proxy actions (untested) Still need to implement GET and PUT, and will implement CONNECT and NOTIFYCHANGE for completeness. All ServerMode checking is implemented for the proxy. There are two possible approaches for how the proxy sends back messages from the remote to the client. One would be to have a background thread that reads messages and sends them back as they come in. The other, which is being implemented so far, is to read messages from the remote at points where it is expected to send them, and relay back to the client before reading the next message from the client. At this point, I'm unsure which approach would be better. The need for proxynoresponse to be used by UNLOCKCONTENT, for example, builds protocol knowledge into the proxy which it would not need with the other method.	2024-06-11 12:56:20 -04:00
Joey Hess	373ae49c87	factor out helper functions These will be used by the proxy, which needs to check the ServerMode in the same way.	2024-06-11 12:04:58 -04:00
Joey Hess	92c83a417f	refactoring	2024-06-11 10:22:05 -04:00
NewUser	124c1313bb		2024-06-11 13:31:01 +00:00
Joey Hess	501d65eeab	started implementing git-annex-shell proxy So far, it negotiates VERSION with both parties. This is a tricky dance. Untested.	2024-06-10 18:01:36 -04:00
Joey Hess	7b1548dbfa	correct AUTH-SUCCESS and AUTH-FAILURE It's AUTH_SUCCESS internally in git-annex, but the line based serialization uses AUTH-SUCCESS.	2024-06-10 15:06:27 -04:00
Joey Hess	317786d219	remove dead code	2024-06-10 14:28:58 -04:00
Joey Hess	649b87bedd	Merge branch 'master' into proxy	2024-06-10 14:26:18 -04:00
Joey Hess	9a8391078a	git-annex-shell: block relay requests connRepo is only used when relaying git upload-pack and receive-pack. That's only supposed to be used when git-annex-remotedaemon is serving git-remote-tor-annex connections over tor. But, it was always set, and so could be used in other places possibly. Fixed by making connRepo optional in the P2P protocol interface. In Command.EnableTor, it's not needed, because it only speaks the protocol in order to check that it's able to connect back to itself via the hidden service. So changed that to pass Nothing rather than the git repo. In Remote.Helper.Ssh, it's connecting to git-annex-shell p2pstdio, so is making the requests, so will never need connRepo. In git-annex-shell p2pstdio, it was accepting git upload-pack and receive-pack requests over the P2P protocol, even though nothing sent them. This is arguably a security hole, particularly if the user has set environment variables like GIT_ANNEX_SHELL_LIMITED to prevent git push/pull via git-annex-shell.	2024-06-10 14:16:27 -04:00
Joey Hess	d2576e5f1a	git-annex-shell: accept uuid of remote that proxying is enabled for For NotifyChanges and also for the fallthrough case where git-annex-shell passes a command off to git-shell, proxying is currently ignored. So every remote that is accessed via a proxy will be treated as the same git repository. Every other command listed in cmdsMap will need to check if Annex.proxyremote is set, and if so handle the proxying appropriately. Probably only P2PStdio will need to support proxying. For now, everything else refuses to work when proxying. The part of that I don't like is that there's the possibility a command later gets added to the list that doesn't check proxying. When proxying is not enabled, it's important that git-annex-shell not leak information that it would not have exposed before. Such as the names or uuids of remotes. I decided that, in the case where a repository used to have proxying enabled, but no longer supports any proxies, it's ok to give the user a clear error message indicating that proxying is not configured, rather than a confusing uuid mismatch message. Similarly, if a repository has proxying enabled, but not for the requested repository, give a clear error message. A tricky thing here is how to handle the case where there is more than one remote, with proxying enabled, with the specified uuid. One way to handle that would be to plumb the proxyRemoteName all the way through from the remote git-annex to git-annex-shell, eg as a field, and use only a remote with the same name. That would be very intrusive though. Instead, I decided to let the proxy pick which remote it uses to access a given Remote. And so it picks the least expensive one. The client after all doesn't necessarily know any details about the proxy's configuration. This does mean though, that if the least expensive remote is not accessible, but another remote would have worked, an access via the proxy will fail.	2024-06-10 12:44:35 -04:00
Joey Hess	783eb8879a	notes on behavior	2024-06-10 11:07:04 -04:00
jlueters@79a910340cdff27611c6a650c108afbe2f61c5f6	daa2c6cce1		2024-06-10 14:24:34 +00:00
Joey Hess	b1cc8c6837	Merge branch 'master' of ssh://git-annex.branchable.com	2024-06-07 16:52:04 -04:00
Joey Hess	25a6ab6f11	Avoid grafting in export tree objects that are missing They could be missing due to an interrupted git-annex at just the wrong time during a prior graft, after which the tree objects got garbage collected. Or they could be missing because of manual messing with the git-annex branch, eg resetting it to back before the graft commit. Sponsored-by: Dartmouth College's OpenNeuro project	2024-06-07 16:51:50 -04:00
emilymaers	3947a51cc8	removed	2024-06-07 20:46:34 +00:00
emilymaers	4fbfc5e5ac	Added a comment: blockchain	2024-06-07 20:46:20 +00:00
Joey Hess	b32c4c2e98	atomic git-annex branch update when regrafting in transition Fix a bug where interrupting git-annex while it is updating the git-annex branch could lead to git fsck complaining about missing tree objects. Interrupting git-annex while regraftexports is running in a transition that is forgetting git-annex branch history would leave the repository with a git-annex branch that did not contain the tree shas listed in export.log. That lets those trees be garbage collected. A subsequent run of the same transition then regrafts the trees listed in export.log into the git-annex branch. But those trees have been lost. Note that both sides of `if neednewlocalbranch` are atomic now. I had thought only the True side needed to be, but I do think there may be cases where the False side needs to be as well. Sponsored-by: Dartmouth College's OpenNeuro project	2024-06-07 16:34:10 -04:00
Joey Hess	f5532be954	graft in exported tree before updating the export log It was possible for the export.log to get written and then git-annex was interrupted, before it could graft in the exported tree. Which could result in export.log referencing a tree that got garbage collected.	2024-06-07 15:25:02 -04:00
Joey Hess	6568ba4904	Merge branch 'master' into proxy	2024-06-07 12:35:47 -04:00
Joey Hess	43ff697f25	update status and design work on proxy encryption and chunking	2024-06-07 12:35:04 -04:00
Joey Hess	a0e59c1d17	comment	2024-06-07 12:35:00 -04:00
Joey Hess	4b940c92bb	proxied remotes working on client side Got the right git config settings inherited now. Note that the url config is not passed on to git, so it won't be able to access the proxied remote. That would need some kind of git-remote-annex but for proxied remotes anyway. Unsure yet if that will be needed.	2024-06-07 12:11:46 -04:00
Joey Hess	5aaa285083	Merge branch 'master' into proxy	2024-06-07 10:43:13 -04:00
Joey Hess	058726ee86	next step identified	2024-06-06 18:06:45 -04:00
Joey Hess	d59383beaf	update	2024-06-06 17:25:22 -04:00
Joey Hess	9bc4dd635c	update	2024-06-06 17:23:51 -04:00
Joey Hess	b43c835def	instantiate remotes that are behind a proxy remote Untested, but this should be close to working. The proxied remotes have the same url but a different uuid. When talking to current git-annex-shell, it will fail due to a uuid mismatch. Once it supports proxies, it will know that the presented uuid is for a remote that it proxies for. The check for any git config settings for a remote with the same name as the proxied remote is there for several reasons. One is security: Writing a name to the proxy log should not cause changes to how an existing, configured git remote operates in a different clone of the repo. It's possible that the user has been using a proxied remote, and decides to set a git config for it. We can't tell the difference between that scenario and an evil remote trying to eg, intercept a file upload by replacing their remote with a proxied remote. Also, if the user sets some git config, does it override the config inherited from the proxy remote? Seems a difficult question. Luckily, the above means we don't need to think through it. This does mean though, that in order for a user to change the config of a proxy remote, they have to manually set its annex-uuid and url, as well as the config they want to change. They may also have to set any of the inherited configs that they were relying on.	2024-06-06 17:15:32 -04:00
Joey Hess	7f1cdb3107	remote git config inheritance for proxied remotes When there is a proxy remote, remotes that it proxies need to be constructed with the right subset of the remote git-config settings. Obviously, the url is the same, and the uuid is different. Added proxyInheritedFields that lists all the fields that should be inherited. These will be copied into the proxied remote when instantiating it. There were a lot of decisions here, made without certainty in some cases. May need to revisit them. The RemoteGitConfigField type was added to make sure that every config used in extractRemoteGitConfig gets considered for proxy inheritance, including new ones that get added going forward. And to avoid needing to write the field string more than once.	2024-06-06 16:30:40 -04:00
Joey Hess	a72d0f69d0	filter out illegal remote names when reading proxy log	2024-06-06 12:51:30 -04:00
Joey Hess	d208b03e5d	Merge branch 'master' into proxy	2024-06-06 12:42:18 -04:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	1e6b4f324a	removed	2024-06-06 13:40:26 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	6274d16102	Added a comment	2024-06-06 11:23:55 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	d4993248eb	Added a comment	2024-06-06 11:23:34 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	a1e1af35af		2024-06-06 10:29:21 +00:00
nobodyinperson	6985c62a47	Added a comment	2024-06-06 09:09:03 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	7dbfb16415		2024-06-05 17:45:49 +00:00

1 2 3 4 5 ...

44878 commits