git-annex

Author	SHA1	Message	Date
Joey Hess	46d4316954	implement annex.retry et al Added annex.retry, annex.retry-delay, and per-remote versions to configure transfer retries. This commit was supported by the NSF-funded DataLad project.	2018-03-29 13:04:07 -04:00
Joey Hess	31e1adc005	deal with unlocked files P2P protocol version 1 adds VALID\|INVALID after DATA; INVALID means the file was detected to change content while it was being sent and so we may not have received the valid content of the file. Added new MustVerify constructor for Verification, which forces verification even when annex.verify=false etc. This is used when INVALID and in protocol version 0. As well as changing git-annex-shell p2psdio, this makes git-annex tor remotes always force verification, since they don't yet use protocol version 1. Previously, annex.verify=false could skip verification when using tor remotes, and let bad data into the repository. This commit was sponsored by Jack Hill on Patreon.	2018-03-13 14:27:14 -04:00
Joey Hess	e16b069331	use total size from DATA Noticed that getting a key whose size is not known resulted in a progress display that didn't include the percent complete. Fixed for P2P by making the size sent with DATA be used to update the meter's total size. In order for rateLimitMeterUpdate to also learn the total size, had to make it be passed the Meter, and some other reorg in Utility.Metered was also done so that --json-progress can construct a Meter to pass to rateLimitMeterUpdate. When the fallback rsync is done, the progress display still doesn't include the percent complete. Only way to fix that seems to be to let rsync display its output again, but that would conflict with git-annex's own progress meter, which is also being displayed. This commit was sponsored by Henrik Riomar on Patreon.	2018-03-12 21:46:58 -04:00
Joey Hess	28589c92d2	no protocol 1 yet	2018-03-12 15:42:15 -04:00
Joey Hess	596af7cbc4	move protocol version stuff to the Net free monad Needs to be in Net not Local, so that Net actions can take the protocol version into account. This commit was sponsored by an anonymous bitcoin donor.	2018-03-12 15:20:51 -04:00
Joey Hess	c81768d425	version the P2P protocol Unfortunately ReceiveMessage didn't handle unknown messages the way it was documented to; client sending VERSION would cause the server to return an ERROR and hang up. Fixed that, but old releases of git-annex use the P2P protocol for tor and will still have that behavior. So, version is not negotiated for Remote.P2P connections, only for Remote.Git connections, which will support VERSION from their first release. There will need to be a later flag day to change Remote.P2P; left a commented out line that is the only thing that will need to be changed then. Version 1 of the P2P protocol is not implemented yet, but updated the docs for the DATA change that will be allowed by that version. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2018-03-12 14:36:35 -04:00
Joey Hess	c036a380b2	p2p ssh connection pools Much like Remote.P2P, there's a pool of connections to a peer, in order to support concurrent operations. Deals with old git-annex-ssh on the remote that does not support p2pstdio, by only trying once to use it, and remembering if it's not supported. Made p2pstdio send an AUTH_SUCCESS with its uuid, which serves the dual purposes of something to detect to see that the connection is working, and a way to verify that it's connected to the right uuid. (There's a redundant uuid check since the uuid field is sent by git_annex_shell, but I anticipate that being removed later when the legacy git-annex-shell stuff gets removed.) Not entirely happy with Remote.Git.runSsh's behavior when the proto action fails. Running the fallback will work ok, but what will we do when the fallbacks later get removed? It might be better to try to reconnect, in case the connection got closed. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-03-08 15:11:31 -04:00
Joey Hess	6ddfa9807b	implemented git-annex-shell p2pstdio Not yet used by git-annex, but this will allow faster transfers etc than using individual ssh connections and rsync. Not called git-annex-shell p2p, because git-annex p2p does something else and I don't want two subcommands with the same name between the two for sanity reasons. This commit was sponsored by Øyvind Andersen Holm.	2018-03-07 15:38:01 -04:00
Joey Hess	f4103744c3	make sure that lockContentShared is always paired with an inAnnex check lockContentShared had a screwy caveat that it didn't verify that the content was present when locking it, but in the most common case, eg indirect mode, it failed to lock when the content is not present. That led to a few callers forgetting to check inAnnex when using it, but the potential data loss was unlikely to be noticed because it only affected direct mode I think. Fix data loss bug when the local repository uses direct mode, and a locally modified file is dropped from a remote repsitory. The bug caused the modified file to be counted as a copy of the original file. (This is not a severe bug because in such a situation, dropping from the remote and then modifying the file is allowed and has the same end result.) And, in content locking over tor, when the remote repository is in direct mode, it neglected to check that the content was actually present when locking it. This could cause git annex drop to remove the only copy of a file when it thought the tor remote had a copy. So, make lockContentShared do its own inAnnex check. This could perhaps be optimised for direct mode, to avoid the check then, since locking the content necessarily verifies it exists there, but I have not bothered with that. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2018-03-07 14:23:52 -04:00
Joey Hess	572a45ae00	add readonly mode to serve P2P protocol This will be used by git-annex-shell when configured to be readonly. This commit was sponsored by Nick Daly on Patreon.	2018-03-07 13:15:55 -04:00
Joey Hess	ba53f60801	refactor	2018-03-06 15:14:53 -04:00
Joey Hess	73704b22a9	comment typo	2018-03-06 14:58:24 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	00be07070c	fix build on windows	2016-12-30 12:31:51 -04:00
Joey Hess	b219be5100	refactor	2016-12-30 12:31:17 -04:00
Joey Hess	8484c0c197	Always use filesystem encoding for all file and handle reads and writes. This is a big scary change. I have convinced myself it should be safe. I hope!	2016-12-24 14:46:31 -04:00
Joey Hess	e08691b393	enable-tor: When run as a regular user, test a connection back to the hidden service over tor. This way we know that after enable-tor, the tor hidden service is fully published and working, and so there should be no problems with it at pairing time. It has to start up its own temporary listener on the hidden service. It would be nice to have it start the remotedaemon running, so that extra step is not needed afterwards. But, there may already be a remotedaemon running, in communication with the assistant and we don't want to start another one. I thought about trying to HUP any running remotedaemon, but Windows does not make it easy to do that. In any case, having the user start the remotedaemon themselves lets them know it needs to be running to serve the hidden service. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2016-12-24 12:50:23 -04:00
Joey Hess	f3a4b9191c	refactor	2016-12-24 12:14:14 -04:00
Joey Hess	22252e8e4c	Revert "close" This reverts commit `3aaabc906b`. Commit contained incomplete work.	2016-12-24 12:07:15 -04:00
Joey Hess	3aaabc906b	close	2016-12-22 13:59:21 -04:00
Joey Hess	405fbd25e1	include tor-annex in hidden service directory names To make it easier to manage/delete them etc. Backwards compatablity is preserved for existing tor configs.	2016-12-21 14:39:32 -04:00
Joey Hess	38f9337e16	Revert "p2p --link now defaults to setting up a bi-directional link" This reverts commit `3037feb1bf`. On second thought, this was an overcomplication of what should be the lowest-level primitive. Let's build bi-directional links at the pairing level with eg magic wormhole.	2016-12-16 18:26:07 -04:00
Joey Hess	3037feb1bf	p2p --link now defaults to setting up a bi-directional link Both the local and remote git repositories get remotes added pointing at one-another. Makes pairing twice as easy! Security: The new LINK command in the protocol can be sent repeatedly, but only by a peer who has authenticated with us. So, it's entirely safe to add a link back to that peer, or to some other peer it knows about. Anything we receive over such a link, the peer could send us over the current connection. There is some risk of being flooded with LINKs, and adding too many remotes. To guard against that, there's a hard cap on the number of remotes that can be set up this way. This will only be a problem if setting up large p2p networks that have exceptional interconnectedness. A new, dedicated authtoken is created when sending LINK. This also allows, in theory, using a p2p network like tor, to learn about links on other networks, like telehash. This commit was sponsored by Bruno BEAUFILS on Patreon.	2016-12-16 16:38:06 -04:00
Joey Hess	16c6333f09	fix build with old ghc	2016-12-10 11:12:18 -04:00
Joey Hess	fa1b3a19f9	hang up connection after relaying Seems that git upload-pack outputs a "ONCDN " that is not read by the remote git receive-pack. This fixes: [2016-12-09 17:08:32.77159731] P2P > ERROR protocol parse error: "ONCDN "	2016-12-09 17:11:16 -04:00
Joey Hess	52ccd44812	avoid exposing auth tokens in debug	2016-12-09 16:55:48 -04:00
Joey Hess	217c3b0a21	debug dump P2P messages	2016-12-09 16:45:36 -04:00
Joey Hess	9dd510bf29	make tor hidden service work when directory watching is not available Avoid crashing when built w/o inotify..	2016-12-09 16:40:47 -04:00
Joey Hess	2c907fff51	remotedaemon: git change detection over tor hidden service	2016-12-09 16:02:43 -04:00
Joey Hess	f7687e0876	only start ref change watcher thread once per P2P connection This is more efficient. Note that the peer will get CHANGED messages for all refs changed since the connection opened, even if those changes happened before it sent NOTIFYCHANGE.	2016-12-09 15:08:54 -04:00
Joey Hess	e152c322f8	refactor ref change watching Added to change notification to P2P protocol. Switched to a TBChan so that a single long-running thread can be started, and serve perhaps intermittent requests for change notifications, without buffering all changes in memory. The P2P runner currently starts up a new thread each times it waits for a change, but that should allow later reusing a thread. Although each connection from a peer will still need a new watcher thread to run. The dependency on stm-chans is more or less free; some stuff in yesod uses it, so it was already indirectly pulled in when building with the webapp. This commit was sponsored by Francois Marier on Patreon.	2016-12-09 15:01:09 -04:00
Joey Hess	15be5c04a6	git-annex-shell, remotedaemon, git remote: Fix some memory DOS attacks. The attacker could just send a very lot of data, with no \n and it would all be buffered in memory until the kernel killed git-annex or perhaps OOM killed some other more valuable process. This is a low impact security hole, only affecting communication between local git-annex and git-annex-shell on the remote system. (With either able to be the attacker). Only those with the right ssh key can do it. And, there are probably lots of ways to construct git repositories that make git use a lot of memory in various ways, which would have similar impact as this attack. The fix in P2P/IO.hs would have been higher impact, if it had made it to a released version, since it would have allowed DOSing the tor hidden service without needing to authenticate. (The LockContent and NotifyChanges instances may not be really exploitable; since the line is read and ignored, it probably gets read lazily and does not end up staying buffered in memory.)	2016-12-09 13:34:32 -04:00
Joey Hess	bdf2a31424	typo	2016-12-09 12:54:12 -04:00
Joey Hess	71e8cd408e	content removal is supposed to succed if the content was already not present	2016-12-09 12:48:22 -04:00
Joey Hess	38516b2fca	update progress logs in remotedaemon send/receive	2016-12-08 19:56:02 -04:00
Joey Hess	0f4ee4f298	fix memory leak I'm unsure why this fixed it, but it did. Seems to suggest that the memory leak is not due to a bug in my code, but that ghc didn't manage to take full advantage of laziness, or was failing to gc something it could have.	2016-12-08 18:42:52 -04:00
Joey Hess	af41519126	convert P2P runners from Maybe to Either String So we get some useful error messages when things fail. This commit was sponsored by Peter Hogg on Patreon.	2016-12-08 15:47:49 -04:00
Joey Hess	c05f4eb631	fix laziness problem in git relaying The switch to hGetMetered subtly changed the laziness of how DATA was read, and broke git protocol relaying. Fix by sending received data to the git process's stdin immediately, which ensures that the lazy bytestring is all read from the peer before going on to process the next message from the peer.	2016-12-08 15:15:29 -04:00
Joey Hess	df67626cb7	fix build with old ghc	2016-12-08 13:58:03 -04:00
Joey Hess	0541f19bea	fix math error that caused resumes to always fail	2016-12-07 15:36:39 -04:00
Joey Hess	db79b69aa0	ReadWriteMode not AppendMode AppendMode does not allow seeking..	2016-12-07 15:24:28 -04:00
Joey Hess	99c36f318c	open file for append, not write, so resuming works WriteMode zeros any existing content, so the seek filled with zeros, and verification failed after download.	2016-12-07 15:06:07 -04:00
Joey Hess	b55399e3ac	offset meters when resuming	2016-12-07 14:52:10 -04:00
Joey Hess	ad5ef51040	more p2p progress meters Display progress meter on send and receive from remote. Added a new hGetMetered that can read an exact number of bytes (or less), updating a meter as it goes. This commit was sponsored by Andreas on Patreon.	2016-12-07 14:25:01 -04:00
Joey Hess	83ea1cec86	update progress meter when sending to p2p remote This commit was sponsored by Thom May on Patreon.	2016-12-07 13:37:35 -04:00
Joey Hess	bb5168e894	need to auth with the peer	2016-12-06 15:50:02 -04:00
Joey Hess	f744bd5391	refactor	2016-12-06 15:43:03 -04:00
Joey Hess	2bd2e0880c	added StoreContentTo This is needed in addition to StoreContent, because retrieveKeyFile can be used to retrieve to different destination files, not only the tmp file for a key. This commit was sponsored by Ole-Morten Duesund on Patreon.	2016-12-06 15:05:44 -04:00
Joey Hess	b29088b8dc	stub Remote.P2P Similar to GCrypt remotes, P2P remotes have an url, so Remote.Git has to separate them out and handle them, passing off to Remote.P2P. This commit was sponsored by Ignacio on Patreon.	2016-12-06 12:27:58 -04:00
Joey Hess	a8c868c2e1	plumb assicated files through P2P protocol for updating transfer logs ReadContent can't update the log, since it reads lazily. This part of the P2P monad will need to be rethought. Associated files are heavily sanitized when received from a peer; they could be an exploit vector. This commit was sponsored by Jochen Bartl on Patreon.	2016-12-02 16:42:54 -04:00
Joey Hess	b16a1cee4b	plumb peer uuid through to runLocal This will allow updating transfer logs with the uuid.	2016-12-02 15:39:49 -04:00
Joey Hess	71ddb10699	initial implementation of P2P.Annex runner Untested, and it does not yet update transfer logs. Verifying transferred content is modeled on git-annex-shell recvkey. In a direct mode or annex.thin repository, content can change while it's being transferred. So, verification is always done, even if annex.verify would normally prevent it. Note that a WORM or URL key could change in a way the verification doesn't catch. That can happen in git-annex-shell recvkey too. We don't worry about it, because those key backends don't guarantee preservation of data. (Which is to say, I worried about it, and then convinced myself again it was ok.)	2016-12-02 14:54:33 -04:00
Joey Hess	c29f2e262a	catch non-IO exceptions too	2016-12-02 14:16:50 -04:00
Joey Hess	881274d021	make remote-daemon able to send and receive objects over tor Each worker thread needs to run in the Annex monad, but the remote-daemon's liftAnnex can only run 1 action at a time. Used Annex.Concurrent to deal with that. P2P.Annex is incomplete as of yet.	2016-12-02 13:52:43 -04:00
Joey Hess	7b7afbbedc	improve Local monad	2016-12-02 13:47:42 -04:00
Joey Hess	15dc63d47f	make sure that the specified number of bytes of DATA are always sent It's possible, in direct or thin mode, that an object file gets truncated or appended to as it's being sent. This would break the protocol badly, so make sure never to send too many bytes, and to close the protocol connection if too few bytes are available.	2016-12-02 13:45:45 -04:00
Joey Hess	3dce6a080e	cleanups	2016-12-01 00:42:01 -04:00
Joey Hess	94dad1e979	more flexible types for Proto runners This will allow a runner in the Annex monad.	2016-12-01 00:27:07 -04:00
Joey Hess	00f48ac407	better comments	2016-11-30 23:54:00 -04:00
Joey Hess	e714e0f67a	actually check p2p authtokens for tor connections This commit was sponsored by Ethan Aubin.	2016-11-30 16:46:02 -04:00
Joey Hess	b88e44ea9a	use P2P auth for git-remote-tor-annex This changes the environment variable name to the more generic GIT_ANNEX_P2P_AUTHTOKEN. This commit was sponsored by andrea rota.	2016-11-30 15:26:55 -04:00
Joey Hess	3ab12ba923	implement p2p --link This commit was sponsored by Riku Voipio.	2016-11-30 15:16:25 -04:00
Joey Hess	bfc8305814	implement p2p command	2016-11-30 14:35:24 -04:00
Joey Hess	f86a7f673c	comments	2016-11-29 17:33:49 -04:00
Joey Hess	38425fdc39	finish git-annex enable-tor Make it stash the address away for git-annex p2p to use later, rather than outputting it. And, look up the UUID itself.	2016-11-29 17:30:27 -04:00
Joey Hess	3ed8895a09	fix build	2016-11-24 16:36:16 -04:00
Joey Hess	158ef45d76	add P2P.Auth	2016-11-22 14:37:50 -04:00
Joey Hess	b08799893f	reorg	2016-11-22 14:37:09 -04:00

1 2 3

118 commits