git-annex

Author	SHA1	Message	Date
Joey Hess	c34152777b	Use http-conduit for url downloads by default, annex.web-options enables curl * For url downloads, git-annex now defaults to using a http library, rather than wget or curl. But, if annex.web-options is set, it will use curl. To use the .netrc file, run: git config annex.web-options --netrc * git-annex no longer uses wget (and wget is no longer shipped with git-annex builds). Note that curl is always run in silent mode, since the new API for download has a MeterUpdate and doesn't make way for curl progress output. It might be worth writing a parser for curl's progress output to update the meter when using it, but I didn't bother with this edge case for now. This commit was supported by the NSF-funded DataLad project.	2018-04-06 17:36:20 -04:00
Joey Hess	9b98d3f630	better HTTP connection reuse Enable HTTP connection reuse across multiple files, when git-annex uses http-conduit. Before, a new Manager was created each time Utility.Url used it. Now, a single Manager gets created the first time, so connections are reused. Doesn't help when external programs are used for url download, but does speed up addurl --fast, fsck --from web, etc. Testing fsck --fast --from web with 3 files, over high-latency satellite internet, it sped up from 19.37s to 14.96s. This commit was supported by the NSF-funded DataLad project.	2018-04-04 15:39:40 -04:00
Joey Hess	2ec07bc29f	Avoid running annex.http-headers-command more than once.	2018-04-04 15:15:08 -04:00
Joey Hess	46d4316954	implement annex.retry et al Added annex.retry, annex.retry-delay, and per-remote versions to configure transfer retries. This commit was supported by the NSF-funded DataLad project.	2018-03-29 13:04:07 -04:00
Joey Hess	31e1adc005	deal with unlocked files P2P protocol version 1 adds VALID\|INVALID after DATA; INVALID means the file was detected to change content while it was being sent and so we may not have received the valid content of the file. Added new MustVerify constructor for Verification, which forces verification even when annex.verify=false etc. This is used when INVALID and in protocol version 0. As well as changing git-annex-shell p2psdio, this makes git-annex tor remotes always force verification, since they don't yet use protocol version 1. Previously, annex.verify=false could skip verification when using tor remotes, and let bad data into the repository. This commit was sponsored by Jack Hill on Patreon.	2018-03-13 14:27:14 -04:00
Joey Hess	b96b845ffd	fix nested progress meters when using git-annex-shell fallback Caused an ugly blank line when the first progress meter was not used, but also it may have confused -J display.	2018-03-12 19:20:10 -04:00
Joey Hess	1c2c8995ac	hide rsync progress output when metered but not in other uses of rsync	2018-03-12 18:36:07 -04:00
Joey Hess	cb05ef06bf	fix lost metering for fallback rsyncs `08814327ff` accidentially got rid of it, when it removed commandMetered.	2018-03-12 18:22:48 -04:00
Joey Hess	c3df5d1f10	avoid double-connect to unreachable ssh remote When git-annex-shell p2pstdio fails with 255, it's because the ssh server is not reachable. Avoid running the fallback action in this case, since it would just try a second time to connect, and presumably fail. Note that the closed P2PSshConnection will not be stored in the pool, so the next request tries again to connect. This is just the right behavior; when the remote becomes reachable again, the same git-annex process will start using it. This commit was sponsored by Ole-Morten Duesund on Patreon.	2018-03-12 16:50:21 -04:00
Joey Hess	d7f54671bf	refactoring	2018-03-09 13:48:10 -04:00
Joey Hess	936ab43932	use P2P for locking keys The P2P protocol is now fully used for git-annex-shell. This commit was sponsored by Ewen McNeill on Patreon.	2018-03-09 13:42:55 -04:00
Joey Hess	08814327ff	use P2P protocol for checkpresent, retrieve, and store Note that, due to not using rsync to transfer files to ssh remotes any longer, permissions and other file metadata of annexed files will no longer be preserved when copying them to ssh remotes. Other remotes never supported preserving that information, so this is not considered a regression. Added NEWS item about this. Another significant side effect of this is that, even when rsync is run to retrieve a file, its progress display will no longer be shown, and instead the native git-annex progress display will appear. It would be possible to use the rsync process display when rsync is used (old git-annex-shell and also retrieval from a local repository), but it would have complicated the code unncessarily, and been inconsistent behavior. (I'd been thinking for a while about eliminating the rsync progress display, since it's got some annoying verbosities, including display of the key and the "(xfr#1, to-chk=0/1)" bit and was already somewhat inconsistent.) retrieveKeyFileCheap still uses rsync, since that ensures that it gets the actual file content from the remote. Using the P2P protocol would use the local content, as long as the local and remote size are the same. This commit was sponsored by John Pellman on Patreon.	2018-03-09 13:25:16 -04:00
Joey Hess	5bc0ab3f31	going AGPL Remote/Git.hs now contains AGPL licensed code, thus the license of git-annex as a whole is AGPL. This was already the case when git-annex was built with the webapp enabled. The AGPL license will apply to all code added to Remote/Git.hs in the future, which is going to include support for using `git-annex-shell p2pstdio`.	2018-03-09 01:03:46 -04:00
Joey Hess	6a59bc4845	use P2P protocol for drop Not yet used for everything else, but this is enough to verify that it works, and do some benchmarking. Some bugfixes included, which got it working. Also fallback to old actions has been verified to work correctly. Benchmarked dropping one thousand files from a ssh remote on localhost. Using the old git-annex 40.867 seconds. With the P2P protocol 9.905 seconds! This commit was sponsored by Jochen Bartl on Patreon.	2018-03-08 16:56:17 -04:00
Joey Hess	16af259209	refactor p2p remote action code Make a Remote.Helper.P2P using code that was in Remote.P2P, converted to use generic protocol runner actions. This will allow it to be reused in Remote.Git. This commit was sponsored by mo on Patreon.	2018-03-08 16:11:00 -04:00
Joey Hess	c036a380b2	p2p ssh connection pools Much like Remote.P2P, there's a pool of connections to a peer, in order to support concurrent operations. Deals with old git-annex-ssh on the remote that does not support p2pstdio, by only trying once to use it, and remembering if it's not supported. Made p2pstdio send an AUTH_SUCCESS with its uuid, which serves the dual purposes of something to detect to see that the connection is working, and a way to verify that it's connected to the right uuid. (There's a redundant uuid check since the uuid field is sent by git_annex_shell, but I anticipate that being removed later when the legacy git-annex-shell stuff gets removed.) Not entirely happy with Remote.Git.runSsh's behavior when the proto action fails. Running the fallback will work ok, but what will we do when the fallbacks later get removed? It might be better to try to reconnect, in case the connection got closed. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-03-08 15:11:31 -04:00
Joey Hess	f4103744c3	make sure that lockContentShared is always paired with an inAnnex check lockContentShared had a screwy caveat that it didn't verify that the content was present when locking it, but in the most common case, eg indirect mode, it failed to lock when the content is not present. That led to a few callers forgetting to check inAnnex when using it, but the potential data loss was unlikely to be noticed because it only affected direct mode I think. Fix data loss bug when the local repository uses direct mode, and a locally modified file is dropped from a remote repsitory. The bug caused the modified file to be counted as a copy of the original file. (This is not a severe bug because in such a situation, dropping from the remote and then modifying the file is allowed and has the same end result.) And, in content locking over tor, when the remote repository is in direct mode, it neglected to check that the content was actually present when locking it. This could cause git annex drop to remove the only copy of a file when it thought the tor remote had a copy. So, make lockContentShared do its own inAnnex check. This could perhaps be optimised for direct mode, to avoid the check then, since locking the content necessarily verifies it exists there, but I have not bothered with that. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2018-03-07 14:23:52 -04:00
Joey Hess	a28c541e23	add remote.<name>.annex-checkuuid Added remote.<name>.annex-checkuuid config, which can be set to false to disable the default checking of the uuid of remotes that point to directories. This can be useful to avoid unncessary drive spin-ups and automounting. Note that the UUID check is still done before writing to the repository, to avoid writing to the wrong repository if it got relocated. Check is also done before checkPresent to avoid getting confused about what is in which repo. This is effectively the same as the use of git-annex-shell with a uuid to check that the remote repository is the expected one. Did not bother with the check for retrieveKeyFile because it doesn't matter if the wrong repo is used then. This commit was sponsored by Trenton Cronholm on Patreon.	2018-01-10 14:21:18 -04:00
Joey Hess	2b66492d6e	Improve startup time for commands that do not operate on remotes And for tab completion, by not unnessessarily statting paths to remotes, which used to cause eg, spin-up of removable drives. Got rid of the remotes member of Git.Repo. This was a bit painful. Remote.Git modifies the list of remotes as it reads their configs, so still need a persistent list of remotes. So, put it in as Annex.gitremotes. It's only populated by getGitRemotes, so commands like examinekey that don't care about remotes won't do so. This commit was sponsored by Jake Vosloo on Patreon.	2018-01-09 16:22:07 -04:00
Joey Hess	f5edb16729	Display progress meter when uploading a key without size information Getting the size by statting the content file. This commit was supported by the NSF-funded DataLad project.	2017-11-14 16:40:49 -04:00
Joey Hess	5c32196a37	fix process and FD leak Fix process and file descriptor leak that was exposed when git-annex was built with ghc 8.2.1. Apparently ghc has changed its behavior of GC of open file handles that are pipes to running processes. That broke git-annex test on OSX due to running out of FDs. Audited for all uses of Annex.new and made stopCoProcesses be called once it's done with the state. Fixed several places that might have leaked in other situations than running the test suite. This commit was sponsored by Ewen McNeill.	2017-09-29 22:36:08 -04:00
Joey Hess	16eb2f976c	prevent exporttree=yes on remotes that don't support exports Don't allow "exporttree=yes" to be set when the special remote does not support exports. That would be confusing since the user would set up a special remote for exports, but `git annex export` to it would later fail. This commit was supported by the NSF-funded DataLad project.	2017-09-07 13:48:44 -04:00
Joey Hess	28e2cad849	implement exporttree=yes configuration * Only export to remotes that were initialized to support it. * Prevent storing key/value on export remotes. * Prevent enabling exporttree=yes and encryption in the same remote. SetupStage Enable was changed to take the old RemoteConfig. This allowed only setting exporttree when initially setting up a remote, and not configuring it later after stuff might already be stored in the remote. Went with =yes rather than =true for consistency with other parts of git-annex. Changed docs accordingly. This commit was supported by the NSF-funded DataLad project.	2017-09-04 13:09:38 -04:00
Joey Hess	a4328b49d2	refactor ExportActions This will allow disabling exports for remotes that are not configured to allow them. Also, exportSupported will be useful for the external special remote to probe. This commit was supported by the NSF-funded DataLad project	2017-09-01 13:05:09 -04:00
Joey Hess	e55e445a36	add API for exporting Implemented so far for the directory special remote. Several remotes don't make sense to export to. Regular Git remotes, obviously, do not. Bup remotes almost certianly do not, since bup would need to be used to extract the export; same store for Ddar. Web and Bittorrent are download-only. GCrypt is always encrypted so exporting to it would be pointless. There's probably no point complicating the Hook remotes with exporting at this point. External, S3, Glacier, WebDAV, Rsync, and possibly Tahoe should be modified to support export. Thought about trying to reuse the storeKey/retrieveKeyFile/removeKey interface, rather than adding a new interface. But, it seemed better to keep it separate, to avoid a complicated interface that sometimes encrypts/chunks key/value storage and sometimes users non-key/value storage. Any common parts can be factored out. Note that storeExport is not atomic. doc/design/exporting_trees_to_special_remotes.mdwn has some things in the "resuming exports" section that bear on this decision. Basically, I don't think, at this time, that an atomic storeExport would help with resuming, because exports are not key/value storage, and we can't be sure that a partially uploaded file is the same content we're currently trying to export. Also, note that ExportLocation will always use unix path separators. This is important, because users may export from a mix of windows and unix, and it avoids complicating the API with path conversions, and ensures that in such a mix, they always use the same locations for exports. This commit was sponsored by Bruno BEAUFILS on Patreon.	2017-08-29 13:00:41 -04:00
Joey Hess	d39c120afa	add annex-ignore-command and annex-sync-command configs Added remote configuration settings annex-ignore-command and annex-sync-command, which are dynamic equivilants of the annex-ignore and annex-sync configurations. For this I needed a new DynamicConfig infrastructure. Its implementation should be as fast as before when there is no dynamic config, and it caches so shell commands are only run once. Note that annex-ignore-command exits nonzero when the remote should be ignored. While that may seem backwards, it allows using the same command for it as for annex-sync-command when you want to disable both. This commit was sponsored by Trenton Cronholm on Patreon.	2017-08-17 13:54:14 -04:00
Joey Hess	db1600b2de	de-Maybe remoteGitConfig It's always set, so does not need to be a Maybe.	2017-05-11 16:05:01 -04:00
Joey Hess	3c8eb59860	When a http remote does not expose an annex.uuid config, only warn about it once, not every time git-annex is run. Same behavior as for a ssh remote.	2017-03-29 12:43:47 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	e6857e75a6	sync hack to make updateInstead work on eg FAT sync: When syncing with a local repository located on a crippled filesystem, run the post-receive hook there, since it wouldn't get run otherwise. This makes pushing to repos on FAT-formatted removable drives update them when receive.denyCurrentBranch=updateInstead. Made Remote.Git export onLocal, which was cleaned up to not have so many caveats about its use. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2017-02-17 15:21:52 -04:00
Joey Hess	00464fbed7	have onLocal stop any coprocesses, not only cat-file I have not seen any other coprocesses being started, but let's avoid problems if any do for whatever reason.	2017-02-17 14:30:18 -04:00
Joey Hess	f07af03018	Run ssh with -n whenever input is not being piped into it ... to avoid it consuming stdin that it shouldn't. This fixes git-annex-checkpresentkey --batch remote, which didn't output results for all keys passed into it. Other git-annex commands that communicate with a remote over ssh may also have been consuming stdin that they shouldn't have, which could have impacted using them in eg, shell scripts. For example, a shell script reading files from stdin and passing them to git annex drop would be impacted by this bug, whenever git annex drop ran git-annex-shell checkpresent, it would consume part/all of the stdin that the shell script was supposed to consume. Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which is used throughout git-annex to run ssh (in order for ssh connection caching to work). Every call site was checked to see if it used CreatePipe for stdin, and if not was marked NoConsumeStdin.	2017-02-15 15:08:46 -04:00
Joey Hess	5c804cf42e	add SetupStage parameter to RemoteType.setup Most remotes have an idempotent setup that can be reused for enableremote, but in a few cases, it needs to tell which, and whether a UUID was provided to setup was used. This is groundwork for making initremote be able to provide a UUID. It should not change any behavior. Note that it would be nice to make the UUID always be provided to setup, and make setup not need to generate and return a UUID. What prevented this simplification is Remote.Git.gitSetup, which needs to reuse the UUID of the git remote when setting it up, and so has to return that UUID. This commit was sponsored by Thom May on Patreon.	2017-02-07 14:55:58 -04:00
Joey Hess	15be5c04a6	git-annex-shell, remotedaemon, git remote: Fix some memory DOS attacks. The attacker could just send a very lot of data, with no \n and it would all be buffered in memory until the kernel killed git-annex or perhaps OOM killed some other more valuable process. This is a low impact security hole, only affecting communication between local git-annex and git-annex-shell on the remote system. (With either able to be the attacker). Only those with the right ssh key can do it. And, there are probably lots of ways to construct git repositories that make git use a lot of memory in various ways, which would have similar impact as this attack. The fix in P2P/IO.hs would have been higher impact, if it had made it to a released version, since it would have allowed DOSing the tor hidden service without needing to authenticate. (The LockContent and NotifyChanges instances may not be really exploitable; since the line is read and ignored, it probably gets read lazily and does not end up staying buffered in memory.)	2016-12-09 13:34:32 -04:00
Joey Hess	58f5d41cac	fix	2016-12-09 12:56:38 -04:00
Joey Hess	0f3a3ff1e5	make clear that log is only updated after successful removal This does not change behavior, because an exception is thrown on unsuccessful removal. But is clearer.	2016-12-09 12:54:18 -04:00
Joey Hess	b29088b8dc	stub Remote.P2P Similar to GCrypt remotes, P2P remotes have an url, so Remote.Git has to separate them out and handle them, passing off to Remote.P2P. This commit was sponsored by Ignacio on Patreon.	2016-12-06 12:27:58 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	8dcf79694d	enable forwardRetry for command-line transfers If a transfer fails for some reason, but some data managed to be sent, the transfer will be retried. (The assistant already did this.) Possible impacts: * More ssh prompts if ssh needs to prompt for a password to connect to a host, or is prompting about some other problem like a ssh key mismatch. * More data transfer due to retrying, epecially when a remote does not support resuming a transfer. In the worst case, a lot of data will be transferred but it fails before the end, and then all that data gets transferred again plus one byte more; repeat until it manages to get the whole file.	2016-10-26 15:38:27 -04:00
Joey Hess	312ef4dfae	make --json-progress update meter when getting from git remote with rsync	2016-09-09 16:05:45 -04:00
Joey Hess	10ddf2c3bd	remove TransferObserver unused after last commit	2016-08-03 13:46:20 -04:00
Joey Hess	f4db181d9b	fix warning	2016-05-27 11:15:52 -04:00
Joey Hess	1b3bde0625	enableremote: Remove annex-ignore configuration from a remote.	2016-05-24 15:58:27 -04:00
Joey Hess	91df4c6b53	Pass the various gnupg-options configs to gpg in several cases where they were not before. Removed the instance LensGpgEncParams RemoteConfig because it encouraged code that does not take the RemoteGitConfig into account. RemoteType's setup was changed to take a RemoteGitConfig, although the only place that is able to provide a non-empty one is enableremote, when it's changing an existing remote. This led to several folow-on changes, and got RemoteGitConfig plumbed through.	2016-05-23 17:03:20 -04:00
Joey Hess	bfb4095c13	Improve behavior when a just added http remote is not available during uuid probe. Do not mark it as annex-ignore, so it will be tried again later.	2016-05-03 12:53:42 -04:00
Joey Hess	850d0da699	Fix duplicate progress meter display when downloading from a git remote over http with -J.	2016-04-19 13:10:56 -04:00
Joey Hess	2d7e46ea98	fix drop hang reported by musicmatze Fix hang when dropping content needs to lock the content on a ssh remote, which occurred when the remote has git-annex version 5.20151019 or newer. Analysis: `race` runs 2 threads at once, and the hGetLine finishes first. So, it tries to cancel the waitForProcess, but unfortunately that is making a foreign call and so cannot be canceled. The remote git-annex-shell is waiting for a line on stdin before it will exit. Deadlock. This only occurred sometimes; I reproduced it going from darkstar to elephant, but not from darkstar to darkstar. Not sure how that fits into the above analysis -- perhaps a race condition is also involved? Fixed by not using `race`; now the hGetLine will fail with an exception if the remote git-annex-shell exits without any output.	2016-04-18 14:04:50 -04:00
Joey Hess	cf06dac2b8	hard links on windows * annex.thin and annex.hardlink are now supported on Windows. * unannex --fast now makes hard links on Windows.	2016-04-08 15:25:32 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	ecd0684bfc	avoid hard linking object from other repository when annex.thin is set This is simpler and less expensive than checking if the src file has a link count >= 2, and also is unlocked.	2016-01-13 14:19:31 -04:00
Joey Hess	2513c1dfd0	remove reundant isDirect check Already checked in wantHardLink	2016-01-13 14:13:37 -04:00
Joey Hess	d0da52f1b1	typo	2015-12-26 15:11:32 -04:00
Joey Hess	1b55af4c3c	deal with unlocked files when calling rsyncParamsRemote In copyFromRemote, it used to check isDirect, but that was not needed; the remote is sending the file, so it doesn't matter if the local, receiving repository is in direct mode or not. And, since the content is not present, yet, it's certianly not unlocked. Note that, the remote may indeed be sending an unlocked file, but sendkey uses sendAnnex, which will detect if the file is modified before or during transfer, and will exit nonzero, aborting the upload. So, the receiver doesn't need any checks. In copyToRemote, it forces recvkey to verify content whenever it's being sent from a v6 repository. recvkey is almost always going to verify content anyway, unless annex.verify is not set. So, this doesn't make it any more expensive, except for in that unusual configuration. The alternative would be to change the recvkey interface, so that the sender checks afterwards if what it was sending changed, and the receiver then throws out the bad transfer. That would be less expensive for the reciever, as it would not need to do a checksum verification. But, it would mean another network round trip, and since rsync closes the connection, it would need to open another ssh connection to do this. Even with connction caching, that would add latency to uploads. It would also complicate the interface, especially because an older git-annex-shell would not have the new interface available. For these reasons, I prefer punting on that at this time, and instead someone might set annex.verify=false and be unhappy that it still verifies.. (One other gotcha not dealt with is that a v5 repo could be upgraded to v6 while an upload is in progress, and a file unlocked and modified.) (Also, I double-checked Remote.GCrypt's calls to rsyncParamsRemote, and they're fine. When a file is being uploaded to gcrypt, or any other special repository, it is mediated by sendAnnex, so changes will be detected at that level and the special remote implementation doesn't need to worry about them.)	2015-12-26 14:16:27 -04:00
Joey Hess	2b8f6b8b2f	check inode cache in prepSendAnnex This does mean one query of the database every time an object is sent. May impact performance.	2015-12-10 14:50:52 -04:00
Joey Hess	e97fce35a6	Display progress meter in -J mode when downloading from the web. Including in addurl, and get --from web, but also in S3 and External special remotes when a web url is known for content in those remotes.	2015-11-16 21:00:54 -04:00
Joey Hess	1244eb3770	refactor	2015-11-16 20:27:01 -04:00
Joey Hess	7943442dff	Display progress meter in -J mode when copying from a local git repo, to a local git repo, and from a remote git repo. Had everything available, just didn't combine the progress meter with the other places progress is sent to update it. (And to a remote repo already did show progress.) Most special remotes should already display progress meters with -J, same as without it. One exception to this is the web, since it relies on wget/curl progress display without -J. Still todo..	2015-11-16 19:32:30 -04:00
Joey Hess	4fd03ccd7b	concurrent-output, first pass Output without -Jn should be unchanged from before. With -Jn, concurrent-output is used for messages, but regions are not used yet, so it's a mess.	2015-11-04 13:45:34 -04:00
Joey Hess	806819be57	Avoid displaying network transport warning when a ssh remote does not yet have an annex.uuid set. Instead, only display transport error if the configlist output doesn't include an annex.uuid line, even an empty one. A recent change made git-annex init try to get all the remote uuids, and so the transport error would be displayed by it. It was also displayed when eg, copying files to a remote that had no uuid yet.	2015-10-15 15:36:54 -04:00
Joey Hess	b0e5c09408	fix various build warnings, mostly on Windows And some when S3 is disabled	2015-10-13 13:24:44 -04:00
Joey Hess	2154b7a38f	add inAnnex check to local lockKey	2015-10-09 18:00:37 -04:00
Joey Hess	6145f905e0	improve display when lockcontent fails /dev/null stderr; ssh is still able to display a password prompt despite this Show some messages so the user knows it's locking a remote, and knows if that locking failed.	2015-10-09 17:31:02 -04:00
Joey Hess	3b89d5a20c	implement lockContent for ssh remotes	2015-10-09 16:55:41 -04:00
Joey Hess	6a72045707	fix local dropping to not require extra locking of copies, but only that the local copy be locked for removal	2015-10-09 15:48:02 -04:00
Joey Hess	865dd11dbf	fix lockKey to run callback in original Annex monad, not local remote's	2015-10-09 13:35:28 -04:00
Joey Hess	4c6095b6f5	content locking during drop working for local git remotes Only ssh remotes lack locking now	2015-10-09 13:12:58 -04:00
Joey Hess	b1abe59193	add removeKey action to Remote Not implemented for any remotes yet; probably the git remote is the only one that will ever implement it.	2015-10-08 15:01:38 -04:00
Joey Hess	4d50958ed7	add lockContentShared Also, rename lockContent to lockContentExclusive inAnnexSafe should perhaps be eliminated, and instead use `lockContentShared inAnnex`. However, I'm waiting on that, as there are only 2 call sites for inAnnexSafe and it's fiddly.	2015-10-08 14:29:35 -04:00
Joey Hess	2def1d0a23	other 80% of avoding verification when hard linking to objects in shared repo In `c6632ee5c8`, it actually only handled uploading objects to a shared repository. To avoid verification when downloading objects from a shared repository, was a lot harder. On the plus side, if the process of downloading a file from a remote is able to verify its content on the side, the remote can indicate this now, and avoid the extra post-download verification. As of yet, I don't have any remotes (except Git) using this ability. Some more work would be needed to support it in special remotes. It would make sense for tahoe to implicitly verify things downloaded from it; as long as you trust your tahoe server (which typically runs locally), there's cryptographic integrity. OTOH, despite bup being based on shas, a bup repo under an attacker's control could have the git ref used for an object changed, and so a bup repo shouldn't implicitly verify. Indeed, tahoe seems unique in being trustworthy enough to implicitly verify.	2015-10-02 14:35:12 -04:00
Joey Hess	c6632ee5c8	avoid verification when hard linking to objects in shared repository Such a repository is implicitly trusted, so there's no point.	2015-10-02 12:36:03 -04:00
Joey Hess	2fb3722ce9	Do verification of checksums of annex objects downloaded from remotes. * When annex objects are received into git repositories, their checksums are verified then too. * To get the old, faster, behavior of not verifying checksums, set annex.verify=false, or remote.<name>.annex-verify=false. * setkey, rekey: These commands also now verify that the provided file matches the key, unless annex.verify=false. * reinject: Already verified content; this can now be disabled by setting annex.verify=false. recvkey and reinject already did verification, so removed now duplicate code from them. fsck still does its own verification, which is ok since it does not use getViaTmp, so verification doesn't happen twice when using fsck --from.	2015-10-01 15:56:39 -04:00
Joey Hess	807ba6a903	refactor	2015-10-01 14:07:06 -04:00
Joey Hess	ffa8221517	annex.hardlink extended to also try to use hard links when copying from the repository to a remote. Also, it used to only check that one of the repos was not in direct mode; now when either repo is direct mode, annex.hardlink won't have an effect.	2015-09-14 12:13:38 -04:00
Joey Hess	127c3db162	add some debugs to get timings Note that I had one in Annex.Action.startup too, but it resulted in a weird message printed by ssh, "channel 2: bad ext data". I don't know why, but it only happened when transferinfo was run, so I wonder if `983a95f021` introduced a fragility somehow.	2015-08-13 16:13:16 -04:00
Joey Hess	983a95f021	Sped up downloads of files from ssh remotes, reducing the non-data-transfer overhead 6x.	2015-08-13 14:20:28 -04:00
Joey Hess	f99ae3d713	remove debug print	2015-08-13 13:18:47 -04:00
Joey Hess	c5b8484c2e	Simplify setup process for a ssh remote. Now it suffices to run git remote add, followed by git-annex sync. Now the remote is automatically initialized for use by git-annex, where before the git-annex branch had to manually be pushed before using git-annex sync. Note that this involved changes to git-annex-shell, so if the remote is using an old version, the manual push is still needed. Implementation required git-annex-shell be changed, so configlist can autoinit a repository even when no git-annex branch has been pushed yet. Unfortunate because we'll have to wait for it to get deployed to servers before being able to rely on this change in the documentation. Did consider making git-annex sync push the git-annex branch to repos that didn't have a uuid, but this seemed difficult to do without complicating it in messy ways. It would be cleaner to split a command out from configlist to handle the initialization. But this is difficult without sacrificing backwards compatability, for users of old git-annex versions which would not use the new command.	2015-08-05 13:49:58 -04:00
Joey Hess	61ccf95004	Avoid accumulating transfer failure log files unless the assistant is being used. Only the assistant uses these, and only the assistant cleans them up, so make only git annex transferkeys write them, There is one behavior change from this. If glacier is being used, and a manual git annex get --from glacier fails because the file isn't available yet, the assistant will no longer later see that failed transfer file and retry the get. Hope no-one depended on that old behavior.	2015-05-12 15:53:38 -04:00
Joey Hess	e27b97d364	Merge branch 'master' into concurrentprogress Conflicts: Command/Fsck.hs Messages.hs Remote/Directory.hs Remote/Git.hs Remote/Helper/Special.hs Types/Remote.hs debian/changelog git-annex.cabal	2015-05-12 13:23:22 -04:00
Joey Hess	addc82dab7	removed all uses of undefined from code base It's a code smell, can lead to hard to diagnose error messages.	2015-04-19 00:38:29 -04:00
Joey Hess	0def1f0b53	Fix fsck --from a git remote in a local directory, and from a directory special remote. This was a reversion caused by the relative path changes in 5.20150113. The directory special remote was not affected in its normal configuration, since annex-directory is an absolute path normally. But it could fail when a relative path was used. The git remote was affected even when an absolute path to it was used in .git/config, since git-annex now converts all such paths to relative.	2015-04-18 13:36:12 -04:00
Joey Hess	a2902cdaaf	add filename to progress bar, and display ok/failed at end This needed plumbing an AssociatedFile through retrieveKeyFileCheap.	2015-04-14 16:35:10 -04:00
Joey Hess	dc4de7faf7	add missing progress bar	2015-04-14 16:00:20 -04:00
Joey Hess	75b6b5cbc7	only display built-in meters in parallel mode	2015-04-10 15:20:23 -04:00
Joey Hess	f8e700ed06	use built-in progress meters for git when in parallel mode	2015-04-10 15:15:21 -04:00
Joey Hess	e87f3b40eb	propigate outer output state into inner state when running onLocal Otherwise, progress displays would not be suppressed here when running with --quiet. Interesting wrinkle!	2015-04-03 20:08:38 -04:00
Joey Hess	450ee53ab6	When re-execing git-annex, use current program location, rather than ~/.config/git-annex/program, when possible. Most of the time, there will be no discreprancy between programPath and readProgramFile. But, the programFile might have been written by an old version of git-annex that is still installed, while a newer one is currently running. In this case, we want to run the same one that's currently running. This is especially important for things like the GIT_SSH=git-annex used for ssh connection caching. The only code that still uses readProgramFile directly is the upgrade code, which needs to know where the standalone git-annex was installed, in order to upgrade it.	2015-02-28 17:23:13 -04:00
Joey Hess	a22eaaae27	comment	2015-02-09 14:16:42 -04:00
Joey Hess	009bd050c1	implement annex.tune.objecthashlower Split out Annex.DirHashes which never really belonged in Locations.	2015-01-28 16:52:08 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	4f657aa14e	add getFileSize, which can get the real size of a large file on Windows Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.	2015-01-20 17:09:24 -04:00
Joey Hess	534c29deae	implemented old Richih wishlist about remote/uuid info * info: Can now display info about a given uuid. * Added to remote/uuid info: Count of the number of keys present on the remote, and their size. This is rather expensive to calculate, so comes last and --fast will disable it. * Git remote info now includes the date of the last sync with the remote.	2015-01-13 18:13:14 -04:00
Joey Hess	3bab5dfb1d	revert parentDir change Reverts `965e106f24` Unfortunately, this caused breakage on Windows, and possibly elsewhere, because parentDir and takeDirectory do not behave the same when there is a trailing directory separator.	2015-01-09 13:11:56 -04:00
Joey Hess	965e106f24	made parentDir return a Maybe FilePath; removed most uses of it parentDir is less safe than takeDirectory, especially when working with relative FilePaths. It's really only useful in loops that want to terminate at / This commit was sponsored by Audric SCHILTKNECHT.	2015-01-06 18:55:56 -04:00
Joey Hess	c9a3e80d32	fixed all remaining build warnings on Windows	2014-12-29 17:30:20 -04:00
Joey Hess	2cd84fcc8b	Expand checkurl to support recommended filename, and multi-file-urls This commit was sponsored by an anonymous bitcoiner.	2014-12-11 15:33:42 -04:00
Joey Hess	30bf112185	Urls can now be claimed by remotes. This will allow creating, for example, a external special remote that handles magnet: and *.torrent urls.	2014-12-08 19:15:07 -04:00
Joey Hess	cb6e16947d	add stub claimUrl	2014-12-08 13:40:15 -04:00
Joey Hess	a0297915c1	add per-remote-type info Now `git annex info $remote` shows info specific to the type of the remote, for example, it shows the rsync url. Remote types that support encryption or chunking also include that in their info. This commit was sponsored by Ævar Arnfjörð Bjarmason.	2014-10-21 14:36:09 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	b874f84086	New annex.hardlink setting. Closes: #758593 * New annex.hardlink setting. Closes: #758593 * init: Automatically detect when a repository was cloned with --shared, and set annex.hardlink=true, as well as marking the repository as untrusted. Had to reorganize Logs.Trust a bit to avoid a cycle between it and Annex.Init.	2014-09-05 13:44:09 -04:00
Joey Hess	6eb5c3f479	Do not preserve permissions and acls when copying files from one local git repository to another. Timestamps are still preserved as long as cp --preserve=timestamps is supported. This avoids cp -a overriding the default mode acls that the user might have set in a git repository. With GNU cp, this behavior change should not be a breaking change, because git-anex also uses rsync sometimes in the same situation, and has only ever preserved timestamps when using rsync. Systems without GNU cp will no longer use cp -a, but instead just cp. So, timestamps will no longer be preserved. Preserving timestamps when copying between repos is not guaranteed anyway. Closes: #729757	2014-08-26 17:10:25 -07:00
Joey Hess	aebcc395ff	use types to enforce that removeAnnex can only be called inside lockContent This fixed one bug where it needed to be and wasn't (in Assistant.Unused). And also found one place where lockContent was used unnecessarily (by drop --from remote). A few other places like uninit probably don't really need to lockContent, but it doesn't hurt to do call it anyway. This commit was sponsored by David Wagner.	2014-08-20 20:13:47 -04:00
Joey Hess	96dc423e39	When accessing a local remote, shut down git-cat-file processes afterwards, to ensure that remotes on removable media can be unmounted. Closes: #758630 This does mean that eg, copying multiple files to a local remote will become slightly slower, since it now restarts git-cat-file after each copy. Should not be significant slowdown. The reason git-cat-file is run on the remote at all is to update its location log. In order to add an item to it, it needs to get the current content of the log. Finding a way to avoid needing to do that would be a good path to avoiding this slowdown if it does become a problem somehow. This commit was sponsored by Evan Deaubl.	2014-08-20 12:07:57 -04:00
Joey Hess	6adbd50cd9	testremote: Add testing of behavior when remote is not available Added a mkUnavailable method, which a Remote can use to generate a version of itself that is not available. Implemented for several, but not yet all remotes. This allows testing that checkPresent properly throws an exceptions when it cannot check if a key is present or not. It also allows testing that the other methods don't throw exceptions in these circumstances. This immediately found several bugs, which this commit also fixes! * git remotes using ssh accidentially had checkPresent return an exception, rather than throwing it * The chunking code accidentially returned False rather than propigating an exception when there were no chunks and checkPresent threw an exception for the non-chunked key. This commit was sponsored by Carlo Matteo Capocasa.	2014-08-10 15:02:59 -04:00
Joey Hess	4f1ba9a23d	fix checkPresent error handling for non-present local git repos guardUsable r (error "foo") returned an error, rather than throwing it	2014-08-08 19:18:08 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	b4cf22a388	pushed checkPresent exception handling out of Remote implementations I tend to prefer moving toward explicit exception handling, not away from it, but in this case, I think there are good reasons to let checkPresent throw exceptions: 1. They can all be caught in one place (Remote.hasKey), and we know every possible exception is caught there now, which we didn't before. 2. It simplified the code of the Remotes. I think it makes sense for Remotes to be able to be implemented without needing to worry about catching exceptions inside them. (Mostly.) 3. Types.StoreRetrieve.Preparer can only work on things that return a Bool, which all the other relevant remote methods already did. I do not see a good way to generalize that type; my previous attempts failed miserably.	2014-08-06 13:45:19 -04:00
Joey Hess	6f4592966d	make testremote work with gcrypt repos This involved making Remote.Gcrypt.gen expect a Repo with a regular, non-gcrypt path. Since tht is what's stored as the Remote's gitrepo, testremote can then modify it and feed it back into gen.	2014-08-04 08:42:04 -04:00
Joey Hess	1cd2273035	finally properly fixed ssh zombie leak The leak was caused by the thread that sshd'd to send transferinfo not waiting on its ssh. Doh.	2014-08-03 20:14:20 -04:00
Joey Hess	cdf61071bc	optimise handling of unavailable repos The exception handling resulted in git config --list being run twice for unavailable repos. This dials it back down to running it only once.	2014-07-15 14:45:27 -04:00
Joey Hess	bd514eb65a	catch exception when repo is really not available	2014-07-15 14:39:31 -04:00
Joey Hess	522a0922b8	sync: Fix git sync with local git remotes even when they don't have an annex.uuid set. Catch an exception when ensureInitialized is run in a non-initted repository. In this case, just read the git config, so that the Git.Repo object is not LocalUnknown, which is what is used to represent remotes on eg, drives that are not connected. The assistant already got this right, and like with the assistant, this causes an implicit git-annex init of the local remote on the second sync, once the git-annex branch has been pushed to it. See this comment for more analysis: http://git-annex.branchable.com/todo/Recovering_from_a_bad_sync/#comment-64e469a2c1969829ee149cbb41b1c138 This commit was sponsored by jscit.	2014-07-15 14:27:43 -04:00
Joey Hess	a44fd2c019	export CreateProcess fields from Utility.Process update code to avoid cwd and env redefinition warnings	2014-06-10 19:20:14 -04:00
Joey Hess	c07343e4f7	initremote/enableremote: Basic support for using with regular git remotes initremote stores the location of an already existing git remote, and enableremote setups up a remote using its stored location.	2014-05-22 13:42:17 -04:00
Joey Hess	c34b5e09f8	factor out getRemoteGitConfig	2014-05-16 16:08:20 -04:00
Joey Hess	0b899fa2f1	show a much longer message when annex-ignore is automatically set, to help the user fix their problem	2014-05-16 12:58:50 -04:00
Joey Hess	f00cb21037	Bring back rsync -p, but only when git-annex is running on a non-crippled file system. This is a better approach to fix #700282 while not unncessarily losing file permissions on non-crippled systems.	2014-04-17 14:31:42 -04:00
Joey Hess	e426fac273	add desktop notifications Motivation: Hook scripts for nautilus or other file managers need to provide the user with feedback that a file is being downloaded. This commit was sponsored by THM Schoemaker.	2014-03-22 14:12:19 -04:00
Joey Hess	b63276309e	clean up cleanup action enumeration	2014-03-13 19:06:26 -04:00
Joey Hess	4d06037fdd	Fix zombie leak and general inneficiency when copying files to a local git repo. Benchmarking this with 1000 small files being copied, the time reduced from 15.98s to 14.64s -- an 8% improvement in the non-data-transfer overhead of git-annex copy.	2014-03-06 17:13:27 -04:00
Joey Hess	360ecb9f35	fix bare repo optimisation on Windows	2014-02-25 13:47:09 -04:00
Joey Hess	003fc2b7e1	add UrlOptions sum type	2014-02-24 22:00:25 -04:00
Joey Hess	c69d6eb035	Make annex.web-options be used in several places that call curl.	2014-02-24 21:29:37 -04:00
Joey Hess	089c0109a2	Added ways to configure rsync options to be used only when uploading or downloading from a remote. Useful to eg limit upload bandwidth.	2014-02-02 16:06:34 -04:00
Joey Hess	74b101d1dd	reorg	2014-01-26 16:36:31 -04:00
Joey Hess	1ca111620d	reorg	2014-01-26 16:32:55 -04:00
Joey Hess	5fc2d760ea	Optimise non-bare http remotes; no longer does a 404 to the wrong url every time before trying the right url. Needs annex-bare to be set to false, which is done when initially probing the uuid of a http remote.	2014-01-26 13:03:25 -04:00
Joey Hess	207ac67aaa	avoid needing a build-dep on hxt for Data.AssocList	2014-01-14 16:42:10 -04:00
Joey Hess	d07f2d7865	Fix a long-standing bug that could cause the wrong index file to be used when committing to the git-annex branch, if GIT_INDEX_FILE is set in the environment. This typically resulted in git-annex branch log files being committed to the master branch and later showing up in the work tree. (These log files can be safely removed.)	2014-01-14 15:36:33 -04:00
Joey Hess	c20f31a1ad	add GETAVAILABILITY to external special remote protocol And some reworking of types, and added an annex-availability git config setting.	2014-01-13 14:41:10 -04:00
Joey Hess	7be69a2491	gcrypt, bup: Fix bug that prevented using these special remotes with encryption=pubkey. I think both of these are all that's affected, but I went ahead and fixed all the remotes that set their config to M.empty to instead store the actual config. Who knows what will expect it to be actually present in future, the Remote instance of getGpgEncParams came to..	2013-11-02 16:37:28 -04:00
Joey Hess	7ed8e87a34	assistant: Support repairing git remotes that are locally accessible (eg, on removable drives) gcrypt remotes are not yet handled. This commit was sponsored by Sören Brunk.	2013-10-27 15:38:59 -04:00
Joey Hess	a6e9386d39	fix remote fsck to run in remote	2013-10-14 15:05:29 -04:00
Joey Hess	c78aaed317	ye olde inverted logic	2013-10-14 12:26:46 -04:00
Joey Hess	1ffb3bb0ba	add remote fsck interface Currently only implemented for local git remotes. May try to add support to git-annex-shell for ssh remotes later. Could concevably also be supported by some special remote, although that seems unlikely. Cronner user this when available, and when not falls back to fsck --fast --from remote git annex fsck --from does not itself use this interface. To do so, I would need to pass --fast and all other options that influence fsck on to the git annex fsck that it runs inside the remote. And that seems like a lot of work for a result that would be no better than cd remote; git annex fsck This may need to be revisited if git-annex-shell gets support, since it may be the case that the user cannot ssh to the server to run git-annex fsck there, but can run git-annex-shell there. This commit was sponsored by Damien Diederen.	2013-10-11 16:03:18 -04:00
Joey Hess	747f5b123c	url size fixes addurl: Improve message when adding url with wrong size to existing file. Before the message suggested the url didn't exist. Fixed handling of URL keys that have no recorded size. Before, if the key has no size, the url also had to not declare any size, which was unlikely and wrong, or it was taken to not exist. This probably would mostly affect keys that were added to the annex with addurl --relaxed.	2013-10-11 13:05:00 -04:00
Joey Hess	4e1e625fa6	fix transferring to gcrypt repo from direct mode repo recvkey was told it was receiving a HMAC key from a direct mode repo, and that confused it into rejecting the transfer, since it has no way to verify a key using that backend, since there is no HMAC backend. I considered making recvkey skip verification in the case of an unknown backend. However, that could lead to bad results; a key can legitimately be in the annex with a backend that the remote git-annex-shell doesn't know about. Better to keep it rejecting if it cannot verify. Instead, made the gcrypt special remote not set the direct mode flag when sending (and receiving) files. Also, added some recvkey messages when its checks fail, since otherwise all that is shown is a confusing error message from rsync when the remote git-annex-shell exits nonzero.	2013-10-01 14:38:46 -04:00
Joey Hess	12f6b9693a	Send a git-annex user-agent when downloading urls. Overridable with --user-agent option. Not yet done for S3 or WebDAV due to limitations of libraries used -- nether allows a user-agent header to be specified. This commit sponsored by Michael Zehrer.	2013-09-28 14:35:21 -04:00
Joey Hess	c1990702e9	hlint	2013-09-25 23:19:01 -04:00
Joey Hess	3192b059b5	add back lost check that git-annex-shell supports gcrypt	2013-09-24 17:51:12 -04:00
Joey Hess	f9e438c1bc	factor out more ssh stuff from git remote This has the dual benefits of making Remote.Git shorter, and letting Remote.GCrypt use these utilities.	2013-09-24 13:37:41 -04:00
Joey Hess	e8e209f4e5	better probing for gcrypt repositories using new --check option Now can tell if a repo uses gcrypt or not, and whether it's decryptable with the current gpg keys. This closes the hole that undecryptable gcrypt repos could have before been combined into the repo in encrypted mode.	2013-09-19 12:53:24 -04:00
Joey Hess	5fe49b98f8	Support hot-swapping of removable drives containing gcrypt repositories. To support this, a core.gcrypt-id is stored by git-annex inside the git config of a local gcrypt repository, when setting it up. That is compared with the remote's cached gcrypt-id. When different, a drive has been changed. git-annex then looks up the remote config for the uuid mapped from the core.gcrypt-id, and tweaks the configuration appropriately. When there is no known config for the uuid, it will refuse to use the remote.	2013-09-12 15:54:35 -04:00
Joey Hess	b64f5baf2d	sync: support gcrypt	2013-09-09 10:02:15 -04:00
Joey Hess	00fb5705ff	ignore gcrypt remotes w/o an annex-uuid	2013-09-08 15:19:14 -04:00
Joey Hess	7c1a9cdeb9	partially complete gcrypt remote (local send done; rest not) This is a git-remote-gcrypt encrypted special remote. Only sending files in to the remote works, and only for local repositories. Most of the work so far has involved making initremote work. A particular problem is that remote setup in this case needs to generate its own uuid, derivied from the gcrypt-id. That required some larger changes in the code to support. For ssh remotes, this will probably just reuse Remote.Rsync's code, so should be easy enough. And for downloading from a web remote, I will need to factor out the part of Remote.Git that does that. One particular thing that will need work is supporting hot-swapping a local gcrypt remote. I think it needs to store the gcrypt-id in the git config of the local remote, so that it can check it every time, and compare with the cached annex-uuid for the remote. If there is a mismatch, it can change both the cached annex-uuid and the gcrypt-id. That should work, and I laid some groundwork for it by already reading the remote's config when it's local. (Also needed for other reasons.) This commit was sponsored by Daniel Callahan.	2013-09-07 18:38:00 -04:00
Joey Hess	a48a4e2f8a	automatically derive an annex-uuid from a gcrypt-uuids	2013-09-05 16:02:39 -04:00
Joey Hess	06db8e0bd9	squash compiler warnings on Windows	2013-08-04 13:18:05 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00

1 2 3 4 5 ...

401 commits