git-annex

Author	SHA1	Message	Date
Joey Hess	86958fda5d	fix build with old http-client	2018-05-10 00:22:23 -04:00
Joey Hess	db720f6a9c	Display error message when http download fails. * Display error message when http download fails. There's nothing in the http-client library to nicely format a http exception, so in some cases it has to fall back to using show on it. Seems better than just saying "it failed" or only showing the http status code. * Avoid forward retry when 0 bytes were received. forwardRetry was comparing Nothing to Just 0, and so thought there had been progress made when 0 bytes were received. This commit was supported by the NSF-funded DataLad project.	2018-05-08 16:11:45 -04:00
Joey Hess	3c6e60dc69	fix build with old http-conduit	2018-04-24 21:23:40 -04:00
Joey Hess	558a0a9328	deal with conduit 1.3 change I don't know if this will build with older conduit, it may need an ifdef.	2018-04-22 13:14:55 -04:00
Joey Hess	e5a404ebe2	fix build with old version of http-client	2018-04-09 13:04:23 -04:00
Joey Hess	c8f2d302dc	run curl when configured to do it at runtime, even if not available at build time	2018-04-06 21:17:36 -04:00
Joey Hess	c34152777b	Use http-conduit for url downloads by default, annex.web-options enables curl * For url downloads, git-annex now defaults to using a http library, rather than wget or curl. But, if annex.web-options is set, it will use curl. To use the .netrc file, run: git config annex.web-options --netrc * git-annex no longer uses wget (and wget is no longer shipped with git-annex builds). Note that curl is always run in silent mode, since the new API for download has a MeterUpdate and doesn't make way for curl progress output. It might be worth writing a parser for curl's progress output to update the meter when using it, but I didn't bother with this edge case for now. This commit was supported by the NSF-funded DataLad project.	2018-04-06 17:36:20 -04:00
Joey Hess	36e6b8abbf	Fix resuming a download when using curl. Noticed a bug; when using curl a workaround for its empty file behavior overwrote the file content, so it never resumed and always started over.	2018-04-06 16:09:53 -04:00
Joey Hess	0f6775f1ff	refactor sinkResponseFile and add downloadC Remote.S3 and Remote.Helper.Http both had similar code to sink a http-conduit Response to a file; refactor out sinkResponseFile. downloadC downloads an url to a file using http-conduit, and supports resuming. Falls back to curl to handle urls that http-conduit does not support. This is not used yet, but the goal is to replace download with it. git-annex.cabal: conduit-extra was not actually used for a long time, remove the dep. conduit moves into the main dependency list, but since http-conduit was already in there, and it depends on conduit, that's not really adding a new build dep. This commit was supported by the NSF-funded DataLad project.	2018-04-06 16:07:08 -04:00
Joey Hess	9b98d3f630	better HTTP connection reuse Enable HTTP connection reuse across multiple files, when git-annex uses http-conduit. Before, a new Manager was created each time Utility.Url used it. Now, a single Manager gets created the first time, so connections are reused. Doesn't help when external programs are used for url download, but does speed up addurl --fast, fsck --from web, etc. Testing fsck --fast --from web with 3 files, over high-latency satellite internet, it sped up from 19.37s to 14.96s. This commit was supported by the NSF-funded DataLad project.	2018-04-04 15:39:40 -04:00
Joey Hess	25703e1413	finally really add back custom-setup stanza Fourth or fifth try at this and finally found a way to make it work. Absurd amount of busy-work forced on me by change in cabal's behavior. Split up Utility modules that need posix stuff out of ones used by Setup. Various other hacks around inability for Setup to use anything that ifdefs a use of unix. Probably lost a full day of my life to this. This is how build systems make their users hate them. Just saying.	2017-12-31 16:36:39 -04:00
Joey Hess	308cd1383c	fold Build/SysConfig.hs into BuildInfo via include This avoids warnings from stack about the module not being listed in the cabal file. So, the generated file is also renamed to Build/SysConfig. Note that the setup program seems to be cached despite these changes; I had to cabal clean to get cabal to update it so that Build/SysConfig was written. This commit was sponsored by Jochen Bartl on Patreon.	2017-12-14 12:46:57 -04:00
Joey Hess	70344d25c0	type signature works for both old and new versions of ifdef	2017-12-11 12:49:23 -04:00
Joey Hess	c6e4bc0a22	fix regression in addurl --file caused by youtube-dl support Now youtubeDlCheck downloads the beginning of the url's content and checks if it's html, only when it is does it pass it off the youtube-dl to check if it supports it. This means more work is done for urls that youtube-dl does support, but is probably more efficient for other urls, since it only downloads the first chunk of content, while youtube-dl probably downloads more. As well as the reported bug, this also fixes behavior when an url was added with youtube-dl, but the url content has now changed from a html page to something else. Remote.Web.checkKey used to wrongly succeed in that situation, since youtube-dl said sure it can download that something else. This commit was supported by the NSF-funded DataLad project.	2017-12-06 13:22:31 -04:00
Joey Hess	93d5951f11	remove redundant pattern match	2017-09-24 16:17:58 -04:00
Joey Hess	01068d8280	fix build with old http-client	2017-09-13 15:35:42 -04:00
Joey Hess	2ca1d3cc01	deal with box.com horrible infinite redirect behavior webdav: Checking if a non-existent file is present on Box.com triggered a bug in its webdav support that generates an infinite series of redirects. It seems to redirect foo to foo/ to foo/index.php to foo/index.php/index.php ... Why a webdav endpoint would behave this way who knows. Deal with such problems by assuming such behavior means the file is not present. Can't simply disable following redirects, because the webdav endpoint could legitimately be redirected to a new endpoint. So, when this happens 10 redirects have to be followed, before it gives up and assumes this means the file does not exist. This commit was supported by the NSF-funded DataLad project.	2017-09-12 15:13:42 -04:00
Joey Hess	0a2f7c261f	fix build with old http-client versions	2017-08-17 11:00:48 -04:00
Joey Hess	69dcb08d7a	Disable http-client's default 30 second response timeout when HEADing an url to check if it exists. Some web servers take quite a long time to answer a HEAD request.	2017-08-15 13:56:12 -04:00
Joey Hess	1c4e5f65fc	Drop support for building with old versions of directory, feed, and http-types.	2017-03-10 15:57:41 -04:00
Joey Hess	ca49a84ba5	Drop support for building with old versions of dns and http-conduit.	2017-03-10 15:49:14 -04:00
Joey Hess	7a0d6d81a0	make curl show http errors to stderr * Run curl with -S, so HTTP errors are displayed, even when it's otherwise silent. * When downloading in --json or --quiet mode, use curl in preference to wget, since curl is able to display only errors to stderr, unlike wget. This does mean that downloadQuiet is only silent on stdout, not necessarily on stderr, which affects a couple other calls of it. For example, downloading the .git/config of a http remote may show an error message now, perhaps with slightly suboptimal formatting due to other output.	2017-02-20 16:09:32 -04:00
Joey Hess	8dd3635acf	improve layout	2017-02-20 15:44:14 -04:00
Joey Hess	4a397b5313	Run wget with -nv instead of -q, so it will display HTTP errors. This adds one extra line of output when a download is successful, after the progress bar. I don't much like that, but wget does not provide a way to show HTTP errors without it.	2017-02-20 15:25:02 -04:00
Joey Hess	9dabe85bb5	whitespace	2016-12-28 00:17:36 -04:00
Joey Hess	59fead6da3	Pass annex.web-options to wget and curl after other options, so that eg --no-show-progress can be set by the user to disable the default --show-progress.	2016-12-13 11:56:23 -04:00
Alper Nebi Yasak	93a22a1c97	Remove http-conduit (<2.2.0) constraint Since https://github.com/aristidb/aws/issues/206 is resolved, this constraint is no longer necessary. However, http-conduit (>=2.2.0) requires http-client (>=0.5.0) which introduces some breaking changes. This commit also implements those changes depending on the version. Fixes: https://git-annex.branchable.com/bugs/Build_with_aws_head_fails/ Signed-off-by: Alper Nebi Yasak <alpernebiyasak@gmail.com>	2016-12-10 10:45:52 -04:00
Joey Hess	de7b2ffa72	avoid deprecation warning from parseUrl	2016-09-07 12:02:38 -04:00
Joey Hess	79704528c0	Support checking presence of content at a http url that redirects to a ftp url.	2016-07-12 16:41:45 -04:00
Joey Hess	1f6f9a8d34	When annex.http-headers is used to set the User-Agent header, avoid sending User-Agent: git-annex	2016-01-11 12:10:38 -04:00
Joey Hess	45c9440cf9	refactor	2015-10-15 10:34:19 -04:00
Joey Hess	cdbce512bd	deal with more backward-compatible breaking renamings in conduit This is the kind of annoying thing that makes me not want to use a library. conduitManagerSettings was a perfectly fine name and could have been kept forever.	2015-10-02 15:18:54 -04:00
Joey Hess	9e3ac97608	avoid deprecation warnings when built with http-client >= 0.4.18 Since I want git-annex to keep building on debian stable, I need to still support the old http-client, which required explicit calls to closeManager, or use of withManager to get Managers to close at appropriate times. This is not needed in the new version, and so they added a deprecation warning. IMHO much too early, because look at the mess I had to go through to avoid that deprecation warning while supporting both versions..	2015-10-01 13:48:56 -04:00
Joey Hess	0f5d6c09ac	importfeed --relaxed: Avoid hitting the urls of items in the feed.	2015-08-19 12:24:55 -04:00
Joey Hess	a6c56fb459	improve url parsing more Now can handle eg, "http://[::1]/download/cdrom-fontzip[foo]", where the first [] need to stay unescaped, but the rest have to be escaped.	2015-06-14 13:54:24 -04:00
Joey Hess	829007d629	Improve url parsing to handle some urls containing illegal [] characters in their paths. Ie, "https://archive.org/download/zoom-2/Zoom - Release 2 (1996)(Active Software)[!].iso"	2015-06-14 13:39:44 -04:00
Joey Hess	eb33569f9d	remove Params constructor from Utility.SafeCommand This removes a bit of complexity, and should make things faster (avoids tokenizing Params string), and probably involve less garbage collection. In a few places, it was useful to use Params to avoid needing a list, but that is easily avoided. Problems noticed while doing this conversion: * Some uses of Params "oneword" which was entirely unnecessary overhead. * A few places that built up a list of parameters with ++ and then used Params to split it! Test suite passes.	2015-06-01 13:52:23 -04:00
Joey Hess	6466dbc950	FlexibleContexts needed by ghc 7.10	2015-05-10 15:37:55 -04:00
Joey Hess	d9bfc78444	avoid using relative path from temp dir to dest file That failed on OSX. The temp dir was /var/folders/fb/pnwjj52n7fg0r9mnvpsfll180000gr/T/downloadurl and the relative path ../../../../../../Volumes/Visitors/joeyh/git-annex/r/.git/... Didn't work. I have no clue why, how did OSX manage to break this? But, the relative path is longer most of the time anyway, so let's just use the absolute path.	2015-05-07 18:47:24 -04:00
Joey Hess	cf786f42a4	Support checking ftp urls for file presence.	2015-05-05 14:05:02 -04:00
Joey Hess	0b18228516	Work around wget bug #784348 which could cause it to clobber git-annex symlinks when downloading from ftp.	2015-05-05 13:53:06 -04:00
Joey Hess	c65e71e6a5	cleanup	2015-04-09 12:57:30 -04:00
Joey Hess	b2ad3403c6	make downloadQuiet quiet again This was broken in commit `c64ede23cd`	2015-04-03 20:38:20 -04:00
Joey Hess	b8f0b7309f	Work around curl bug when asked to download an empty url to a file. In this situation, curl -o exits successfully without creating the output file. There was already a workaround for curl file:/// but I did not realize this also affected regular url downloads. To fix it, pre-create the destination file before starting curl. Since we cannot always know the size of an url before trying to download it, let's always do this. Note that since curl is told -C -, we have to consider if this makes curl try to do a ranged download, which might fail on some servers where a regular download would have succeeded. My testing indicates this isn't a problem; since the file is empty, curl seems to not try to do a ranged download. Original report: https://github.com/datalad/datalad/issues/79 Curl bug report: https://github.com/bagder/curl/issues/183	2015-03-27 10:22:36 -04:00
Joey Hess	e8c376e0ad	import Data.Default in Common	2015-01-28 16:11:28 -04:00
Joey Hess	587f6a919b	addurl: When a Content-Disposition header suggests a filename to use, addurl will consider using it, if it's reasonable and doesn't conflict with an existing file. (--file overrides this)	2015-01-22 14:52:52 -04:00
Joey Hess	91f1b2bdcf	excess indent	2015-01-22 13:47:06 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	4f657aa14e	add getFileSize, which can get the real size of a large file on Windows Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.	2015-01-20 17:09:24 -04:00
Joey Hess	c64ede23cd	Use wget -q --show-progress for less verbose wget output, when built with wget 1.16.	2014-12-16 14:04:40 -04:00
Joey Hess	8b15af309a	add compat cruft for old versions of http-types and http-conduit	2014-08-17 15:39:46 -04:00
Joey Hess	6ab0737a75	work around default Accept-Encoding in http-client	2014-08-15 18:02:17 -04:00
Joey Hess	e0227dfedf	memoize construction of the Request -> Request function to apply the UrlOptions	2014-08-15 17:47:21 -04:00
Joey Hess	dd619c7166	Switched from the old haskell HTTP library to http-conduit. The hoary old HTTP library was only used when checking if an url exists, when curl was not available. It had many problems, including not supporting https at all. Now, this is done using http-conduit for all urls that it supports. Falls back to curl for any url that http-conduit doesn't like (probably ftp etc, but could also be an url that its parser chokes on for whatever reason). This adds a new dependency on http-conduit, but webdav support already indirectly depended on that, and the s3-aws branch also uses it. This opens up the possibility of using http-conduit for large file downloads, but for now I've left it using wget/curl. This commit was sponsored by Paul Tötterman.	2014-08-15 17:37:42 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	2427832bed	relicense general utility library code to BSD Omitted a couple of files what have had significant contributions from others.	2014-05-10 11:01:27 -03:00
Joey Hess	16387edd00	avoid exception when curl exits nonzero (due to eg, bad domain name)	2014-03-27 13:01:57 -04:00
Joey Hess	003fc2b7e1	add UrlOptions sum type	2014-02-24 22:00:25 -04:00
Joey Hess	c69d6eb035	Make annex.web-options be used in several places that call curl.	2014-02-24 21:29:37 -04:00
Joey Hess	0cac4402ac	Android: Avoid passing --clobber to busybox wget.	2014-01-13 14:52:49 -04:00
Joey Hess	e2f50f5110	Added support for quvi 0.9. Slightly suboptimal due to limitations in its interface compared with the old version.	2013-11-24 23:44:30 -04:00
Joey Hess	747f5b123c	url size fixes addurl: Improve message when adding url with wrong size to existing file. Before the message suggested the url didn't exist. Fixed handling of URL keys that have no recorded size. Before, if the key has no size, the url also had to not declare any size, which was unlikely and wrong, or it was taken to not exist. This probably would mostly affect keys that were added to the annex with addurl --relaxed.	2013-10-11 13:05:00 -04:00
Joey Hess	12f6b9693a	Send a git-annex user-agent when downloading urls. Overridable with --user-agent option. Not yet done for S3 or WebDAV due to limitations of libraries used -- nether allows a user-agent header to be specified. This commit sponsored by Michael Zehrer.	2013-09-28 14:35:21 -04:00
Joey Hess	d603f536bd	Set --clobber when running wget to ensure resuming works properly.	2013-08-21 18:19:01 -04:00
Joey Hess	0912e752b5	Revert "Delete empty downloaded file when wget fails, to work around reported resume failure." This reverts commit `98886e3fbf`. Better fix forthcoming	2013-08-21 18:17:48 -04:00
Joey Hess	98886e3fbf	Delete empty downloaded file when wget fails, to work around reported resume failure. <RichiH> i richih@eudyptes (git)-[master] ~git/debconf-share/debconf13/photos/chrysn % rm /home/richih/work/git/debconf-share/.git/annex/tmp/SHA256E-s3044235--693b74fcb12db06b5e79a8b99d03e2418923866506ee62d24a4e9ae8c5236758.JPG <RichiH> richih@eudyptes (git)-[master] ~git/debconf-share/debconf13/photos/chrysn % git annex get P8060008.JPG <RichiH> get P8060008.JPG (from website...) --2013-08-21 21:42:45-- http://annex.debconf.org/debconf-share/.git//annex/objects/1a4/67d/SHA256E-s3044235--693b74fcb12db06b5e79a8b99d03e2418923866506ee62d24a4e9ae8c5236758.JPG/SHA256E-s3044235--693b74fcb12db06b5e79a8b99d03e2418923866506ee62d24a4e9ae8c5236758.JPG <RichiH> Resolving annex.debconf.org (annex.debconf.org)... 5.153.231.227, 2001:41c8:1000:19::227:2 <RichiH> Connecting to annex.debconf.org (annex.debconf.org)\|5.153.231.227\|:80... connected. <RichiH> HTTP request sent, awaiting response... 404 Not Found <RichiH> 2013-08-21 21:42:45 ERROR 404: Not Found. <RichiH> File `/home/richih/work/git/debconf-share/.git/annex/tmp/SHA256E-s3044235--693b74fcb12db06b5e79a8b99d03e2418923866506ee62d24a4e9ae8c5236758.JPG' already there; not retrieving. <RichiH> Unable to access these remotes: website <RichiH> Try making some of these repositories available: <RichiH> 3e0356ac-0743-11e3-83a5-1be63124a102 -- website (annex.debconf.org) <RichiH> a7495021-9f2d-474e-80c7-34d29d09fec6 -- chrysn@hephaistos:~/data/projects/debconf13/debconf-share <RichiH> eb8990f7-84cd-4e6b-b486-a5e71efbd073 -- joeyh passport usb drive <RichiH> f415f118-f428-4c68-be66-c91501da3a93 -- joeyh laptop <RichiH> failed <RichiH> git-annex: get: 1 failed <RichiH> richih@eudyptes (git)-[master] ~git/debconf-share/debconf13/photos/chrysn % I was not able to reproduce the failure, but I did reproduce that wget -O http://404/ results in an empty file being written.	2013-08-21 16:05:51 -04:00
Joey Hess	7e66d260ea	importfeed: git-annex becomes a podcatcher in 150 LOC	2013-07-28 16:55:42 -04:00
Joey Hess	79e1a0c571	Pass -f to curl when downloading a file with it, so it propigates failure.	2013-07-06 00:55:00 -04:00
Joey Hess	e3c1586997	Improve error handling when getting uuid of http remotes to auto-ignore, like with ssh remotes.	2013-05-25 01:47:19 -04:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	18bdff3fae	clean up from windows porting	2013-05-11 18:23:41 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	6490418a4e	Fall back to internal url downloader when built without curl.	2013-04-16 15:42:51 -04:00
Joey Hess	0ecd05c28d	addurl url escaping foo * addurl: Escape invalid characters in urls, rather than failing to use an invalid url. * addurl: Properly handle url-escaped characters in file:// urls.	2013-03-10 23:00:33 -04:00
Joey Hess	eedd248371	avoid using curl for file:// urls since it's buggy	2013-03-10 22:34:44 -04:00
Joey Hess	981dbc02f5	cleanup	2013-01-28 11:32:30 +11:00
Joey Hess	d3d791c7e7	addurl --fast: Use curl, rather than haskell HTTP library, to support https.	2013-01-27 09:30:53 +11:00
Joey Hess	f87a781aa6	finished where indentation changes	2012-12-13 00:24:19 -04:00
Joey Hess	7c30be0e8c	use cabal macro to detect if old version of network is being used	2012-11-11 18:05:01 -04:00
Joey Hess	359f386ad6	switch to new URI version by default, -DWITH_OLD_URI for old	2012-11-03 12:10:01 -04:00
Joey Hess	62f50b2052	file:/// URLs can now be used with the web special remote.	2012-10-21 01:28:10 -04:00
Joey Hess	8fcb84dd2f	deal with incompatable api change in network 2.4.0.1 On the cabal side, let's just require this new version, and set -DURI_24 to enable the code using it.	2012-10-10 11:26:30 -04:00
Joey Hess	42e4145a17	bugfixes	2012-04-22 01:20:17 -04:00
Joey Hess	84ac8c58db	Add annex.httpheaders and annex.httpheader-command config settings Allow custom headers to be sent with all HTTP requests. (Requested by the Internet Archive)	2012-04-22 01:13:09 -04:00
Joey Hess	ed79596b75	noop	2012-04-21 23:32:33 -04:00
Joey Hess	771052a85e	optimize monadic \|\| (\|\|) used applicative style runs both conditions rather than short circuiting. Add an orM that properly short-circuits.	2012-03-16 12:28:17 -04:00
Joey Hess	c0c9991c9f	nukes another 15 lines thanks to ifM	2012-03-15 20:39:25 -04:00
Joey Hess	a1e52f0ce5	hlint	2012-02-16 00:44:51 -04:00
Joey Hess	ecfcb41abe	work around Network.Browser bug that converts a HEAD to a GET when following a redirect The code explicitly switches from HEAD to GET for most redirects. Possibly because someone misread a spec (which does require switching from POST to GET for 303 redirects). Or possibly because the spec really is that bad. Upstream bug: https://github.com/haskell/HTTP/issues/24 Since we absolutely don't want to download entire (large) files from the web when checking that they exist with HEAD, I wrote my own redirect follower, based closely on the one used by Network.Browser, but without this misfeature. Note that Network.Browser checks that the redirect url is a http url and fails if not. I don't, because I want to not need to change this code when it gets https support (related: I'm surprised to see it doesn't support https yet..). The check does not seem security significant; it doesn't support file:// urls for example. If a http url is redirected to https, the Network.Browser will actually make a http connection again. This could loop, but only up to 5 times.	2012-02-10 21:54:25 -04:00
Joey Hess	9030f68452	When checking that an url has a key, verify that the Content-Length, if available, matches the size of the key. If there's no Content-Length, or the key has no size, this check is not done, but it should happen most of the time, and protect against web content that has changed.	2012-02-10 19:23:41 -04:00
Joey Hess	aa0882691b	Added remote.name.annex-web-options configuration setting, which can be used to provide parameters to whichever of wget or curl git-annex uses (depends on which is available, but most of their important options suitable for use here are the same).	2012-01-02 14:20:20 -04:00
Joey Hess	8e2f74f7ab	update	2011-12-20 23:24:05 -04:00
Joey Hess	815d1318d7	comment	2011-12-20 23:24:05 -04:00
Joey Hess	23f2a12816	broke up Utility	2011-10-16 00:50:12 -04:00
Joey Hess	91366c896d	clean Annex stuff out of Utility/	2011-10-16 00:04:26 -04:00
Joey Hess	6e750764b7	The wget command will now be used in preference to curl, if available. Got tired of curl's various ugly progress bars.	2011-08-27 12:31:50 -04:00
Joey Hess	678726c10c	code simplification thanks to applicative functors	2011-08-25 01:27:19 -04:00
Joey Hess	203148363f	split groups of related functions out of Utility	2011-08-22 16:14:12 -04:00
Joey Hess	737b5d14c9	moved files around	2011-08-20 16:11:42 -04:00

1 2 3

149 commits