c34152777b
* For url downloads, git-annex now defaults to using a http library, rather than wget or curl. But, if annex.web-options is set, it will use curl. To use the .netrc file, run: git config annex.web-options --netrc * git-annex no longer uses wget (and wget is no longer shipped with git-annex builds). Note that curl is always run in silent mode, since the new API for download has a MeterUpdate and doesn't make way for curl progress output. It might be worth writing a parser for curl's progress output to update the meter when using it, but I didn't bother with this edge case for now. This commit was supported by the NSF-funded DataLad project.
52 lines
2.3 KiB
Markdown
52 lines
2.3 KiB
Markdown
Currently git-annex uses wget and curl for downloading urls.
|
|
Which is used depends on the situation, since both have their limitations
|
|
and quirks.
|
|
|
|
This often confuses users, who expect annex.web-options to only apply
|
|
to whichever program git-annex was running, and put in an option that
|
|
breaks the other program. Or, configure a netrc file, which wget uses by
|
|
default, but curl does not.
|
|
|
|
Also, using these external programs avoids keeping a http connection open
|
|
and pipelining requests, so it makes mass url downloads a lot slower than
|
|
if git-annex used http-conduit to do url downloads itself. [[users/yoh]]
|
|
has requested http pipelining.
|
|
|
|
(git-annex was creating a new http manager each time it hit an url,
|
|
except for in the S3 remote which reused a single manager. That's now been
|
|
improved, so all http-conduit use in git-annex reuses a http manager, and
|
|
so will do http pipelining.)
|
|
|
|
For file: ftp: and more unusual urls, http-conduit can't support them.
|
|
git-annex does support those urls, and people rely on that, so it would
|
|
still need to use wget or curl for those.
|
|
|
|
wget is also not shipped with git-annex on Windows or OSX, only curl is,
|
|
and it would be good to only use one of the programs, not both, when
|
|
handing those unusual urls.
|
|
|
|
See also, [[support_.netrc_for_fsck_--from_web]]. That some users rely on
|
|
git-annex using wget and a netrc file is kind of problimatic if switching
|
|
to http-conduit which does not support it. Maybe require users to set
|
|
`annex.web-download-command` if they want to make it use something that
|
|
supports netrc?
|
|
|
|
--[[Joey]]
|
|
|
|
> Implemented Utility.Url.downloadC that is the (nontrivial)
|
|
> download a file with resume support using http-conduit.
|
|
> It falls back to curl to handle urls that http-conduit does not support.
|
|
> Now we only have to decide what to do about the above edge cases..
|
|
|
|
> > Let's drop use of wget entirely, as it was only using it because I
|
|
> > preferred wget's progress bar to curl's. The user can still force wget
|
|
> > with annex.web-download-command.
|
|
> >
|
|
> > That leaves users who have a .netrc file or want to use
|
|
> > annex.web-options. Since curl requires --netrc in order to use the
|
|
> > .netrc file, require users who want to use the .netrc to
|
|
> > set "annex.web-options = --netrc". When "annex.web-options" is
|
|
> > set, always use curl (unless overridden by annex.web-download-command).
|
|
> > Otherwise, use conduit.
|
|
|
|
[[done]] --[[Joey]]
|