git-annex/doc/todo/stop_using_curl_and_wget.mdwn

53 lines
2.3 KiB
Text
Raw Normal View History

2018-04-04 18:32:32 +00:00
Currently git-annex uses wget and curl for downloading urls.
Which is used depends on the situation, since both have their limitations
and quirks.
This often confuses users, who expect annex.web-options to only apply
to whichever program git-annex was running, and put in an option that
breaks the other program. Or, configure a netrc file, which wget uses by
default, but curl does not.
Also, using these external programs avoids keeping a http connection open
and pipelining requests, so it makes mass url downloads a lot slower than
if git-annex used http-conduit to do url downloads itself. [[users/yoh]]
has requested http pipelining.
(git-annex was creating a new http manager each time it hit an url,
except for in the S3 remote which reused a single manager. That's now been
improved, so all http-conduit use in git-annex reuses a http manager, and
so will do http pipelining.)
2018-04-04 18:32:32 +00:00
For file: ftp: and more unusual urls, http-conduit can't support them.
git-annex does support those urls, and people rely on that, so it would
still need to use wget or curl for those.
wget is also not shipped with git-annex on Windows or OSX, only curl is,
and it would be good to only use one of the programs, not both, when
handing those unusual urls.
See also, [[support_.netrc_for_fsck_--from_web]]. That some users rely on
git-annex using wget and a netrc file is kind of problimatic if switching
to http-conduit which does not support it. Maybe require users to set
`annex.web-download-command` if they want to make it use something that
supports netrc?
--[[Joey]]
> Implemented Utility.Url.downloadC that is the (nontrivial)
> download a file with resume support using http-conduit.
> It falls back to curl to handle urls that http-conduit does not support.
> Now we only have to decide what to do about the above edge cases..
> > Let's drop use of wget entirely, as it was only using it because I
> > preferred wget's progress bar to curl's. The user can still force wget
> > with annex.web-download-command.
> >
> > That leaves users who have a .netrc file or want to use
> > annex.web-options. Since curl requires --netrc in order to use the
> > .netrc file, require users who want to use the .netrc to
> > set "annex.web-options = --netrc". When "annex.web-options" is
> > set, always use curl (unless overridden by annex.web-download-command).
> > Otherwise, use conduit.
[[done]] --[[Joey]]