Use http-conduit for url downloads by default, annex.web-options enables curl

* For url downloads, git-annex now defaults to using a http library,
  rather than wget or curl. But, if annex.web-options is set, it will
  use curl. To use the .netrc file, run:
    git config annex.web-options --netrc
* git-annex no longer uses wget (and wget is no longer shipped with
  git-annex builds).

Note that curl is always run in silent mode, since the new API for
download has a MeterUpdate and doesn't make way for curl progress
output. It might be worth writing a parser for curl's progress output
to update the meter when using it, but I didn't bother with this edge
case for now.

This commit was supported by the NSF-funded DataLad project.
This commit is contained in:
Joey Hess 2018-04-06 17:00:46 -04:00
parent 0791c24221
commit c34152777b
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
17 changed files with 104 additions and 181 deletions

View file

@ -939,16 +939,14 @@ saveState nocommit = doSideAction $ do
{- Downloads content from any of a list of urls. -} {- Downloads content from any of a list of urls. -}
downloadUrl :: Key -> MeterUpdate -> [Url.URLString] -> FilePath -> Annex Bool downloadUrl :: Key -> MeterUpdate -> [Url.URLString] -> FilePath -> Annex Bool
downloadUrl k p urls file = meteredFile file (Just p) k $ downloadUrl k p urls file =
-- Poll the file to handle configurations where an external
-- download command is used.
meteredFile file (Just p) k $
go =<< annexWebDownloadCommand <$> Annex.getGitConfig go =<< annexWebDownloadCommand <$> Annex.getGitConfig
where where
go Nothing = do go Nothing = Url.withUrlOptions $ \uo ->
a <- ifM commandProgressDisabled liftIO $ anyM (\u -> Url.download p u file uo) urls
( return Url.downloadQuiet
, return Url.download
)
Url.withUrlOptions $ \uo ->
liftIO $ anyM (\u -> a u file uo) urls
go (Just basecmd) = anyM (downloadcmd basecmd) urls go (Just basecmd) = anyM (downloadcmd basecmd) urls
downloadcmd basecmd url = downloadcmd basecmd url =
progressCommand "sh" [Param "-c", Param $ gencmd url basecmd] progressCommand "sh" [Param "-c", Param $ gencmd url basecmd]

View file

@ -1,5 +1,5 @@
{- Url downloading, with git-annex user agent and configured http {- Url downloading, with git-annex user agent and configured http
- headers and wget/curl options. - headers and curl options.
- -
- Copyright 2013-2018 Joey Hess <id@joeyh.name> - Copyright 2013-2018 Joey Hess <id@joeyh.name>
- -

View file

@ -36,6 +36,7 @@ import Utility.Tmp.Dir
import Utility.UserInfo import Utility.UserInfo
import Utility.Gpg import Utility.Gpg
import Utility.FileMode import Utility.FileMode
import Utility.Metered
import qualified Utility.Lsof as Lsof import qualified Utility.Lsof as Lsof
import qualified BuildInfo import qualified BuildInfo
import qualified Utility.Url as Url import qualified Utility.Url as Url
@ -322,8 +323,8 @@ downloadDistributionInfo = do
liftIO $ withTmpDir "git-annex.tmp" $ \tmpdir -> do liftIO $ withTmpDir "git-annex.tmp" $ \tmpdir -> do
let infof = tmpdir </> "info" let infof = tmpdir </> "info"
let sigf = infof ++ ".sig" let sigf = infof ++ ".sig"
ifM (Url.downloadQuiet distributionInfoUrl infof uo ifM (Url.download nullMeterUpdate distributionInfoUrl infof uo
<&&> Url.downloadQuiet distributionInfoSigUrl sigf uo <&&> Url.download nullMeterUpdate distributionInfoSigUrl sigf uo
<&&> verifyDistributionSig gpgcmd sigf) <&&> verifyDistributionSig gpgcmd sigf)
( parseInfoFile <$> readFileStrict infof ( parseInfoFile <$> readFileStrict infof
, return Nothing , return Nothing

View file

@ -69,14 +69,6 @@ preferredBundledPrograms = catMaybes
, Just "rsync" , Just "rsync"
#ifndef mingw32_HOST_OS #ifndef mingw32_HOST_OS
, Just "sh" , Just "sh"
#endif
#ifndef mingw32_HOST_OS
#ifndef darwin_HOST_OS
-- wget on OSX has been problematic, looking for certs in the wrong
-- places. Don't ship it, use curl or the OSX's own wget if it has
-- one.
, ifset BuildInfo.wget "wget"
#endif
#endif #endif
, BuildInfo.lsof , BuildInfo.lsof
, BuildInfo.gcrypt , BuildInfo.gcrypt

View file

@ -6,14 +6,11 @@ module Build.Configure where
import Build.TestConfig import Build.TestConfig
import Build.Version import Build.Version
import Utility.PartialPrelude
import Utility.Process
import Utility.SafeCommand import Utility.SafeCommand
import Utility.ExternalSHA import Utility.ExternalSHA
import Utility.Env.Basic import Utility.Env.Basic
import Utility.Exception import Utility.Exception
import qualified Git.Version import qualified Git.Version
import Utility.DottedVersion
import Utility.Directory import Utility.Directory
import Control.Monad.IfElse import Control.Monad.IfElse
@ -34,8 +31,6 @@ tests =
, TestCase "xargs -0" $ testCmd "xargs_0" "xargs -0 </dev/null" , TestCase "xargs -0" $ testCmd "xargs_0" "xargs -0 </dev/null"
, TestCase "rsync" $ testCmd "rsync" "rsync --version >/dev/null" , TestCase "rsync" $ testCmd "rsync" "rsync --version >/dev/null"
, TestCase "curl" $ testCmd "curl" "curl --version >/dev/null" , TestCase "curl" $ testCmd "curl" "curl --version >/dev/null"
, TestCase "wget" $ testCmd "wget" "wget --version >/dev/null"
, TestCase "wget unclutter options" checkWgetUnclutter
, TestCase "bup" $ testCmd "bup" "bup --version >/dev/null" , TestCase "bup" $ testCmd "bup" "bup --version >/dev/null"
, TestCase "nice" $ testCmd "nice" "nice true >/dev/null" , TestCase "nice" $ testCmd "nice" "nice true >/dev/null"
, TestCase "ionice" $ testCmd "ionice" "ionice -c3 true >/dev/null" , TestCase "ionice" $ testCmd "ionice" "ionice -c3 true >/dev/null"
@ -106,19 +101,6 @@ getGitVersion = go =<< getEnv "FORCE_GIT_VERSION"
error $ "installed git version " ++ show v ++ " is too old! (Need " ++ show oldestallowed ++ " or newer)" error $ "installed git version " ++ show v ++ " is too old! (Need " ++ show oldestallowed ++ " or newer)"
return $ Config "gitversion" $ StringConfig $ show v return $ Config "gitversion" $ StringConfig $ show v
checkWgetUnclutter :: Test
checkWgetUnclutter = Config "wgetunclutter" . BoolConfig
. maybe False (>= normalize "1.16")
<$> getWgetVersion
getWgetVersion :: IO (Maybe DottedVersion)
getWgetVersion = catchDefaultIO Nothing $
extract <$> readProcess "wget" ["--version"]
where
extract s = case lines s of
[] -> Nothing
(l:_) -> normalize <$> headMaybe (drop 2 $ words l)
getSshConnectionCaching :: Test getSshConnectionCaching :: Test
getSshConnectionCaching = Config "sshconnectioncaching" . BoolConfig <$> getSshConnectionCaching = Config "sshconnectioncaching" . BoolConfig <$>
boolSystem "sh" [Param "-c", Param "ssh -o ControlPersist=yes -V >/dev/null 2>/dev/null"] boolSystem "sh" [Param "-c", Param "ssh -o ControlPersist=yes -V >/dev/null 2>/dev/null"]

View file

@ -9,7 +9,8 @@
- for that. - for that.
- -
- To build the installer, git-annex should already be built to - To build the installer, git-annex should already be built to
- ./git-annex.exe and the necessary utility programs (rsync and wget) - ./git-annex.exe and the necessary utility programs
- (specifically rsync)
- already installed in PATH from msys32. - already installed in PATH from msys32.
- -
- Copyright 2013-2015 Joey Hess <id@joeyh.name> - Copyright 2013-2015 Joey Hess <id@joeyh.name>

View file

@ -1,6 +1,13 @@
git-annex (6.20180317) UNRELEASED; urgency=medium git-annex (6.20180317) UNRELEASED; urgency=medium
* Added adb special remote which allows exporting files to Android devices. * Added adb special remote which allows exporting files to Android devices.
* For url downloads, git-annex now defaults to using a http library,
rather than wget or curl. But, if annex.web-options is set, it will
use curl. To use the .netrc file, run:
git config annex.web-options --netrc
* git-annex no longer uses wget (and wget is no longer shipped with
git-annex builds).
* Enable HTTP connection reuse across multiple files for improved speed.
* Fix calculation of estimated completion for progress meter. * Fix calculation of estimated completion for progress meter.
* OSX app: Work around libz/libPng/ImageIO.framework version skew * OSX app: Work around libz/libPng/ImageIO.framework version skew
by not bundling libz, assuming OSX includes a suitable libz.1.dylib. by not bundling libz, assuming OSX includes a suitable libz.1.dylib.
@ -12,8 +19,6 @@ git-annex (6.20180317) UNRELEASED; urgency=medium
don't copy the data metadata from the old version of the file, don't copy the data metadata from the old version of the file,
instead use the mtime of the file. instead use the mtime of the file.
* Avoid running annex.http-headers-command more than once. * Avoid running annex.http-headers-command more than once.
* Enable HTTP connection reuse across multiple files, when git-annex
uses http-conduit.
* info: Added "combined size of repositories containing these files" * info: Added "combined size of repositories containing these files"
stat when run on a directory. stat when run on a directory.
* info: Changed sorting of numcopies stats table, so it's ordered * info: Changed sorting of numcopies stats table, so it's ordered

View file

@ -27,6 +27,7 @@ import Types.KeySource
import Types.UrlContents import Types.UrlContents
import Annex.FileMatcher import Annex.FileMatcher
import Logs.Location import Logs.Location
import Messages.Progress
import Utility.Metered import Utility.Metered
import Utility.FileSystemEncoding import Utility.FileSystemEncoding
import Utility.HtmlDetect import Utility.HtmlDetect
@ -260,9 +261,8 @@ downloadWeb o url urlinfo file =
go =<< downloadWith' downloader urlkey webUUID url (AssociatedFile (Just file)) go =<< downloadWith' downloader urlkey webUUID url (AssociatedFile (Just file))
where where
urlkey = addSizeUrlKey urlinfo $ Backend.URL.fromUrl url Nothing urlkey = addSizeUrlKey urlinfo $ Backend.URL.fromUrl url Nothing
downloader f p = do downloader f p = metered (Just p) urlkey (pure Nothing) $
showOutput \_ p' -> downloadUrl urlkey p' [url] f
downloadUrl urlkey p [url] f
go Nothing = return Nothing go Nothing = return Nothing
-- If we downloaded a html file, try to use youtube-dl to -- If we downloaded a html file, try to use youtube-dl to
-- extract embedded media. -- extract embedded media.

View file

@ -33,6 +33,7 @@ import Logs.Web
import Logs.File import Logs.File
import qualified Utility.Format import qualified Utility.Format
import Utility.Tmp import Utility.Tmp
import Utility.Metered
import Command.AddUrl (addUrlFile, downloadRemoteFile, parseDownloadOptions, DownloadOptions(..)) import Command.AddUrl (addUrlFile, downloadRemoteFile, parseDownloadOptions, DownloadOptions(..))
import Annex.UUID import Annex.UUID
import Backend.URL (fromUrl) import Backend.URL (fromUrl)
@ -148,12 +149,10 @@ findDownloads u = go =<< downloadFeed u
downloadFeed :: URLString -> Annex (Maybe Feed) downloadFeed :: URLString -> Annex (Maybe Feed)
downloadFeed url downloadFeed url
| Url.parseURIRelaxed url == Nothing = giveup "invalid feed url" | Url.parseURIRelaxed url == Nothing = giveup "invalid feed url"
| otherwise = do | otherwise = Url.withUrlOptions $ \uo ->
showOutput
Url.withUrlOptions $ \uo ->
liftIO $ withTmpFile "feed" $ \f h -> do liftIO $ withTmpFile "feed" $ \f h -> do
hClose h hClose h
ifM (Url.download url f uo) ifM (Url.download nullMeterUpdate url f uo)
( parseFeedString <$> readFileStrict f ( parseFeedString <$> readFileStrict f
, return Nothing , return Nothing
) )

View file

@ -193,13 +193,13 @@ downloadTorrentFile u = do
( return True ( return True
, do , do
showAction "downloading torrent file" showAction "downloading torrent file"
showOutput
createAnnexDirectory (parentDir torrent) createAnnexDirectory (parentDir torrent)
if isTorrentMagnetUrl u if isTorrentMagnetUrl u
then do then do
tmpdir <- tmpTorrentDir u tmpdir <- tmpTorrentDir u
let metadir = tmpdir </> "meta" let metadir = tmpdir </> "meta"
createAnnexDirectory metadir createAnnexDirectory metadir
showOutput
ok <- downloadMagnetLink u metadir torrent ok <- downloadMagnetLink u metadir torrent
liftIO $ removeDirectoryRecursive metadir liftIO $ removeDirectoryRecursive metadir
return ok return ok
@ -208,7 +208,7 @@ downloadTorrentFile u = do
withTmpFileIn misctmp "torrent" $ \f h -> do withTmpFileIn misctmp "torrent" $ \f h -> do
liftIO $ hClose h liftIO $ hClose h
ok <- Url.withUrlOptions $ ok <- Url.withUrlOptions $
liftIO . Url.download u f liftIO . Url.download nullMeterUpdate u f
when ok $ when ok $
liftIO $ renameFile f torrent liftIO $ renameFile f torrent
return ok return ok

View file

@ -252,7 +252,7 @@ tryGitConfigRead autoinit r
v <- liftIO $ withTmpFile "git-annex.tmp" $ \tmpfile h -> do v <- liftIO $ withTmpFile "git-annex.tmp" $ \tmpfile h -> do
hClose h hClose h
let url = Git.repoLocation r ++ "/config" let url = Git.repoLocation r ++ "/config"
ifM (Url.downloadQuiet url tmpfile uo) ifM (Url.download nullMeterUpdate url tmpfile uo)
( Just <$> pipedconfig "git" [Param "config", Param "--null", Param "--list", Param "--file", File tmpfile] ( Just <$> pipedconfig "git" [Param "config", Param "--null", Param "--list", Param "--file", File tmpfile]
, return Nothing , return Nothing
) )

View file

@ -17,6 +17,7 @@ import Annex.Content
import Config.Cost import Config.Cost
import Logs.Web import Logs.Web
import Annex.UUID import Annex.UUID
import Messages.Progress
import Utility.Metered import Utility.Metered
import qualified Annex.Url as Url import qualified Annex.Url as Url
import Annex.YoutubeDl import Annex.YoutubeDl
@ -74,13 +75,14 @@ downloadKey key _af dest p = unVerified $ get =<< getWebUrls key
get [] = do get [] = do
warning "no known url" warning "no known url"
return False return False
get urls = do get urls = untilTrue urls $ \u -> do
showOutput -- make way for download progress bar
untilTrue urls $ \u -> do
let (u', downloader) = getDownloader u let (u', downloader) = getDownloader u
case downloader of case downloader of
YoutubeDownloader -> youtubeDlTo key u' dest YoutubeDownloader -> do
_ -> downloadUrl key p [u'] dest showOutput
youtubeDlTo key u' dest
_ -> metered (Just p) key (pure Nothing) $ \_ p' ->
downloadUrl key p' [u'] dest
downloadKeyCheap :: Key -> AssociatedFile -> FilePath -> Annex Bool downloadKeyCheap :: Key -> AssociatedFile -> FilePath -> Annex Bool
downloadKeyCheap _ _ _ = return False downloadKeyCheap _ _ _ = return False

View file

@ -25,8 +25,6 @@ module Utility.Url (
getUrlInfo, getUrlInfo,
assumeUrlExists, assumeUrlExists,
download, download,
downloadQuiet,
downloadC,
sinkResponseFile, sinkResponseFile,
downloadPartial, downloadPartial,
parseURIRelaxed, parseURIRelaxed,
@ -35,7 +33,6 @@ module Utility.Url (
) where ) where
import Common import Common
import Utility.Tmp.Dir
import Utility.Metered import Utility.Metered
import qualified BuildInfo import qualified BuildInfo
@ -72,23 +69,30 @@ type UserAgent = String
data UrlOptions = UrlOptions data UrlOptions = UrlOptions
{ userAgent :: Maybe UserAgent { userAgent :: Maybe UserAgent
, reqHeaders :: Headers , reqHeaders :: Headers
, reqParams :: [CommandParam] , urlDownloader :: UrlDownloader
, applyRequest :: Request -> Request , applyRequest :: Request -> Request
, httpManager :: Manager , httpManager :: Manager
} }
data UrlDownloader
= DownloadWithConduit
| DownloadWithCurl [CommandParam]
defUrlOptions :: IO UrlOptions defUrlOptions :: IO UrlOptions
defUrlOptions = UrlOptions defUrlOptions = UrlOptions
<$> pure Nothing <$> pure Nothing
<*> pure [] <*> pure []
<*> pure [] <*> pure DownloadWithConduit
<*> pure id <*> pure id
<*> newManager managerSettings <*> newManager managerSettings
mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> Manager -> UrlOptions mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> Manager -> UrlOptions
mkUrlOptions defuseragent reqheaders reqparams manager = mkUrlOptions defuseragent reqheaders reqparams manager =
UrlOptions useragent reqheaders reqparams applyrequest manager UrlOptions useragent reqheaders urldownloader applyrequest manager
where where
urldownloader = if null reqparams
then DownloadWithConduit
else DownloadWithCurl reqparams
applyrequest = \r -> r { requestHeaders = requestHeaders r ++ addedheaders } applyrequest = \r -> r { requestHeaders = requestHeaders r ++ addedheaders }
addedheaders = uaheader ++ otherheaders addedheaders = uaheader ++ otherheaders
useragent = maybe defuseragent (Just . B8.toString . snd) useragent = maybe defuseragent (Just . B8.toString . snd)
@ -105,11 +109,16 @@ mkUrlOptions defuseragent reqheaders reqparams manager =
(' ':v') -> (h', B8.fromString v') (' ':v') -> (h', B8.fromString v')
_ -> (h', B8.fromString v) _ -> (h', B8.fromString v)
addUserAgent :: UrlOptions -> [CommandParam] -> [CommandParam] curlParams :: UrlOptions -> [CommandParam] -> [CommandParam]
addUserAgent uo ps = case userAgent uo of curlParams uo ps = ps ++ uaparams ++ headerparams ++ addedparams
Nothing -> ps where
-- --user-agent works for both wget and curl commands uaparams = case userAgent uo of
Just ua -> ps ++ [Param "--user-agent", Param ua] Nothing -> []
Just ua -> [Param "--user-agent", Param ua]
headerparams = concatMap (\h -> [Param "-H", Param h]) (reqHeaders uo)
addedparams = case urlDownloader uo of
DownloadWithConduit -> []
DownloadWithCurl l -> l
{- Checks that an url exists and could be successfully downloaded, {- Checks that an url exists and could be successfully downloaded,
- also checking that its size, if available, matches a specified size. -} - also checking that its size, if available, matches a specified size. -}
@ -144,8 +153,8 @@ assumeUrlExists = UrlInfo True Nothing Nothing
- also returning its size and suggested filename if available. -} - also returning its size and suggested filename if available. -}
getUrlInfo :: URLString -> UrlOptions -> IO UrlInfo getUrlInfo :: URLString -> UrlOptions -> IO UrlInfo
getUrlInfo url uo = case parseURIRelaxed url of getUrlInfo url uo = case parseURIRelaxed url of
Just u -> case parseUrlConduit (show u) of Just u -> case (urlDownloader uo, parseUrlConduit (show u)) of
Just req -> catchJust (DownloadWithConduit, Just req) -> catchJust
-- When http redirects to a protocol which -- When http redirects to a protocol which
-- conduit does not support, it will throw -- conduit does not support, it will throw
-- a StatusCodeException with found302. -- a StatusCodeException with found302.
@ -155,7 +164,7 @@ getUrlInfo url uo = case parseURIRelaxed url of
`catchNonAsync` (const dne) `catchNonAsync` (const dne)
-- http-conduit does not support file:, ftp:, etc urls, -- http-conduit does not support file:, ftp:, etc urls,
-- so fall back to reading files and using curl. -- so fall back to reading files and using curl.
Nothing _
| uriScheme u == "file:" -> do | uriScheme u == "file:" -> do
let f = unEscapeString (uriPath u) let f = unEscapeString (uriPath u)
s <- catchMaybeIO $ getFileStatus f s <- catchMaybeIO $ getFileStatus f
@ -171,12 +180,12 @@ getUrlInfo url uo = case parseURIRelaxed url of
dne = return $ UrlInfo False Nothing Nothing dne = return $ UrlInfo False Nothing Nothing
found sz f = return $ UrlInfo True sz f found sz f = return $ UrlInfo True sz f
curlparams = addUserAgent uo $ curlparams = curlParams uo $
[ Param "-s" [ Param "-s"
, Param "--head" , Param "--head"
, Param "-L", Param url , Param "-L", Param url
, Param "-w", Param "%{http_code}" , Param "-w", Param "%{http_code}"
] ++ concatMap (\h -> [Param "-H", Param h]) (reqHeaders uo) ++ (reqParams uo) ]
extractlencurl s = case lastMaybe $ filter ("Content-Length:" `isPrefixOf`) (lines s) of extractlencurl s = case lastMaybe $ filter ("Content-Length:" `isPrefixOf`) (lines s) of
Just l -> case lastMaybe $ words l of Just l -> case lastMaybe $ words l of
@ -184,13 +193,11 @@ getUrlInfo url uo = case parseURIRelaxed url of
_ -> Nothing _ -> Nothing
_ -> Nothing _ -> Nothing
extractlen = readish . B8.toString <=< firstheader hContentLength extractlen = readish . B8.toString
<=< lookup hContentLength . responseHeaders
extractfilename = contentDispositionFilename . B8.toString extractfilename = contentDispositionFilename . B8.toString
<=< firstheader hContentDisposition <=< lookup hContentDisposition . responseHeaders
firstheader h = headMaybe . map snd .
filter (\p -> fst p == h) . responseHeaders
existsconduit req = do existsconduit req = do
let req' = headRequest (applyRequest uo req) let req' = headRequest (applyRequest uo req)
@ -240,102 +247,25 @@ headRequest r = r
(requestHeaders r) (requestHeaders r)
} }
{- Download a perhaps large file, with auto-resume of incomplete downloads.
-
- Uses wget or curl program for its progress bar and resuming support.
- Which program to use is determined at run time depending on which is
- in path and which works best in a particular situation.
-}
download :: URLString -> FilePath -> UrlOptions -> IO Bool
download = download' False
{- No output to stdout. -}
downloadQuiet :: URLString -> FilePath -> UrlOptions -> IO Bool
downloadQuiet = download' True
download' :: Bool -> URLString -> FilePath -> UrlOptions -> IO Bool
download' quiet url file uo = do
case parseURIRelaxed url of
Just u
| uriScheme u == "file:" -> curl
-- curl is preferred in quiet mode, because
-- it displays http errors to stderr, while wget
-- does not display them in quiet mode
| quiet -> ifM (inPath "curl") (curl, wget)
-- wget is preferred mostly because it has a better
-- progress bar
| otherwise -> ifM (inPath "wget") (wget , curl)
_ -> return False
where
headerparams = map (\h -> Param $ "--header=" ++ h) (reqHeaders uo)
wget = go "wget" $ headerparams ++ quietopt "-q" ++ wgetparams
{- Regular wget needs --clobber to continue downloading an existing
- file. On Android, busybox wget is used, which does not
- support, or need that option.
-
- When the wget version is new enough, pass options for
- a less cluttered download display. Using -nv rather than -q
- avoids most clutter while still displaying http errors.
-}
#ifndef __ANDROID__
wgetparams = concat
[ if BuildInfo.wgetunclutter && not quiet
then [Param "-nv", Param "--show-progress"]
else []
, [ Param "--clobber", Param "-c", Param "-O"]
]
#else
wgetparams = [Param "-c", Param "-O"]
#endif
{- Uses the -# progress display, because the normal
- one is very confusing when resuming, showing
- the remainder to download as the whole file,
- and not indicating how much percent was
- downloaded before the resume. -}
curl = do
-- curl does not create destination file
-- if the url happens to be empty, so pre-create.
unlessM (doesFileExist file) $
writeFile file ""
go "curl" $ headerparams ++ quietopt "-sS" ++
[ Param "-f"
, Param "-L"
, Param "-C", Param "-"
, Param "-#"
, Param "-o"
]
{- Run wget in a temp directory because it has been buggy
- and overwritten files in the current directory, even though
- it was asked to write to a file elsewhere. -}
go cmd opts = withTmpDir "downloadurl" $ \tmp -> do
absfile <- absPath file
let ps = addUserAgent uo $ opts++reqParams uo++[File absfile, File url]
boolSystem' cmd ps $ \p -> p { cwd = Just tmp }
quietopt s
| quiet = [Param s]
| otherwise = []
{- Download a perhaps large file, with auto-resume of incomplete downloads. {- Download a perhaps large file, with auto-resume of incomplete downloads.
- -
- By default, conduit is used for the download, except for file: urls, - By default, conduit is used for the download, except for file: urls,
- which are copied. If the url scheme is not supported by conduit, falls - which are copied. If the url scheme is not supported by conduit, falls
- back to using curl. - back to using curl.
-} -}
downloadC :: MeterUpdate -> URLString -> FilePath -> UrlOptions -> IO Bool download :: MeterUpdate -> URLString -> FilePath -> UrlOptions -> IO Bool
downloadC meterupdate url file uo = go `catchNonAsync` (const $ return False) download meterupdate url file uo = go `catchNonAsync` (const $ return False)
where where
go = case parseURIRelaxed url of go = case parseURIRelaxed url of
Just u -> case parseUrlConduit (show u) of Just u -> case (urlDownloader uo, parseUrlConduit (show u)) of
Just req -> catchJust (DownloadWithConduit, Just req) -> catchJust
-- When http redirects to a protocol which -- When http redirects to a protocol which
-- conduit does not support, it will throw -- conduit does not support, it will throw
-- a StatusCodeException with found302. -- a StatusCodeException with found302.
(matchStatusCodeException (== found302)) (matchStatusCodeException (== found302))
(downloadconduit req) (downloadconduit req)
(const downloadcurl) (const downloadcurl)
Nothing _
| uriScheme u == "file:" -> do | uriScheme u == "file:" -> do
let src = unEscapeString (uriPath u) let src = unEscapeString (uriPath u)
withMeteredFile src meterupdate $ withMeteredFile src meterupdate $
@ -371,7 +301,6 @@ downloadC meterupdate url file uo = go `catchNonAsync` (const $ return False)
dl = runResourceT $ do dl = runResourceT $ do
let req' = req { requestHeaders = resumeFromHeader sz : requestHeaders req } let req' = req { requestHeaders = resumeFromHeader sz : requestHeaders req }
resp <- http req' (httpManager uo) resp <- http req' (httpManager uo)
liftIO $ print ("XXX", responseStatus resp)
if responseStatus resp == partialContent206 if responseStatus resp == partialContent206
then store (BytesProcessed sz) AppendMode resp then store (BytesProcessed sz) AppendMode resp
else if responseStatus resp == ok200 else if responseStatus resp == ok200
@ -387,20 +316,13 @@ downloadC meterupdate url file uo = go `catchNonAsync` (const $ return False)
-- if the url happens to be empty, so pre-create. -- if the url happens to be empty, so pre-create.
unlessM (doesFileExist file) $ unlessM (doesFileExist file) $
writeFile file "" writeFile file ""
let headerparams = map (\h -> Param $ "--header=" ++ h) (reqHeaders uo) let ps = curlParams uo
let opts =
[ Param "-sS" [ Param "-sS"
, Param "-f" , Param "-f"
, Param "-L" , Param "-L"
, Param "-C", Param "-" , Param "-C", Param "-"
, Param "-o"
]
boolSystem "curl" $ addUserAgent uo $ concat
[ headerparams
, opts
, reqParams uo
, [File file, File url]
] ]
boolSystem "curl" (ps ++ [Param "-o", File file, File url])
{- Sinks a Response's body to a file. The file can either be opened in {- Sinks a Response's body to a file. The file can either be opened in
- WriteMode or AppendMode. Updates the meter as data is received. - WriteMode or AppendMode. Updates the meter as data is received.

2
debian/control vendored
View file

@ -83,7 +83,6 @@ Build-Depends:
libimage-magick-perl, libimage-magick-perl,
git (>= 1:1.8.1), git (>= 1:1.8.1),
rsync, rsync,
wget,
curl, curl,
openssh-client, openssh-client,
git-remote-gcrypt (>= 0.20130908-6), git-remote-gcrypt (>= 0.20130908-6),
@ -101,7 +100,6 @@ Section: utils
Depends: ${misc:Depends}, ${shlibs:Depends}, Depends: ${misc:Depends}, ${shlibs:Depends},
git (>= 1:1.8.1), git (>= 1:1.8.1),
rsync, rsync,
wget,
curl, curl,
openssh-client (>= 1:5.6p1) openssh-client (>= 1:5.6p1)
Recommends: Recommends:

View file

@ -1356,8 +1356,11 @@ Here are all the supported configuration settings.
* `annex.web-options` * `annex.web-options`
Options to pass when running wget or curl. Setting this makes git-annex use curl to download urls
For example, to force IPv4 only, set it to "-4" (rather than the default built-in url downloader).
For example, to force IPv4 only, set it to "-4".
Or to make curl use your ~/.netrc file, set it to "--netrc".
* `annex.youtube-dl-options` * `annex.youtube-dl-options`
@ -1387,7 +1390,6 @@ Here are all the supported configuration settings.
* `annex.web-download-command` * `annex.web-download-command`
Use to specify a command to run to download a file from the web. Use to specify a command to run to download a file from the web.
(The default is to use wget or curl.)
In the command line, %url is replaced with the url to download, In the command line, %url is replaced with the url to download,
and %file is replaced with the file that it should be saved to. and %file is replaced with the file that it should be saved to.

View file

@ -37,3 +37,16 @@ supports netrc?
> download a file with resume support using http-conduit. > download a file with resume support using http-conduit.
> It falls back to curl to handle urls that http-conduit does not support. > It falls back to curl to handle urls that http-conduit does not support.
> Now we only have to decide what to do about the above edge cases.. > Now we only have to decide what to do about the above edge cases..
> > Let's drop use of wget entirely, as it was only using it because I
> > preferred wget's progress bar to curl's. The user can still force wget
> > with annex.web-download-command.
> >
> > That leaves users who have a .netrc file or want to use
> > annex.web-options. Since curl requires --netrc in order to use the
> > .netrc file, require users who want to use the .netrc to
> > set "annex.web-options = --netrc". When "annex.web-options" is
> > set, always use curl (unless overridden by annex.web-download-command).
> > Otherwise, use conduit.
[[done]] --[[Joey]]

View file

@ -8,3 +8,11 @@ is gone permanently).
I do not want to encode my credentials into the URLs (eg. I do not want to encode my credentials into the URLs (eg.
username:password@example.com) because my password changes frequently and I would username:password@example.com) because my password changes frequently and I would
have to update all of the URLs. have to update all of the URLs.
> git-annex 6.20180406 and onwards use http-conduit for everything
> by default. To use the .netrc file, run:
git config annex.web-options --netrc
> That will make git-annex use curl for all web accesses, and configures
> curl to use your netrc file. [[done]] --[[Joey]]