limit url downloads to whitelisted schemes
Security fix! Allowing any schemes, particularly file: and possibly others like scp: allowed file exfiltration by anyone who had write access to the git repository, since they could add an annexed file using such an url, or using an url that redirected to such an url, and wait for the victim to get it into their repository and send them a copy. * Added annex.security.allowed-url-schemes setting, which defaults to only allowing http and https URLs. Note especially that file:/ is no longer enabled by default. * Removed annex.web-download-command, since its interface does not allow supporting annex.security.allowed-url-schemes across redirects. If you used this setting, you may want to instead use annex.web-options to pass options to curl. With annex.web-download-command removed, nearly all url accesses in git-annex are made via Utility.Url via http-client or curl. http-client only supports http and https, so no problem there. (Disabling one and not the other is not implemented.) Used curl --proto to limit the allowed url schemes. Note that this will cause git annex fsck --from web to mark files using a disallowed url scheme as not being present in the web. That seems acceptable; fsck --from web also does that when a web server is not available. youtube-dl already disabled file: itself (probably for similar reasons). The scheme check was also added to youtube-dl urls for completeness, although that check won't catch any redirects it might follow. But youtube-dl goes off and does its own thing with other protocols anyway, so that's fine. Special remotes that support other domain-specific url schemes are not affected by this change. In the bittorrent remote, aria2c can still download magnet: links. The download of the .torrent file is otherwise now limited by annex.security.allowed-url-schemes. This does not address any external special remotes that might download an url themselves. Current thinking is all external special remotes will need to be audited for this problem, although many of them will use http libraries that only support http and not curl's menagarie. The related problem of accessing private localhost and LAN urls is not addressed by this commit. This commit was sponsored by Brett Eisenberg on Patreon.
This commit is contained in:
parent
c8559a0403
commit
28720c795f
16 changed files with 139 additions and 68 deletions
|
@ -943,18 +943,8 @@ downloadUrl k p urls file =
|
||||||
-- Poll the file to handle configurations where an external
|
-- Poll the file to handle configurations where an external
|
||||||
-- download command is used.
|
-- download command is used.
|
||||||
meteredFile file (Just p) k $
|
meteredFile file (Just p) k $
|
||||||
go =<< annexWebDownloadCommand <$> Annex.getGitConfig
|
Url.withUrlOptions $ \uo ->
|
||||||
where
|
|
||||||
go Nothing = Url.withUrlOptions $ \uo ->
|
|
||||||
liftIO $ anyM (\u -> Url.download p u file uo) urls
|
liftIO $ anyM (\u -> Url.download p u file uo) urls
|
||||||
go (Just basecmd) = anyM (downloadcmd basecmd) urls
|
|
||||||
downloadcmd basecmd url =
|
|
||||||
progressCommand "sh" [Param "-c", Param $ gencmd url basecmd]
|
|
||||||
<&&> liftIO (doesFileExist file)
|
|
||||||
gencmd url = massReplace
|
|
||||||
[ ("%file", shellEscape file)
|
|
||||||
, ("%url", shellEscape url)
|
|
||||||
]
|
|
||||||
|
|
||||||
{- Copies a key's content, when present, to a temp file.
|
{- Copies a key's content, when present, to a temp file.
|
||||||
- This is used to speed up some rsyncs. -}
|
- This is used to speed up some rsyncs. -}
|
||||||
|
|
|
@ -39,6 +39,7 @@ getUrlOptions = Annex.getState Annex.urloptions >>= \case
|
||||||
<*> headers
|
<*> headers
|
||||||
<*> options
|
<*> options
|
||||||
<*> liftIO (U.newManager U.managerSettings)
|
<*> liftIO (U.newManager U.managerSettings)
|
||||||
|
<*> (annexAllowedUrlSchemes <$> Annex.getGitConfig)
|
||||||
headers = annexHttpHeadersCommand <$> Annex.getGitConfig >>= \case
|
headers = annexHttpHeadersCommand <$> Annex.getGitConfig >>= \case
|
||||||
Just cmd -> lines <$> liftIO (readProcess "sh" ["-c", cmd])
|
Just cmd -> lines <$> liftIO (readProcess "sh" ["-c", cmd])
|
||||||
Nothing -> annexHttpHeaders <$> Annex.getGitConfig
|
Nothing -> annexHttpHeaders <$> Annex.getGitConfig
|
||||||
|
|
|
@ -11,7 +11,7 @@ module Annex.YoutubeDl (
|
||||||
youtubeDlSupported,
|
youtubeDlSupported,
|
||||||
youtubeDlCheck,
|
youtubeDlCheck,
|
||||||
youtubeDlFileName,
|
youtubeDlFileName,
|
||||||
youtubeDlFileName',
|
youtubeDlFileNameHtmlOnly,
|
||||||
) where
|
) where
|
||||||
|
|
||||||
import Annex.Common
|
import Annex.Common
|
||||||
|
@ -41,8 +41,11 @@ import Control.Concurrent.Async
|
||||||
-- (Note that we can't use --output to specifiy the file to download to,
|
-- (Note that we can't use --output to specifiy the file to download to,
|
||||||
-- due to <https://github.com/rg3/youtube-dl/issues/14864>)
|
-- due to <https://github.com/rg3/youtube-dl/issues/14864>)
|
||||||
youtubeDl :: URLString -> FilePath -> Annex (Either String (Maybe FilePath))
|
youtubeDl :: URLString -> FilePath -> Annex (Either String (Maybe FilePath))
|
||||||
youtubeDl url workdir
|
youtubeDl url workdir = withUrlOptions $ youtubeDl' url workdir
|
||||||
| supportedScheme url = ifM (liftIO $ inPath "youtube-dl")
|
|
||||||
|
youtubeDl' :: URLString -> FilePath -> UrlOptions -> Annex (Either String (Maybe FilePath))
|
||||||
|
youtubeDl' url workdir uo
|
||||||
|
| supportedScheme uo url = ifM (liftIO $ inPath "youtube-dl")
|
||||||
( runcmd >>= \case
|
( runcmd >>= \case
|
||||||
Right True -> workdirfiles >>= \case
|
Right True -> workdirfiles >>= \case
|
||||||
(f:[]) -> return (Right (Just f))
|
(f:[]) -> return (Right (Just f))
|
||||||
|
@ -134,8 +137,11 @@ youtubeDlSupported url = either (const False) id <$> youtubeDlCheck url
|
||||||
|
|
||||||
-- Check if youtube-dl can find media in an url.
|
-- Check if youtube-dl can find media in an url.
|
||||||
youtubeDlCheck :: URLString -> Annex (Either String Bool)
|
youtubeDlCheck :: URLString -> Annex (Either String Bool)
|
||||||
youtubeDlCheck url
|
youtubeDlCheck = withUrlOptions . youtubeDlCheck'
|
||||||
| supportedScheme url = catchMsgIO $ htmlOnly url False $ do
|
|
||||||
|
youtubeDlCheck' :: URLString -> UrlOptions -> Annex (Either String Bool)
|
||||||
|
youtubeDlCheck' url uo
|
||||||
|
| supportedScheme uo url = catchMsgIO $ htmlOnly url False $ do
|
||||||
opts <- youtubeDlOpts [ Param url, Param "--simulate" ]
|
opts <- youtubeDlOpts [ Param url, Param "--simulate" ]
|
||||||
liftIO $ snd <$> processTranscript "youtube-dl" (toCommand opts) Nothing
|
liftIO $ snd <$> processTranscript "youtube-dl" (toCommand opts) Nothing
|
||||||
| otherwise = return (Right False)
|
| otherwise = return (Right False)
|
||||||
|
@ -144,18 +150,22 @@ youtubeDlCheck url
|
||||||
--
|
--
|
||||||
-- (This is not always identical to the filename it uses when downloading.)
|
-- (This is not always identical to the filename it uses when downloading.)
|
||||||
youtubeDlFileName :: URLString -> Annex (Either String FilePath)
|
youtubeDlFileName :: URLString -> Annex (Either String FilePath)
|
||||||
youtubeDlFileName url
|
youtubeDlFileName url = withUrlOptions go
|
||||||
| supportedScheme url = flip catchIO (pure . Left . show) $
|
|
||||||
htmlOnly url nomedia (youtubeDlFileName' url)
|
|
||||||
| otherwise = return nomedia
|
|
||||||
where
|
where
|
||||||
|
go uo
|
||||||
|
| supportedScheme uo url = flip catchIO (pure . Left . show) $
|
||||||
|
htmlOnly url nomedia (youtubeDlFileNameHtmlOnly' url uo)
|
||||||
|
| otherwise = return nomedia
|
||||||
nomedia = Left "no media in url"
|
nomedia = Left "no media in url"
|
||||||
|
|
||||||
-- Does not check if the url contains htmlOnly; use when that's already
|
-- Does not check if the url contains htmlOnly; use when that's already
|
||||||
-- been verified.
|
-- been verified.
|
||||||
youtubeDlFileName' :: URLString -> Annex (Either String FilePath)
|
youtubeDlFileNameHtmlOnly :: URLString -> Annex (Either String FilePath)
|
||||||
youtubeDlFileName' url
|
youtubeDlFileNameHtmlOnly = withUrlOptions . youtubeDlFileNameHtmlOnly'
|
||||||
| supportedScheme url = flip catchIO (pure . Left . show) go
|
|
||||||
|
youtubeDlFileNameHtmlOnly' :: URLString -> UrlOptions -> Annex (Either String FilePath)
|
||||||
|
youtubeDlFileNameHtmlOnly' url uo
|
||||||
|
| supportedScheme uo url = flip catchIO (pure . Left . show) go
|
||||||
| otherwise = return nomedia
|
| otherwise = return nomedia
|
||||||
where
|
where
|
||||||
go = do
|
go = do
|
||||||
|
@ -189,12 +199,13 @@ youtubeDlOpts addopts = do
|
||||||
opts <- map Param . annexYoutubeDlOptions <$> Annex.getGitConfig
|
opts <- map Param . annexYoutubeDlOptions <$> Annex.getGitConfig
|
||||||
return (opts ++ addopts)
|
return (opts ++ addopts)
|
||||||
|
|
||||||
supportedScheme :: URLString -> Bool
|
supportedScheme :: UrlOptions -> URLString -> Bool
|
||||||
supportedScheme url = case uriScheme <$> parseURIRelaxed url of
|
supportedScheme uo url = case parseURIRelaxed url of
|
||||||
Nothing -> False
|
Nothing -> False
|
||||||
|
Just u -> case uriScheme u of
|
||||||
-- avoid ugly message from youtube-dl about not supporting file:
|
-- avoid ugly message from youtube-dl about not supporting file:
|
||||||
Just "file:" -> False
|
"file:" -> False
|
||||||
-- ftp indexes may look like html pages, and there's no point
|
-- ftp indexes may look like html pages, and there's no point
|
||||||
-- involving youtube-dl in a ftp download
|
-- involving youtube-dl in a ftp download
|
||||||
Just "ftp:" -> False
|
"ftp:" -> False
|
||||||
Just _ -> True
|
_ -> allowedScheme uo u
|
||||||
|
|
12
CHANGELOG
12
CHANGELOG
|
@ -1,3 +1,15 @@
|
||||||
|
git-annex (6.20180622) UNRELEASED; urgency=high
|
||||||
|
|
||||||
|
* Added annex.security.allowed-url-schemes setting, which defaults
|
||||||
|
to only allowing http and https URLs. Note especially that file:/
|
||||||
|
is no longer enabled by default. This is a security fix.
|
||||||
|
* Removed annex.web-download-command, since its interface does not allow
|
||||||
|
supporting annex.security.allowed-url-schemes across redirects.
|
||||||
|
If you used this setting, you may want to instead use annex.web-options
|
||||||
|
to pass options to curl.
|
||||||
|
|
||||||
|
-- Joey Hess <id@joeyh.name> Wed, 30 May 2018 11:49:08 -0400
|
||||||
|
|
||||||
git-annex (6.20180530) UNRELEASED; urgency=medium
|
git-annex (6.20180530) UNRELEASED; urgency=medium
|
||||||
|
|
||||||
* Fix build with ghc 8.4+, which broke due to the Semigroup Monoid change.
|
* Fix build with ghc 8.4+, which broke due to the Semigroup Monoid change.
|
||||||
|
|
|
@ -277,7 +277,7 @@ downloadWeb o url urlinfo file =
|
||||||
-- Ask youtube-dl what filename it will download
|
-- Ask youtube-dl what filename it will download
|
||||||
-- first, and check if that is already an annexed file,
|
-- first, and check if that is already an annexed file,
|
||||||
-- to avoid unnecessary work in that case.
|
-- to avoid unnecessary work in that case.
|
||||||
| otherwise = youtubeDlFileName' url >>= \case
|
| otherwise = youtubeDlFileNameHtmlOnly url >>= \case
|
||||||
Right dest -> ifAnnexed dest
|
Right dest -> ifAnnexed dest
|
||||||
(alreadyannexed dest)
|
(alreadyannexed dest)
|
||||||
(dl dest)
|
(dl dest)
|
||||||
|
|
11
NEWS
11
NEWS
|
@ -1,3 +1,14 @@
|
||||||
|
git-annex (6.20180622) upstream; urgency=high
|
||||||
|
|
||||||
|
A security fix has changed git-annex to only support http and https
|
||||||
|
URL schemes by default. You can enable other URL schemes, at your own risk,
|
||||||
|
using annex.security.allowed-url-schemes.
|
||||||
|
|
||||||
|
The annex.web-download-command configuration has been removed,
|
||||||
|
use annex.web-options instead.
|
||||||
|
|
||||||
|
-- Joey Hess <id@joeyh.name> Fri, 15 Jun 2018 17:54:23 -0400
|
||||||
|
|
||||||
git-annex (6.20180309) upstream; urgency=medium
|
git-annex (6.20180309) upstream; urgency=medium
|
||||||
|
|
||||||
Note that, due to not using rsync to transfer files over ssh
|
Note that, due to not using rsync to transfer files over ssh
|
||||||
|
|
6
Test.hs
6
Test.hs
|
@ -1714,10 +1714,12 @@ test_add_subdirs = intmpclonerepo $ do
|
||||||
test_addurl :: Assertion
|
test_addurl :: Assertion
|
||||||
test_addurl = intmpclonerepo $ do
|
test_addurl = intmpclonerepo $ do
|
||||||
-- file:// only; this test suite should not hit the network
|
-- file:// only; this test suite should not hit the network
|
||||||
|
let filecmd c ps = git_annex c ("-cannex.security.allowed-url-schemes=file" : ps)
|
||||||
f <- absPath "myurl"
|
f <- absPath "myurl"
|
||||||
let url = replace "\\" "/" ("file:///" ++ dropDrive f)
|
let url = replace "\\" "/" ("file:///" ++ dropDrive f)
|
||||||
writeFile f "foo"
|
writeFile f "foo"
|
||||||
git_annex "addurl" [url] @? ("addurl failed on " ++ url)
|
not <$> git_annex "addurl" [url] @? "addurl failed to fail on file url"
|
||||||
|
filecmd "addurl" [url] @? ("addurl failed on " ++ url)
|
||||||
let dest = "addurlurldest"
|
let dest = "addurlurldest"
|
||||||
git_annex "addurl" ["--file", dest, url] @? ("addurl failed on " ++ url ++ " with --file")
|
filecmd "addurl" ["--file", dest, url] @? ("addurl failed on " ++ url ++ " with --file")
|
||||||
doesFileExist dest @? (dest ++ " missing after addurl --file")
|
doesFileExist dest @? (dest ++ " missing after addurl --file")
|
||||||
|
|
|
@ -33,8 +33,10 @@ import Config.DynamicConfig
|
||||||
import Utility.HumanTime
|
import Utility.HumanTime
|
||||||
import Utility.Gpg (GpgCmd, mkGpgCmd)
|
import Utility.Gpg (GpgCmd, mkGpgCmd)
|
||||||
import Utility.ThreadScheduler (Seconds(..))
|
import Utility.ThreadScheduler (Seconds(..))
|
||||||
|
import Utility.Url (Scheme, mkScheme)
|
||||||
|
|
||||||
import Control.Concurrent.STM
|
import Control.Concurrent.STM
|
||||||
|
import qualified Data.Set as S
|
||||||
|
|
||||||
-- | A configurable value, that may not be fully determined yet because
|
-- | A configurable value, that may not be fully determined yet because
|
||||||
-- the global git config has not yet been loaded.
|
-- the global git config has not yet been loaded.
|
||||||
|
@ -71,7 +73,6 @@ data GitConfig = GitConfig
|
||||||
, annexWebOptions :: [String]
|
, annexWebOptions :: [String]
|
||||||
, annexYoutubeDlOptions :: [String]
|
, annexYoutubeDlOptions :: [String]
|
||||||
, annexAriaTorrentOptions :: [String]
|
, annexAriaTorrentOptions :: [String]
|
||||||
, annexWebDownloadCommand :: Maybe String
|
|
||||||
, annexCrippledFileSystem :: Bool
|
, annexCrippledFileSystem :: Bool
|
||||||
, annexLargeFiles :: Maybe String
|
, annexLargeFiles :: Maybe String
|
||||||
, annexAddSmallFiles :: Bool
|
, annexAddSmallFiles :: Bool
|
||||||
|
@ -93,6 +94,7 @@ data GitConfig = GitConfig
|
||||||
, annexSecureHashesOnly :: Bool
|
, annexSecureHashesOnly :: Bool
|
||||||
, annexRetry :: Maybe Integer
|
, annexRetry :: Maybe Integer
|
||||||
, annexRetryDelay :: Maybe Seconds
|
, annexRetryDelay :: Maybe Seconds
|
||||||
|
, annexAllowedUrlSchemes :: S.Set Scheme
|
||||||
, coreSymlinks :: Bool
|
, coreSymlinks :: Bool
|
||||||
, coreSharedRepository :: SharedRepository
|
, coreSharedRepository :: SharedRepository
|
||||||
, receiveDenyCurrentBranch :: DenyCurrentBranch
|
, receiveDenyCurrentBranch :: DenyCurrentBranch
|
||||||
|
@ -133,7 +135,6 @@ extractGitConfig r = GitConfig
|
||||||
, annexWebOptions = getwords (annex "web-options")
|
, annexWebOptions = getwords (annex "web-options")
|
||||||
, annexYoutubeDlOptions = getwords (annex "youtube-dl-options")
|
, annexYoutubeDlOptions = getwords (annex "youtube-dl-options")
|
||||||
, annexAriaTorrentOptions = getwords (annex "aria-torrent-options")
|
, annexAriaTorrentOptions = getwords (annex "aria-torrent-options")
|
||||||
, annexWebDownloadCommand = getmaybe (annex "web-download-command")
|
|
||||||
, annexCrippledFileSystem = getbool (annex "crippledfilesystem") False
|
, annexCrippledFileSystem = getbool (annex "crippledfilesystem") False
|
||||||
, annexLargeFiles = getmaybe (annex "largefiles")
|
, annexLargeFiles = getmaybe (annex "largefiles")
|
||||||
, annexAddSmallFiles = getbool (annex "addsmallfiles") True
|
, annexAddSmallFiles = getbool (annex "addsmallfiles") True
|
||||||
|
@ -159,6 +160,9 @@ extractGitConfig r = GitConfig
|
||||||
, annexRetry = getmayberead (annex "retry")
|
, annexRetry = getmayberead (annex "retry")
|
||||||
, annexRetryDelay = Seconds
|
, annexRetryDelay = Seconds
|
||||||
<$> getmayberead (annex "retrydelay")
|
<$> getmayberead (annex "retrydelay")
|
||||||
|
, annexAllowedUrlSchemes = S.fromList $ map mkScheme $
|
||||||
|
maybe ["http", "https"] words $
|
||||||
|
getmaybe (annex "security.allowed-url-schemes")
|
||||||
, coreSymlinks = getbool "core.symlinks" True
|
, coreSymlinks = getbool "core.symlinks" True
|
||||||
, coreSharedRepository = getSharedRepository r
|
, coreSharedRepository = getSharedRepository r
|
||||||
, receiveDenyCurrentBranch = getDenyCurrentBranch r
|
, receiveDenyCurrentBranch = getDenyCurrentBranch r
|
||||||
|
|
|
@ -15,6 +15,9 @@ module Utility.Url (
|
||||||
managerSettings,
|
managerSettings,
|
||||||
URLString,
|
URLString,
|
||||||
UserAgent,
|
UserAgent,
|
||||||
|
Scheme,
|
||||||
|
mkScheme,
|
||||||
|
allowedScheme,
|
||||||
UrlOptions(..),
|
UrlOptions(..),
|
||||||
defUrlOptions,
|
defUrlOptions,
|
||||||
mkUrlOptions,
|
mkUrlOptions,
|
||||||
|
@ -41,6 +44,7 @@ import qualified Data.CaseInsensitive as CI
|
||||||
import qualified Data.ByteString as B
|
import qualified Data.ByteString as B
|
||||||
import qualified Data.ByteString.UTF8 as B8
|
import qualified Data.ByteString.UTF8 as B8
|
||||||
import qualified Data.ByteString.Lazy as L
|
import qualified Data.ByteString.Lazy as L
|
||||||
|
import qualified Data.Set as S
|
||||||
import Control.Monad.Trans.Resource
|
import Control.Monad.Trans.Resource
|
||||||
import Network.HTTP.Conduit
|
import Network.HTTP.Conduit
|
||||||
import Network.HTTP.Client (brRead, withResponse)
|
import Network.HTTP.Client (brRead, withResponse)
|
||||||
|
@ -65,12 +69,22 @@ type Headers = [String]
|
||||||
|
|
||||||
type UserAgent = String
|
type UserAgent = String
|
||||||
|
|
||||||
|
newtype Scheme = Scheme (CI.CI String)
|
||||||
|
deriving (Eq, Ord)
|
||||||
|
|
||||||
|
mkScheme :: String -> Scheme
|
||||||
|
mkScheme = Scheme . CI.mk
|
||||||
|
|
||||||
|
fromScheme :: Scheme -> String
|
||||||
|
fromScheme (Scheme s) = CI.original s
|
||||||
|
|
||||||
data UrlOptions = UrlOptions
|
data UrlOptions = UrlOptions
|
||||||
{ userAgent :: Maybe UserAgent
|
{ userAgent :: Maybe UserAgent
|
||||||
, reqHeaders :: Headers
|
, reqHeaders :: Headers
|
||||||
, urlDownloader :: UrlDownloader
|
, urlDownloader :: UrlDownloader
|
||||||
, applyRequest :: Request -> Request
|
, applyRequest :: Request -> Request
|
||||||
, httpManager :: Manager
|
, httpManager :: Manager
|
||||||
|
, allowedSchemes :: S.Set Scheme
|
||||||
}
|
}
|
||||||
|
|
||||||
data UrlDownloader
|
data UrlDownloader
|
||||||
|
@ -84,8 +98,9 @@ defUrlOptions = UrlOptions
|
||||||
<*> pure DownloadWithConduit
|
<*> pure DownloadWithConduit
|
||||||
<*> pure id
|
<*> pure id
|
||||||
<*> newManager managerSettings
|
<*> newManager managerSettings
|
||||||
|
<*> pure (S.fromList $ map mkScheme ["http", "https"])
|
||||||
|
|
||||||
mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> Manager -> UrlOptions
|
mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> Manager -> S.Set Scheme -> UrlOptions
|
||||||
mkUrlOptions defuseragent reqheaders reqparams manager =
|
mkUrlOptions defuseragent reqheaders reqparams manager =
|
||||||
UrlOptions useragent reqheaders urldownloader applyrequest manager
|
UrlOptions useragent reqheaders urldownloader applyrequest manager
|
||||||
where
|
where
|
||||||
|
@ -115,7 +130,7 @@ mkUrlOptions defuseragent reqheaders reqparams manager =
|
||||||
_ -> (h', B8.fromString v)
|
_ -> (h', B8.fromString v)
|
||||||
|
|
||||||
curlParams :: UrlOptions -> [CommandParam] -> [CommandParam]
|
curlParams :: UrlOptions -> [CommandParam] -> [CommandParam]
|
||||||
curlParams uo ps = ps ++ uaparams ++ headerparams ++ addedparams
|
curlParams uo ps = ps ++ uaparams ++ headerparams ++ addedparams ++ schemeparams
|
||||||
where
|
where
|
||||||
uaparams = case userAgent uo of
|
uaparams = case userAgent uo of
|
||||||
Nothing -> []
|
Nothing -> []
|
||||||
|
@ -124,6 +139,25 @@ curlParams uo ps = ps ++ uaparams ++ headerparams ++ addedparams
|
||||||
addedparams = case urlDownloader uo of
|
addedparams = case urlDownloader uo of
|
||||||
DownloadWithConduit -> []
|
DownloadWithConduit -> []
|
||||||
DownloadWithCurl l -> l
|
DownloadWithCurl l -> l
|
||||||
|
schemeparams =
|
||||||
|
[ Param "--proto"
|
||||||
|
, Param $ intercalate "," ("-all" : schemelist)
|
||||||
|
]
|
||||||
|
schemelist = map fromScheme $ S.toList $ allowedSchemes uo
|
||||||
|
|
||||||
|
checkPolicy :: UrlOptions -> URI -> a -> IO a -> IO a
|
||||||
|
checkPolicy uo u onerr a
|
||||||
|
| allowedScheme uo u = a
|
||||||
|
| otherwise = do
|
||||||
|
hPutStrLn stderr $
|
||||||
|
"Configuration does not allow accessing " ++ show u
|
||||||
|
hFlush stderr
|
||||||
|
return onerr
|
||||||
|
|
||||||
|
allowedScheme :: UrlOptions -> URI -> Bool
|
||||||
|
allowedScheme uo u = uscheme `S.member` allowedSchemes uo
|
||||||
|
where
|
||||||
|
uscheme = mkScheme $ takeWhile (/=':') (uriScheme u)
|
||||||
|
|
||||||
{- Checks that an url exists and could be successfully downloaded,
|
{- Checks that an url exists and could be successfully downloaded,
|
||||||
- also checking that its size, if available, matches a specified size. -}
|
- also checking that its size, if available, matches a specified size. -}
|
||||||
|
@ -158,7 +192,8 @@ assumeUrlExists = UrlInfo True Nothing Nothing
|
||||||
- also returning its size and suggested filename if available. -}
|
- also returning its size and suggested filename if available. -}
|
||||||
getUrlInfo :: URLString -> UrlOptions -> IO UrlInfo
|
getUrlInfo :: URLString -> UrlOptions -> IO UrlInfo
|
||||||
getUrlInfo url uo = case parseURIRelaxed url of
|
getUrlInfo url uo = case parseURIRelaxed url of
|
||||||
Just u -> case (urlDownloader uo, parseUrlConduit (show u)) of
|
Just u -> checkPolicy uo u dne $
|
||||||
|
case (urlDownloader uo, parseUrlConduit (show u)) of
|
||||||
(DownloadWithConduit, Just req) -> catchJust
|
(DownloadWithConduit, Just req) -> catchJust
|
||||||
-- When http redirects to a protocol which
|
-- When http redirects to a protocol which
|
||||||
-- conduit does not support, it will throw
|
-- conduit does not support, it will throw
|
||||||
|
@ -166,7 +201,7 @@ getUrlInfo url uo = case parseURIRelaxed url of
|
||||||
(matchStatusCodeException (== found302))
|
(matchStatusCodeException (== found302))
|
||||||
(existsconduit req)
|
(existsconduit req)
|
||||||
(const (existscurl u))
|
(const (existscurl u))
|
||||||
`catchNonAsync` (const dne)
|
`catchNonAsync` (const $ return dne)
|
||||||
-- http-conduit does not support file:, ftp:, etc urls,
|
-- http-conduit does not support file:, ftp:, etc urls,
|
||||||
-- so fall back to reading files and using curl.
|
-- so fall back to reading files and using curl.
|
||||||
_
|
_
|
||||||
|
@ -177,11 +212,11 @@ getUrlInfo url uo = case parseURIRelaxed url of
|
||||||
Just stat -> do
|
Just stat -> do
|
||||||
sz <- getFileSize' f stat
|
sz <- getFileSize' f stat
|
||||||
found (Just sz) Nothing
|
found (Just sz) Nothing
|
||||||
Nothing -> dne
|
Nothing -> return dne
|
||||||
| otherwise -> existscurl u
|
| otherwise -> existscurl u
|
||||||
Nothing -> dne
|
Nothing -> return dne
|
||||||
where
|
where
|
||||||
dne = return $ UrlInfo False Nothing Nothing
|
dne = UrlInfo False Nothing Nothing
|
||||||
found sz f = return $ UrlInfo True sz f
|
found sz f = return $ UrlInfo True sz f
|
||||||
|
|
||||||
curlparams = curlParams uo $
|
curlparams = curlParams uo $
|
||||||
|
@ -213,7 +248,7 @@ getUrlInfo url uo = case parseURIRelaxed url of
|
||||||
then found
|
then found
|
||||||
(extractlen resp)
|
(extractlen resp)
|
||||||
(extractfilename resp)
|
(extractfilename resp)
|
||||||
else dne
|
else return dne
|
||||||
|
|
||||||
existscurl u = do
|
existscurl u = do
|
||||||
output <- catchDefaultIO "" $
|
output <- catchDefaultIO "" $
|
||||||
|
@ -230,7 +265,7 @@ getUrlInfo url uo = case parseURIRelaxed url of
|
||||||
-- don't try to parse ftp status codes; if curl
|
-- don't try to parse ftp status codes; if curl
|
||||||
-- got a length, it's good
|
-- got a length, it's good
|
||||||
_ | isftp && isJust len -> good
|
_ | isftp && isJust len -> good
|
||||||
_ -> dne
|
_ -> return dne
|
||||||
|
|
||||||
-- Parse eg: attachment; filename="fname.ext"
|
-- Parse eg: attachment; filename="fname.ext"
|
||||||
-- per RFC 2616
|
-- per RFC 2616
|
||||||
|
@ -265,7 +300,8 @@ download meterupdate url file uo =
|
||||||
`catchNonAsync` showerr
|
`catchNonAsync` showerr
|
||||||
where
|
where
|
||||||
go = case parseURIRelaxed url of
|
go = case parseURIRelaxed url of
|
||||||
Just u -> case (urlDownloader uo, parseUrlConduit (show u)) of
|
Just u -> checkPolicy uo u False $
|
||||||
|
case (urlDownloader uo, parseUrlConduit (show u)) of
|
||||||
(DownloadWithConduit, Just req) -> catchJust
|
(DownloadWithConduit, Just req) -> catchJust
|
||||||
-- When http redirects to a protocol which
|
-- When http redirects to a protocol which
|
||||||
-- conduit does not support, it will throw
|
-- conduit does not support, it will throw
|
||||||
|
|
|
@ -1387,12 +1387,20 @@ Here are all the supported configuration settings.
|
||||||
If set, the command is run and each line of its output is used as a HTTP
|
If set, the command is run and each line of its output is used as a HTTP
|
||||||
header. This overrides annex.http-headers.
|
header. This overrides annex.http-headers.
|
||||||
|
|
||||||
* `annex.web-download-command`
|
* `annex.security.allowed-url-schemes`
|
||||||
|
|
||||||
Use to specify a command to run to download a file from the web.
|
List of URL schemes that git-annex is allowed to download content from.
|
||||||
|
The default is "http https".
|
||||||
|
|
||||||
In the command line, %url is replaced with the url to download,
|
Think very carefully before changing this; there are security
|
||||||
and %file is replaced with the file that it should be saved to.
|
implications. For example, if it's changed to allow "file" URLs,
|
||||||
|
then anyone who can get a commit into your git-annex repository
|
||||||
|
could add a pointer to a private file located outside that repository,
|
||||||
|
risking it being copied into the repository and transferred on to other
|
||||||
|
remotes, exposing its content.
|
||||||
|
|
||||||
|
Some special remotes support their own domain-specific URL
|
||||||
|
schemes; those are not affected by this configuration setting.
|
||||||
|
|
||||||
* `annex.secure-erase-command`
|
* `annex.secure-erase-command`
|
||||||
|
|
||||||
|
|
|
@ -123,7 +123,7 @@ while read line; do
|
||||||
url="$2"
|
url="$2"
|
||||||
# List contents of torrent.
|
# List contents of torrent.
|
||||||
tmp=$(mktemp)
|
tmp=$(mktemp)
|
||||||
if ! runcmd curl -o "$tmp" "$url"; then
|
if ! runcmd curl --proto -all,http,https -o "$tmp" "$url"; then
|
||||||
echo CHECKURL-FAILURE
|
echo CHECKURL-FAILURE
|
||||||
else
|
else
|
||||||
oldIFS="$IFS"
|
oldIFS="$IFS"
|
||||||
|
@ -166,7 +166,7 @@ while read line; do
|
||||||
echo TRANSFER-FAILURE RETRIEVE "$key" "no known torrent urls for this key"
|
echo TRANSFER-FAILURE RETRIEVE "$key" "no known torrent urls for this key"
|
||||||
else
|
else
|
||||||
tmp=$(mktemp)
|
tmp=$(mktemp)
|
||||||
if ! runcmd curl -o "$tmp" "$url"; then
|
if ! runcmd curl --proto -all,http,https -o "$tmp" "$url"; then
|
||||||
echo TRANSFER-FAILURE RETRIEVE "$key" "failed downloading torrent file from $url"
|
echo TRANSFER-FAILURE RETRIEVE "$key" "failed downloading torrent file from $url"
|
||||||
else
|
else
|
||||||
filenum="$(echo "$url" | sed 's/(.*#\(\d*\)/\1/')"
|
filenum="$(echo "$url" | sed 's/(.*#\(\d*\)/\1/')"
|
||||||
|
|
|
@ -6,6 +6,6 @@ See [[tips/using_the_web_as_a_special_remote]] for usage examples.
|
||||||
Currently git-annex only supports downloading content from the web;
|
Currently git-annex only supports downloading content from the web;
|
||||||
it cannot upload to it or remove content.
|
it cannot upload to it or remove content.
|
||||||
|
|
||||||
This special remote uses arbitrary urls on the web as the source for content.
|
This special remote uses urls on the web as the source for content.
|
||||||
git-annex can also download content from a normal git remote, accessible by
|
git-annex can also download content from a normal git remote, accessible by
|
||||||
http.
|
http.
|
||||||
|
|
|
@ -5,4 +5,7 @@
|
||||||
date="2013-08-17T08:59:11Z"
|
date="2013-08-17T08:59:11Z"
|
||||||
content="""
|
content="""
|
||||||
When it says \"arbitrary urls\", it means it. The only requirement is that the url be well formed and that wget or whatever command you have it configured to use via annex.web-download-command knows how to download it.
|
When it says \"arbitrary urls\", it means it. The only requirement is that the url be well formed and that wget or whatever command you have it configured to use via annex.web-download-command knows how to download it.
|
||||||
|
|
||||||
|
Update 2018: That used to be the case, but it's now limited by default to
|
||||||
|
http and https urls.
|
||||||
"""]]
|
"""]]
|
||||||
|
|
|
@ -11,12 +11,6 @@ The first step is to install the Firefox plugin
|
||||||
[FlashGot](http://flashgot.net/). We will use it to provide the Firefox
|
[FlashGot](http://flashgot.net/). We will use it to provide the Firefox
|
||||||
shortcuts to add things to our annex.
|
shortcuts to add things to our annex.
|
||||||
|
|
||||||
We also need a normal download manager, if we want to get status updates as
|
|
||||||
the download is done. We'll need to configure git-annex to use it by
|
|
||||||
setting `annex.web-download-command` as Joey describes in his comment on
|
|
||||||
[[todo/wishlist: allow configuration of downloader for addurl]]. See the
|
|
||||||
manpage [[git-annex]] for more information on setting configuration.
|
|
||||||
|
|
||||||
Once we have installed all that, we need a script that has an interface
|
Once we have installed all that, we need a script that has an interface
|
||||||
which FlashGot can treat as a downloader, but which calls git-annex to do
|
which FlashGot can treat as a downloader, but which calls git-annex to do
|
||||||
the actual downloading. Such a script is available from
|
the actual downloading. Such a script is available from
|
||||||
|
|
|
@ -1,8 +0,0 @@
|
||||||
[[!comment format=mdwn
|
|
||||||
username="http://joeyh.name/"
|
|
||||||
nickname="joey"
|
|
||||||
subject="comment 1"
|
|
||||||
date="2013-04-11T20:16:02Z"
|
|
||||||
content="""
|
|
||||||
As of my last commit, you don't really need a separate download manager. The webapp will now display urls that `git annex addurl` is downloading in among the other transfers.
|
|
||||||
"""]]
|
|
|
@ -22,3 +22,10 @@ what about the other settings, is it okay to hardcode those?
|
||||||
|
|
||||||
maybe this would be easier if there would be an options override just
|
maybe this would be easier if there would be an options override just
|
||||||
like rsync, but separate ones for curl and wget... --[[anarcat]]
|
like rsync, but separate ones for curl and wget... --[[anarcat]]
|
||||||
|
|
||||||
|
> git-annex now only uses curl, and defaults to a built-in http downloader.
|
||||||
|
> The annex.web-download-command is no longer supported. annex.web-options
|
||||||
|
> can be used to pass options to curl.
|
||||||
|
>
|
||||||
|
> So, I don't think this todo is relevant anymore, closing [[done]].
|
||||||
|
> --[[Joey]]
|
||||||
|
|
Loading…
Reference in a new issue