prevent http connections to localhost and private ips by default

Security fix!

* git-annex will refuse to download content from http servers on
  localhost, or any private IP addresses, to prevent accidental
  exposure of internal data. This can be overridden with the
  annex.security.allowed-http-addresses setting.
* Since curl's interface does not have a way to prevent it from accessing
  localhost or private IP addresses, curl defaults to not being used
  for url downloads, even if annex.web-options enabled it before.
  Only when annex.security.allowed-http-addresses=all will curl be used.

Since S3 and WebDav use the Manager, the same policies apply to them too.

youtube-dl is not handled yet, and a http proxy configuration can bypass
these checks too. Those cases are still TBD.

This commit was sponsored by Jeff Goeke-Smith on Patreon.
This commit is contained in:
Joey Hess 2018-06-17 13:05:30 -04:00
parent 43bf219a3c
commit b54b2cdc0e
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 141 additions and 63 deletions

View file

@ -1,5 +1,5 @@
{- Url downloading, with git-annex user agent and configured http {- Url downloading, with git-annex user agent and configured http
- headers and curl options. - headers, security restrictions, etc.
- -
- Copyright 2013-2018 Joey Hess <id@joeyh.name> - Copyright 2013-2018 Joey Hess <id@joeyh.name>
- -
@ -16,8 +16,12 @@ module Annex.Url (
import Annex.Common import Annex.Common
import qualified Annex import qualified Annex
import Utility.Url as U import Utility.Url as U
import Utility.IPAddress
import Utility.HttpManagerRestricted
import qualified BuildInfo import qualified BuildInfo
import Network.Socket
defaultUserAgent :: U.UserAgent defaultUserAgent :: U.UserAgent
defaultUserAgent = "git-annex/" ++ BuildInfo.packageversion defaultUserAgent = "git-annex/" ++ BuildInfo.packageversion
@ -34,16 +38,51 @@ getUrlOptions = Annex.getState Annex.urloptions >>= \case
{ Annex.urloptions = Just uo } { Annex.urloptions = Just uo }
return uo return uo
where where
mk = mkUrlOptions mk = do
<$> getUserAgent (urldownloader, manager) <- checkallowedaddr
<*> headers mkUrlOptions
<*> options <$> getUserAgent
<*> liftIO (U.newManager U.managerSettings) <*> headers
<*> (annexAllowedUrlSchemes <$> Annex.getGitConfig) <*> pure urldownloader
<*> pure manager
<*> (annexAllowedUrlSchemes <$> Annex.getGitConfig)
headers = annexHttpHeadersCommand <$> Annex.getGitConfig >>= \case headers = annexHttpHeadersCommand <$> Annex.getGitConfig >>= \case
Just cmd -> lines <$> liftIO (readProcess "sh" ["-c", cmd]) Just cmd -> lines <$> liftIO (readProcess "sh" ["-c", cmd])
Nothing -> annexHttpHeaders <$> Annex.getGitConfig Nothing -> annexHttpHeaders <$> Annex.getGitConfig
options = map Param . annexWebOptions <$> Annex.getGitConfig
checkallowedaddr = words . annexAllowedHttpAddresses <$> Annex.getGitConfig >>= \case
["all"] -> do
-- Only allow curl when all are allowed,
-- as its interface does not allow preventing
-- it from accessing specific IP addresses.
curlopts <- map Param . annexWebOptions <$> Annex.getGitConfig
let urldownloader = if null curlopts
then U.DownloadWithCurl curlopts
else U.DownloadWithConduit
manager <- liftIO $ U.newManager U.managerSettings
return (urldownloader, manager)
allowedaddrs -> do
addrmatcher <- liftIO $
(\l v -> any (\f -> f v) l) . catMaybes
<$> mapM makeAddressMatcher allowedaddrs
-- Default to not allowing access to loopback
-- and private IP addresses to avoid data
-- leakage.
let isallowed addr
| addrmatcher addr = True
| isLoopbackAddress addr = False
| isPrivateAddress addr = False
| otherwise = True
let r = Restriction
{ addressRestriction = \addr ->
if isallowed (addrAddress addr)
then Nothing
else Just (addrConnectionRestricted addr)
}
manager <- liftIO $ U.newManager $
restrictManagerSettings r U.managerSettings
return (U.DownloadWithConduit, manager)
withUrlOptions :: (U.UrlOptions -> Annex a) -> Annex a withUrlOptions :: (U.UrlOptions -> Annex a) -> Annex a
withUrlOptions a = a =<< getUrlOptions withUrlOptions a = a =<< getUrlOptions

View file

@ -7,6 +7,14 @@ git-annex (6.20180622) UNRELEASED; urgency=high
supporting annex.security.allowed-url-schemes across redirects. supporting annex.security.allowed-url-schemes across redirects.
If you used this setting, you may want to instead use annex.web-options If you used this setting, you may want to instead use annex.web-options
to pass options to curl. to pass options to curl.
* git-annex will refuse to download content from http servers on
localhost, or any private IP addresses, to prevent accidental
exposure of internal data. This can be overridden with the
annex.security.allowed-http-addresses setting.
* Since curl's interface does not have a way to prevent it from accessing
localhost or private IP addresses, curl defaults to not being used
for url downloads, even if annex.web-options enabled it before.
Only when annex.security.allowed-http-addresses=all will curl be used.
-- Joey Hess <id@joeyh.name> Wed, 30 May 2018 11:49:08 -0400 -- Joey Hess <id@joeyh.name> Wed, 30 May 2018 11:49:08 -0400

View file

@ -95,6 +95,7 @@ data GitConfig = GitConfig
, annexRetry :: Maybe Integer , annexRetry :: Maybe Integer
, annexRetryDelay :: Maybe Seconds , annexRetryDelay :: Maybe Seconds
, annexAllowedUrlSchemes :: S.Set Scheme , annexAllowedUrlSchemes :: S.Set Scheme
, annexAllowedHttpAddresses :: String
, coreSymlinks :: Bool , coreSymlinks :: Bool
, coreSharedRepository :: SharedRepository , coreSharedRepository :: SharedRepository
, receiveDenyCurrentBranch :: DenyCurrentBranch , receiveDenyCurrentBranch :: DenyCurrentBranch
@ -163,6 +164,8 @@ extractGitConfig r = GitConfig
, annexAllowedUrlSchemes = S.fromList $ map mkScheme $ , annexAllowedUrlSchemes = S.fromList $ map mkScheme $
maybe ["http", "https"] words $ maybe ["http", "https"] words $
getmaybe (annex "security.allowed-url-schemes") getmaybe (annex "security.allowed-url-schemes")
, annexAllowedHttpAddresses = fromMaybe "" $
getmaybe (annex "security.allowed-http-addresses")
, coreSymlinks = getbool "core.symlinks" True , coreSymlinks = getbool "core.symlinks" True
, coreSharedRepository = getSharedRepository r , coreSharedRepository = getSharedRepository r
, receiveDenyCurrentBranch = getDenyCurrentBranch r , receiveDenyCurrentBranch = getDenyCurrentBranch r

View file

@ -18,6 +18,7 @@ module Utility.Url (
Scheme, Scheme,
mkScheme, mkScheme,
allowedScheme, allowedScheme,
UrlDownloader(..),
UrlOptions(..), UrlOptions(..),
defUrlOptions, defUrlOptions,
mkUrlOptions, mkUrlOptions,
@ -37,6 +38,7 @@ module Utility.Url (
import Common import Common
import Utility.Metered import Utility.Metered
import Utility.HttpManagerRestricted
import Network.URI import Network.URI
import Network.HTTP.Types import Network.HTTP.Types
@ -100,19 +102,10 @@ defUrlOptions = UrlOptions
<*> newManager managerSettings <*> newManager managerSettings
<*> pure (S.fromList $ map mkScheme ["http", "https"]) <*> pure (S.fromList $ map mkScheme ["http", "https"])
mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> Manager -> S.Set Scheme -> UrlOptions mkUrlOptions :: Maybe UserAgent -> Headers -> UrlDownloader -> Manager -> S.Set Scheme -> UrlOptions
mkUrlOptions defuseragent reqheaders reqparams manager = mkUrlOptions defuseragent reqheaders urldownloader manager =
UrlOptions useragent reqheaders urldownloader applyrequest manager UrlOptions useragent reqheaders urldownloader applyrequest manager
where where
urldownloader = if null reqparams
#if MIN_VERSION_cryptonite(0,6,0)
then DownloadWithConduit
#else
-- Work around for old cryptonite bug that broke tls.
-- https://github.com/vincenthz/hs-tls/issues/109
then DownloadWithCurl reqparams
#endif
else DownloadWithCurl reqparams
applyrequest = \r -> r { requestHeaders = requestHeaders r ++ addedheaders } applyrequest = \r -> r { requestHeaders = requestHeaders r ++ addedheaders }
addedheaders = uaheader ++ otherheaders addedheaders = uaheader ++ otherheaders
useragent = maybe defuseragent (Just . B8.toString . snd) useragent = maybe defuseragent (Just . B8.toString . snd)
@ -154,6 +147,12 @@ checkPolicy uo u onerr a
hFlush stderr hFlush stderr
return onerr return onerr
unsupportedUrlScheme :: URI -> IO ()
unsupportedUrlScheme u = do
hPutStrLn stderr $
"Unsupported url scheme" ++ show u
hFlush stderr
allowedScheme :: UrlOptions -> URI -> Bool allowedScheme :: UrlOptions -> URI -> Bool
allowedScheme uo u = uscheme `S.member` allowedSchemes uo allowedScheme uo u = uscheme `S.member` allowedSchemes uo
where where
@ -194,31 +193,24 @@ getUrlInfo :: URLString -> UrlOptions -> IO UrlInfo
getUrlInfo url uo = case parseURIRelaxed url of getUrlInfo url uo = case parseURIRelaxed url of
Just u -> checkPolicy uo u dne $ Just u -> checkPolicy uo u dne $
case (urlDownloader uo, parseUrlConduit (show u)) of case (urlDownloader uo, parseUrlConduit (show u)) of
(DownloadWithConduit, Just req) -> catchJust (DownloadWithConduit, Just req) ->
-- When http redirects to a protocol which existsconduit req
-- conduit does not support, it will throw `catchNonAsync` (const $ return dne)
-- a StatusCodeException with found302. (DownloadWithConduit, Nothing)
(matchStatusCodeException (== found302)) | isfileurl u -> existsfile u
(existsconduit req) | otherwise -> do
(const (existscurl u)) unsupportedUrlScheme u
`catchNonAsync` (const $ return dne) return dne
-- http-conduit does not support file:, ftp:, etc urls, (DownloadWithCurl _, _)
-- so fall back to reading files and using curl. | isfileurl u -> existsfile u
_ | otherwise -> existscurl u
| uriScheme u == "file:" -> do
let f = unEscapeString (uriPath u)
s <- catchMaybeIO $ getFileStatus f
case s of
Just stat -> do
sz <- getFileSize' f stat
found (Just sz) Nothing
Nothing -> return dne
| otherwise -> existscurl u
Nothing -> return dne Nothing -> return dne
where where
dne = UrlInfo False Nothing Nothing dne = UrlInfo False Nothing Nothing
found sz f = return $ UrlInfo True sz f found sz f = return $ UrlInfo True sz f
isfileurl u = uriScheme u == "file:"
curlparams = curlParams uo $ curlparams = curlParams uo $
[ Param "-s" [ Param "-s"
, Param "--head" , Param "--head"
@ -266,6 +258,15 @@ getUrlInfo url uo = case parseURIRelaxed url of
-- got a length, it's good -- got a length, it's good
_ | isftp && isJust len -> good _ | isftp && isJust len -> good
_ -> return dne _ -> return dne
existsfile u = do
let f = unEscapeString (uriPath u)
s <- catchMaybeIO $ getFileStatus f
case s of
Just stat -> do
sz <- getFileSize' f stat
found (Just sz) Nothing
Nothing -> return dne
-- Parse eg: attachment; filename="fname.ext" -- Parse eg: attachment; filename="fname.ext"
-- per RFC 2616 -- per RFC 2616
@ -287,10 +288,6 @@ headRequest r = r
} }
{- Download a perhaps large file, with auto-resume of incomplete downloads. {- Download a perhaps large file, with auto-resume of incomplete downloads.
-
- By default, conduit is used for the download, except for file: urls,
- which are copied. If the url scheme is not supported by conduit, falls
- back to using curl.
- -
- Displays error message on stderr when download failed. - Displays error message on stderr when download failed.
-} -}
@ -302,21 +299,19 @@ download meterupdate url file uo =
go = case parseURIRelaxed url of go = case parseURIRelaxed url of
Just u -> checkPolicy uo u False $ Just u -> checkPolicy uo u False $
case (urlDownloader uo, parseUrlConduit (show u)) of case (urlDownloader uo, parseUrlConduit (show u)) of
(DownloadWithConduit, Just req) -> catchJust (DownloadWithConduit, Just req) ->
-- When http redirects to a protocol which downloadconduit req
-- conduit does not support, it will throw (DownloadWithConduit, Nothing)
-- a StatusCodeException with found302. | isfileurl u -> downloadfile u
(matchStatusCodeException (== found302)) | otherwise -> do
(downloadconduit req) unsupportedUrlScheme u
(const downloadcurl) return False
_ (DownloadWithCurl _, _)
| uriScheme u == "file:" -> do | isfileurl u -> downloadfile u
let src = unEscapeString (uriPath u) | otherwise -> downloadcurl
withMeteredFile src meterupdate $
L.writeFile file
return True
| otherwise -> downloadcurl
Nothing -> return False Nothing -> return False
isfileurl u = uriScheme u == "file:"
downloadconduit req = catchMaybeIO (getFileSize file) >>= \case downloadconduit req = catchMaybeIO (getFileSize file) >>= \case
Nothing -> runResourceT $ do Nothing -> runResourceT $ do
@ -369,6 +364,10 @@ download meterupdate url file uo =
let msg = case he of let msg = case he of
HttpExceptionRequest _ (StatusCodeException _ msgb) -> HttpExceptionRequest _ (StatusCodeException _ msgb) ->
B8.toString msgb B8.toString msgb
HttpExceptionRequest _ (InternalException ie) ->
case fromException ie of
Nothing -> show ie
Just (ConnectionRestricted why) -> why
HttpExceptionRequest _ other -> show other HttpExceptionRequest _ other -> show other
_ -> show he _ -> show he
#else #else
@ -401,6 +400,12 @@ download meterupdate url file uo =
, Param "-C", Param "-" , Param "-C", Param "-"
] ]
boolSystem "curl" (ps ++ [Param "-o", File file, File url]) boolSystem "curl" (ps ++ [Param "-o", File file, File url])
downloadfile u = do
let src = unEscapeString (uriPath u)
withMeteredFile src meterupdate $
L.writeFile file
return True
{- Sinks a Response's body to a file. The file can either be opened in {- Sinks a Response's body to a file. The file can either be opened in
- WriteMode or AppendMode. Updates the meter as data is received. - WriteMode or AppendMode. Updates the meter as data is received.

View file

@ -1356,12 +1356,16 @@ Here are all the supported configuration settings.
* `annex.web-options` * `annex.web-options`
Setting this makes git-annex use curl to download urls Options to pass to curl when git-annex uses it to download urls
(rather than the default built-in url downloader). (rather than the default built-in url downloader).
For example, to force IPv4 only, set it to "-4". For example, to force IPv4 only, set it to "-4".
Or to make curl use your ~/.netrc file, set it to "--netrc". Or to make curl use your ~/.netrc file, set it to "--netrc".
Setting this option makes git-annex use curl, but only
when annex.security.allowed-http-addresses is configured in a
specific way. See its documentation.
* `annex.youtube-dl-options` * `annex.youtube-dl-options`
Options to pass to youtube-dl when using it to find the url to download Options to pass to youtube-dl when using it to find the url to download
@ -1393,15 +1397,34 @@ Here are all the supported configuration settings.
The default is "http https". The default is "http https".
Think very carefully before changing this; there are security Think very carefully before changing this; there are security
implications. For example, if it's changed to allow "file" URLs, implications. For example, if it's changed to allow "file" URLs, then
then anyone who can get a commit into your git-annex repository anyone who can get a commit into your git-annex repository could
could add a pointer to a private file located outside that repository, `git-annex addurl` a pointer to a private file located outside that
risking it being copied into the repository and transferred on to other repository, possibly causing it to be copied into your repository
remotes, exposing its content. and transferred on to other remotes, exposing its content.
Some special remotes support their own domain-specific URL Some special remotes support their own domain-specific URL
schemes; those are not affected by this configuration setting. schemes; those are not affected by this configuration setting.
* `annex.security.allowed-http-addresses`
By default, git-annex only makes HTTP connections to public IP addresses;
it will refuse to use HTTP servers on localhost or on a private network.
This setting can override that behavior, allowing access to particular
IP addresses. For example "127.0.0.1 ::1" allows access to localhost
(both IPV4 and IPV6). To allow access to all IP addresses, use "all"
Think very carefully before changing this; there are security
implications. Anyone who can get a commit into your git-annex repository
could `git annex addurl` an url on a private http server, possibly
causing it to be downloaded into your repository transferred to
other remotes, exposing its content.
Note that, since curl's interface does not allow these IP address
restrictions to be enforced, any configuration that enables use of curl
will be ignored unless annex.security.allowed-http-addresses=all.
* `annex.secure-erase-command` * `annex.secure-erase-command`
This can be set to a command that should be run whenever git-annex This can be set to a command that should be run whenever git-annex