make it easier to use curl for unusual url schemes

Use curl when annex.security.allowed-url-schemes includes an url scheme not
supported by git-annex internally, as long as
annex.security.allowed-ip-addresses is configured to allow using curl.

Sponsored-by: Luke Shumaker on Patreon
This commit is contained in:
Joey Hess 2022-08-15 12:22:01 -04:00
parent 2fc9a0096f
commit 840bd50390
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 37 additions and 4 deletions
Annex
CHANGELOG
Utility
doc
forum/Use_addurl_with_a_file_on_an_HPC_cluster
git-annex.mdwn

View file

@ -1,7 +1,7 @@
{- Url downloading, with git-annex user agent and configured http
- headers, security restrictions, etc.
-
- Copyright 2013-2020 Joey Hess <id@joeyh.name>
- Copyright 2013-2022 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
@ -43,6 +43,7 @@ import Network.Socket
import Network.HTTP.Client
import Network.HTTP.Client.TLS
import Text.Read
import qualified Data.Set as S
defaultUserAgent :: U.UserAgent
defaultUserAgent = "git-annex/" ++ BuildInfo.packageversion
@ -78,7 +79,8 @@ getUrlOptions = Annex.getState Annex.urloptions >>= \case
checkallowedaddr = words . annexAllowedIPAddresses <$> Annex.getGitConfig >>= \case
["all"] -> do
curlopts <- map Param . annexWebOptions <$> Annex.getGitConfig
let urldownloader = if null curlopts
allowedurlschemes <- annexAllowedUrlSchemes <$> Annex.getGitConfig
let urldownloader = if null curlopts && not (any (`S.member` U.conduitUrlSchemes) allowedurlschemes)
then U.DownloadWithConduit $
U.DownloadWithCurlRestricted mempty
else U.DownloadWithCurl curlopts

View file

@ -20,6 +20,9 @@ git-annex (10.20220725) UNRELEASED; urgency=medium
* Added annex.dbdir config which can be used to move sqlite databases
to a different filesystem than the git-annex repo, when the repo is on
a filesystem that sqlite does not work well in.
* Use curl when annex.security.allowed-url-schemes includes an url
scheme not supported by git-annex internally, as long as
annex.security.allowed-ip-addresses is configured to allow using curl.
-- Joey Hess <id@joeyh.name> Mon, 25 Jul 2022 15:35:45 -0400

View file

@ -1,6 +1,6 @@
{- Url downloading.
-
- Copyright 2011-2021 Joey Hess <id@joeyh.name>
- Copyright 2011-2022 Joey Hess <id@joeyh.name>
-
- License: BSD-2-clause
-}
@ -40,6 +40,7 @@ module Utility.Url (
noBasicAuth,
applyBasicAuth',
extractFromResourceT,
conduitUrlSchemes,
) where
import Common
@ -111,10 +112,13 @@ defUrlOptions = UrlOptions
<*> pure (DownloadWithConduit (DownloadWithCurlRestricted mempty))
<*> pure id
<*> newManager tlsManagerSettings
<*> pure (S.fromList $ map mkScheme ["http", "https", "ftp"])
<*> pure conduitUrlSchemes
<*> pure Nothing
<*> pure noBasicAuth
conduitUrlSchemes :: S.Set Scheme
conduitUrlSchemes = S.fromList $ map mkScheme ["http", "https", "ftp"]
mkUrlOptions :: Maybe UserAgent -> Headers -> UrlDownloader -> Manager -> S.Set Scheme -> Maybe (URI -> String) -> GetBasicAuth -> UrlOptions
mkUrlOptions defuseragent reqheaders urldownloader =
UrlOptions useragent reqheaders urldownloader applyrequest

View file

@ -0,0 +1,21 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2022-08-15T15:46:40Z"
content="""
git-annex can be used with any url scheme that curl supports, but you have to
configure it to allow using it. See the documentation
of annex.security.allowed-url-schemes in the git-annex man page.
You will also have to set annex.security.allowed-ip-addresses
to "all".
It seems that even with both settings, git-annex still avoids using curl
for unsupported url schemes, unless you also set annex.web-options
to some option used by curl. That forces it to use curl. I set it to
"--netrc". You will probably need to use that option anyway since I think
curl needs configuration in a netrc file to authenticate for sftp.
(I feel that it's a bug that annex.web-options needs to be set to make it
use curl, and I've fixed that in master.)
"""]]

View file

@ -1745,6 +1745,9 @@ Remotes are configured using these settings in `.git/config`.
repository, possibly causing it to be copied into your repository
and transferred on to other remotes, exposing its content.
Any url schemes supported by curl can be listed here, but you will
also need to configure annex.allowed-ip-addresses to allow using curl.
Some special remotes support their own domain-specific URL
schemes; those are not affected by this configuration setting.