git-remote-annex: Support urls like annex::https://example.com/foo-repo

Using the usual url download machinery even allows these urls to need
http basic auth, which is prompted for with git-credential. Which opens
the possibility for urls that contain a secret to be used, eg the cipher
for encryption=shared. Although the user is currently on their own
constructing such an url, I do think it would work.

Limited to httpalso for now, for security reasons. Since both httpalso
(and retrieving this very url) is limited by the usual
annex.security.allowed-ip-addresses configs, it's not possible for an
attacker to make one of these urls that sets up a httpalso url that
opens the garage door. Which is one class of attacks to keep in mind
with this thing.

It seems that there could be either a git-config that allows other types
of special remotes to be set up this way, or special remotes could
indicate when they are safe. I do worry that the git-config would
encourage users to set it without thinking through the security
implications. One remote config might be safe to access this way, but
another config, for one with the same type, might not be. This will need
further thought, and real-world examples to decide what to do.
This commit is contained in:
Joey Hess 2024-05-30 12:19:46 -04:00
parent 3f33616068
commit 0155abfba4
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 103 additions and 33 deletions

View file

@ -24,6 +24,7 @@ import qualified Git.Version
import qualified Annex.SpecialRemote as SpecialRemote
import qualified Annex.Branch
import qualified Annex.BranchState
import qualified Annex.Url as Url
import qualified Types.Remote as Remote
import qualified Logs.Remote
import qualified Remote.External
@ -57,6 +58,7 @@ import Utility.FileMode
import Network.URI
import Data.Either
import Data.Char
import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as B8
import qualified Data.Map.Strict as M
@ -65,21 +67,25 @@ import qualified Utility.RawFilePath as R
import qualified Data.Set as S
run :: [String] -> IO ()
run (remotename:url:[]) =
-- git strips the "annex::" prefix of the url
-- when running this command, so add it back
let url' = "annex::" ++ url
in case parseSpecialRemoteNameUrl remotename url' of
Left e -> giveup e
Right src -> do
repo <- getRepo
state <- Annex.new repo
Annex.eval state (run' src url')
run (remotename:url:[]) = do
repo <- getRepo
state <- Annex.new repo
Annex.eval state $
resolveSpecialRemoteWebUrl url >>= \case
-- git strips the "annex::" prefix of the url
-- when running this command, so add it back
Nothing -> parseurl ("annex::" ++ url) pure
Just url' -> parseurl url' checkAllowedFromSpecialRemoteWebUrl
where
parseurl u checkallowed =
case parseSpecialRemoteNameUrl remotename u of
Right src -> checkallowed src >>= run' u
Left e -> giveup e
run (_remotename:[]) = giveup "remote url not configured"
run _ = giveup "expected remote name and url parameters"
run' :: SpecialRemoteConfig -> String -> Annex ()
run' src url = do
run' :: String -> SpecialRemoteConfig -> Annex ()
run' url src = do
sab <- startAnnexBranch
whenM (Annex.getRead Annex.debugenabled) $
enableDebugOutput
@ -477,7 +483,36 @@ parseSpecialRemoteUrl url remotename = case parseURI url of
let (k, sv) = break (== '=') kv
v = if null sv then sv else drop 1 sv
in (Proposed (unEscapeString k), Proposed (unEscapeString v))
-- Handles an url that contains a http address, by downloading
-- the web page and using it as the full annex:: url.
-- The passed url has already had "annex::" stripped off.
resolveSpecialRemoteWebUrl :: String -> Annex (Maybe String)
resolveSpecialRemoteWebUrl url
| "http://" `isPrefixOf` lcurl || "https://" `isPrefixOf` lcurl =
Url.withUrlOptionsPromptingCreds $ \uo ->
withTmpFile "git-remote-annex" $ \tmp h -> do
liftIO $ hClose h
Url.download' nullMeterUpdate Nothing url tmp uo >>= \case
Left err -> giveup $ url ++ " " ++ err
Right () -> liftIO $
(headMaybe . lines)
<$> readFileStrict tmp
| otherwise = return Nothing
where
lcurl = map toLower url
-- Only some types of special remotes are allowed to come from
-- resolveSpecialRemoteWebUrl. Throws an error if this one is not.
checkAllowedFromSpecialRemoteWebUrl :: SpecialRemoteConfig -> Annex SpecialRemoteConfig
checkAllowedFromSpecialRemoteWebUrl src@(ExistingSpecialRemote {}) = pure src
checkAllowedFromSpecialRemoteWebUrl src@(SpecialRemoteConfig {}) =
case M.lookup typeField (specialRemoteConfig src) of
Nothing -> giveup "Web URL did not include a type field."
Just t
| t == Proposed "httpalso" -> return src
| otherwise -> giveup "Web URL can only be used for a httpalso special remote."
getSpecialRemoteUrl :: Remote -> Annex (Maybe String)
getSpecialRemoteUrl rmt = do
rcp <- Remote.configParser (Remote.remotetype rmt)

View file

@ -10,31 +10,61 @@ git fetch annex::uuid?param=value&param=value...
This is a git remote helper program that allows git to clone,
pull and push from a git repository that is stored in a git-annex
special remote.
special remote with an URL that starts with "annex::"
The format of the remote URL is "annex::" followed by the UUID of the
special remote, and then followed by all of the configuration parameters of
the special remote.
The special remote needs to have a `remote.<name>.url`
configured to use this. That is set up automatically when git
cloning from a special remote.
For example, to clone from a directory special remote:
To make [[git-annex-initremote]](1) and [[git-annex-enableremote]](1)
configure the url, pass them the `--with-url` option.
git clone annex::358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/
Or, to configure an existing special remote with a shorthand URL, run:
But you don't need to generate such an url yourself. Instead, you can use
the shorthand url of "annex::" with an existing special remote.
git config remote.name.url annex::
git-annex initremote foo type=directory encryption=none directory=/mnt/foo
git config remote.foo.url annex::
git push foo master
Once the URL is configured, you can use `git pull`, `git push`, etc
with the special remote much like with any other git remote.
But see CONFLICTING PUSHES below for some situations where it behaves
slightly differently.
Configuring the url like that is automatically done when cloning from a
special remote. To make [[git-annex-initremote]](1) and
[[git-annex-enableremote]](1) configure the url, pass them the `--with-url`
option.
# URL FORMAT
When using the shorthand "annex::" url, the full url will be displayed
each time you git pull or push, when it's possible for git-annex to
determine it.
This uses an URL that starts with "annex::". There are three forms of such
URLs:
* Complete URL
This contains the UUID and all configuration parameters
of the special remote that were passed when using
`git-annex initremote`.
For example, to clone from a directory special remote:
git clone annex::358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/
* Shorthand URL
This makes it easy to configure an existing special remote with an URL
without having to come up with the complete URL.
annex::
When using this shorthand URL, the full URL will be displayed each time you
git pull or push, when it's possible for git-annex to determine it.
(Although in some cases, like the directory special remote, some
parameters may be left off of the displayed URL.)
* Web URL
This URL points at a file on the web, which contains the complete annex::
URL.
annex::https://example.com/foo-repo
Not all special remotes can be accessed by such an URL,
for security reasons. Currently, this is limited to httpalso special
remotes.
# CONFLICTING PUSHES
@ -48,13 +78,13 @@ time, for one of the pushes to be overwritten by the other one. In this
situation, the overwritten push will appear to have succeeded, but pulling
later will show the true situation.
# HTTP ACCESS
# HTTPALSO
If the content of a special remote is published via http, a httpalso
special remote can be initialized, and used to `git clone` and `git fetch`
over http.
For example, if the directory special remote set up above is published
For example, a directory special remote named "foo" is published
at `https://example.com/foo/`, set up the httpalso remote like this
to access it:

View file

@ -17,3 +17,8 @@ shortener can be used. --[[Joey]]
> Perhaps it could be limited to safe special remotes. httpalso is surely
> safe in this context. Would anything else be? Any external special
> remotes? --[[Joey]]
>> Implemented this, but it was being a bit hard to handle a redirect to an
>> annex:: url, and in any case with httpalso, the user has a web server
>> they can host files on. So made the url be downloaded as a file, and
>> the first line contains the complete annex:: url. [[done]]