git-remote-annex: Support urls like annex::https://example.com/foo-repo

Using the usual url download machinery even allows these urls to need
http basic auth, which is prompted for with git-credential. Which opens
the possibility for urls that contain a secret to be used, eg the cipher
for encryption=shared. Although the user is currently on their own
constructing such an url, I do think it would work.

Limited to httpalso for now, for security reasons. Since both httpalso
(and retrieving this very url) is limited by the usual
annex.security.allowed-ip-addresses configs, it's not possible for an
attacker to make one of these urls that sets up a httpalso url that
opens the garage door. Which is one class of attacks to keep in mind
with this thing.

It seems that there could be either a git-config that allows other types
of special remotes to be set up this way, or special remotes could
indicate when they are safe. I do worry that the git-config would
encourage users to set it without thinking through the security
implications. One remote config might be safe to access this way, but
another config, for one with the same type, might not be. This will need
further thought, and real-world examples to decide what to do.
This commit is contained in:
Joey Hess 2024-05-30 12:19:46 -04:00
parent 3f33616068
commit 0155abfba4
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 103 additions and 33 deletions

View file

@ -24,6 +24,7 @@ import qualified Git.Version
import qualified Annex.SpecialRemote as SpecialRemote import qualified Annex.SpecialRemote as SpecialRemote
import qualified Annex.Branch import qualified Annex.Branch
import qualified Annex.BranchState import qualified Annex.BranchState
import qualified Annex.Url as Url
import qualified Types.Remote as Remote import qualified Types.Remote as Remote
import qualified Logs.Remote import qualified Logs.Remote
import qualified Remote.External import qualified Remote.External
@ -57,6 +58,7 @@ import Utility.FileMode
import Network.URI import Network.URI
import Data.Either import Data.Either
import Data.Char
import qualified Data.ByteString as B import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as B8 import qualified Data.ByteString.Char8 as B8
import qualified Data.Map.Strict as M import qualified Data.Map.Strict as M
@ -65,21 +67,25 @@ import qualified Utility.RawFilePath as R
import qualified Data.Set as S import qualified Data.Set as S
run :: [String] -> IO () run :: [String] -> IO ()
run (remotename:url:[]) = run (remotename:url:[]) = do
-- git strips the "annex::" prefix of the url repo <- getRepo
-- when running this command, so add it back state <- Annex.new repo
let url' = "annex::" ++ url Annex.eval state $
in case parseSpecialRemoteNameUrl remotename url' of resolveSpecialRemoteWebUrl url >>= \case
Left e -> giveup e -- git strips the "annex::" prefix of the url
Right src -> do -- when running this command, so add it back
repo <- getRepo Nothing -> parseurl ("annex::" ++ url) pure
state <- Annex.new repo Just url' -> parseurl url' checkAllowedFromSpecialRemoteWebUrl
Annex.eval state (run' src url') where
parseurl u checkallowed =
case parseSpecialRemoteNameUrl remotename u of
Right src -> checkallowed src >>= run' u
Left e -> giveup e
run (_remotename:[]) = giveup "remote url not configured" run (_remotename:[]) = giveup "remote url not configured"
run _ = giveup "expected remote name and url parameters" run _ = giveup "expected remote name and url parameters"
run' :: SpecialRemoteConfig -> String -> Annex () run' :: String -> SpecialRemoteConfig -> Annex ()
run' src url = do run' url src = do
sab <- startAnnexBranch sab <- startAnnexBranch
whenM (Annex.getRead Annex.debugenabled) $ whenM (Annex.getRead Annex.debugenabled) $
enableDebugOutput enableDebugOutput
@ -477,7 +483,36 @@ parseSpecialRemoteUrl url remotename = case parseURI url of
let (k, sv) = break (== '=') kv let (k, sv) = break (== '=') kv
v = if null sv then sv else drop 1 sv v = if null sv then sv else drop 1 sv
in (Proposed (unEscapeString k), Proposed (unEscapeString v)) in (Proposed (unEscapeString k), Proposed (unEscapeString v))
-- Handles an url that contains a http address, by downloading
-- the web page and using it as the full annex:: url.
-- The passed url has already had "annex::" stripped off.
resolveSpecialRemoteWebUrl :: String -> Annex (Maybe String)
resolveSpecialRemoteWebUrl url
| "http://" `isPrefixOf` lcurl || "https://" `isPrefixOf` lcurl =
Url.withUrlOptionsPromptingCreds $ \uo ->
withTmpFile "git-remote-annex" $ \tmp h -> do
liftIO $ hClose h
Url.download' nullMeterUpdate Nothing url tmp uo >>= \case
Left err -> giveup $ url ++ " " ++ err
Right () -> liftIO $
(headMaybe . lines)
<$> readFileStrict tmp
| otherwise = return Nothing
where
lcurl = map toLower url
-- Only some types of special remotes are allowed to come from
-- resolveSpecialRemoteWebUrl. Throws an error if this one is not.
checkAllowedFromSpecialRemoteWebUrl :: SpecialRemoteConfig -> Annex SpecialRemoteConfig
checkAllowedFromSpecialRemoteWebUrl src@(ExistingSpecialRemote {}) = pure src
checkAllowedFromSpecialRemoteWebUrl src@(SpecialRemoteConfig {}) =
case M.lookup typeField (specialRemoteConfig src) of
Nothing -> giveup "Web URL did not include a type field."
Just t
| t == Proposed "httpalso" -> return src
| otherwise -> giveup "Web URL can only be used for a httpalso special remote."
getSpecialRemoteUrl :: Remote -> Annex (Maybe String) getSpecialRemoteUrl :: Remote -> Annex (Maybe String)
getSpecialRemoteUrl rmt = do getSpecialRemoteUrl rmt = do
rcp <- Remote.configParser (Remote.remotetype rmt) rcp <- Remote.configParser (Remote.remotetype rmt)

View file

@ -10,31 +10,61 @@ git fetch annex::uuid?param=value&param=value...
This is a git remote helper program that allows git to clone, This is a git remote helper program that allows git to clone,
pull and push from a git repository that is stored in a git-annex pull and push from a git repository that is stored in a git-annex
special remote. special remote with an URL that starts with "annex::"
The format of the remote URL is "annex::" followed by the UUID of the The special remote needs to have a `remote.<name>.url`
special remote, and then followed by all of the configuration parameters of configured to use this. That is set up automatically when git
the special remote. cloning from a special remote.
For example, to clone from a directory special remote: To make [[git-annex-initremote]](1) and [[git-annex-enableremote]](1)
configure the url, pass them the `--with-url` option.
git clone annex::358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/ Or, to configure an existing special remote with a shorthand URL, run:
But you don't need to generate such an url yourself. Instead, you can use git config remote.name.url annex::
the shorthand url of "annex::" with an existing special remote.
git-annex initremote foo type=directory encryption=none directory=/mnt/foo Once the URL is configured, you can use `git pull`, `git push`, etc
git config remote.foo.url annex:: with the special remote much like with any other git remote.
git push foo master But see CONFLICTING PUSHES below for some situations where it behaves
slightly differently.
Configuring the url like that is automatically done when cloning from a # URL FORMAT
special remote. To make [[git-annex-initremote]](1) and
[[git-annex-enableremote]](1) configure the url, pass them the `--with-url`
option.
When using the shorthand "annex::" url, the full url will be displayed This uses an URL that starts with "annex::". There are three forms of such
each time you git pull or push, when it's possible for git-annex to URLs:
determine it.
* Complete URL
This contains the UUID and all configuration parameters
of the special remote that were passed when using
`git-annex initremote`.
For example, to clone from a directory special remote:
git clone annex::358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/
* Shorthand URL
This makes it easy to configure an existing special remote with an URL
without having to come up with the complete URL.
annex::
When using this shorthand URL, the full URL will be displayed each time you
git pull or push, when it's possible for git-annex to determine it.
(Although in some cases, like the directory special remote, some
parameters may be left off of the displayed URL.)
* Web URL
This URL points at a file on the web, which contains the complete annex::
URL.
annex::https://example.com/foo-repo
Not all special remotes can be accessed by such an URL,
for security reasons. Currently, this is limited to httpalso special
remotes.
# CONFLICTING PUSHES # CONFLICTING PUSHES
@ -48,13 +78,13 @@ time, for one of the pushes to be overwritten by the other one. In this
situation, the overwritten push will appear to have succeeded, but pulling situation, the overwritten push will appear to have succeeded, but pulling
later will show the true situation. later will show the true situation.
# HTTP ACCESS # HTTPALSO
If the content of a special remote is published via http, a httpalso If the content of a special remote is published via http, a httpalso
special remote can be initialized, and used to `git clone` and `git fetch` special remote can be initialized, and used to `git clone` and `git fetch`
over http. over http.
For example, if the directory special remote set up above is published For example, a directory special remote named "foo" is published
at `https://example.com/foo/`, set up the httpalso remote like this at `https://example.com/foo/`, set up the httpalso remote like this
to access it: to access it:

View file

@ -17,3 +17,8 @@ shortener can be used. --[[Joey]]
> Perhaps it could be limited to safe special remotes. httpalso is surely > Perhaps it could be limited to safe special remotes. httpalso is surely
> safe in this context. Would anything else be? Any external special > safe in this context. Would anything else be? Any external special
> remotes? --[[Joey]] > remotes? --[[Joey]]
>> Implemented this, but it was being a bit hard to handle a redirect to an
>> annex:: url, and in any case with httpalso, the user has a web server
>> they can host files on. So made the url be downloaded as a file, and
>> the first line contains the complete annex:: url. [[done]]