This commit is contained in:
Joey Hess 2018-06-17 16:13:45 -04:00
parent e62c4543c3
commit 8703fdd3b7
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 265 additions and 0 deletions

View file

@ -0,0 +1,199 @@
This is a security hole that allows exposure of
private data in files located outside the git-annex repository.
The attacker needs to have control over one of the remotes of the git-annex
repository. For example, they may provide a public git-annex repository
that the victim clones. Or the victim may have paired repositories with
them. Or, equivilantly, the attacker could have read access to the victim's
git-annex repository (eg on a server somewhere), and some channel to get
commits into it (eg a pull requests).
The attacker does `git-annex addurl --relaxed file:///etc/passwd` and commits
this to the repository in some out of the way place. Then they wait for the
victim to pull the change.
The easiest exploit is when the victim is running the assistant, or is
periodically doing `git annex sync --content`. The victim may also perform
the equivilant actions manually, not noticing they're operating on the
file.
In either case, git-annex gets the content of the annexed file, following
the file:// url. Then git-annex transfers the content back to the
attacker's repository.
Note that the victim can always detect this attack after the fact, since
even if the attacker deletes the file, the git history will still
contain it.
It may also be possible to exploit scp:// sftp:// smb:// etc urls to get at
files on other computers that the user has access to as well as localhost.
I was not able to get curl to download from scp:// or sftp:// on debian
(unstable) but there may be configurations that allow it.
If the url is attached to a key using a cryptographic hash, then the
attacker would need to already know at least the hash of the content
to exploit this.
Sending that content back to them could be considered not a security
hole. Except, I can guess what content some file on your system might have,
and use this attack as an oracle to determine if I guessed right, and work
toward penetrating your system using that information.
So, best to not treat addurl with a hash any differently than
--relaxed and --fast when addressing this hole.
----
The fix must involve locking down the set of allowed in url schemes.
Better to have a whitelist than a blacklist. http and https seems like the
right default.
Some users do rely on file:// urls, and this is fine in some use cases,
eg when you're not merging changes from anyone else.
So this probably needs to be a git config of allowed url schemes,
with an appropriatly scary name, like `annex.security.allowed-url-schemes`.
Redirects from one url scheme to another could be usd to bypass such a
whitelist. curl's `--proto` also affects redirects. http-conduit
is limited to only http and will probably remain that way.
> done in [[!commit 28720c795ff57a55b48e56d15f9b6bcb977f48d9]] --[[Joey]]
----
The same kind of attack can be used to see the content of
localhost urls and other non-globally-available urls.
Redirects and DNS rebinding attacks mean that checking the IP address
of the hostname in the url is not good enough. It needs to check the IP
address that is actually connected to, for each http connection made,
including redirects.
There will need to be a config to relax checks, like
with an appropriatly scary name, like
`annex.security.allowed-http-addresses`. Users will want to enable
support for specific subnets, or perhaps allow all addresses.
When git-annex is configured to use curl, there seems to be no way
to prevent curl from following urls that redirect to localhost, other than
disabling redirections. And unless git-annex also pre-resolves the IP
address and rewrites it into the url passed to curl, DNS rebinding attacks
would still be possible. Also, one of the remaining reasons people enable
curl is to use a netrc file with passwords, and the content of
urls on those sites could also be exposed by this attack. So, it seems curl
should not be enabled by default and should have a big security warning if
it's supported at all. Probably ought to only enable it
when `annex.security.allowed-http-addresses=all`
http-client does not have hooks that can be used to find out what IP
address it's connecting to in a secure enough way.
See <https://github.com/snoyberg/http-client/issues/354>
Seems that building my own http Manager is the only way to go. By building
my own, I can do the IP address checks inside it when it's setting up
connections. And, the same manager can be passed to the S3 and WebDav libraries.
(The url scheme checks could also be moved into that Manager, to prevent
S3 redirect to file: url scenarios..)
> restricted http manager done and used in
> [[!commit b54b2cdc0ef1373fc200c0d28fded3c04fd57212]];
> curl also disabled by default
http proxies are another problem. They could be on the local network,
or even on localhost, and http-client does not provide a way to force
a http proxy to connect to an particular IP address (is that even possible?)
May have to disable use of http proxies unless
`annex.security.allowed-http-addresses=all`
Or better, find what http proxy is enabled and prevent using it if it's on
an IP not allowed there.
> TODO
----
The external special remote interface is another way to exploit this.
Since it bypasses git-annex's usual url download code, whatever fixes are
put in place there won't help external special remotes.
External special remotes that use GETURLS, typically in conjunction with
CLAIMURL and CHECKURL, and then themselves download the content of urls
in response to a TRANSFER RETRIEVE will have the same problems as
git-annex's url downloading.
> Checked all known external special remotes
> to see if they used GETURLS; only one that did is ipfs, which doesn't use
> http so is ok.
> TODO document this security issue in special remote protocol page
An external special remote might also make a simple http request to a
key/value API to download a key, and follow a redirect to file:///
or http://localhost/.
If the key uses a cryptographic hash, git-annex verifies the content.
But, the attacker could have committed a key that doesn't
use a hash. Also, the attacker could use the hash check as an oracle.
What if an external special remote uses https to a hardcoded server (either
fully hardcoded, or configured by the user when they enable the remote),
and uses a http library that only supports https and http (not file:///).
Is that good enough? An attacker would need to either compromise the server
to redirect to a local private webserver, or MITM, and https should prevent
the MITM.
This seems to require auditing of all external special remotes.
git-annex could add a new command to the external protocol, that asks
the special remote if it's been audited, and refuse to use ones that have
not.
I have emailed all relevant external special remote authors a heads-up and
some questions.
> TODO
----
Built-in special remotes that use protocols on top of http, eg S3 and WebDAV,
don't use Utility.Url, could also be exploited, and will need to be fixed
separately.
> not affected for url schemes, because http-client only supports http,
> not file:/
> done for localhost/lan in [[!commit b54b2cdc0ef1373fc200c0d28fded3c04fd57212]]
youtube-dl
> already disabled file:/. Added a scheme check, but it won't block
> redirects to other schemes. But youtube-dl goes off and does its own thing with other
> protocols anyway, so that's fine.
>
> The youtube-dl generic extractor will download media files (including
> videos and photos) if passed an direct url to them. It does not seem to
> extract video etc from tags on html pages.
> git-annex first checks if a web page
> is html before pointing youtube-dl at it, to avoid using it to download
> direct urls to media files. But that would not prevent a DNS rebinding
> attack which made git-annex see the html page, and youtube-dl then see
> a media file on localhost.
>
> Also, the current code in htmlOnly
> runs youtube-dl if the html page download test fails to download
> anything.
>
> Also, in the course of a download, youtube-dl could chain to other urls,
> depending on how its extractor works. Those urls could redirect to
> a localhost/lan web server.
>
> So, youtube-dl needs to be disabled unless http address security checks
> are turned off.
>
> > done in [[!commit e62c4543c31a61186ebf2e4e0412df59fc8630c8]]
glacier
> This special remote uses glacier-cli, which will need to be audited.
> Emailed basak.
> TODO

View file

@ -0,0 +1,27 @@
I'm writing this on a private branch, it won't be posted until a week from
now when the security hole is disclosed.
Security is not compositional. You can have one good feature, and add
another good feature, and the result is not two good features, but a new
security hole. In this case
[[bugs/security_hole_private_data_exposure_via_addurl]]. And it can be hard
to spot this kind of security hole, but then once it's known it
seems blindly obvious.
It came to me last night and by this morning I had decided the potential
impact was large enough to do a coordinated disclosure. Spent the first
half of the day thinking through ways to fix it that don't involve writing
my own http library. Then started getting in touch with all the
distributions' security teams. And then coded up a fairly complete fix for
the worst part of the security hole, although a secondary part is going to
need considerably more work.
It looks like the external special remotes are going to need at least some
security review too, and I'm still thinking that part of the problem over.
Exhausted.
Today's work was sponsored by Trenton Cronholm
[on Patreon](https://patreon.com/joeyh).
[[!meta date="Jun 15 2018 7:00 pm"]]

View file

@ -0,0 +1,26 @@
Most of the day was spent staring at the http-client source code and trying
to find a way to add the IP address checks to it that I need to fully close
the security hole.
In the end, I did find a way, with the duplication of a couple dozen lines
of code from http-client. It will let the security fix be used with
libraries like aws and DAV that build on top of http-client, too.
While the code is in git-annex for now, it's fully disconnected and
would also be useful if a web browser were implemented in Haskell,
to implement same-origin restrictions while avoiding DNS rebinding attacks.
Looks like http proxies and curl will need to be disabled by default,
since this fix can't support either of them securely. I wonder how web
browsers deal with http proxies, DNS rebinding attacks and same-origin?
I can't think of a secure way.
Next I need a function that checks if an IP address is a link-local address
or a private network address. For both ipv4 and ipv6. Could not find
anything handy on hackage, so I'm gonna have to stare at some RFCs. Perhaps
this evening, for now, it's time to swim in the river.
Today's work was sponsored by Jake Vosloo
[on Patreon](https://patreon.com/joeyh)
[[!meta date="June 16 2018 4:00 pm"]]

View file

@ -0,0 +1,13 @@
Got the IP address restrictions for http implemented. (Except for http
proxies.)
Unforunately as part of this, had to make youtube-dl and curl not be used
by default. The annex.security.allowed-http-addresses config has to be
opened up by the user in order to use those external commands, since they
can follow arbitrary redirects.
Also thought some more about how external special remotes might be
affected, and sent their authors' a heads-up.
[[!meta date="June 17 2018 4:00 pm"]]