From 8703fdd3b75c1b249fe9143f23d1128390f391cc Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 17 Jun 2018 16:13:45 -0400 Subject: [PATCH] add --- ...hole_private_data_exposure_via_addurl.mdwn | 199 ++++++++++++++++++ doc/devblog/day_499__security_hole.mdwn | 27 +++ .../day_500__security_hole_part_2.mdwn | 26 +++ .../day_501__security_hole_part_3.mdwn | 13 ++ 4 files changed, 265 insertions(+) create mode 100644 doc/bugs/security_hole_private_data_exposure_via_addurl.mdwn create mode 100644 doc/devblog/day_499__security_hole.mdwn create mode 100644 doc/devblog/day_500__security_hole_part_2.mdwn create mode 100644 doc/devblog/day_501__security_hole_part_3.mdwn diff --git a/doc/bugs/security_hole_private_data_exposure_via_addurl.mdwn b/doc/bugs/security_hole_private_data_exposure_via_addurl.mdwn new file mode 100644 index 0000000000..42ba0dd9ae --- /dev/null +++ b/doc/bugs/security_hole_private_data_exposure_via_addurl.mdwn @@ -0,0 +1,199 @@ +This is a security hole that allows exposure of +private data in files located outside the git-annex repository. + +The attacker needs to have control over one of the remotes of the git-annex +repository. For example, they may provide a public git-annex repository +that the victim clones. Or the victim may have paired repositories with +them. Or, equivilantly, the attacker could have read access to the victim's +git-annex repository (eg on a server somewhere), and some channel to get +commits into it (eg a pull requests). + +The attacker does `git-annex addurl --relaxed file:///etc/passwd` and commits +this to the repository in some out of the way place. Then they wait for the +victim to pull the change. + +The easiest exploit is when the victim is running the assistant, or is +periodically doing `git annex sync --content`. The victim may also perform +the equivilant actions manually, not noticing they're operating on the +file. + +In either case, git-annex gets the content of the annexed file, following +the file:// url. Then git-annex transfers the content back to the +attacker's repository. + +Note that the victim can always detect this attack after the fact, since +even if the attacker deletes the file, the git history will still +contain it. + +It may also be possible to exploit scp:// sftp:// smb:// etc urls to get at +files on other computers that the user has access to as well as localhost. +I was not able to get curl to download from scp:// or sftp:// on debian +(unstable) but there may be configurations that allow it. + +If the url is attached to a key using a cryptographic hash, then the +attacker would need to already know at least the hash of the content +to exploit this. + +Sending that content back to them could be considered not a security +hole. Except, I can guess what content some file on your system might have, +and use this attack as an oracle to determine if I guessed right, and work +toward penetrating your system using that information. + +So, best to not treat addurl with a hash any differently than +--relaxed and --fast when addressing this hole. + +---- + +The fix must involve locking down the set of allowed in url schemes. +Better to have a whitelist than a blacklist. http and https seems like the +right default. + +Some users do rely on file:// urls, and this is fine in some use cases, +eg when you're not merging changes from anyone else. + +So this probably needs to be a git config of allowed url schemes, +with an appropriatly scary name, like `annex.security.allowed-url-schemes`. + +Redirects from one url scheme to another could be usd to bypass such a +whitelist. curl's `--proto` also affects redirects. http-conduit +is limited to only http and will probably remain that way. + +> done in [[!commit 28720c795ff57a55b48e56d15f9b6bcb977f48d9]] --[[Joey]] + +---- + +The same kind of attack can be used to see the content of +localhost urls and other non-globally-available urls. + +Redirects and DNS rebinding attacks mean that checking the IP address +of the hostname in the url is not good enough. It needs to check the IP +address that is actually connected to, for each http connection made, +including redirects. + +There will need to be a config to relax checks, like +with an appropriatly scary name, like +`annex.security.allowed-http-addresses`. Users will want to enable +support for specific subnets, or perhaps allow all addresses. + +When git-annex is configured to use curl, there seems to be no way +to prevent curl from following urls that redirect to localhost, other than +disabling redirections. And unless git-annex also pre-resolves the IP +address and rewrites it into the url passed to curl, DNS rebinding attacks +would still be possible. Also, one of the remaining reasons people enable +curl is to use a netrc file with passwords, and the content of +urls on those sites could also be exposed by this attack. So, it seems curl +should not be enabled by default and should have a big security warning if +it's supported at all. Probably ought to only enable it +when `annex.security.allowed-http-addresses=all` + +http-client does not have hooks that can be used to find out what IP +address it's connecting to in a secure enough way. +See + +Seems that building my own http Manager is the only way to go. By building +my own, I can do the IP address checks inside it when it's setting up +connections. And, the same manager can be passed to the S3 and WebDav libraries. +(The url scheme checks could also be moved into that Manager, to prevent +S3 redirect to file: url scenarios..) + +> restricted http manager done and used in +> [[!commit b54b2cdc0ef1373fc200c0d28fded3c04fd57212]]; +> curl also disabled by default + +http proxies are another problem. They could be on the local network, +or even on localhost, and http-client does not provide a way to force +a http proxy to connect to an particular IP address (is that even possible?) +May have to disable use of http proxies unless +`annex.security.allowed-http-addresses=all` +Or better, find what http proxy is enabled and prevent using it if it's on +an IP not allowed there. + +> TODO + +---- + +The external special remote interface is another way to exploit this. +Since it bypasses git-annex's usual url download code, whatever fixes are +put in place there won't help external special remotes. + +External special remotes that use GETURLS, typically in conjunction with +CLAIMURL and CHECKURL, and then themselves download the content of urls +in response to a TRANSFER RETRIEVE will have the same problems as +git-annex's url downloading. + +> Checked all known external special remotes +> to see if they used GETURLS; only one that did is ipfs, which doesn't use +> http so is ok. + +> TODO document this security issue in special remote protocol page + +An external special remote might also make a simple http request to a +key/value API to download a key, and follow a redirect to file:/// +or http://localhost/. + +If the key uses a cryptographic hash, git-annex verifies the content. +But, the attacker could have committed a key that doesn't +use a hash. Also, the attacker could use the hash check as an oracle. + +What if an external special remote uses https to a hardcoded server (either +fully hardcoded, or configured by the user when they enable the remote), +and uses a http library that only supports https and http (not file:///). +Is that good enough? An attacker would need to either compromise the server +to redirect to a local private webserver, or MITM, and https should prevent +the MITM. + +This seems to require auditing of all external special remotes. +git-annex could add a new command to the external protocol, that asks +the special remote if it's been audited, and refuse to use ones that have +not. + +I have emailed all relevant external special remote authors a heads-up and +some questions. + +> TODO + +---- + +Built-in special remotes that use protocols on top of http, eg S3 and WebDAV, +don't use Utility.Url, could also be exploited, and will need to be fixed +separately. + +> not affected for url schemes, because http-client only supports http, +> not file:/ + +> done for localhost/lan in [[!commit b54b2cdc0ef1373fc200c0d28fded3c04fd57212]] + +youtube-dl + +> already disabled file:/. Added a scheme check, but it won't block +> redirects to other schemes. But youtube-dl goes off and does its own thing with other +> protocols anyway, so that's fine. +> +> The youtube-dl generic extractor will download media files (including +> videos and photos) if passed an direct url to them. It does not seem to +> extract video etc from tags on html pages. + +> git-annex first checks if a web page +> is html before pointing youtube-dl at it, to avoid using it to download +> direct urls to media files. But that would not prevent a DNS rebinding +> attack which made git-annex see the html page, and youtube-dl then see +> a media file on localhost. +> +> Also, the current code in htmlOnly +> runs youtube-dl if the html page download test fails to download +> anything. +> +> Also, in the course of a download, youtube-dl could chain to other urls, +> depending on how its extractor works. Those urls could redirect to +> a localhost/lan web server. +> +> So, youtube-dl needs to be disabled unless http address security checks +> are turned off. +> +> > done in [[!commit e62c4543c31a61186ebf2e4e0412df59fc8630c8]] + +glacier + +> This special remote uses glacier-cli, which will need to be audited. +> Emailed basak. +> TODO diff --git a/doc/devblog/day_499__security_hole.mdwn b/doc/devblog/day_499__security_hole.mdwn new file mode 100644 index 0000000000..9f0d532d6d --- /dev/null +++ b/doc/devblog/day_499__security_hole.mdwn @@ -0,0 +1,27 @@ +I'm writing this on a private branch, it won't be posted until a week from +now when the security hole is disclosed. + +Security is not compositional. You can have one good feature, and add +another good feature, and the result is not two good features, but a new +security hole. In this case +[[bugs/security_hole_private_data_exposure_via_addurl]]. And it can be hard +to spot this kind of security hole, but then once it's known it +seems blindly obvious. + +It came to me last night and by this morning I had decided the potential +impact was large enough to do a coordinated disclosure. Spent the first +half of the day thinking through ways to fix it that don't involve writing +my own http library. Then started getting in touch with all the +distributions' security teams. And then coded up a fairly complete fix for +the worst part of the security hole, although a secondary part is going to +need considerably more work. + +It looks like the external special remotes are going to need at least some +security review too, and I'm still thinking that part of the problem over. + +Exhausted. + +Today's work was sponsored by Trenton Cronholm +[on Patreon](https://patreon.com/joeyh). + +[[!meta date="Jun 15 2018 7:00 pm"]] diff --git a/doc/devblog/day_500__security_hole_part_2.mdwn b/doc/devblog/day_500__security_hole_part_2.mdwn new file mode 100644 index 0000000000..3d485ee73c --- /dev/null +++ b/doc/devblog/day_500__security_hole_part_2.mdwn @@ -0,0 +1,26 @@ +Most of the day was spent staring at the http-client source code and trying +to find a way to add the IP address checks to it that I need to fully close +the security hole. + +In the end, I did find a way, with the duplication of a couple dozen lines +of code from http-client. It will let the security fix be used with +libraries like aws and DAV that build on top of http-client, too. + +While the code is in git-annex for now, it's fully disconnected and +would also be useful if a web browser were implemented in Haskell, +to implement same-origin restrictions while avoiding DNS rebinding attacks. + +Looks like http proxies and curl will need to be disabled by default, +since this fix can't support either of them securely. I wonder how web +browsers deal with http proxies, DNS rebinding attacks and same-origin? +I can't think of a secure way. + +Next I need a function that checks if an IP address is a link-local address +or a private network address. For both ipv4 and ipv6. Could not find +anything handy on hackage, so I'm gonna have to stare at some RFCs. Perhaps +this evening, for now, it's time to swim in the river. + +Today's work was sponsored by Jake Vosloo +[on Patreon](https://patreon.com/joeyh) + +[[!meta date="June 16 2018 4:00 pm"]] diff --git a/doc/devblog/day_501__security_hole_part_3.mdwn b/doc/devblog/day_501__security_hole_part_3.mdwn new file mode 100644 index 0000000000..6187f7d534 --- /dev/null +++ b/doc/devblog/day_501__security_hole_part_3.mdwn @@ -0,0 +1,13 @@ +Got the IP address restrictions for http implemented. (Except for http +proxies.) + +Unforunately as part of this, had to make youtube-dl and curl not be used +by default. The annex.security.allowed-http-addresses config has to be +opened up by the user in order to use those external commands, since they +can follow arbitrary redirects. + +Also thought some more about how external special remotes might be +affected, and sent their authors' a heads-up. + +[[!meta date="June 17 2018 4:00 pm"]] +