add

2018-06-17 16:13:45 -04:00 · 2018-06-17 16:13:45 -04:00 · 8703fdd3b7
commit 8703fdd3b7
parent e62c4543c3
4 changed files with 265 additions and 0 deletions
--- a/doc/bugs/security_hole_private_data_exposure_via_addurl.mdwn
+++ b/doc/bugs/security_hole_private_data_exposure_via_addurl.mdwn
@ -0,0 +1,199 @@
+This is a security hole that allows exposure of
+private data in files located outside the git-annex repository.
+
+The attacker needs to have control over one of the remotes of the git-annex
+repository. For example, they may provide a public git-annex repository
+that the victim clones. Or the victim may have paired repositories with
+them. Or, equivilantly, the attacker could have read access to the victim's
+git-annex repository (eg on a server somewhere), and some channel to get
+commits into it (eg a pull requests).
+
+The attacker does `git-annex addurl --relaxed file:///etc/passwd` and commits
+this to the repository in some out of the way place. Then they wait for the
+victim to pull the change.
+
+The easiest exploit is when the victim is running the assistant, or is
+periodically doing `git annex sync --content`. The victim may also perform
+the equivilant actions manually, not noticing they're operating on the
+file. 
+
+In either case, git-annex gets the content of the annexed file, following
+the file:// url. Then git-annex transfers the content back to the
+attacker's repository.
+
+Note that the victim can always detect this attack after the fact, since
+even if the attacker deletes the file, the git history will still 
+contain it.
+
+It may also be possible to exploit scp:// sftp:// smb:// etc urls to get at
+files on other computers that the user has access to as well as localhost.
+I was not able to get curl to download from scp:// or sftp:// on debian
+(unstable) but there may be configurations that allow it.
+
+If the url is attached to a key using a cryptographic hash, then the
+attacker would need to already know at least the hash of the content
+to exploit this.
+
+Sending that content back to them could be considered not a security
+hole. Except, I can guess what content some file on your system might have,
+and use this attack as an oracle to determine if I guessed right, and work
+toward penetrating your system using that information.
+
+So, best to not treat addurl with a hash any differently than 
+--relaxed and --fast when addressing this hole.
+
+----
+
+The fix must involve locking down the set of allowed in url schemes.
+Better to have a whitelist than a blacklist. http and https seems like the
+right default.
+
+Some users do rely on file:// urls, and this is fine in some use cases,
+eg when you're not merging changes from anyone else. 
+
+So this probably needs to be a git config of allowed url schemes,
+with an appropriatly scary name, like `annex.security.allowed-url-schemes`.
+
+Redirects from one url scheme to another could be usd to bypass such a
+whitelist. curl's `--proto` also affects redirects. http-conduit 
+is limited to only http and will probably remain that way.
+
+> done in [[!commit 28720c795ff57a55b48e56d15f9b6bcb977f48d9]] --[[Joey]]
+
+----
+
+The same kind of attack can be used to see the content of
+localhost urls and other non-globally-available urls.
+
+Redirects and DNS rebinding attacks mean that checking the IP address
+of the hostname in the url is not good enough. It needs to check the IP
+address that is actually connected to, for each http connection made,
+including redirects.
+
+There will need to be a config to relax checks, like
+with an appropriatly scary name, like
+`annex.security.allowed-http-addresses`. Users will want to enable
+support for specific subnets, or perhaps allow all addresses.
+
+When git-annex is configured to use curl, there seems to be no way
+to prevent curl from following urls that redirect to localhost, other than
+disabling redirections. And unless git-annex also pre-resolves the IP
+address and rewrites it into the url passed to curl, DNS rebinding attacks
+would still be possible. Also, one of the remaining reasons people enable
+curl is to use a netrc file with passwords, and the content of
+urls on those sites could also be exposed by this attack. So, it seems curl
+should not be enabled by default and should have a big security warning if
+it's supported at all. Probably ought to only enable it 
+when `annex.security.allowed-http-addresses=all`
+
+http-client does not have hooks that can be used to find out what IP
+address it's connecting to in a secure enough way.
+See <https://github.com/snoyberg/http-client/issues/354>
+
+Seems that building my own http Manager is the only way to go. By building
+my own, I can do the IP address checks inside it when it's setting up
+connections. And, the same manager can be passed to the S3 and WebDav libraries.
+(The url scheme checks could also be moved into that Manager, to prevent
+S3 redirect to file: url scenarios..)
+
+> restricted http manager done and used in
+> [[!commit b54b2cdc0ef1373fc200c0d28fded3c04fd57212]];
+> curl also disabled by default
+
+http proxies are another problem. They could be on the local network,
+or even on localhost, and http-client does not provide a way to force
+a http proxy to connect to an particular IP address (is that even possible?)
+May have to disable use of http proxies unless
+`annex.security.allowed-http-addresses=all`
+Or better, find what http proxy is enabled and prevent using it if it's on
+an IP not allowed there.
+
+> TODO
+
+----
+
+The external special remote interface is another way to exploit this.
+Since it bypasses git-annex's usual url download code, whatever fixes are
+put in place there won't help external special remotes.
+
+External special remotes that use GETURLS, typically in conjunction with 
+CLAIMURL and CHECKURL, and then themselves download the content of urls
+in response to a TRANSFER RETRIEVE will have the same problems as
+git-annex's url downloading. 
+
+> Checked all known external special remotes
+> to see if they used GETURLS; only one that did is ipfs, which doesn't use
+> http so is ok.
+
+> TODO document this security issue in special remote protocol page
+
+An external special remote might also make a simple http request to a
+key/value API to download a key, and follow a redirect to file:///
+or http://localhost/.
+
+If the key uses a cryptographic hash, git-annex verifies the content. 
+But, the attacker could have committed a key that doesn't
+use a hash. Also, the attacker could use the hash check as an oracle.
+
+What if an external special remote uses https to a hardcoded server (either
+fully hardcoded, or configured by the user when they enable the remote),
+and uses a http library that only supports https and http (not file:///).
+Is that good enough? An attacker would need to either compromise the server
+to redirect to a local private webserver, or MITM, and https should prevent
+the MITM.
+
+This seems to require auditing of all external special remotes.
+git-annex could add a new command to the external protocol, that asks
+the special remote if it's been audited, and refuse to use ones that have
+not.
+
+I have emailed all relevant external special remote authors a heads-up and
+some questions.
+
+> TODO
+
+----
+
+Built-in special remotes that use protocols on top of http, eg S3 and WebDAV,
+don't use Utility.Url, could also be exploited, and will need to be fixed
+separately.
+
+> not affected for url schemes, because http-client only supports http,
+> not file:/
+
+> done for localhost/lan in [[!commit b54b2cdc0ef1373fc200c0d28fded3c04fd57212]]
+
+youtube-dl
+
+> already disabled file:/. Added a scheme check, but it won't block
+> redirects to other schemes. But youtube-dl goes off and does its own thing with other
+> protocols anyway, so that's fine.
+> 
+> The youtube-dl generic extractor will download media files (including 
+> videos and photos) if passed an direct url to them. It does not seem to
+> extract video etc from tags on html pages.
+
+> git-annex first checks if a web page
+> is html before pointing youtube-dl at it, to avoid using it to download
+> direct urls to media files. But that would not prevent a DNS rebinding
+> attack which made git-annex see the html page, and youtube-dl then see
+> a media file on localhost.
+> 
+> Also, the current code in htmlOnly
+> runs youtube-dl if the html page download test fails to download
+> anything.
+> 
+> Also, in the course of a download, youtube-dl could chain to other urls,
+> depending on how its extractor works. Those urls could redirect to
+> a localhost/lan web server.
+> 
+> So, youtube-dl needs to be disabled unless http address security checks
+> are turned off.
+> 
+> > done in [[!commit e62c4543c31a61186ebf2e4e0412df59fc8630c8]]
+
+glacier
+
+> This special remote uses glacier-cli, which will need to be audited.
+> Emailed basak.
+> TODO
--- a/doc/devblog/day_499__security_hole.mdwn
+++ b/doc/devblog/day_499__security_hole.mdwn
@ -0,0 +1,27 @@
+I'm writing this on a private branch, it won't be posted until a week from
+now when the security hole is disclosed.
+
+Security is not compositional. You can have one good feature, and add
+another good feature, and the result is not two good features, but a new
+security hole. In this case
+[[bugs/security_hole_private_data_exposure_via_addurl]]. And it can be hard
+to spot this kind of security hole, but then once it's known it
+seems blindly obvious.
+
+It came to me last night and by this morning I had decided the potential
+impact was large enough to do a coordinated disclosure. Spent the first
+half of the day thinking through ways to fix it that don't involve writing
+my own http library. Then started getting in touch with all the
+distributions' security teams. And then coded up a fairly complete fix for
+the worst part of the security hole, although a secondary part is going to
+need considerably more work.
+
+It looks like the external special remotes are going to need at least some
+security review too, and I'm still thinking that part of the problem over.
+
+Exhausted.
+
+Today's work was sponsored by Trenton Cronholm 
+[on Patreon](https://patreon.com/joeyh).
+
+[[!meta date="Jun 15 2018 7:00 pm"]]
--- a/doc/devblog/day_500__security_hole_part_2.mdwn
+++ b/doc/devblog/day_500__security_hole_part_2.mdwn
@ -0,0 +1,26 @@
+Most of the day was spent staring at the http-client source code and trying
+to find a way to add the IP address checks to it that I need to fully close
+the security hole.
+
+In the end, I did find a way, with the duplication of a couple dozen lines
+of code from http-client. It will let the security fix be used with
+libraries like aws and DAV that build on top of http-client, too.
+
+While the code is in git-annex for now, it's fully disconnected and
+would also be useful if a web browser were implemented in Haskell,
+to implement same-origin restrictions while avoiding DNS rebinding attacks.
+
+Looks like http proxies and curl will need to be disabled by default,
+since this fix can't support either of them securely. I wonder how web
+browsers deal with http proxies, DNS rebinding attacks and same-origin?
+I can't think of a secure way.
+
+Next I need a function that checks if an IP address is a link-local address
+or a private network address. For both ipv4 and ipv6. Could not find
+anything handy on hackage, so I'm gonna have to stare at some RFCs. Perhaps
+this evening, for now, it's time to swim in the river.
+
+Today's work was sponsored by  Jake Vosloo
+[on Patreon](https://patreon.com/joeyh)
+
+[[!meta date="June 16 2018 4:00 pm"]]
--- a/doc/devblog/day_501__security_hole_part_3.mdwn
+++ b/doc/devblog/day_501__security_hole_part_3.mdwn
@ -0,0 +1,13 @@
+Got the IP address restrictions for http implemented. (Except for http
+proxies.)
+
+Unforunately as part of this, had to make youtube-dl and curl not be used
+by default. The annex.security.allowed-http-addresses config has to be
+opened up by the user in order to use those external commands, since they
+can follow arbitrary redirects.
+
+Also thought some more about how external special remotes might be
+affected, and sent their authors' a heads-up.
+
+[[!meta date="June 17 2018 4:00 pm"]]
+