Tighten key parser to not accept keys containing a non-numeric fields, which could be used to embed data useful for a SHA1 attack against git.

Also todo about why this is important, and with some further hardening to
add.

This commit was sponsored by Ignacio on Patreon.
This commit is contained in:
Joey Hess 2017-02-24 00:17:25 -04:00
parent 0dec2257f0
commit 60d99a80a6
No known key found for this signature in database
GPG key ID: C910D9222512E3C7
3 changed files with 51 additions and 1 deletions

View file

@ -33,6 +33,9 @@ git-annex (6.20170215) UNRELEASED; urgency=medium
to wget, since curl is able to display only errors to stderr, unlike
wget.
* status: Pass --ignore-submodules=when option on to git status.
* Tighten key parser to not accept keys containing a non-numeric
fields, which could be used to embed data useful for a SHA1
attack against git.
-- Joey Hess <id@joeyh.name> Tue, 14 Feb 2017 15:54:25 -0400

View file

@ -22,6 +22,7 @@ module Types.Key (
import System.Posix.Types
import Data.Aeson
import Data.Char
import qualified Data.Text as T
import Common
@ -108,6 +109,16 @@ file2key s
findfields _ v = v
addbackend k v = Just k { keyBackendName = v }
-- This is a strict parser for security reasons; a key
-- can contain only 4 fields, which all consist only of numbers.
-- Any key containing other fields, or non-numeric data is
-- rejected with Nothing.
--
-- If a key contained non-numeric fields, they could be used to
-- embed data used in a SHA1 collision attack, which would be a
-- problem since the keys are committed to git.
addfield _ _ v | not (all isDigit v) = Nothing
addfield 's' k v = do
sz <- readish v
return $ k { keySize = Just sz }
@ -120,7 +131,7 @@ file2key s
addfield 'C' k v = case readish v of
Just chunknum | chunknum > 0 ->
return $ k { keyChunkNum = Just chunknum }
_ -> return k
_ -> Nothing
addfield _ _ _ = Nothing
instance ToJSON Key where

View file

@ -0,0 +1,36 @@
Some git-annex backends allow embedding enough data in the names of keys
that it could be used for a SHA1 collision attack. So, a signed git commit
could point to a tree with such a key in it, and the blob for the key could
have two versions with the same SHA1.
Users who want to use git-annex with signed commits to mitigate git's own
SHA1 insecurities would like at least a way to disable the insecure
git-annex backends:
* WORM can contain fairly arbitrary data in a key name
* URL too (also, of course, URLs download arbitrary data from the web,
so a signed git commit pointing at URL keys doesn't have any security
even w/o SHA1 collisions)
* SHA1 and MD5 backends are insecure because there can be colliding
versions of the data they point to.
A config setting to prevent git-annex from using insecure backends would be
useful.
(git-annex might suggest enabling that configuration if commit.gpgSign
is enabled)
A few other potential problems:
* `*E` backends could embed sha1 collision data in a long filename
extension. That this is much harder to exploit because git-annex
checks the hash of the data when it enters the repository, and git-annex
fsck also verifies it. It still might be worth limiting the length
of an extension in such a key to the longest such extension git-annex has
ever supported (probably < 20 bytes or so), which would be less than the
size of the data needed for current SHA1 collision attacks.
* It might be possible to embed colliding data in a specially constructed
key name with an extra field in it, eg "SHA256-cXXXXXXXXXXXXXXX-...".
Need to review the code and see if such extra fields are allowed.
Update: All fields are numeric, but could contain arbitrary data
after the number. (fixed now)