make file2key reject E* backend keys with a long extension

I am not happy that I had to put backend-specific code in file2key. But
it would be very difficult to avoid this layering violation.

Most of the time, when parsing a Key from a symlink target, git-annex
never looks up its Backend at all, so adding this check to a method of
the Backend object would not work.

The Key could be made to contain the appropriate
Backend, but since Backend is parameterized on an "a" that is fixed to
the Annex monad later, that would need Key to change to "Key a".

The only way to clean this up that I can see would be to have the Key
contain a LowlevelBackend, and put the validation in LowlevelBackend.
Perhaps later, but that would be an extensive change, so let's not do
it in this commit which may want to cherry-pick to backports.

This commit was sponsored by Ethan Aubin.
This commit is contained in:
Joey Hess 2017-02-24 11:17:07 -04:00
parent 63df8d8966
commit 35739a74c2
No known key found for this signature in database
GPG key ID: C910D9222512E3C7
3 changed files with 36 additions and 7 deletions

View file

@ -33,9 +33,10 @@ git-annex (6.20170215) UNRELEASED; urgency=medium
to wget, since curl is able to display only errors to stderr, unlike
wget.
* status: Pass --ignore-submodules=when option on to git status.
* Tighten key parser to not accept keys containing non-numeric
fields, which could be used to embed data useful for a SHA1
attack against git.
* Tighten key parser to mitigate against hypothetical SHA1 preimage
attacks. This ensures that signed git commits of annexed files
will remain secure, even against the worst possible future SHA1
attacks, as long as git-annex is using a secure hashing backend.
-- Joey Hess <id@joeyh.name> Tue, 14 Feb 2017 15:54:25 -0400

View file

@ -1,6 +1,6 @@
{- git-annex Key data type
-
- Copyright 2011-2016 Joey Hess <id@joeyh.name>
- Copyright 2011-2017 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU GPL version 3 or higher.
-}
@ -104,7 +104,7 @@ file2key s
_ -> Nothing
findfields (c:v) (Just k)
| c == fieldSep = Just $ k { keyName = v }
| c == fieldSep = addkeyname k v
| otherwise = sepfield k v $ addfield c
findfields _ v = v
@ -134,6 +134,31 @@ file2key s
_ -> Nothing
addfield _ _ _ = Nothing
addkeyname k v
| validKeyName k v = Just $ k { keyName = v }
| otherwise = Nothing
{- A key with a backend ending in "E" is an extension preserving key,
- using some hash.
-
- The length of the extension is limited in order to mitigate against
- SHA1 collision attacks (specifically, chosen-prefix attacks).
- In such an attack, the extension of the key could be made to contain
- the collision generation data, with the result that a signed git commit
- including such keys would not be secure.
-
- The maximum extension length ever generated for such a key was 8
- characters; 20 is used here to give a little future wiggle-room.
- The SHA1 common-prefix attack used 128 bytes of data.
-
- This code is here, and not in Backend.Hash (where it really belongs)
- so that file2key can check it whenever a Key is constructed.
-}
validKeyName :: Key -> String -> Bool
validKeyName k v
| end (keyBackendName k) == "E" = length (takeExtensions v) <= 20
| otherwise = True
instance ToJSON Key where
toJSON = toJSON . key2file

View file

@ -36,12 +36,15 @@ A few other potential problems:
git-annex has ever supported (probably < 20 bytes or so), which would
be less than the size of the data needed for current SHA1 collision
attacks. Presumably aa chosen-prefix attack would need a similar amount of
data.
data. Update: Now done; git-annex refuses to use keys with super
long extensions.
* It might be possible to embed colliding data in a specially constructed
key name with an extra field in it, eg "SHA256-cXXXXXXXXXXXXXXX-...".
Need to review the code and see if such extra fields are allowed.
Update: All fields are numeric, but could contain arbitrary data
after the number. This has been fixed; git-annex refuses to parse
after the number. Could have been used in a chosen-prefix attack
(posibly; would require field to come after key name data) or
preimage attack. This has been fixed; git-annex refuses to parse
such fields, so it won't work with files that try to exploit this.