e53070c1ff
* init: When annex.securehashesonly has been set with git-annex config, copy that value to the annex.securehashesonly git config. * config --set: As well as setting value in git-annex branch, set local gitconfig. This is needed especially for annex.securehashesonly, which is read only from local gitconfig and not the git-annex branch. doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn has the rationalle for doing it this way. There's no perfect solution; this seems to be the least-bad one. This commit was supported by the NSF-funded DataLad project.
145 lines
6.8 KiB
Markdown
145 lines
6.8 KiB
Markdown
Some git-annex backends allow embedding enough data in the names of keys
|
|
that it could be used for a SHA1 collision attack. So, a signed git commit
|
|
could point to a tree with such a key in it, and the blob for the key could
|
|
have two versions with the same SHA1.
|
|
|
|
> All issues below are [[done]] --[[Joey]]
|
|
|
|
Users who want to use git-annex with signed commits to mitigate git's own
|
|
SHA1 insecurities would like at least a way to disable the insecure
|
|
git-annex backends:
|
|
|
|
* WORM can contain fairly arbitrary data in a key name
|
|
* URL too (also, of course, URLs download arbitrary data from the web,
|
|
so a signed git commit pointing at URL keys doesn't have any security
|
|
even w/o SHA1 collisions)
|
|
* SHA1 and MD5 backends are insecure because there can be colliding
|
|
versions of the data they point to.
|
|
|
|
There could be a config setting, which would prevent git-annex from using
|
|
keys with such insecure backends. A user who checks git commit signatures
|
|
could enable the config setting when they initially clone their repository.
|
|
This should prevent any file contents using insecure backends from being
|
|
downloaded into the repository. (Even git-annex-shell recvkey would
|
|
refuse data using such a key, since it would fail parsing the key.)
|
|
The user would thus know that any file contents in their repository match
|
|
the files in signed git commits.
|
|
|
|
Enabling the config setting in a repository that already contains
|
|
file contents would be a mistake, because it might contain insecure keys.
|
|
And since git-annex would skip over such files, `git annex fsck` cannot
|
|
warn about such a mistake.
|
|
|
|
Perhaps, then, the config setting should be turned on by `git annex init`?
|
|
Or, we can document this gotcha.
|
|
|
|
> I've done some groundwork for this, but making git-annex not accept
|
|
> insecure keys into the repo at all requires changing file2key,
|
|
> which is a pure function that's used in eg, instances for serialization.
|
|
>
|
|
> So, how to make it vary depending on git config? Can't. Alternative
|
|
> would be to add lots of checks everywhere a key is read from disk
|
|
> or network, which feels like it would be a hard security boundary to
|
|
> manage.
|
|
>
|
|
> It doesn't really matter if content under an insecure key is in the
|
|
> repo, as long as there's not a signed commit referencing such a key.
|
|
> So, we could say, this is up to the user constucting a signed commit, to not
|
|
> put such keys in the commit.
|
|
>
|
|
> Or, we could use the pre-commit hook, and when
|
|
> the config setting disallows insecure keys, make it reject commits
|
|
> that contain them. But, if a past commit added a file using an insecure
|
|
> key, and the current commit does not touch it, should it be rejected?
|
|
> Rejecting it would then require a somewhat expensive look at the tree
|
|
> being committed.
|
|
>
|
|
> The user might be merging a branch from someone else; there seems no
|
|
> git hook that can sanity check a fast-forward merge.
|
|
>
|
|
> Perhaps leave it up to the person making signed commits to get it
|
|
> right, and make git annex fsck warn about such keys? That seems
|
|
> reasonable. --[[Joey]]
|
|
|
|
> > Rather than preventing SHA1/URL/WORM Keys, could put checks in
|
|
> > Annex.Content.moveAnnex to prevent SHA1/URL/WORM objects reaching the
|
|
> > repository. That would make moveAnnex a security boundary, which is is
|
|
> > not currently. Would need to audid to check if anything else populates
|
|
> > .git/annex/objects.
|
|
> >
|
|
> > Annex.Transfer.runTransfer could also check for disallowed objects,
|
|
> > not as a security boundary, but to prevent accidental expensive
|
|
> > transfers that would fail at the moveAnnex stage.
|
|
|
|
> > As to how to enable this, it may make sense to use git-annex-config
|
|
> > but only read the value from the git-annex branch when initializing the
|
|
> > repository, and cache it in git-config.
|
|
> >
|
|
> > This way, a repository can be created and configured not to allow
|
|
> > SHA1/URL/WORM, and all clones will inherit this configuration.
|
|
> >
|
|
> > Users can also set it in git-config on a per repository basis.
|
|
> >
|
|
> > If the git-annex-config setting is changed, existing clone's won't
|
|
> > change their behavior, although new ones will. That's a mixed
|
|
> > blessing; it makes it harder to switch an existing repo to disallowing
|
|
> > SHA1/URL/WORM, but an accidental/malicious re-enabling won't affect
|
|
> > clones made while it was disabled.
|
|
> > > This is done now.
|
|
> >
|
|
> > Could a repository be configured to either always disallow
|
|
> > SHA1/URL/WORM, or always allow them, and then not let that be changed?
|
|
> > Maybe -- Look through all the history of the git-annex branch from the
|
|
> > earliest commit forward. The first value stored in
|
|
> > git-annex/disableinsecurehashes (eg 0 or 1) is the value to use;
|
|
> > any later changes are ignored.
|
|
> > That would be a little slow, but only needs to be done at init time.
|
|
> > It might be possible to fool this though. Create a new empty branch,
|
|
> > with an old date, make a commit enabling insecure hashes, and
|
|
> > merge it into git-annex branch HEAD. It now looks as if insecure hashes
|
|
> > were disabled earliest.
|
|
|
|
> > > Well, annex.securehashesonly is implemented now. It currently needs to be
|
|
> > > set in each clone that cares about it. --[[Joey]]
|
|
|
|
----
|
|
|
|
A few other potential problems:
|
|
|
|
* A symlink target like .git/annex/objects/XX/YY/SHA256--foo
|
|
might be able to be manipulated to add collision data in the path.
|
|
For example, .git/annex/objects/collisiondata/../XX/YY/SHA256--foo
|
|
|
|
I think this is not a valid attack, because at least on linux,
|
|
such a symlink won't be followed, unless the
|
|
.git/annex/objects/collisiondata directory exists.
|
|
|
|
* `*E` backends could embed sha1 collision data in a long filename
|
|
extension in a key.
|
|
|
|
The recent SHA1 common-prefix attack could be used to exploit this;
|
|
the result would be two keys that have the same SHA1.
|
|
|
|
This can be fixed by limiting the length
|
|
of an extension allowed in such a key to the longest such extension
|
|
git-annex has ever supported (probably < 20 bytes or so), which would
|
|
be less than the size of the data needed for current SHA1 collision
|
|
attacks. Now done; git-annex refuses to use keys with super
|
|
long extensions.
|
|
|
|
* It might be possible to embed colliding data in a specially constructed
|
|
key name with an extra field in it, eg "SHA256-cXXXXXXXXXXXXXXX-...".
|
|
Need to review the code and see if such extra fields are allowed.
|
|
|
|
Update: All fields are numeric, but could contain arbitrary data
|
|
after the number. Could have been used in a common-prefix attack.
|
|
This has been fixed; git-annex refuses to parse
|
|
such fields, so it won't work with files that try to exploit this.
|
|
|
|
* A symlink target like .git/annex/objects/XX/YY/SHA256--foo
|
|
might be able to be manipulated to add collision data in the path.
|
|
For example, .git/annex/objects/collisiondata/../XX/YY/SHA256--foo
|
|
|
|
I think this is not a valid attack, because at least on linux,
|
|
such a symlink won't be followed, unless the
|
|
.git/annex/objects/collisiondata directory exists.
|