Commit graph

29746 commits

Author SHA1 Message Date
Joey Hess
3f876f72e3
larger headings 2017-02-27 16:17:19 -04:00
Joey Hess
b78703ca4e
devblog 2017-02-27 16:11:35 -04:00
Joey Hess
e53070c1ff
inheritable annex.securehashesonly
* init: When annex.securehashesonly has been set with git-annex config,
  copy that value to the annex.securehashesonly git config.
* config --set: As well as setting value in git-annex branch,
  set local gitconfig. This is needed especially for
  annex.securehashesonly, which is read only from local gitconfig and not
  the git-annex branch.

doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn has the
rationalle for doing it this way. There's no perfect solution; this
seems to be the least-bad one.

This commit was supported by the NSF-funded DataLad project.
2017-02-27 16:08:23 -04:00
Joey Hess
6e0e7d885c
update 2017-02-27 15:32:04 -04:00
Joey Hess
c33363dfa7
early cancelation of transfer that annex.securehashesonly prohibits
This avoids sending all the data to a remote, only to have it reject it
because it has annex.securehashesonly set. It assumes that local and
remote will have the same annex.securehashesonly setting in most cases.
If a remote does not have that set, and local does, the remote won't get
some content it would otherwise accept.

Also avoids downloading data that will not be added to the local object
store due to annex.securehashesonly.

Note that, while encrypted special remotes use a GPGHMAC key variety,
which is not collisiton resistent, Transfers are not used for such
keys, so this check is avoided. Which is what we want, so encrypted
special remotes still work.

This commit was sponsored by Ewen McNeill.
2017-02-27 15:21:24 -04:00
Joey Hess
9db064f50c
reorg 2017-02-27 15:04:03 -04:00
Joey Hess
49114cf4ea
securehash matching
Added --securehash option to match files using a secure hash function, and
corresponding securehash preferred content expression.

This commit was sponsored by Ethan Aubin.
2017-02-27 15:02:44 -04:00
Joey Hess
31275754f5
mapM_ = sequence_ . map 2017-02-27 14:48:07 -04:00
Joey Hess
942e0174b3
make fsck check annex.securehashesonly, and new tip for working around SHA1 collisions with git-annex
This commit was sponsored by andrea rota.
2017-02-27 13:55:15 -04:00
Joey Hess
07f1e638ee
annex.securehashesonly
Cryptographically secure hashes can be forced to be used in a repository,
by setting annex.securehashesonly. This does not prevent the git repository
from containing files with insecure hashes, but it does prevent the content
of such files from being pulled into .git/annex/objects from another
repository.

We want to make sure that at no point does git-annex accept content into
.git/annex/objects that is hashed with an insecure key. Here's how it
was done:

* .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be
  written to it normally
* So every place that writes content must call, thawContent or modifyContent.
  We can audit for these, and be sure we've considered all cases.
* The main functions are moveAnnex, and linkToAnnex; these were made to
  check annex.securehashesonly, and are the main security boundary
  for annex.securehashesonly.
* Most other calls to modifyContent deal with other files in the KEY
  directory (inode cache etc). The other ones that mess with the content
  are:
	- Annex.Direct.toDirectGen, in which content already in the
	  annex directory is moved to the direct mode file, so not relevant.
	- fix and lock, which don't add new content
	- Command.ReKey.linkKey, which manually unlocks it to make a
	  copy.
* All other calls to thawContent appear safe.

Made moveAnnex return a Bool, so checked all callsites and made them
deal with a failure in appropriate ways.

linkToAnnex simply returns LinkAnnexFailed; all callsites already deal
with it failing in appropriate ways.

This commit was sponsored by Riku Voipio.
2017-02-27 13:33:59 -04:00
Joey Hess
0fda7c08d0
add cryptographicallySecure
Note that GPGHMAC keys are not cryptographically secure, because their
content has no relation to the name of the key. So, things that use this
function to avoid sending keys to a remote will need to special case in
support for those keys. If GPGHMAC keys were accepted as
cryptographically secure, symlinks using them could be committed to a
git repo, and their content would be accepted into the repo, with no
guarantee that two repos got the same content, which is what we're aiming
to prevent.
2017-02-27 12:54:06 -04:00
Joey Hess
5e24e3ffe7
Merge branch 'master' of ssh://git-annex.branchable.com 2017-02-26 14:55:11 -04:00
Joey Hess
1dcd68a149
fix osxapp target
Broken by recent changes to other targets
2017-02-26 14:54:24 -04:00
michalrus
b4f7979391 Added a comment 2017-02-26 00:59:21 +00:00
openmedi
9bb93e2129 Added a comment 2017-02-25 20:35:53 +00:00
Joey Hess
e8bf942dc4
move thoughts 2017-02-25 15:00:22 -04:00
michalrus
03826e9759 2017-02-25 18:53:27 +00:00
Joey Hess
a463ba6e8a
more thoughts 2017-02-25 14:49:44 -04:00
michalrus
5fb21f1260 Added a comment 2017-02-25 18:47:36 +00:00
Joey Hess
d512098cbb
further thoughts 2017-02-25 12:55:38 -04:00
Joey Hess
78eb21cf88
remove cryptohash from debian build-dep option 2017-02-24 21:06:29 -04:00
Joey Hess
40327cab6e
Removed support for building with the old cryptohash library.
Building with that library made git-annex not support SHA3; it's time for
that to always be supported in case SHA2 dominoes.
2017-02-24 20:56:26 -04:00
Joey Hess
622b3fface
devblog 2017-02-24 20:03:36 -04:00
Joey Hess
6b52fcbb7e
SHA1 collisions in key names was more exploitable than I thought
Yesterday's SHA1 collision attack could be used to generate eg:

SHA256-sfoo--whatever.good
SHA256-sfoo--whatever.bad

Such that they collide. A repository with the good one could have the
bad one swapped in and signed commits would still verify.

I've already mitigated this.
2017-02-24 19:54:36 -04:00
Joey Hess
27eca014be
fix up Read instance incompatability caused by recent commit
9c4650358c changed the Read instance for
Key.

I've checked all uses of that instance (by removing it and seeing what
breaks), and they're all limited to the webapp, except one.
That is GitAnnexDistribution's Read instance.

So, 9c4650358c would have broken upgrades
of git-annex from downloads.kitenet.net. Once the .info files there got
updated for a new release, old releases would have failed to parse them
and never upgraded.

To fix this, I found a way to make the .info files that contain
GitAnnexDistribution values be readable by the old version of git-annex.

This commit was sponsored by Ewen McNeill.
2017-02-24 18:59:12 -04:00
Joey Hess
634a485b50
update 2017-02-24 17:57:21 -04:00
Joey Hess
1f0d0ab4b3
Revert "pointer to a todo"
This reverts commit ae3f6705eb.

todo is not ready yet
2017-02-24 15:40:28 -04:00
Joey Hess
9c4650358c
add KeyVariety type
Where before the "name" of a key and a backend was a string, this makes
it a concrete data type.

This is groundwork for allowing some varieties of keys to be disabled
in file2key, so git-annex won't use them at all.

Benchmarks ran in my big repo:

old git-annex info:

real	0m3.338s
user	0m3.124s
sys	0m0.244s

new git-annex info:

real	0m3.216s
user	0m3.024s
sys	0m0.220s

new git-annex find:

real	0m7.138s
user	0m6.924s
sys	0m0.252s

old git-annex find:

real	0m7.433s
user	0m7.240s
sys	0m0.232s

Surprising result; I'd have expected it to be slower since it now parses
all the key varieties. But, the parser is very simple and perhaps
sharing KeyVarieties uses less memory or something like that.

This commit was supported by the NSF-funded DataLad project.
2017-02-24 15:16:56 -04:00
Joey Hess
ca0daa8bb8
factor non-type stuff out of Key 2017-02-24 13:42:30 -04:00
Joey Hess
ae3f6705eb
pointer to a todo 2017-02-24 13:41:29 -04:00
Joey Hess
9de0767d0e
update 2017-02-24 12:31:23 -04:00
Joey Hess
6346704a04
clarify that annex.backends is used when adding new files
Even if annex.backends does not include a backend, that does not prevent
git-annex commands from acting on a file using the missing backend.

(There's really no reason at all for annex.backends to be a list.)
2017-02-24 11:53:59 -04:00
Joey Hess
8971949d60
Merge branch 'master' of ssh://git-annex.branchable.com 2017-02-24 11:33:57 -04:00
Joey Hess
02d3fbbd8b
add back a configure target
Otherwise, make reconfigures every time and then rebuilds all files.

I went too far in 3af9f5ed1a. All that's
needed is to make the configure target not use Build/SysConfig.hs as the
target name, so make won't delete that file after a failed build.

This commit was supported by the NSF-funded DataLad project
2017-02-24 11:29:48 -04:00
Joey Hess
35739a74c2
make file2key reject E* backend keys with a long extension
I am not happy that I had to put backend-specific code in file2key. But
it would be very difficult to avoid this layering violation.

Most of the time, when parsing a Key from a symlink target, git-annex
never looks up its Backend at all, so adding this check to a method of
the Backend object would not work.

The Key could be made to contain the appropriate
Backend, but since Backend is parameterized on an "a" that is fixed to
the Annex monad later, that would need Key to change to "Key a".

The only way to clean this up that I can see would be to have the Key
contain a LowlevelBackend, and put the validation in LowlevelBackend.
Perhaps later, but that would be an extensive change, so let's not do
it in this commit which may want to cherry-pick to backports.

This commit was sponsored by Ethan Aubin.
2017-02-24 11:22:15 -04:00
benjamin.poldrack@d09ccff6d42dd20277610b59867cf7462927b8e3
4a7ae6f9c0 Added a comment 2017-02-24 13:00:10 +00:00
Joey Hess
63df8d8966
update 2017-02-24 02:14:36 -04:00
Joey Hess
44b9ac41a4
update 2017-02-24 01:21:54 -04:00
Joey Hess
102e04b30c
typo 2017-02-24 00:29:37 -04:00
Joey Hess
4cad401629
updates 2017-02-24 00:28:15 -04:00
Joey Hess
969da82b5c
update 2017-02-24 00:21:58 -04:00
Joey Hess
60d99a80a6
Tighten key parser to not accept keys containing a non-numeric fields, which could be used to embed data useful for a SHA1 attack against git.
Also todo about why this is important, and with some further hardening to
add.

This commit was sponsored by Ignacio on Patreon.
2017-02-24 00:17:25 -04:00
Joey Hess
0dec2257f0
Merge branch 'master' of ssh://git-annex.branchable.com 2017-02-23 19:08:03 -04:00
Joey Hess
5a88cab005
add para 2017-02-23 19:06:06 -04:00
unicell@9c0b0afd4176d5933d4b5c41350ebe61488c1df0
342e256bc5 Added a comment 2017-02-23 23:05:10 +00:00
Joey Hess
3afc7d83f2
noCommit for PostReceive
This was noticed because it broke the datalad test suite, which pushed
to the remote and then fetched to check if it had received the expected
branches. Auto-init caused the git-annex branch on the remote to
diverge, breaking that test.

https://github.com/datalad/datalad/issues/1319#issuecomment-281649518

The auto-init still happens, it's staged in the journal, and will be
commited by some later git-annex command when it runs. Which is fine,
it's the same as that later command doing the auto-init.

This commit was supported by the NSF-funded DataLad project
2017-02-23 18:37:02 -04:00
Joey Hess
9bee19ed38
slight correction 2017-02-23 17:11:46 -04:00
Joey Hess
aa8ab352f2
Merge branch 'master' of ssh://git-annex.branchable.com 2017-02-23 16:44:07 -04:00
Joey Hess
aae9e15a97
devblog 2017-02-23 16:43:15 -04:00
benjamin.poldrack@d09ccff6d42dd20277610b59867cf7462927b8e3
9f9d7ae029 Added a comment 2017-02-22 16:48:04 +00:00