Merge branch 'encryption'

This commit is contained in:
Joey Hess 2013-09-05 00:09:11 -04:00
commit fc7b5cfe7d
30 changed files with 448 additions and 242 deletions

View file

@ -29,3 +29,6 @@ git-annex: user error (gpg ["--quiet","--trust-model","always","--encrypt","--no
failed
git-annex: enableremote: 1 failed
"""]]
> [[done]]; can now use: `git annex enableremote foo keyid-=REVOKEDKEY
> keyid+=NEWKEY` to remove it, and add a new key. --[[Joey]]

View file

@ -103,14 +103,17 @@ use the special remote.
## risks
A risk of this scheme is that, once the symmetric cipher has been obtained, it
allows full access to all the encrypted content. This scheme does not allow
revoking a given gpg key access to the cipher, since anyone with such a key
could have already decrypted the cipher and stored a copy.
A risk of this scheme is that, once the symmetric cipher has been
obtained, it allows full access to all the encrypted content. Indeed
anyone owning a key that used to be granted access could already have
decrypted the cipher and stored a copy. While it is in possible to
remove a key with `keyid-=`, it is designed for a
[[completely_different_purpose|/encryption]] and does not actually revoke
access.
If git-annex stores the decrypted symmetric cipher in memory, then there
is a risk that it could be intercepted from there by an attacker. Gpg
amelorates these type of risks by using locked memory. For git-annex, note
ameliorates these type of risks by using locked memory. For git-annex, note
that an attacker with local machine access can tell at least all the
filenames and metadata of files stored in the encrypted remote anyway,
and can access whatever content is stored locally.

View file

@ -6,20 +6,90 @@ Encryption is needed when using [[special_remotes]] like Amazon S3, where
file content is sent to an untrusted party who does not have access to the
git repository.
Such an encrypted remote uses strong GPG encryption on the contents of files,
as well as HMAC hashing of the filenames. The size of the encrypted files,
and access patterns of the data, should be the only clues to what is
stored in such a remote.
Such an encrypted remote uses strong ([[symmetric|design/encryption]] or
asymmetric) encryption on the contents of files, as well as HMAC hashing
of the filenames. The size of the encrypted files, and access patterns
of the data, should be the only clues to what is stored in such a
remote.
You should decide whether to use encryption with a special remote before
any data is stored in it. So, `git annex initremote` requires you
to specify "encryption=none" when first setting up a remote in order
to disable encryption.
to disable encryption. To use encryption, you run
run `git-annex initremote` in one of these ways:
If you want to use encryption, run `git annex initremote` with
"encryption=USERID". The value will be passed to `gpg` to find encryption keys.
Typically, you will say "encryption=2512E3C7" to use a specific gpg key.
Or, you might say "encryption=joey@kitenet.net" to search for matching keys.
* `git annex initremote newremote type=... encryption=hybrid keyid=KEYID ...`
* `git annex initremote newremote type=... encryption=shared`
* `git annex initremote newremote type=... encryption=pubkey keyid=KEYID ...`
## hybrid encryption keys
The [[hybrid_key_design|design/encryption]] allows additional
encryption keys to be added on to a special remote later. Due to this
flexability, it is the default and recommended encryption scheme.
git annex initremote newremote type=... [encryption=hybrid] keyid=KEYID ...
Here the KEYID(s) are passed to `gpg` to find encryption keys.
Typically, you will say "keyid=2512E3C7" to use a specific gpg key.
Or, you might say "keyid=joey@kitenet.net" to search for matching keys.
To add a new key and allow it to access all the content that is stored
in the encrypted special remote, just run `git annex
enableremote` specifying the new encryption key:
git annex enableremote myremote keyid+=788A3F4C
While a key can later be removed from the list, note that
that will **not** necessarily prevent the owner of the key
from accessing data on the remote (which is by design impossible to prevent,
short of deleting the remote). In fact the only sound use of `keyid-=` is
probably to replace a revoked key:
git annex enableremote myremote keyid-=2512E3C7 keyid+=788A3F4C
See also [[encryption_design|design/encryption]] for other security
risks associated with encryption.
## shared encryption key
Alternatively, you can configure git-annex to use a shared cipher to
encrypt data stored in a remote. This shared cipher is stored,
**unencrypted** in the git repository. So it's shared among every
clone of the git repository.
git annex initremote newremote type=... encryption=shared
The advantage is you don't need to set up gpg keys. The disadvantage is
that this is **insecure** unless you trust every clone of the git
repository with access to the encrypted data stored in the special remote.
## regular public key encryption
This alternative simply encrypts the files in the special remotes to one or
more public keys. It might be considered more secure due to its simplicity
and since it's exactly the way everyone else uses gpg.
git annex initremote newremote type=.... encryption=pubkey keyid=KEYID ...
A disavantage is that is not easy to later add additional public keys
to the special remote. While the `enableremote` parameters `keyid+=` and
`keyid-=` can be used, they have **no effect** on files that are already
present on the remote. Probably the only use for these parameters is
to replace a revoked key:
git annex enableremote myremote keyid-=2512E3C7 keyid+=788A3F4C
But even in this case, since the files are not re-encrypted, the revoked
key has to be kept around to be able to decrypt those files.
(Of course, if the reason for revocation is
that the key has been compromised, it is **insecure** to leave files
encrypted using that old key, and the user should re-encrypt everything.)
(Because filenames are MAC'ed, a cipher still needs to be
generated (and encrypted to the given key IDs).)
## MAC algorithm
The default MAC algorithm to be applied on the filenames is HMACSHA1. A
stronger one, for instance HMACSHA512, one can be chosen upon creation
@ -27,29 +97,3 @@ of the special remote with the option `mac=HMACSHA512`. The available
MAC algorithms are HMACSHA1, HMACSHA224, HMACSHA256, HMACSHA384, and
HMACSHA512. Note that it is not possible to change algorithm for a
non-empty remote.
The [[encryption_design|design/encryption]] allows additional encryption keys
to be added on to a special remote later. Once a key is added, it is able
to access content that has already been stored in the special remote.
To add a new key, just run `git annex enableremote` specifying the
new encryption key:
git annex enableremote myremote encryption=788A3F4C
Note that once a key has been given access to a remote, it's not
possible to revoke that access, short of deleting the remote. See
[[encryption_design|design/encryption]] for other security risks
associated with encryption.
## shared cipher mode
Alternatively, you can configure git-annex to use a shared cipher to
encrypt data stored in a remote. This shared cipher is stored,
**unencrypted** in the git repository. So it's shared amoung every
clone of the git repository. The advantage is you don't need to set up gpg
keys. The disadvantage is that this is **insecure** unless you
trust every clone of the git repository with access to the encrypted data
stored in the special remote.
To use shared encryption, specify "encryption=shared" when first setting
up a special remote.

View file

@ -307,19 +307,30 @@ subdirectories).
types of special remotes need different configuration values. The
command will prompt for parameters as needed.
All special remotes support encryption. You must either specify
encryption=none to disable encryption, or use encryption=keyid
(or encryption=emailaddress) to specify a gpg key that can access
the encrypted special remote.
All special remotes support encryption. You can either specify
`encryption=none` to disable encryption, or specify
`encryption=hybrid keyid=$keyid ...` to specify a gpg key id (or an email
address accociated with a key.
Note that with encryption enabled, a gpg key is created. This requires
sufficient entropy. If initremote seems to hang or take a long time
while generating the key, you may want to ctrl-c it and re-run with --fast,
which causes it to use a lower-quality source of randomness.
There are actually three schemes that can be used for management of the
encryption keys. When using the encryption=hybrid scheme, additional
gpg keys can be given access to the encrypted special remote easily
(without re-encrypting everything). When using encryption=shared,
a shared key is generated and stored in the git repository, allowing
anyone who can clone the git repository to access it. Finally, when using
encryption=pubkey, content in the special remote is directly encrypted
to the specified gpg keys, and additional ones cannot easily be given
access.
Note that with encryption enabled, a cryptographic key is created.
This requires sufficient entropy. If initremote seems to hang or take
a long time while generating the key, you may want to ctrl-c it and
re-run with --fast, which causes it to use a lower-quality source of
randomness.
Example Amazon S3 remote:
git annex initremote mys3 type=S3 encryption=me@example.com datacenter=EU
git annex initremote mys3 type=S3 encryption=hybrid keyid=me@example.com datacenter=EU
* enableremote name [param=value ...]
@ -335,11 +346,28 @@ subdirectories).
For example, the directory special remote requires a directory= parameter.
This command can also be used to modify the configuration of an existing
special remote, by specifying new values for parameters that were originally
set when using initremote. For example, to add a new gpg key to the keys
that can access an encrypted remote:
special remote, by specifying new values for parameters that were
originally set when using initremote. (However, some settings such as
the as the encryption scheme cannot be changed once a special remote
has been created.)
git annex enableremote mys3 encryption=friend@example.com
The gpg keys that an encrypted special remote is encrypted to can be
changed using the keyid+= and keyid-= parameters. These respectively
add and remove keys from the list. However, note that removing a key
does NOT necessarily prevent the key's owner from accessing data
in the encrypted special remote
(which is by design impossible, short of deleting the remote).
One use-case of keyid-= is to replace a revoked key with
a new key:
git annex enableremote mys3 keyid-=revokedkey keyid+=newkey
Also, note that for encrypted special remotes using plain public-key
encryption (encryption=pubkey), adding or removing a key has NO effect
on files that have already been copied to the remote. Hence using
keyid+= and keyid-= with such remotes should be used with care, and
make little sense except in cases like the revoked key example above.
* trust [repository ...]

View file

@ -15,13 +15,10 @@ can read inside the local git repository.
A number of parameters can be passed to `git annex initremote` to configure
the S3 remote.
* `encryption` - Required. Either "none" to disable encryption (not recommended),
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote, or "shared" which allows
every clone of the repository to access the encrypted data (use with caution).
* `encryption` - One of "none", "hybrid", "shared", or "pubkey".
See [[encryption]].
Note that additional gpg keys can be given access to a remote by
running enableremote with the new key id. See [[encryption]].
* `keyid` - Specifies the gpg key to use for [[encryption]].
* `embedcreds` - Optional. Set to "yes" embed the login credentials inside
the git repository, which allows other clones to also access them. This is

View file

@ -19,14 +19,10 @@ for example; or clone bup's git repository to further back it up.
These parameters can be passed to `git annex initremote` to configure bup:
* `encryption` - Required. Either "none" to disable encryption of content
stored in bup (ssh will still be used to transport it securely),
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote, or "shared" which allows
every clone of the repository to access the encrypted data (use with caution).
* `encryption` - One of "none", "hybrid", "shared", or "pubkey".
See [[encryption]].
Note that additional gpg keys can be given access to a remote by
running enableremote with the new key id. See [[encryption]].
* `keyid` - Specifies the gpg key to use for [[encryption]].
* `buprepo` - Required. This is passed to `bup` as the `--remote`
to use to store data. To create the repository,`bup init` will be run.

View file

@ -14,13 +14,10 @@ Instead, you should use a regular `git clone` of your git-annex repository.
These parameters can be passed to `git annex initremote` to configure the
remote:
* `encryption` - Required. Either "none" to disable encryption,
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote, or "shared" which allows
every clone of the repository to decrypt the encrypted data.
* `encryption` - One of "none", "hybrid", "shared", or "pubkey".
See [[encryption]].
Note that additional gpg keys can be given access to a remote by
running enableremote with the new key id. See [[encryption]].
* `keyid` - Specifies the gpg key to use for [[encryption]].
* `chunksize` - Avoid storing files larger than the specified size in the
directory. For use on directories on mount points that have file size

View file

@ -21,13 +21,10 @@ can read inside the local git repository.
A number of parameters can be passed to `git annex initremote` to configure
the Glacier remote.
* `encryption` - Required. Either "none" to disable encryption (not recommended),
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote, or "shared" which allows
every clone of the repository to access the encrypted data (use with caution).
* `encryption` - One of "none", "hybrid", "shared", or "pubkey".
See [[encryption]].
Note that additional gpg keys can be given access to a remote by
running enableremote with the new key id. See [[encryption]].
* `keyid` - Specifies the gpg key to use for [[encryption]].
* `embedcreds` - Optional. Set to "yes" embed the login credentials inside
the git repository, which allows other clones to also access them. This is

View file

@ -25,17 +25,14 @@ Can you spot the potential data loss bugs in the above simple example?
These parameters can be passed to `git annex initremote`:
* `encryption` - Required. Either "none" to disable encryption,
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote, or "shared" which allows
every clone of the repository to access the encrypted data.
Note that additional gpg keys can be given access to a remote by
running enableremote with the new key id. See [[encryption]].
* `hooktype` - Required. This specifies a collection of hooks to use for
this remote.
* `encryption` - One of "none", "hybrid", "shared", or "pubkey".
See [[encryption]].
* `keyid` - Specifies the gpg key to use for [[encryption]].
## hooks
Each type of hook remote is specified by a collection of hook commands.

View file

@ -2,26 +2,22 @@ This special remote type rsyncs file contents to somewhere else.
Setup example:
# git annex initremote myrsync type=rsync rsyncurl=rsync://rsync.example.com/myrsync encryption=joey@kitenet.net
# git annex initremote myrsync type=rsync rsyncurl=rsync://rsync.example.com/myrsync keyid=joey@kitenet.net
# git annex describe myrsync "rsync server"
Or for using rsync over SSH
# git annex initremote myrsync type=rsync rsyncurl=ssh.example.com:/myrsync encryption=joey@kitenet.net
# git annex initremote myrsync type=rsync rsyncurl=ssh.example.com:/myrsync keyid=joey@kitenet.net
# git annex describe myrsync "rsync server"
## configuration
These parameters can be passed to `git annex initremote` to configure rsync:
* `encryption` - Required. Either "none" to disable encryption of content
stored on the remote rsync server,
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote, or "shared" which allows
every clone of the repository to decrypt the encrypted data.
* `encryption` - One of "none", "hybrid", "shared", or "pubkey".
See [[encryption]].
Note that additional gpg keys can be given access to a remote by
running enableremote with the new key id. See [[encryption]].
* `keyid` - Specifies the gpg key to use for [[encryption]].
* `rsyncurl` - Required. This is the url or `hostname:/directory` to
pass to rsync to tell it where to store content.

View file

@ -10,13 +10,10 @@ can read inside the local git repository.
A number of parameters can be passed to `git annex initremote` to configure
the webdav remote.
* `encryption` - Required. Either "none" to disable encryption (not recommended),
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote, or "shared" which allows
every clone of the repository to access the encrypted data (use with caution).
* `encryption` - One of "none", "hybrid", "shared", or "pubkey".
See [[encryption]].
Note that additional gpg keys can be given access to a remote by
running enableremote with the new key id. See [[encryption]].
* `keyid` - Specifies the gpg key to use for [[encryption]].
* `embedcreds` - Optional. Set to "yes" embed the login credentials inside
the git repository, which allows other clones to also access them. This is
@ -42,4 +39,4 @@ the webdav remote.
Setup example:
# WEBDAV_USERNAME=joey@kitenet.net WEBDAV_PASSWORD=xxxxxxx git annex initremote box.com type=webdav url=https://www.box.com/dav/git-annex chunksize=75mb encryption=joey@kitenet.net
# WEBDAV_USERNAME=joey@kitenet.net WEBDAV_PASSWORD=xxxxxxx git annex initremote box.com type=webdav url=https://www.box.com/dav/git-annex chunksize=75mb keyid=joey@kitenet.net

View file

@ -16,7 +16,7 @@ like "2512E3C7"
Next, create the Glacier remote.
# git annex initremote glacier type=glacier encryption=2512E3C7
# git annex initremote glacier type=glacier keyid=2512E3C7
initremote glacier (encryption setup with gpg key C910D9222512E3C7) (gpg) ok
The configuration for the Glacier remote is stored in git. So to make another

View file

@ -14,7 +14,7 @@ like "2512E3C7"
Next, create the S3 remote, and describe it.
# git annex initremote cloud type=S3 encryption=2512E3C7
# git annex initremote cloud type=S3 keyid=2512E3C7
initremote cloud (encryption setup with gpg key C910D9222512E3C7) (checking bucket) (creating bucket in US) (gpg) ok
# git annex describe cloud "at Amazon's US datacenter"
describe cloud ok

View file

@ -5,7 +5,7 @@ for providing 50 gb of free storage if you sign up with its Android client.
git-annex can use Box as a [[special remote|special_remotes]].
Recent versions of git-annex make this very easy to set up:
WEBDAV_USERNAME=you@example.com WEBDAV_PASSWORD=xxxxxxx git annex initremote box.com type=webdav url=https://www.box.com/dav/git-annex chunksize=75mb encryption=you@example.com
WEBDAV_USERNAME=you@example.com WEBDAV_PASSWORD=xxxxxxx git annex initremote box.com type=webdav url=https://www.box.com/dav/git-annex chunksize=75mb encryption=shared
Note the use of chunksize; Box has a 100 mb maximum file size, and this
breaks up large files into chunks before that limit is reached.