document encryption

This commit is contained in:
Joey Hess 2011-04-16 19:30:31 -04:00
parent 1247bfeaa7
commit d2e74efdb2
6 changed files with 53 additions and 67 deletions

View file

@ -1,15 +1,5 @@
git-annex mostly does not use encryption. Anyone with access to a git
repository can see all the filenames in it, its history, and can access
any annexed file contents.
Encryption is needed when using [[special_remotes]] like Amazon S3, where
file content is sent to an untrusted party who does not have access to the
git repository.
Such an encrypted remote uses strong encryption on the contents of files,
as well as the filenames. The size of the encrypted files, and access
patterns of the data, should be the only clues to what type of is stored in
such a remote.
This was the design doc for [[encryption]] and is preserved for
the curious.
[[!toc]]
@ -20,29 +10,6 @@ should be a way to tell what backend is responsible for a given filename
in an encrypted remote. (And since special remotes can also store files
unencrypted, differentiate from those as well.)
At a high level, an encryption backend needs to support these operations:
* Create a new encrypted cipher, or update the cipher. Some input
parameters will specifiy things like the gpg public keys that
can access the cipher.
* Initialize an instance of the encryption backend, that will use a
specified encrypted cipher.
* Given a key/value backend key, produce and return an encrypted key.
The same naming scheme git-annex uses for keys in regular key/value
[[backends]] can be used. So a filename for a key might be
"GPG-s12345--armoureddatahere"
* Given a streaming source of file content, encrypt it, and send it in
a stream to an action that consumes the encrypted content.
* Given a streaming source of encrypted content, decrypt it, and send
it in a stream to an action that consumes the decrypted content.
* Clean up.
The rest of this page will describe a single encryption backend using GPG.
Probably only one will be needed, but who knows? Maybe that backend will
turn out badly designed, or some other encryptor needed. Designing

35
doc/encryption.mdwn Normal file
View file

@ -0,0 +1,35 @@
git-annex mostly does not use encryption. Anyone with access to a git
repository can see all the filenames in it, its history, and can access
any annexed file contents.
Encryption is needed when using [[special_remotes]] like Amazon S3, where
file content is sent to an untrusted party who does not have access to the
git repository.
Such an encrypted remote uses strong GPG encryption on the contents of files,
as well as HMAC hashing of the filenames. The size of the encrypted files,
and access patterns of the data, should be the only clues to what is
stored in such a remote.
You should decide whether to use encryption with a special remote before
any data is stored in it. So, `git annex initremote` requires you
to specify "encryption=none" when first setting up a remote in order
to disable encryption.
If you want to use encryption, run `git annex initremote` with
"encryption=USERID". The value will be passed to `gpg` to find encryption keys.
Typically, you will say "encryption=2512E3C7" to use a specific gpg key.
Or, you might say "encryption=joey@kitenet.net" to search for matching keys.
The [[encryption_design|design/encryption]] allows additional encryption keys
to be added on to a special remote later. Once a key is added, it is able
to access content that has already been stored in the special remote.
To add a new key, just run `git annex initremote` again, specifying the
new encryption key:
git annex initremote myremote encryption=788A3F4C
Note that once a key has been given access to a remote, it's not
possible to revoke that access, short of deleting the remote. See
[[encryption_design|design/encryption]] for other security risks
associated with encryption.

View file

@ -9,11 +9,12 @@ See [[walkthrough/using_Amazon_S3]] for usage examples.
A number of parameters can be passed to `git annex initremote` to configure
the S3 remote.
* `encryption` - Required. Either "none" to disable encryption,
* `encryption` - Required. Either "none" to disable encryption
(not recommended),
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote. Note that additional gpg
keys can be given access to a remote by rerunning initremote with
the new key id.
the new key id. See [[encryption]].
* `datacenter` - Defaults to "US". Other values include "EU",
"us-west-1", and "ap-southeast-1".
@ -28,13 +29,3 @@ the S3 remote.
* `bucket` - S3 requires that buckets have a globally unique name,
so by default, a bucket name is chosen based on the remote name
and UUID. This can be specified to pick a bucket name.
## data security
When encryption=none, there is **no** protection against your data being read
as it is sent to/from S3, or by Amazon when it is stored in S3. This should
only be used for public data.
** Encryption is not yet supported. **
See [[design/encryption]].

View file

@ -15,11 +15,12 @@ for example; or clone bup's git repository to further back it up.
These parameters can be passed to `git annex initremote` to configure bup:
* `encryption` - Required. Either "none" to disable encryption,
* `encryption` - Required. Either "none" to disable encryption of content
stored in bup (ssh will still be used to transport it securely),
or a value that can be looked up (using gpg -k) to find a gpg encryption
key that will be given access to the remote. Note that additional gpg
keys can be given access to a remote by rerunning initremote with
the new key id.
the new key id. See [[encryption]].
* `buprepo` - Required. This is passed to `bup` as the `--remote`
to use to store data. To create the repository,`bup init` will be run.
@ -34,13 +35,3 @@ can be used to, for example, limit its bandwidth.
[[git-annex-shell]] does not support bup, due to the wacky way that bup
starts its server. So, to use bup, you need full shell access to the server.
## data security
When encryption=none, there is **no** protection against your data being read
by anyone who can access the bup remote. However, bup does transfer data
using ssh, and if you trust the security of the remote, that's fine.
** Encryption is not yet supported. **
See [[design/encryption]].

View file

@ -1,8 +1,8 @@
This special remote type stores file contents in directory.
One use case for this would be if you have a removable drive, that you
cannot put a git repository on for some reason, and you want to use it
to sneakernet files between systems. Just set up both systems to use
One use case for this would be if you have a removable drive that
you want to use it to sneakernet files between systems (possibly with
[[encrypted|encryption]] contents). Just set up both systems to use
the drive's mountpoint as a directory remote.
Setup example:

View file

@ -2,17 +2,19 @@ git-annex extends git's usual remotes with some [[special_remotes]], that
are not git repositories. This way you can set up a remote using say,
Amazon S3, and use git-annex to transfer files into the cloud.
**Note that encrypted buckets are not (yet) supported. Data sent to S3
is without encryption susceptible to snooping.**
First, export your S3 credentials:
# export ANNEX_S3_ACCESS_KEY_ID="08TJMT99S3511WOZEP91"
# export ANNEX_S3_SECRET_ACCESS_KEY="s3kr1t"
Next, create the S3 remote, and describe it.
Now, create a gpg key, if you don't already have one. This will be used
to encrypt everything stored in S3, for your privacy. Once you have
a gpg key, run `gpg --list-secret-keys` to look up its key id, something
like "2512E3C7"
# git annex initremote mys3 type=S3 encryption=none
Next, create the S3 remote, and describe it.
# git annex initremote mys3 type=S3 encryption=2512E3C7
initremote mys3 (checking bucket) (creating bucket in US) ok
# git annex describe mys3 "at Amazon's US datacenter"
describe mys3 ok