S3 updates; gpg keys

This commit is contained in:
Joey Hess 2011-03-28 13:47:29 -04:00
parent c5fc4f3d2a
commit 3162a724f1
5 changed files with 63 additions and 18 deletions

2
debian/changelog vendored
View file

@ -1,6 +1,8 @@
git-annex (0.20110329) UNRELEASED; urgency=low
* Amazon S3 is now supported as a special type of remote.
Warning: Encrypting data before sending it to S3 is not currently
supported.
-- Joey Hess <joeyh@debian.org> Sat, 26 Mar 2011 14:36:16 -0400

View file

@ -132,20 +132,22 @@ Many git-annex commands will stage changes for later `git commit` by you.
by uuid. To change the description of the current repository, use
"."
* s3bucket name description [datacenter host port] [--key=gpgkey]
* s3bucket name gpgkey [datacenter host port]
Creates a bucket in Amazon S3. The bucket's name can be used
to configure git remote using the bucket.
Create or updates the key of a bucket in Amazon S3. The bucket's
name can be used to configure git remote using the bucket.
The gpgkey is a value that can be looked up (using gpg -k) to
find a gpg encryption key that will be given access to the bucket.
To disable encryption, specify "unencrypted". Note that additional gpg
keys can be given access to a bucket by running s3bucket on an existing
bucket, with a new key.
The datacenter defaults to "US". Other values include "EU",
"us-west-1", and "ap-southeast-1".
To use a different, S3-compatable service, specify a host and port.
By default, data (including filenames) is encrypted using gpg.
To use a key other than the default gpg key, specify it with
the --key option. To disable encryption, specify "none".
* fsck [path ...]
With no parameters, this command checks the whole annex for consistency,

View file

@ -6,8 +6,4 @@ But, git-annex also extends git's concept of remotes, with these special
types of remotes. These can be used just like any normal remote by git-annex.
They cannot be used by other git commands though.
## Amazon S3
Stores file contents in a bucket in Amazon S3 or a similar service.
Content is stored encrypted by gpg.
See [[walkthrough/using_Amazon_S3]] for examples.
* [[Amazon_S3]]

View file

@ -0,0 +1,45 @@
This special remote type stores file contents in a bucket in Amazon S3
or a similar service.
See [[walkthrough/using_Amazon_S3]] for usage examples.
## bucket names
When `git annex s3bucket` is used to create a new bucket, it generates a
UUID, and the name of the bucket includes that UUID, as well as the name
specified by the user. This makes for some unweidly bucket names, but
since S3 requires that bucket names be globally unique, it avoids needing
to hunt for a unused bucket name.
## data security
When `git annex s3bucket` is used to create an unencrypted bucket,
there is **no** protection against your data being read as it is sent
to/from S3, or by Amazon when it is stored in S3. This should only be used
for public data.
** Encryption is not yet supported. **
When an encrypted bucket is created, all files stored in the bucket are
encrypted with gpg. Additionally, the filenames themselves are hashed
to obfuscate them. The size of the encrypted files, and access patterns of
the data, should be the only clues to what type of data you are storing in
S3.
[[!template id=note text="""
This scheme was originally developed by Lars Wirzenius at al [for Obnam](http://braawi.org/obnam/encryption/).
"""]]
The data stored in S3 is encrypted by gpg with a symmetric cipher. The
passphrase of the cipher is itself checked into your git repository,
encrypted using one or more gpg public keys. This scheme allows new public
keys to be given access to a bucket's content, after the bucket is created
and is in use. It also allows revoking compromised public keys without
having to throw out the contents of the bucket. The symmetric cipher
is also hashed together with filenames used in the bucket, obfuscate
the filenames.
To add a new gpg key to an existing bucket, just re-run `git annex
s3bucket`, specifying the new key id. For example:
# git annex s3bucket mybucket 16D0B8EF
s3bucket (adding gpg key 16D0B8EF) ok

View file

@ -9,8 +9,11 @@ First, export your S3 credentials:
Next, create a bucket, giving it a name and a description:
git annex s3bucket mybucket "my Amazon S3 bucket"
s3bucket (creating mybucket...) ok
git annex s3bucket mybucket unencrypted
s3bucket (creating mybucket...) (no encryption!) ok
**Note that encrypted buckets are not (yet) supported. Data sent to S3
is susceptible to snooping.**
Finally, configure a git remote to use the bucket you created:
@ -23,7 +26,4 @@ Now the remote can be used like any other remote.
# git annex move video/hackity_hack_and_kaxxt.mov --to mys3
move video/hackity_hack_and_kaxxt.mov (to mys3...) ok
An Amazon S3 remote works just like a ssh remote, except it does not have
a git repository at the other end, and it costs you money. :) In particular,
all data is stored encrypted with gpg, so neither Amazon nor anyone in
between can see it.
See [[special_remotes/Amazon_S3]] for details.