From 58af57493418d80eb9fb3c4719f20442725aa7f8 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 28 Mar 2011 19:08:12 -0400 Subject: [PATCH] generalize special remote configuration storage --- doc/git-annex.mdwn | 39 ++++++--------------------- doc/internals.mdwn | 14 +++++----- doc/special_remotes/Amazon_S3.mdwn | 40 ++++++++++++++++------------ doc/walkthrough/using_Amazon_S3.mdwn | 10 +++---- 4 files changed, 40 insertions(+), 63 deletions(-) diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn index 3985addc6c..ce5b380d0e 100644 --- a/doc/git-annex.mdwn +++ b/doc/git-annex.mdwn @@ -132,21 +132,17 @@ Many git-annex commands will stage changes for later `git commit` by you. by uuid. To change the description of the current repository, use "." -* s3bucket name gpgkey [datacenter host port] +* initremote name [type=value param=value ...] - Create or updates the key of a bucket in Amazon S3. The bucket's - name can be used to configure git remote using the bucket. + Sets up a [[special_remote|special_remotes]] of some type. The remote's + type and configuration is specified by the parameters. If a remote + with the specified name has already been configured, its configuration + is modified by any values specified. In either case, the remote will be + added added to `.git/config`. - The gpgkey is a value that can be looked up (using gpg -k) to - find a gpg encryption key that will be given access to the bucket. - To disable encryption, specify "unencrypted". Note that additional gpg - keys can be given access to a bucket by running s3bucket on an existing - bucket, with a new key. + Example Amazon S3 remote: - The datacenter defaults to "US". Other values include "EU", - "us-west-1", and "ap-southeast-1". - - To use a different, S3-compatable service, specify a host and port. + initremote mys3 type=S3 encryption=none datacenter=EU * fsck [path ...] @@ -403,25 +399,6 @@ Here are all the supported configuration settings. Default ssh and rsync options to use if a remote does not have specific options. -* `remote..annex-s3-access-key-id` - - Your S3 Access Key ID. Does not need to be kept private. - If not set, the environment variable `AWS_ACCESS_KEY_ID` - will be used. - -* `remote..annex-s3-secret-access-key` - - Your S3 Secret Access Key. This is a password. - If not set, the environment variable `AWS_SECRET_ACCESS_KEY` - will be used. - -* `remote..annex-s3-storageclass` - - Storage class to use when adding new content to S3. The default - is "STANDARD". If you have configured git-annex to preserve - multiple [[copies]], consider setting this to "REDUCED_REDUNDANCY" - to save money. - * `annex.diskreserve` Amount of disk space to reserve. Disk space is checked when transferring diff --git a/doc/internals.mdwn b/doc/internals.mdwn index 55b1045a11..6296095035 100644 --- a/doc/internals.mdwn +++ b/doc/internals.mdwn @@ -30,16 +30,14 @@ space and then the description through to the end of the line. Example: e605dca6-446a-11e0-8b2a-002170d25c55 laptop 26339d22-446b-11e0-9101-002170d25c55 usb disk -## `git-annex/s3.log` +## `git-annex/remotes.log` -Associates the UUIDs of Amazon S3 buckets with a bucket nickname and connection -information. Example: +Holds persistent configuration settings for [[special_remotes]] such as +Amazon S3. - be72acb8-5901-11e0-b600-002170d25c55 mybucket s3.amazonaws.com 80 - -Note that the actual bucket name used on S3 in the above example -is "mybucket-be72acb8-5901-11e0-b600-002170d25c55". The UUID is included -in the bucket name to ensure it is globally unique. +The file format is one line per remote, starting with the uuid of the +remote, followed by a space, the name of the remote, a space, and then +a series of key=value pairs, each separated by whitespace. ## `.git-annex/trust.log` diff --git a/doc/special_remotes/Amazon_S3.mdwn b/doc/special_remotes/Amazon_S3.mdwn index ae3990a76e..42c4a54534 100644 --- a/doc/special_remotes/Amazon_S3.mdwn +++ b/doc/special_remotes/Amazon_S3.mdwn @@ -3,24 +3,36 @@ or a similar service. See [[walkthrough/using_Amazon_S3]] for usage examples. -## bucket names +## initremote parameters -When `git annex s3bucket` is used to create a new bucket, it generates a -UUID, and the name of the bucket includes that UUID, as well as the name -specified by the user. This makes for some unweidly bucket names, but -since S3 requires that bucket names be globally unique, it avoids needing -to hunt for a unused bucket name. +A number of parameters can be passed to `git annex initremote` to configure +the S3 remote. + +* `encryption` - Either "none" to disable encryption, + or a value that can be looked up (using gpg -k) to find a gpg encryption + key that will be given access to the remote. Note that additional gpg + keys can be given access to a remote by rerunning initremote with + the new key id. + +* `datacenter` - Defaults to "US". Other values include "EU", + "us-west-1", and "ap-southeast-1". + +* `storageclass` - Default is "STANDARD". If you have configured git-annex + to preserve multiple [[copies]], consider setting this to "REDUCED_REDUNDANCY" + to save money. + +* `host` and `port` - Specify in order to use a different, S3 compatable + service. ## data security -When `git annex s3bucket` is used to create an unencrypted bucket, -there is **no** protection against your data being read as it is sent -to/from S3, or by Amazon when it is stored in S3. This should only be used -for public data. +When encryption=none, there is **no** protection against your data being read +as it is sent to/from S3, or by Amazon when it is stored in S3. This should +only be used for public data. ** Encryption is not yet supported. ** -When an encrypted bucket is created, all files stored in the bucket are +When encryption is enabled, all files stored in the bucket are encrypted with gpg. Additionally, the filenames themselves are hashed to obfuscate them. The size of the encrypted files, and access patterns of the data, should be the only clues to what type of data you are storing in @@ -36,9 +48,3 @@ encrypted using one or more gpg public keys. This scheme allows new private keys to be given access to a bucket's content, after the bucket is created and is in use. The symmetric cipher is also hashed together with filenames used in the bucket, in order to obfuscate the filenames. - -To add a new gpg key to an existing bucket, just re-run `git annex -s3bucket`, specifying the new key id. For example: - - # git annex s3bucket mybucket 16D0B8EF - s3bucket (adding gpg key 16D0B8EF) ok diff --git a/doc/walkthrough/using_Amazon_S3.mdwn b/doc/walkthrough/using_Amazon_S3.mdwn index 2833a9c5a4..b87238a327 100644 --- a/doc/walkthrough/using_Amazon_S3.mdwn +++ b/doc/walkthrough/using_Amazon_S3.mdwn @@ -7,18 +7,14 @@ First, export your S3 credentials: export ANNEX_S3_ACCESS_KEY_ID="08TJMT99S3511WOZEP91" export ANNEX_S3_SECRET_ACCESS_KEY="s3kr1t" -Next, create a bucket, giving it a name and a description: +Next, create the remote. - git annex s3bucket mybucket unencrypted - s3bucket (creating mybucket...) (no encryption!) ok + git annex initremote mys3 encryption=none + initremote (creating bucket mys3-291d2fdc-5990-11e0-909a-002170d25c55...) ok **Note that encrypted buckets are not (yet) supported. Data sent to S3 is susceptible to snooping.** -Finally, configure a git remote to use the bucket you created: - - git config remote.mys3.annex-s3-bucket mybucket - Now the remote can be used like any other remote. # git annex copy my_cool_big_file --to mys3