public=yes config to send AclPublicRead

In my tests, this has to be set when uploading a file to the bucket
and then the file can be accessed using the bucketname.s3.amazonaws.com
url.

Setting it when creating the bucket didn't seem to make the whole bucket
public, or allow accessing files stored in it. But I have gone ahead and
also sent it when creating the bucket just in case that is needed in some
case.
This commit is contained in:
Joey Hess 2015-06-05 14:38:01 -04:00
parent 5211b8fc70
commit 4acd28bf21
4 changed files with 50 additions and 25 deletions

View file

@ -308,12 +308,13 @@ genBucket c u = do
Right _ -> noop Right _ -> noop
Left _ -> do Left _ -> do
showAction $ "creating bucket in " ++ datacenter showAction $ "creating bucket in " ++ datacenter
void $ sendS3Handle h $ void $ sendS3Handle h $ S3.PutBucket
S3.PutBucket (bucket info) Nothing $ (bucket info)
mkLocationConstraint $ (acl info)
T.pack datacenter locconstraint
writeUUIDFile c u info h writeUUIDFile c u info h
locconstraint = mkLocationConstraint $ T.pack datacenter
datacenter = fromJust $ M.lookup "datacenter" c datacenter = fromJust $ M.lookup "datacenter" c
{- Writes the UUID to an annex-uuid file within the bucket. {- Writes the UUID to an annex-uuid file within the bucket.
@ -430,6 +431,7 @@ data S3Info = S3Info
, metaHeaders :: [(T.Text, T.Text)] , metaHeaders :: [(T.Text, T.Text)]
, partSize :: Maybe Integer , partSize :: Maybe Integer
, isIA :: Bool , isIA :: Bool
, acl :: Maybe S3.CannedAcl
} }
extractS3Info :: RemoteConfig -> Annex S3Info extractS3Info :: RemoteConfig -> Annex S3Info
@ -445,6 +447,9 @@ extractS3Info c = do
, metaHeaders = getMetaHeaders c , metaHeaders = getMetaHeaders c
, partSize = getPartSize c , partSize = getPartSize c
, isIA = configIA c , isIA = configIA c
, acl = case M.lookup "public" c of
Just "yes" -> Just S3.AclPublicRead
_ -> Nothing
} }
putObject :: S3Info -> T.Text -> RequestBody -> S3.PutObject putObject :: S3Info -> T.Text -> RequestBody -> S3.PutObject
@ -452,6 +457,7 @@ putObject info file rbody = (S3.putObject (bucket info) file rbody)
{ S3.poStorageClass = Just (storageClass info) { S3.poStorageClass = Just (storageClass info)
, S3.poMetadata = metaHeaders info , S3.poMetadata = metaHeaders info
, S3.poAutoMakeBucket = isIA info , S3.poAutoMakeBucket = isIA info
, S3.poAcl = acl info
} }
getBucketName :: RemoteConfig -> Maybe BucketName getBucketName :: RemoteConfig -> Maybe BucketName

2
debian/changelog vendored
View file

@ -12,6 +12,8 @@ git-annex (5.20150529) UNRELEASED; urgency=medium
* import --clean-duplicates: Fix bug that didn't count local or trusted * import --clean-duplicates: Fix bug that didn't count local or trusted
repo's copy of a file as one of the necessary copies to allow removing repo's copy of a file as one of the necessary copies to allow removing
it from the import location. it from the import location.
* S3: Special remotes can be configured with public=yes to allow
the public to access the bucket's content.
-- Joey Hess <id@joeyh.name> Sat, 30 May 2015 02:07:18 -0400 -- Joey Hess <id@joeyh.name> Sat, 30 May 2015 02:07:18 -0400

View file

@ -48,6 +48,11 @@ the S3 remote.
so by default, a bucket name is chosen based on the remote name so by default, a bucket name is chosen based on the remote name
and UUID. This can be specified to pick a bucket name. and UUID. This can be specified to pick a bucket name.
* `public` - Set to "yes" to allow public read access to files sent
to the S3 remote. This is accomplished by setting an ACL when each
file is uploaded to the remote. So, it can be changed but changes
will only affect subseqent uploads.
* `partsize` - Amazon S3 only accepts uploads up to a certian file size, * `partsize` - Amazon S3 only accepts uploads up to a certian file size,
and storing larger files requires a multipart upload process. and storing larger files requires a multipart upload process.

View file

@ -2,28 +2,19 @@
(In this example, I'll assume you'll be creating a bucket in S3 named **public-annex** and a special remote in git-annex, which will store its files in the previous bucket, named **public-s3**, but change these names if you are going to do the thing for real) (In this example, I'll assume you'll be creating a bucket in S3 named **public-annex** and a special remote in git-annex, which will store its files in the previous bucket, named **public-s3**, but change these names if you are going to do the thing for real)
First, in the AWS dashboard, go to (or create) the bucket you will use at S3 and add a public get policy to it: Set up your special [S3](http://git-annex.branchable.com/special_remotes/S3/) remote with (at least) these options:
{ git annex initremote public-s3 type=s3 encryption=none bucket=public-annex chunk=0 public=yes
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::public-annex/*"
}
]
}
Then set up your special [S3](http://git-annex.branchable.com/special_remotes/S3/) remote with (at least) these options: This way git-annex will upload the files to this repo, (when you call `git
annex copy [FILES...] --to public-s3`) without encrypting them and without
chunking them. And, thanks to the public=yes, they will be
accessible by anyone with the link.
git annex initremote public-s3 type=s3 encryption=none bucket=public-annex chunk=0 (Note that public=yes was added in git-annex version 5.20150605.
If you have an older version, it will be silently ignored, and you
This way git-annex will upload the files to this repo, (when you call `git annex copy [FILES...] --to public-s3`) without encrypting them and without chunking them, and, because of the policy of the bucket, they will be accessible by anyone with the link. will instead need to use the AWS dashboard to configure a public get policy
for the bucket.)
Following the example, the files will be accessible at `http://public-annex.s3.amazonaws.com/KEY` where `KEY` is the file key created by git-annex and which you can discover running Following the example, the files will be accessible at `http://public-annex.s3.amazonaws.com/KEY` where `KEY` is the file key created by git-annex and which you can discover running
@ -31,8 +22,6 @@ Following the example, the files will be accessible at `http://public-annex.s3.a
This way you can share a link to each file you have at your S3 remote. This way you can share a link to each file you have at your S3 remote.
___________________
## Sharing all links in a folder ## Sharing all links in a folder
To share all the links in a given folder, for example, you can go to that folder and run (this is an example with the _fish_ shell, but I'm sure you can do the same in _bash_, I just don't know exactly): To share all the links in a given folder, for example, you can go to that folder and run (this is an example with the _fish_ shell, but I'm sure you can do the same in _bash_, I just don't know exactly):
@ -74,3 +63,26 @@ done | git annex registerurl
</pre> </pre>
`registerurl` was introduced in `5.20150317`. There's a todo open to ensure we don't have to do this by hand: [[todo/credentials-less access to s3]]. `registerurl` was introduced in `5.20150317`. There's a todo open to ensure we don't have to do this by hand: [[todo/credentials-less access to s3]].
## Manually configuring a public get policy
Here is how to manually configure a public get policy
for a bucket, in the AWS dashboard.
{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::public-annex/*"
}
]
}
This should not be necessary if using a new enough version
of git-annex, which can instead be configured with public=yet.