S3: Try to ensure bucket name is valid for archive.org.

This commit is contained in:
Joey Hess 2013-10-16 16:35:47 -04:00
parent 02522cba68
commit c76c94a0da
4 changed files with 24 additions and 20 deletions

View file

@ -102,23 +102,24 @@ s3Setup' u c = if isIA c then archiveorg else defaulthost
archiveorg = do archiveorg = do
showNote "Internet Archive mode" showNote "Internet Archive mode"
maybe (error "specify bucket=") (const noop) $ -- Ensure user enters a valid bucket name, since
getBucket archiveconfig -- this determines the name of the archive.org item.
writeUUIDFile archiveconfig u let bucket = replace " " "-" $ map toLower $
use archiveconfig fromMaybe (error "specify bucket=") $
where getBucket c
archiveconfig = let archiveconfig =
-- hS3 does not pass through x-archive-* headers -- hS3 does not pass through x-archive-* headers
M.mapKeys (replace "x-archive-" "x-amz-") $ M.mapKeys (replace "x-archive-" "x-amz-") $
-- encryption does not make sense here -- encryption does not make sense here
M.insert "encryption" "none" $ M.insert "encryption" "none" $
M.insert "bucket" bucket $
M.union c $ M.union c $
-- special constraints on key names -- special constraints on key names
M.insert "mungekeys" "ia" $ M.insert "mungekeys" "ia" $
-- bucket created only when files are uploaded -- bucket created only when files are uploaded
M.insert "x-amz-auto-make-bucket" "1" $ M.insert "x-amz-auto-make-bucket" "1" defaults
-- no default bucket name; should be human-readable writeUUIDFile archiveconfig u
M.delete "bucket" defaults use archiveconfig
store :: Remote -> Key -> AssociatedFile -> MeterUpdate -> Annex Bool store :: Remote -> Key -> AssociatedFile -> MeterUpdate -> Annex Bool
store r k _f p = s3Action r False $ \(conn, bucket) -> store r k _f p = s3Action r False $ \(conn, bucket) ->

1
debian/changelog vendored
View file

@ -25,6 +25,7 @@ git-annex (4.20131003) UNRELEASED; urgency=low
on OSX. on OSX.
* sync: Fix automatic resolution of merge conflicts where one side is an * sync: Fix automatic resolution of merge conflicts where one side is an
annexed file, and the other side is a non-annexed file, or a directory. annexed file, and the other side is a non-annexed file, or a directory.
* S3: Try to ensure bucket name is valid for archive.org.
-- Joey Hess <joeyh@debian.org> Thu, 03 Oct 2013 15:41:24 -0400 -- Joey Hess <joeyh@debian.org> Thu, 03 Oct 2013 15:41:24 -0400

View file

@ -28,3 +28,5 @@ initremote archive-moglenrepublica (Internet Archive mode) git-annex: The reques
"""]] """]]
Just thought it would be better to have a separate thread for this bug. :) Just thought it would be better to have a separate thread for this bug. :)
> [[fixed|done]] --[[Joey]]

View file

@ -30,12 +30,6 @@ rather than having git-annex pick a random one; and you can optionally
specify `x-archive-meta*` headers to add metadata as explained in their specify `x-archive-meta*` headers to add metadata as explained in their
[documentation](http://www.archive.org/help/abouts3.txt). [documentation](http://www.archive.org/help/abouts3.txt).
[[!template id=note text="""
/!\ There seems to be a [[bug|bugs/S3 buckets with capital letters breaks authentication]] in either hS3 or the archive that breaks
authentication when the bucket name contains spaces or upper-case letters..
use all lowercase and no spaces when making the bucket with `initremote`.
"""]]
# git annex initremote archive-panama type=S3 \ # git annex initremote archive-panama type=S3 \
host=s3.us.archive.org bucket=panama-canal-lock-blueprints \ host=s3.us.archive.org bucket=panama-canal-lock-blueprints \
x-archive-meta-mediatype=texts x-archive-meta-language=eng \ x-archive-meta-mediatype=texts x-archive-meta-language=eng \
@ -51,8 +45,14 @@ Then you can annex files and copy them to the remote as usual:
# git annex copy photo1.jpeg --fast --to archive-panama # git annex copy photo1.jpeg --fast --to archive-panama
copy (to archive-panama...) ok copy (to archive-panama...) ok
Note the use of the SHA1E [[backend|backends]]. It makes most sense Once a file has been stored on archive.org, it cannot be (easily) removed
to use the WORM or SHA1E backend for files that will be stored in from it. Also, git-annex whereis will tell you a public url for the file
the Internet Archive, since the key name will be exposed as the filename on archive.org. (It may take a while for archive.org to make the file
there, and since the Archive does special processing of files based on publically visibile.)
their extension.
Note the use of the SHA1E [[backend|backends]] when adding files. That is
the default backend used by git-annex, but even if you don't normally use
it, it makes most sense to use the WORM or SHA1E backend for files that
will be stored in the Internet Archive, since the key name will be exposed
as the filename there, and since the Archive does special processing of
files based on their extension.