49 lines
2.3 KiB
Markdown
49 lines
2.3 KiB
Markdown
[The Internet Archive](http://www.archive.org/) allows members to upload
|
|
collections using an Amazon S3
|
|
[compatible API](http://www.archive.org/help/abouts3.txt), and this can
|
|
be used with git-annex's [[special_remotes/S3]] support.
|
|
|
|
So, you can locally archive things with git-annex, define remotes that
|
|
correspond to "items" at the Internet Archive, and use git-annex to upload
|
|
your files to there. Of course, your use of the Internet Archive must
|
|
comply with their [terms of service](http://www.archive.org/about/terms.php).
|
|
|
|
Sign up for an account, and get your access keys here:
|
|
<http://www.archive.org/account/s3.php>
|
|
|
|
# export AWS_ACCESS_KEY_ID=blahblah
|
|
# export AWS_SECRET_ACCESS_KEY=xxxxxxx
|
|
|
|
Specify `host=s3.us.archive.org` when doing `initremote` to set up
|
|
a remote at the Archive. This will enable a special Internet Archive mode:
|
|
Encryption is not allowed; you are required to specify a bucket name
|
|
rather than having git-annex pick a random one; and you can optionally
|
|
specify `x-archive-meta*` headers to add metadata as explained in their
|
|
[documentation](http://www.archive.org/help/abouts3.txt).
|
|
|
|
[[!template id=note text="""
|
|
/!\ There seems to be a bug in either hS3 or the archive that breaks
|
|
authentication when the bucket name contains spaces or upper-case letters..
|
|
use all lowercase and no spaces when making the bucket with `initremote`.
|
|
"""]]
|
|
|
|
# git annex initremote archive-panama type=S3 \
|
|
host=s3.us.archive.org bucket=panama-canal-lock-blueprints \
|
|
x-archive-meta-mediatype=texts x-archive-meta-language=eng \
|
|
x-archive-meta-title="original Panama Canal lock design blueprints"
|
|
initremote archive-panama (Internet Archive mode) (checking bucket) (creating bucket in US) ok
|
|
# git annex describe archive-panama "Internet Archive item for my grandfather's Panama Canal lock design blueprints"
|
|
describe archive-panama ok
|
|
|
|
Then you can annex files and copy them to the remote as usual:
|
|
|
|
# git annex add photo1.jpeg --backend=SHA1E
|
|
add photo1.jpeg (checksum...) ok
|
|
# git annex copy photo1.jpeg --fast --to archive-panama
|
|
copy (to archive-panama...) ok
|
|
|
|
Note the use of the SHA1E [[backend|backends]]. It makes most sense
|
|
to use the WORM or SHA1E backend for files that will be stored in
|
|
the Internet Archive, since the key name will be exposed as the filename
|
|
there, and since the Archive does special processing of files based on
|
|
their extension.
|