[The Internet Archive](http://www.archive.org/) allows members to upload collections using an Amazon S3 [compatible API](http://www.archive.org/help/abouts3.txt), and this can be used with git-annex's [[special_remotes/S3]] support. So, you can locally archive things with git-annex, define remotes that correspond to "items" at the Internet Archive, and use git-annex to upload your files to there. Of course, your use of the Internet Archive must comply with their [terms of service](http://www.archive.org/about/terms.php). A nice added feature is that whenever git-annex sends a file to the Internet Archive, it records its url, the same as if you'd run `git annex addurl`. So any users who can clone your repository can download the files from archive.org, without needing any login or password info. This makes the Internet Archive a nice way to publish the large files associated with a public git repository. ## webapp setup Just go to "Add Another Repository", pick "Internet Archive", and you're on your way. ## basic setup Sign up for an account, and get your access keys here: # export AWS_ACCESS_KEY_ID=blahblah # export AWS_SECRET_ACCESS_KEY=xxxxxxx Specify `host=s3.us.archive.org` when doing `initremote` to set up a remote at the Archive. This will enable a special Internet Archive mode: Encryption is not allowed; you are required to specify a bucket name rather than having git-annex pick a random one; and you can optionally specify `x-archive-meta*` headers to add metadata as explained in their [documentation](http://www.archive.org/help/abouts3.txt). # git annex initremote archive-panama type=S3 \ host=s3.us.archive.org bucket=panama-canal-lock-blueprints \ x-archive-meta-mediatype=texts x-archive-meta-language=eng \ x-archive-meta-title="original Panama Canal lock design blueprints" initremote archive-panama (Internet Archive mode) ok # git annex describe archive-panama "a man, a plan, a canal: panama" describe archive-panama ok Then you can annex files and copy them to the remote as usual: # git annex add photo1.jpeg --backend=SHA256E add photo1.jpeg (checksum...) ok # git annex copy photo1.jpeg --fast --to archive-panama copy (to archive-panama...) ok Once a file has been stored on archive.org, it cannot be (easily) removed from it. Also, git-annex whereis will tell you a public url for the file on archive.org. (It may take a while for archive.org to make the file publically visibile.) ## exporting trees By default, files stored in the Internet Archive will show up there named by their git-annex key, not the original filename. If the filenames are important, you can run `git annex initremote` with an additional parameter "exporttree=yes", and then use [[git-annex-export]] to publish a tree of files to the Internet Archive. Note that the Internet Archive does not support filenames containing whitespace and some other characters. Exporting such problem filenames will fail; you can rename the file and re-export.