S3 export (untested)
It opens a http connection per file exported, but then so does git annex copy --to s3. Decided not to munge exported filenames for IA. Too large a chance of the munging having confusing results. Instead, export of files not supported by IA, eg with spaces in their name, will fail. This commit was supported by the NSF-funded DataLad project.
This commit is contained in:
parent
a1b195d84c
commit
44cd5ae313
5 changed files with 121 additions and 64 deletions
|
@ -55,31 +55,14 @@ from it. Also, git-annex whereis will tell you a public url for the file
|
|||
on archive.org. (It may take a while for archive.org to make the file
|
||||
publically visibile.)
|
||||
|
||||
Note the use of the SHA256E [[backend|backends]] when adding files. That is
|
||||
the default backend used by git-annex, but even if you don't normally use
|
||||
it, it makes most sense to use the WORM or SHA256E backend for files that
|
||||
will be stored in the Internet Archive, since the key name will be exposed
|
||||
as the filename there, and since the Archive does special processing of
|
||||
files based on their extension.
|
||||
## exporting trees
|
||||
|
||||
## publishing only one subdirectory
|
||||
By default, files stored in the Internet Archive will show up there named
|
||||
by their git-annex key, not the original filename. If the filenames
|
||||
are important, you can run `git annex initremote` with an additional
|
||||
parameter "exporttree=yes", and then use [[git-annex-export]] to publish
|
||||
a tree of files to the Internet Archive.
|
||||
|
||||
Perhaps you have a repository with lots of files in it, and only want
|
||||
to publish some of them to a particular Internet Archive item. Of course
|
||||
you can specify which files to send manually, but it's useful to
|
||||
configure [[preferred_content]] settings so git-annex knows what content
|
||||
you want to store in the Internet Archive.
|
||||
|
||||
One way to do this is using the "public" repository type.
|
||||
|
||||
git annex enableremote archive-panama preferreddir=panama
|
||||
git annex wanted archive-panama standard
|
||||
git annex group archive-panama public
|
||||
|
||||
Now anything in a "panama" directory will be sent to that remote,
|
||||
and anything else won't. You can use `git annex copy --auto` or the
|
||||
assistant and it'll do the right thing.
|
||||
|
||||
When setting up an Internet Archive item using the webapp, this
|
||||
configuration is automatically done, using an item name that the user
|
||||
enters as the name of the subdirectory.
|
||||
Note that the Internet Archive does not support filenames containing
|
||||
whitespace and some other characters. Exporting such problem filenames will
|
||||
fail; you can rename the file and re-export.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue