S3: Allow removing files from IA, but warn about derived versions potentially still existing there.
Removal works, only derives are a potential issue, so allow removing with a warning. This way, unexporting a file works, and behavior is consistent with IA remotes whether or not exporttree=yes. Also tested exporting filenames containing unicode, spaces, underscores. All worked, despite the IA's faq saying it doesn't. This commit was sponsored by Trenton Cronholm on Patreon.
This commit is contained in:
parent
7f0e2a4685
commit
267f47c473
4 changed files with 33 additions and 23 deletions
|
@ -9,6 +9,8 @@ git-annex (6.20170819) UNRELEASED; urgency=medium
|
|||
* Support building with feed-1.0, while still supporting older versions.
|
||||
* init: Display an additional message when it detects a filesystem that
|
||||
allows writing to files whose write bit is not set.
|
||||
* S3: Allow removing files from IA, but warn about derived versions
|
||||
potentially still existing there.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Mon, 28 Aug 2017 12:20:59 -0400
|
||||
|
||||
|
|
25
Remote/S3.hs
25
Remote/S3.hs
|
@ -278,14 +278,17 @@ retrieveCheap _ _ _ = return False
|
|||
- While it may remove the file, there are generally other files
|
||||
- derived from it that it does not remove. -}
|
||||
remove :: S3Info -> S3Handle -> Remover
|
||||
remove info h k
|
||||
remove info h k = warnIARemoval info $ do
|
||||
res <- tryNonAsync $ sendS3Handle h $
|
||||
S3.DeleteObject (T.pack $ bucketObject info k) (bucket info)
|
||||
return $ either (const False) (const True) res
|
||||
|
||||
warnIARemoval :: S3Info -> Annex a -> Annex a
|
||||
warnIARemoval info a
|
||||
| isIA info = do
|
||||
warning "Cannot remove content from the Internet Archive"
|
||||
return False
|
||||
| otherwise = do
|
||||
res <- tryNonAsync $ sendS3Handle h $
|
||||
S3.DeleteObject (T.pack $ bucketObject info k) (bucket info)
|
||||
return $ either (const False) (const True) res
|
||||
warning "Derived versions of removed file may still be present in the Internet Archive"
|
||||
a
|
||||
| otherwise = a
|
||||
|
||||
checkKey :: Remote -> S3Info -> Maybe S3Handle -> CheckPresent
|
||||
checkKey r info Nothing k = case getpublicurl info of
|
||||
|
@ -342,7 +345,7 @@ retrieveExportS3 r info _k loc f p =
|
|||
return True
|
||||
|
||||
removeExportS3 :: Remote -> S3Info -> Key -> ExportLocation -> Annex Bool
|
||||
removeExportS3 r info _k loc =
|
||||
removeExportS3 r info _k loc = warnIARemoval info $
|
||||
catchNonAsync go (\e -> warning (show e) >> return False)
|
||||
where
|
||||
go = withS3Handle (config r) (gitconfig r) (uuid r) $ \h -> do
|
||||
|
@ -620,9 +623,9 @@ getBucketObject c = munge . key2file
|
|||
getBucketExportLocation :: RemoteConfig -> ExportLocation -> FilePath
|
||||
getBucketExportLocation c (ExportLocation loc) = getFilePrefix c ++ loc
|
||||
|
||||
{- Internet Archive limits filenames to a subset of ascii,
|
||||
- with no whitespace. Other characters are xml entity
|
||||
- encoded. -}
|
||||
{- Internet Archive documentation limits filenames to a subset of ascii.
|
||||
- While other characters seem to work now, this entity encodes everything
|
||||
- else to avoid problems. -}
|
||||
iaMunge :: String -> String
|
||||
iaMunge = (>>= munge)
|
||||
where
|
||||
|
|
|
@ -11,9 +11,10 @@ comply with their [terms of service](http://www.archive.org/about/terms.php).
|
|||
A nice added feature is that whenever git-annex sends a file to the
|
||||
Internet Archive, it records its url, the same as if you'd run `git annex
|
||||
addurl`. So any users who can clone your repository can download the files
|
||||
from archive.org, without needing any login or password info. This makes
|
||||
the Internet Archive a nice way to publish the large files associated with
|
||||
a public git repository.
|
||||
from archive.org, without needing any login or password info.
|
||||
The url to the content in the Internet Archive is also displayed by
|
||||
`git annex whereis`. This makes the Internet Archive a nice way to
|
||||
publish the large files associated with a public git repository.
|
||||
|
||||
## webapp setup
|
||||
|
||||
|
@ -50,10 +51,15 @@ Then you can annex files and copy them to the remote as usual:
|
|||
# git annex copy photo1.jpeg --fast --to archive-panama
|
||||
copy (to archive-panama...) ok
|
||||
|
||||
Once a file has been stored on archive.org, it cannot be (easily) removed
|
||||
from it. Also, git-annex whereis will tell you a public url for the file
|
||||
on archive.org. (It may take a while for archive.org to make the file
|
||||
publically visibile.)
|
||||
It may take a while for archive.org to make files publically visible after
|
||||
they've been uploaded.
|
||||
|
||||
## removing files
|
||||
|
||||
While files can be removed from the Internet Archive,
|
||||
[derived versions](https://archive.org/help/derivatives.php)
|
||||
of some files may continued to be stored there after the originals
|
||||
were removed. git-annex warns about this problem.
|
||||
|
||||
## exporting trees
|
||||
|
||||
|
@ -63,6 +69,7 @@ are important, you can run `git annex initremote` with an additional
|
|||
parameter "exporttree=yes", and then use [[git-annex-export]] to publish
|
||||
a tree of files to the Internet Archive.
|
||||
|
||||
Note that the Internet Archive does not support filenames containing
|
||||
whitespace and some other characters. Exporting such problem filenames will
|
||||
fail; you can rename the file and re-export.
|
||||
Note that the Internet Archive may not support certian characters
|
||||
in filenames ([see FAQ](http://archive.org/about/faqs.php#1099)).
|
||||
If exporting a filename fails due to such limitations, you would need
|
||||
to rename it in your git annex repository in order to export it.
|
||||
|
|
|
@ -29,8 +29,6 @@ Work is in progress. Todo list:
|
|||
Would need git-annex sync to export to the master tree?
|
||||
This is similar to the little-used preferreddir= preferred content
|
||||
setting and the "public" repository group.
|
||||
* Test export to IA via S3. In particualar, does removing an exported file
|
||||
work?
|
||||
|
||||
Low priority:
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue