S3 versioning=yes config
Not yet used. This commit was supported by the NSF-funded DataLad project.
This commit is contained in:
parent
358178fbfb
commit
0ff5a41311
4 changed files with 57 additions and 10 deletions
|
@ -5,6 +5,8 @@ content and it can be accessed using a version ID (that S3 returns when
|
|||
storing the content). So it should be possible for git-annex to allow
|
||||
downloading old versions of files from such a remote.
|
||||
|
||||
<https://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectVersioning.html>
|
||||
|
||||
Basically, store the S3 version ID in git-annex branch and support
|
||||
downloading using it.
|
||||
|
||||
|
@ -28,6 +30,17 @@ which files still need to be sent to an export. It uses the export database
|
|||
to keep track of that. This is important, because the location tracking
|
||||
won't be updated, as discussed above.
|
||||
|
||||
The haskell aws library does not seem to support enabling versioning when
|
||||
creating a bucket, so it would need to be done from the web console.
|
||||
|
||||
If the user enables versioning in git-annex but forgets to enable it
|
||||
in the bucket (or later suspends versioning in the bucket), it's no
|
||||
big problem; old files will not be retained and git-annex will notice
|
||||
this in the usual way (drop locking, fsck). So, it seems that initremote
|
||||
does not need to check if the versioning=yes setting matches the bucket
|
||||
configuration. For same reasons, it's ok to enable versioning for an
|
||||
existing remote.
|
||||
|
||||
## final plan
|
||||
|
||||
Add an "appendOnly" field to Remote, indicating it retains all content stored
|
||||
|
@ -44,15 +57,26 @@ Make exporttree=yes remotes that are appendOnly not be untrusted, and not force
|
|||
verification of content, since the usual concerns about losing data when an
|
||||
export is updated by someone else don't apply. done
|
||||
|
||||
Let S3 remotes be configured with versioned=yes or something like that
|
||||
(what does S3 call the feature?) which enables appendOnly.
|
||||
Let S3 remotes be configured with versioning=yes which enables appendOnly.
|
||||
done
|
||||
|
||||
Make S3 store version IDs for uploaded keys in the per-remote log when so
|
||||
Make S3 store version IDs for exported files in the per-remote log when so
|
||||
configured, and use them for when retrieving keys and for checkpresent.
|
||||
|
||||
Make S3 refuse to removeKey when configured appendOnly, failing with an error.
|
||||
|
||||
When a file was deleted from an exported tree, and then put back
|
||||
in a later exported tree, it might get re-uploaded even though the content
|
||||
is still retained in the versioned remote. S3 might have a way to avoid
|
||||
such a redundant upload, if so it could support using it.
|
||||
|
||||
S3 does allow DELETE of a version of an object from a bucket. So it would
|
||||
be possible to support `git annex drop` of old versions of a file from an
|
||||
export remote. Dropping the current version though, would make the export
|
||||
database inconsistent; it would not know that a file in the exported tree
|
||||
was no longer present. I don't think that inconsitency can easily be
|
||||
resolved -- bear in ming that multiple repositories can have an export db,
|
||||
so it would need to look at location tracking for all objects in the export
|
||||
to find ones that some other repository dropped. And dropping of only
|
||||
keys that are not used in the current export doesn't help because another
|
||||
repository may have changed the exported tree and be relying on the dropped
|
||||
key being present in the export. So, DELETE from an appendonly export
|
||||
won't be supported, at least for now.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue