2018-08-28 16:14:06 +00:00
|
|
|
Some remotes like S3 support versioning of data stored in them.
|
|
|
|
When git-annex updates an export, it deletes the old
|
|
|
|
content from eg the S3 bucket, but with versioning enabled, S3 retains the
|
|
|
|
content and it can be accessed using a version ID (that S3 returns when
|
|
|
|
storing the content). So it should be possible for git-annex to allow
|
|
|
|
downloading old versions of files from such a remote.
|
|
|
|
|
2018-08-30 17:45:28 +00:00
|
|
|
<https://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectVersioning.html>
|
|
|
|
|
2018-08-29 18:12:18 +00:00
|
|
|
Basically, store the S3 version ID in git-annex branch and support
|
|
|
|
downloading using it.
|
2018-08-28 16:14:06 +00:00
|
|
|
|
2018-08-29 18:12:18 +00:00
|
|
|
But this has the problem that dropping makes git-annex think it's not in S3
|
|
|
|
any more, while what we want for export is for it to be removed from the
|
|
|
|
current bucket, but still tracked as present in S3.
|
2018-08-28 16:14:06 +00:00
|
|
|
|
|
|
|
The drop from S3 could fail, or "succeed" in a way that prevents the location
|
|
|
|
tracking being updated to say it lacks the content. Failing is how bup deals
|
2018-08-29 17:59:52 +00:00
|
|
|
with it. It seems confusing to have a drop appear to succeed but not really drop,
|
|
|
|
especially since dropping again would seem to do something a second time.
|
|
|
|
|
|
|
|
This does mean that git-annex drop/sync --content/assistant might try to do a
|
|
|
|
lot of drops from the remote, and generate a lot of noise when they fail.
|
|
|
|
Which is kind of ok for drop, since the user should be told that they can't
|
|
|
|
delete the data. Could add a way to say "this remote does not support drop",
|
|
|
|
and make at sync --content/assistant use that.
|
|
|
|
|
|
|
|
Note that git-annex export does not rely on location tracking to determine
|
|
|
|
which files still need to be sent to an export. It uses the export database
|
2018-08-29 18:12:18 +00:00
|
|
|
to keep track of that. This is important, because the location tracking
|
|
|
|
won't be updated, as discussed above.
|
2018-08-29 17:59:52 +00:00
|
|
|
|
2018-08-30 17:45:28 +00:00
|
|
|
The haskell aws library does not seem to support enabling versioning when
|
|
|
|
creating a bucket, so it would need to be done from the web console.
|
|
|
|
|
2018-08-31 17:56:32 +00:00
|
|
|
## other considerations
|
|
|
|
|
|
|
|
If the user enables versioning in git-annex but forgets to enable it
|
|
|
|
in the bucket (or later suspends versioning in the bucket), it's no
|
|
|
|
big problem; old files will not be retained and git-annex will notice
|
|
|
|
this in the usual way (drop locking, fsck). So, it seems that initremote
|
|
|
|
does not need to check if the versioning=yes setting matches the bucket
|
|
|
|
configuration. For same reasons, it's ok to enable versioning for an
|
|
|
|
existing remote.
|
|
|
|
|
|
|
|
S3 does allow DELETE of a version of an object from a bucket. So it would
|
|
|
|
be possible to support `git annex drop` of old versions of a file from an
|
|
|
|
export remote. Dropping the current version though, would make the export
|
|
|
|
database inconsistent; it would not know that a file in the exported tree
|
|
|
|
was no longer present. I don't think that inconsitency can easily be
|
|
|
|
resolved -- bear in ming that multiple repositories can have an export db,
|
|
|
|
so it would need to look at location tracking for all objects in the export
|
|
|
|
to find ones that some other repository dropped. And dropping of only
|
|
|
|
keys that are not used in the current export doesn't help because another
|
|
|
|
repository may have changed the exported tree and be relying on the dropped
|
|
|
|
key being present in the export. Unless... Could export conflict resultion
|
|
|
|
somehow detect that?
|
|
|
|
|
2018-08-31 18:01:24 +00:00
|
|
|
## final plan
|
2018-08-29 17:59:52 +00:00
|
|
|
|
|
|
|
Add an "appendOnly" field to Remote, indicating it retains all content stored
|
2018-08-30 15:18:20 +00:00
|
|
|
in it. done
|
2018-08-29 17:59:52 +00:00
|
|
|
|
|
|
|
Make `git annex export` check appendOnly when removing a file from an
|
|
|
|
export, and not update the location log, since the remote still contains
|
2018-08-30 15:18:20 +00:00
|
|
|
the content. done
|
2018-08-29 17:59:52 +00:00
|
|
|
|
|
|
|
Make git-annex sync and the assistant skip trying to drop from appendOnly
|
2018-08-30 15:41:44 +00:00
|
|
|
remotes since it's just going to fail. done
|
2018-08-29 17:59:52 +00:00
|
|
|
|
2018-08-30 15:41:44 +00:00
|
|
|
Make exporttree=yes remotes that are appendOnly not be untrusted, and not force
|
2018-08-29 17:59:52 +00:00
|
|
|
verification of content, since the usual concerns about losing data when an
|
2018-08-30 15:41:44 +00:00
|
|
|
export is updated by someone else don't apply. done
|
|
|
|
|
2018-08-30 17:45:28 +00:00
|
|
|
Let S3 remotes be configured with versioning=yes which enables appendOnly.
|
|
|
|
done
|
2018-08-30 15:41:44 +00:00
|
|
|
|
2018-08-30 17:45:28 +00:00
|
|
|
Make S3 store version IDs for exported files in the per-remote log when so
|
2018-08-30 18:22:26 +00:00
|
|
|
configured. done
|
|
|
|
|
2018-08-30 18:47:52 +00:00
|
|
|
Use version IDs when retrieving keys and for checkpresent. done
|
2018-08-30 18:22:26 +00:00
|
|
|
|
2018-08-31 18:01:24 +00:00
|
|
|
> all done! (but see [[support_public_versioned_S3_access]]) --[[Joey]]
|