remove false starts, simplify

2018-08-29 14:12:18 -04:00 · 2018-08-29 14:12:18 -04:00 · dad627fa9e
commit dad627fa9e
parent 5b78952f78
1 changed files with 7 additions and 56 deletions
--- a/doc/todo/versioning_in_export_remotes.mdwn
+++ b/doc/todo/versioning_in_export_remotes.mdwn
@ -5,62 +5,12 @@ content and it can be accessed using a version ID (that S3 returns when
 storing the content). So it should be possible for git-annex to allow
 downloading old versions of files from such a remote.
-## remote pair approach
+Basically, store the S3 version ID in git-annex branch and support
 downloading using it. 
-One way would be to have the S3 remote, when storing a file to a S3 bucket
+But this has the problem that dropping makes git-annex think it's not in S3
-that is known to support versioning, to add an url using the S3 version ID
+any more, while what we want for export is for it to be removed from the
-to the web remote.
+current bucket, but still tracked as present in S3.
 However, some remotes that support versioning won't be accessible via the
 web, so that's not a general solution.
 (Also, S3 buckets only support web access when configured to be public.)
 This generalizes to a pair of remotes, it could be S3+web or S3 could instantiate
 two remotes automatically, and use the second for versioned data.
 Note that location tracking info has to be carefully managed, to avoid
 there appearing to be two copies of data that's only really stored in one place.
 When uploading to S3, it should not yet add the url or mark the content 
 as present in the web. Then when dropping from S3, after the
 drop succeeds, it can mark the content as present in the web and add its url.
 There's a potential race there still, since the remote does not update location
 tracking when dropping, the caller of the remote does. So if S3 marks content
 as being present in the web, it will breifly appear present in both locations
 and break numcopies counting. Would need to extend the API to avoid this race.
 > Ah, but: exporttree remotes are always untrusted for other reasons,
 > so location tracking is less of a problem. Even if location tracking
 > shows the content in two places, a drop will skip the exporttree remote
 > so will only treat the pair as one copy.
 > 
 > So the location tracking problem is limited to --copies=N matching incorrectly,
 > and whereis listing both locations, and some preferred content 
 > expressions behaving in surprising ways.
 Unfortunately this remote pair approach will leak out into git-annex's interface;
 it will show two remotes. Not a problem for S3+web really, but if S3 instantiates
 an S3oldversions remote, that necessarily adds the potential for confusion,
 and adds complexity in configuration of preferred content settings, repo groups,
 etc.
 > Could flip it; make the main remote track the versioned data, and the
 > exporttree remote be secondary. Since only git-annex export/sync need to
 > access that remote, they could have a special case to look for such a
 > secondary remote and act on it. All other commands would only operate on
 > the main remote. Indeed, the secondary remote would not need to be
 > in the RemoteList at all.
 > 
 > Doesn't avoid preferred content etc complexity, still.
 ## location tracking approach
 Another way is to store the S3 version ID in git-annex branch and support
 downloading using it. But this has the problem that dropping makes
 git-annex think it's not in S3 any more, while what we want for export
 is for it to be removed from the current bucket, but still tracked as
 present in S3.
 The drop from S3 could fail, or "succeed" in a way that prevents the location
 tracking being updated to say it lacks the content. Failing is how bup deals
@ -75,7 +25,8 @@ and make at sync --content/assistant use that.
 Note that git-annex export does not rely on location tracking to determine
 which files still need to be sent to an export. It uses the export database
-to keep track of that.
+to keep track of that. This is important, because the location tracking
 won't be updated, as discussed above.
 ## final plan