From e65455fddab64ed168a816830b4a6aa79d36beef Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Thu, 18 Apr 2019 16:38:15 -0400 Subject: [PATCH] devblog --- .../day_581__starting_import_from_S3.mdwn | 27 +++++++++++++++++++ 1 file changed, 27 insertions(+) create mode 100644 doc/devblog/day_581__starting_import_from_S3.mdwn diff --git a/doc/devblog/day_581__starting_import_from_S3.mdwn b/doc/devblog/day_581__starting_import_from_S3.mdwn new file mode 100644 index 0000000000..a9682bd0d7 --- /dev/null +++ b/doc/devblog/day_581__starting_import_from_S3.mdwn @@ -0,0 +1,27 @@ +Started today on `git annex import` from S3, in the "import-from-s3" +branch. + +It looks like I'm going to support both versioned and unversioned buckets; +the latter will need --force to initialize since it can lose data. + +One thought I had about that is: It's probably better for git-annex to be +able to import data from an unversioned S3 bucket with caveats about +avoiding unsafe operations (export) that could lose data, than it is for +git-annex to not be able to import from the bucket at all, guaranteeing +that past versions of modified files will be lost. (Rationalization is a +powerful drug.) + +To support unversioned buckets, some kind of stable content identifier is +needed other than the S3 version id. Luckily, S3 has etags, which are +md5sum of the content, so will work great. But, the `aws` haskell library +needs one small change to return an etag, so this will be +blocked on that change. + +I've gotten listing importable contents from S3 working for unversioned +buckets, including dealing with S3's 1000 item limit by paging. +Listing importable contents from versioned buckets is harder, because +it needs to synthesize a git version history from the information that S3 +provides. I think I have a method for doing this that will generate the +trees that users will expect to see, and also will generate the same past +trees every time, avoiding a proliferation of git trees. Next step: +Converting my prose description of how to do that into haskell.