enableremote S3: Do not let versioning=yes be set on existing remote

Because when git-annex lacks S3 version IDs for files stored in the bucket,
deleting them would cause data loss.

Also because git-annex is not able to download unversioned objects from a bucket
when versioning=yes.

This also prevents setting versioning=no. While that would perhaps be
possible to do safely, it would add complexity, and would mean that if
the user accidentially did enableremote versioning=no, they would not be
able to undo it.

This commit was sponsored by Trenton Cronholm on Patreon.
This commit is contained in:
Joey Hess 2019-01-29 14:08:42 -04:00
parent ee011b3cbb
commit bb9817ceae
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 38 additions and 45 deletions

View file

@ -7,6 +7,9 @@ git-annex (7.20190123) UNRELEASED; urgency=medium
with versioning=yes. Needs not yet released version 0.22 of aws library;
with older versions asks the user to configure the bucket versioning
themselves.
* enableremote S3: Do not let versioning=yes be set on existing remote,
because when git-annex lacks S3 version IDs for files stored in
the bucket, deleting them would cause data loss.
-- Joey Hess <id@joeyh.name> Wed, 23 Jan 2019 12:54:56 -0400

View file

@ -148,12 +148,7 @@ s3Setup' ss u mcreds c gc
]
use fullconfig = do
case ss of
Init -> do
info <- extractS3Info fullconfig
when (versioning info) $
enableBucketVersioning (bucket info) fullconfig gc u
_ -> return ()
enableBucketVersioning ss fullconfig gc u
gitConfigSpecialRemote u fullconfig [("s3", "true")]
return (fullconfig, u)
@ -864,27 +859,31 @@ getS3VersionIDPublicUrls :: (S3Info -> BucketObject -> URLString) -> S3Info -> U
getS3VersionIDPublicUrls mk info u k =
map (s3VersionIDPublicUrl mk info) <$> getS3VersionID u k
-- Enable versioning on the bucket.
--
-- This must only be done at init time; setting versioning in a bucket
-- that git-annex has already exported files to risks losing the content of
-- those un-versioned files.
enableBucketVersioning :: S3.Bucket -> RemoteConfig -> RemoteGitConfig -> UUID -> Annex ()
enableBucketVersioning b c gc u = do
-- Enable versioning on the bucket can only be done at init time;
-- setting versioning in a bucket that git-annex has already exported
-- files to risks losing the content of those un-versioned files.
enableBucketVersioning :: SetupStage -> RemoteConfig -> RemoteGitConfig -> UUID -> Annex ()
enableBucketVersioning ss c gc u = do
info <- extractS3Info c
case ss of
Init -> when (versioning info) $
enableversioning (bucket info)
Enable oldc -> do
oldinfo <- extractS3Info oldc
when (versioning info /= versioning oldinfo) $
giveup "Cannot change versioning= of existing S3 remote."
where
enableversioning b = do
#if MIN_VERSION_aws(0,22,0)
showAction "enabling bucket versioning"
withS3Handle c gc u $ \h ->
void $ sendS3Handle h $ S3.putBucketVersioning b S3.VersioningEnabled
showAction "enabling bucket versioning"
withS3Handle c gc u $ \h ->
void $ sendS3Handle h $ S3.putBucketVersioning b S3.VersioningEnabled
#else
let ConfigKey c = remoteConfig c "s3-versioning-enabled"
showLongNote $ unlines
[ "This version of git-annex cannot auto-enable S3 bucket versioning."
, "You need to manually enable versioning in the S3 console"
, "for the bucket \"" ++ T.unpack b ++ "\""
, "https://docs.aws.amazon.com/AmazonS3/latest/user-guide/enable-versioning.html"
, "It's important you do this before storing anything in the bucket!"
]
showLongNote $ unlines
[ "This version of git-annex cannot auto-enable S3 bucket versioning."
, "You need to manually enable versioning in the S3 console"
, "for the bucket \"" ++ T.unpack b ++ "\""
, "https://docs.aws.amazon.com/AmazonS3/latest/user-guide/enable-versioning.html"
, "It's important you enable versioning before storing anything in the bucket!"
]
#endif
--
-- verifyBucketVersioningEnabled :: Annex Bool

View file

@ -35,26 +35,17 @@ only place a file was stored, they would lose data without git-annex being
run at all, so it's equivilant to them deleting the bucket, so this seems
not something it needs to worry about).
There is [an yet-unmerged pull
request](https://github.com/aristidb/aws/pull/255) to let buckets be
created with versioning enabled, that is kind of a prerequisite for this
change, otherwise the user would need to manually make the bucket and
enable versioning before initremote.
So, plan:
Plan:
* Wait for the PutBucketVersioning pull request to be merged.
* Add a remote.name.s3-versioning-enabled which needs to be true
in order for exporttree to remove files from a versioned remote.
* Enable versioning and set remote.name.s3-versioning-enabled during initremote,
when versioning=yes. If aws library is too old to enable versioning,
initremote should fail.
* Do no allow changing versioning= during enableremote.
* Any repos that used versioning=yes before this point will see removal
of files from them fail. The user can either manually set
remote.name.s3-versioning-enabled (if they are sure they enabled it from
the beginning), or can disable versioning. (Or perhaps other resolutions
to the problem, up to the user.)
(Done, not in a release yet, but will probably be aws-0.22)
* Auto-enable versioning during initremote (and not enableremote)
when versioning=yes. (Or prompt user to do it when aws is too old.)
(Done)
* Do not allow changing versioning= during enableremote.
* Any repos that previously enabled versioning after storing some
unversioned files still are at risk of data loss. Detect this
case and treat them as versioning=no. How?
# change exporttree