S3: Detect when version=yes but an exported file lacks versioning, and refuse to delete it, to avoid data loss.
This commit was sponsored by Denis Dzyubenko on Patreon.
This commit is contained in:
parent
a4f71aa1e8
commit
a8f1add4d1
3 changed files with 46 additions and 22 deletions
|
@ -10,6 +10,8 @@ git-annex (7.20190123) UNRELEASED; urgency=medium
|
|||
* enableremote S3: Do not let versioning=yes be set on existing remote,
|
||||
because when git-annex lacks S3 version IDs for files stored in
|
||||
the bucket, deleting them would cause data loss.
|
||||
* S3: Detect when version=yes but an exported file lacks versioning,
|
||||
and refuse to delete it, to avoid data loss.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Wed, 23 Jan 2019 12:54:56 -0400
|
||||
|
||||
|
|
24
Remote/S3.hs
24
Remote/S3.hs
|
@ -383,7 +383,7 @@ retrieveExportS3 u info mh _k loc f p =
|
|||
exporturl = bucketExportLocation info loc
|
||||
|
||||
removeExportS3 :: UUID -> S3Info -> Maybe S3Handle -> Key -> ExportLocation -> Annex Bool
|
||||
removeExportS3 _u info (Just h) _k loc =
|
||||
removeExportS3 u info (Just h) k loc = checkVersioning info u $
|
||||
catchNonAsync go (\e -> warning (show e) >> return False)
|
||||
where
|
||||
go = do
|
||||
|
@ -406,7 +406,8 @@ checkPresentExportS3 u info Nothing k loc = case getPublicUrlMaker info of
|
|||
|
||||
-- S3 has no move primitive; copy and delete.
|
||||
renameExportS3 :: UUID -> S3Info -> Maybe S3Handle -> Key -> ExportLocation -> ExportLocation -> Annex Bool
|
||||
renameExportS3 _u info (Just h) _k src dest = catchNonAsync go (\_ -> return False)
|
||||
renameExportS3 u info (Just h) k src dest = checkVersioning info u k src $
|
||||
catchNonAsync go (\_ -> return False)
|
||||
where
|
||||
go = do
|
||||
let co = S3.copyObject (bucket info) dstobject
|
||||
|
@ -887,3 +888,22 @@ enableBucketVersioning ss c gc u = do
|
|||
, "It's important you enable versioning before storing anything in the bucket!"
|
||||
]
|
||||
#endif
|
||||
|
||||
-- If the remote has versioning enabled, but the version ID is for some
|
||||
-- reason not being recorded, it's not safe to perform an action that
|
||||
-- will remove the unversioned file. The file may be the only copy of an
|
||||
-- annex object.
|
||||
--
|
||||
-- This code could be removed eventually, since enableBucketVersioning
|
||||
-- will avoid this situation. Before that was added, some remotes
|
||||
-- were created without versioning, some unversioned files exported to
|
||||
-- them, and then versioning enabled, and this is to avoid data loss in
|
||||
-- those cases.
|
||||
checkVersioning :: S3Info -> UUID -> Key -> Annex Bool -> Annex Bool
|
||||
checkVersioning info u k a
|
||||
| versioning info = getS3VersionID u k >>= \case
|
||||
[] -> do
|
||||
warning $ "Remote is configured to use versioning, but no S3 version ID is recorded for this key, so it cannot safely be modified."
|
||||
return False
|
||||
_ -> a
|
||||
| otherwise = a
|
||||
|
|
|
@ -3,10 +3,6 @@ it, and then it's later changed to also have versioning=yes, an exporttree
|
|||
that removes some of the original files can lose the only remaining copy of
|
||||
them.
|
||||
|
||||
exporttree does not currently check numcopies before removing from an
|
||||
export. Normally all export remotes are untrusted, so they can't count as a
|
||||
copy, and so removing something from them cannot violate numcopies.
|
||||
|
||||
An appendonly remote, such as S3 with exporttree=yes, is supposed to not
|
||||
let git-annex remove content from it. So such a remote can be not
|
||||
untrusted, and exporttree can remove content from its exported tree without
|
||||
|
@ -19,25 +15,31 @@ appendonly remote. When exporttree removes a file from that S3 remote,
|
|||
it could have contained the only copy of the file, and it may not have
|
||||
versioning info for that file, so the only copy is lost.
|
||||
|
||||
S3 remotes could refuse to allow versioning=yes to be set during
|
||||
enableremote, and only allow it at initremote time. And check that the
|
||||
bucket does indeed have versioning enabled or refuse to allow that
|
||||
configuration. That would avoid the problem.
|
||||
## Migration advice for users affected by this bug
|
||||
|
||||
(Unless the user changed the bucket configuration later to not allow
|
||||
versioning. But if they did so, and an old version of the bucket was the
|
||||
only place a file was stored, they would lose data without git-annex being
|
||||
run at all, so it's equivilant to them deleting the bucket, so this seems
|
||||
not something it needs to worry about).
|
||||
If you think your S3 remote may be affected by this problem, you should
|
||||
immediately set it to untrusted to avoid data loss:
|
||||
`git annex untrust $mys3remotename`
|
||||
|
||||
Plan:
|
||||
If you see a warning message "Remote is configured to use versioning, but no S3 version ID is recorded for this key",
|
||||
your S3 remote is affected.
|
||||
|
||||
Also, the fixed git-annex (version 7.20190129) will detect the problem,
|
||||
and refuse to delete unversioned files from your versioned S3 bucket.
|
||||
|
||||
This will leave you with a S3 remote containing some versioned and some
|
||||
unversioned files. Kind of a mess. Best thing to do is to make a new
|
||||
S3 remote, with versioning=yes exporttree=yes set from the beginning,
|
||||
and copy all the content that was in the old S3 remote over to it.
|
||||
Then you can delete the old S3 bucket, and use `git annex dead` to
|
||||
make git-annex stop using it.
|
||||
|
||||
## Fix
|
||||
|
||||
* Wait for the PutBucketVersioning pull request to be merged.
|
||||
(Done, not in a release yet, but will probably be aws-0.22)
|
||||
* Auto-enable versioning during initremote (and not enableremote)
|
||||
when versioning=yes. (Or prompt user to do it when aws is too old.)
|
||||
(Done)
|
||||
* Do not allow changing versioning= during enableremote.
|
||||
* Any repos that previously enabled versioning after storing some
|
||||
unversioned files still are at risk of data loss. Detect this
|
||||
case and treat them as versioning=no. How?
|
||||
* Make removeExport and renameExport check that
|
||||
there is a S3 version ID known, and fail if not.
|
||||
|
||||
[[done]] --[[Joey]]
|
||||
|
|
Loading…
Reference in a new issue