partially fix concurrency issue in updating the rollingtotal

It's possible for two processes or threads to both be doing the same
operation at the same time. Eg, both dropping the same key. If one
finishes and updates the rollingtotal, then the other one needs to be
prevented from later updating the rollingtotal as well. And they could
finish at the same time, or with some time in between.

Addressed this by making updateRepoSize be called with the journal
locked, and only once it's been determined that there is an actual
location change to record in the log. updateRepoSize waits for the
database to be updated.

When there is a redundant operation, updateRepoSize won't be called,
and the redundant LiveUpdate will be removed from the database on
garbage collection.

But: There will be a window where the redundant LiveUpdate is still
visible in the db, and processes can see it, combine it with the
rollingtotal, and arrive at the wrong size. This is a small window, but
it still ought to be addressed. Unsure if it would always be safe to
remove the redundant LiveUpdate? Consider the case where two drops and a
get are all running concurrently somehow, and the order they finish is
[drop, get, drop]. The second drop seems redundant to the first, but
it would not be safe to remove it. While this seems unlikely, it's hard
to rule out that a get and drop at different stages can both be running
at the same time.
This commit is contained in:
Joey Hess 2024-08-26 09:43:32 -04:00
parent 03c7f99957
commit db89e39df6
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
6 changed files with 42 additions and 72 deletions

View file

@ -10,7 +10,6 @@
module Annex.RepoSize.LiveUpdate where
import Annex.Common
import qualified Annex
import Logs.Presence.Pure
import qualified Database.RepoSize as Db
import Annex.UUID
@ -20,22 +19,17 @@ import qualified Data.Map.Strict as M
import qualified Data.Set as S
import System.Process
{- Called when a location log change is journalled, so the LiveUpdate
- is done. This is called with the journal still locked, so no concurrent
- changes can happen while it's running. Waits for the database
- to be updated. -}
updateRepoSize :: LiveUpdate -> UUID -> Key -> LogStatus -> Annex ()
updateRepoSize lu u k s = do
liftIO $ finishedLiveUpdate lu u k sc
rsv <- Annex.getRead Annex.reposizes
liftIO (takeMVar rsv) >>= \case
Nothing -> liftIO (putMVar rsv Nothing)
Just sizemap -> do
let !sizemap' = M.adjust
(fromMaybe (RepoSize 0) . f k . Just)
u sizemap
liftIO $ putMVar rsv (Just sizemap')
updateRepoSize lu u k s = liftIO $ finishedLiveUpdate lu u k sc
where
(sc, f) = case s of
InfoPresent -> (AddingKey, addKeyRepoSize)
InfoMissing -> (RemovingKey, removeKeyRepoSize)
InfoDead -> (RemovingKey, removeKeyRepoSize)
sc = case s of
InfoPresent -> AddingKey
InfoMissing -> RemovingKey
InfoDead -> RemovingKey
addKeyRepoSize :: Key -> Maybe RepoSize -> Maybe RepoSize
addKeyRepoSize k mrs = case mrs of
@ -88,11 +82,8 @@ prepareLiveUpdate mu k sc = do
{- Wait for finishedLiveUpdate to be called, or for the LiveUpdate
- to get garbage collected in the case where the change didn't
- actually happen. -}
- actually happen. Updates the database. -}
waitdone donev finishv h u cid = tryNonAsync (takeMVar donev) >>= \case
-- TODO need to update local state too, and it must be done
-- with locking around the state update and this database
-- update.
Right (Just (u', k', sc'))
| u' == u && k' == k && sc' == sc -> do
Db.successfullyFinishedLiveSizeChange h u k sc cid