Avoid running multiple bup split processes concurrently

Since bup split is not concurrency safe.

Used a lock file so that 2 git-annex processes only run one bup split
between them (per bup repo).

(Concurrent writes from different git-annex repository clones to the same
bup repo could still have concurrency problems.)

Sponsored-by: Noam Kremen on Patreon
This commit is contained in:
Joey Hess 2022-08-08 18:54:06 -04:00
parent 04247fb4d0
commit abd417d4fe
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 31 additions and 2 deletions

View file

@ -15,6 +15,8 @@ git-annex (10.20220725) UNRELEASED; urgency=medium
when when core.untrackedCache is set, and broke git-annex init.
* Improve output when storing to bup.
* When bup split fails, display its stderr.
* Avoid running multiple bup split processes concurrently, since
bup is not concurrency safe.
-- Joey Hess <id@joeyh.name> Mon, 25 Jul 2022 15:35:45 -0400

View file

@ -5,13 +5,14 @@
- Licensed under the GNU AGPL version 3 or higher.
-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE RankNTypes, OverloadedStrings #-}
module Remote.Bup (remote) where
import qualified Data.Map as M
import qualified Data.ByteString as S
import qualified Data.ByteString.Lazy as L
import qualified System.FilePath.ByteString as P
import Data.ByteString.Lazy.UTF8 (fromString)
import Control.Concurrent.Async
@ -35,6 +36,8 @@ import Utility.Hash
import Utility.UserInfo
import Annex.UUID
import Annex.Ssh
import Annex.LockFile
import Annex.Perms
import Utility.Metered
import Types.ProposedAccepted
@ -155,7 +158,7 @@ bupSplitParams r buprepo k src =
(os ++ [Param "-q", Param "-n", Param (bupRef k)] ++ src)
store :: Remote -> BupRepo -> Storer
store r buprepo = byteStorer $ \k b p -> do
store r buprepo = byteStorer $ \k b p -> lockBup r $ do
liftIO $ withNullHandle $ \nullh ->
let params = bupSplitParams r buprepo k []
cmd = (proc "bup" (toCommand params))
@ -183,6 +186,17 @@ store r buprepo = byteStorer $ \k b p -> do
" (stderr output: " ++ erroutput ++ ")"
go _ _ _ _ _ _ = error "internal"
{- Bup is not concurrency safe, so use a lock file to prevent more than
- one process from running. -}
lockBup :: Remote -> Annex a -> Annex a
lockBup r a = do
dir <- fromRepo gitAnnexRemotesDir
unlessM (liftIO $ doesDirectoryExist (fromRawFilePath dir)) $
createAnnexDirectory dir
let remoteid = fromUUID (uuid r)
let lck = dir P.</> remoteid <> ".lck"
withExclusiveLock (const lck) a
retrieve :: BupRepo -> Retriever
retrieve buprepo = byteRetriever $ \k sink -> do
let params = bupParams "join" buprepo [Param $ bupRef k]

View file

@ -0,0 +1,13 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2022-08-08T20:13:24Z"
content="""
Got confirmation that bup is generally not concurrency safe.
I've made git-annex limit the number of bup-split it runs to 1.
It may be that this will also need to be done with bup-join, but I think
probably not since it probably does not write to the repo, and a bup-split
is unlikely to get in its way.
"""]]