fix deadlock in restagePointerFiles

Fix a hang that occasionally occurred during commands such as move.
(A bug introduced in 10.20220927, in
commit 6a3bd283b8)

The restage.log was kept locked while running a complex index refresh
action. In an unusual situation, that action could need to write to the
restage log, which caused a deadlock.

The solution is a two-stage process. First the restage.log is moved to a
work file, which is done with the lock held. Then the content of the work
file is read and processed, which happens without the lock being held.
This is all done in a crash-safe manner.

Note that streamRestageLog may not be fully safe to run concurrently
with itself. That's ok, because restagePointerFiles uses it with the
index lock held, so only one can be run at a time.

streamRestageLog does delete the restage.old file at the end without
locking. If a calcRestageLog is run concurrently, it will either see the
file content before it was deleted, or will see it's missing. Either is
ok, because at most this will cause calcRestageLog to report more
work remains to be done than there is.

Sponsored-by: Dartmouth College's Datalad project
This commit is contained in:
Joey Hess 2022-12-08 14:18:54 -04:00
parent a7554f1a6a
commit 65f9e7a3c7
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 75 additions and 15 deletions

View file

@ -13,8 +13,10 @@ module Logs.File (
appendLogFile,
modifyLogFile,
streamLogFile,
streamLogFileUnsafe,
checkLogFile,
calcLogFile,
calcLogFileUnsafe,
) where
import Annex.Common
@ -98,7 +100,12 @@ checkLogFile f lck matchf = withSharedLock lck $ bracket setup cleanup go
-- | Folds a function over lines of a log file to calculate a value.
calcLogFile :: RawFilePath -> RawFilePath -> t -> (L.ByteString -> t -> t) -> Annex t
calcLogFile f lck start update = withSharedLock lck $ bracket setup cleanup go
calcLogFile f lck start update =
withSharedLock lck $ calcLogFileUnsafe f start update
-- | Unsafe version that does not do locking.
calcLogFileUnsafe :: RawFilePath -> t -> (L.ByteString -> t -> t) -> Annex t
calcLogFileUnsafe f start update = bracket setup cleanup go
where
setup = liftIO $ tryWhenExists $ openFile f' ReadMode
cleanup Nothing = noop
@ -125,7 +132,15 @@ calcLogFile f lck start update = withSharedLock lck $ bracket setup cleanup go
-- is running.
streamLogFile :: FilePath -> RawFilePath -> Annex () -> (String -> Annex ()) -> Annex ()
streamLogFile f lck finalizer processor =
withExclusiveLock lck $ bracketOnError setup cleanup go
withExclusiveLock lck $ do
streamLogFileUnsafe f finalizer processor
liftIO $ writeFile f ""
setAnnexFilePerm (toRawFilePath f)
-- Unsafe version that does not do locking, and does not empty the file
-- at the end.
streamLogFileUnsafe :: FilePath -> Annex () -> (String -> Annex ()) -> Annex ()
streamLogFileUnsafe f finalizer processor = bracketOnError setup cleanup go
where
setup = liftIO $ tryWhenExists $ openFile f ReadMode
cleanup Nothing = noop
@ -135,8 +150,6 @@ streamLogFile f lck finalizer processor =
mapM_ processor =<< liftIO (lines <$> hGetContents h)
liftIO $ hClose h
finalizer
liftIO $ writeFile f ""
setAnnexFilePerm (toRawFilePath f)
createDirWhenNeeded :: RawFilePath -> Annex () -> Annex ()
createDirWhenNeeded f a = a `catchNonAsync` \_e -> do