disable buggy concurrency in Command.Export

Fix a crash or potentially not all files being exported when sync -J
--content is used with an export remote.

Crash as described in fixed bug report.

waitForAllRunningCommandActions inserted in several points where all the
commandActions started before need to have finished before moving on to
the next stage of the export. A race across those points could have
maybe resulted in not all files being exported, or a wrong tree being
export.

For example, changeExport starting up an action like
a rename of A to B. Then, with that action still running, fillExport
uploading a new A, *before* the rename occurred. That race seems
unlikely to have happened. There are some other ones that this also
fixes.
This commit is contained in:
Joey Hess 2020-05-26 13:44:43 -04:00
parent 51432a6704
commit 864ba4ecaa
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 52 additions and 8 deletions

View file

@ -4,6 +4,8 @@ git-annex (8.20200523) UNRELEASED; urgency=medium
* import: Added --json-progress. * import: Added --json-progress.
* addurl: Make --preserve-filename also apply when eg a torrent contains * addurl: Make --preserve-filename also apply when eg a torrent contains
multiple files. multiple files.
* Fix a crash or potentially not all files being exported when
sync -J --content is used with an export remote.
* export: Let concurrent transfers be done with -J or annex.jobs. * export: Let concurrent transfers be done with -J or annex.jobs.
* move --to, copy --to, mirror --to: When concurrency is enabled, run * move --to, copy --to, mirror --to: When concurrency is enabled, run
cleanup actions in separate job pool from uploads. cleanup actions in separate job pool from uploads.

View file

@ -1,6 +1,6 @@
{- git-annex command-line actions and concurrency {- git-annex command-line actions and concurrency
- -
- Copyright 2010-2019 Joey Hess <id@joeyh.name> - Copyright 2010-2020 Joey Hess <id@joeyh.name>
- -
- Licensed under the GNU AGPL version 3 or higher. - Licensed under the GNU AGPL version 3 or higher.
-} -}
@ -136,6 +136,15 @@ finishCommandActions = Annex.getState Annex.workers >>= \case
else retry else retry
mapM_ mergeState sts mapM_ mergeState sts
{- Waits for all worker threads that have been started so far to finish. -}
waitForAllRunningCommandActions :: Annex ()
waitForAllRunningCommandActions = Annex.getState Annex.workers >>= \case
Nothing -> noop
Just tv -> liftIO $ atomically $ do
pool <- readTMVar tv
unless (allIdle pool)
retry
{- Like commandAction, but without the concurrency. -} {- Like commandAction, but without the concurrency. -}
includeCommandAction :: CommandStart -> CommandCleanup includeCommandAction :: CommandStart -> CommandCleanup
includeCommandAction start = includeCommandAction start =

View file

@ -136,6 +136,7 @@ changeExport r db (PreferredFiltered new) = do
(Git.DiffTree.file diff) (Git.DiffTree.file diff)
forM_ (incompleteExportedTreeishes old) $ \incomplete -> forM_ (incompleteExportedTreeishes old) $ \incomplete ->
mapdiff recover incomplete new mapdiff recover incomplete new
waitForAllRunningCommandActions
-- Diff the old and new trees, and delete or rename to new name all -- Diff the old and new trees, and delete or rename to new name all
-- changed files in the export. After this, every file that remains -- changed files in the export. After this, every file that remains
@ -158,12 +159,14 @@ changeExport r db (PreferredFiltered new) = do
(Just oldf, Nothing) -> (Just oldf, Nothing) ->
startUnexport' r db oldf ek startUnexport' r db oldf ek
_ -> stop _ -> stop
waitForAllRunningCommandActions
-- Rename from temp to new files. -- Rename from temp to new files.
seekdiffmap $ \(ek, (moldf, mnewf)) -> seekdiffmap $ \(ek, (moldf, mnewf)) ->
case (moldf, mnewf) of case (moldf, mnewf) of
(Just _oldf, Just newf) -> (Just _oldf, Just newf) ->
startMoveFromTempName r db ek newf startMoveFromTempName r db ek newf
_ -> stop _ -> stop
waitForAllRunningCommandActions
ts -> do ts -> do
warning "Resolving export conflict.." warning "Resolving export conflict.."
forM_ ts $ \oldtreesha -> do forM_ ts $ \oldtreesha -> do
@ -181,6 +184,7 @@ changeExport r db (PreferredFiltered new) = do
(\diff -> commandAction $ startUnexport r db (Git.DiffTree.file diff) (unexportboth diff)) (\diff -> commandAction $ startUnexport r db (Git.DiffTree.file diff) (unexportboth diff))
oldtreesha new oldtreesha new
updateExportTree db emptyTree new updateExportTree db emptyTree new
waitForAllRunningCommandActions
liftIO $ recordExportTreeCurrent db new liftIO $ recordExportTreeCurrent db new
-- Waiting until now to record the export guarantees that, -- Waiting until now to record the export guarantees that,
@ -239,6 +243,7 @@ fillExport r db (PreferredFiltered newtree) mtbcommitsha = do
allfilledvar <- liftIO $ newMVar (AllFilled True) allfilledvar <- liftIO $ newMVar (AllFilled True)
commandActions $ map (startExport r db cvar allfilledvar) l commandActions $ map (startExport r db cvar allfilledvar) l
void $ liftIO $ cleanup void $ liftIO $ cleanup
waitForAllRunningCommandActions
case mtbcommitsha of case mtbcommitsha of
Nothing -> noop Nothing -> noop
@ -484,3 +489,4 @@ filterPreferredContent r tree = logExportExcluded (uuid r) $ \logwriter -> do
) )
-- Always export non-annexed files. -- Always export non-annexed files.
Nothing -> return (Just ti) Nothing -> return (Just ti)

View file

@ -1,5 +1,5 @@
git annex sync exportremote -J2 --content git annex sync exportremote -J2 --content
...
git-annex: thread blocked indefinitely in an MVar operation git-annex: thread blocked indefinitely in an MVar operation
failed failed
git-annex: thread blocked indefinitely in an STM transaction git-annex: thread blocked indefinitely in an STM transaction
@ -9,5 +9,32 @@ when adding -J to export, but then found sync had the same bug.
To reproduce this, there may need there to be a tree of several annexed To reproduce this, there may need there to be a tree of several annexed
files whose content is not locally available. In my case, files whose content is not locally available. In my case,
there were 338 of them. It seems to export all, or all but 1 there were 338 of them. It seems to act on almost all before
before crashing. --[[Joey]] crashing. --[[Joey]]
----
It's crashing in finishCommandActions. DebugLocks does not show a backtrace.
Dumping the worker pool inside the crashing STM action, it looks like this:
WorkerPool UsedStages {initialStage = PerformStage, stageSet = fromList [PerformStage,CleanupStage]} [IdleWorker StartStage,ActiveWorker PerformStage,IdleWorker PerformStage,IdleWorker StartStage,IdleWorker PerformStage,IdleWorker StartStage,IdleWorker CleanupStage,IdleWorker CleanupStage,IdleWorker CleanupStage] 8
Always ends with an ActiveWorker PerformStage. So a worker thread is
apparently still running, but the retry blocks indefinitely, so
somehow the worker thread never transitions back to idle.
Also, the MVar crash is not from this code, so maybe the MVar crash is
the real culprit and it just also leads to the STM crash.
---
Added debugLocks to the MVar uses in Command.Export, and it's
the one in failedsend that is causing the MVar deadlock. So that must be
the root cause. Looks like fillExport is starting threads with
commandActions, but then assumes they'll all be done, so takes a MVar,
before all the threads *are* done, so a thread tries to modify the MVar and
deadlocks.
Ok, [[fixed|done]] by using includeCommandAction instead, although that
does reduce the actual concurrency. --[[Joey]]