git-annex/Annex/Concurrent.hs
Joey Hess 659640e224
separate queue for cleanup actions
When running multiple concurrent actions, the cleanup phase is run in a
separate queue than the main action queue. This can make some commands
faster, because less time is spent on bookkeeping in between each file
transfer.

But as far as I can see, nothing will be sped up much by this yet, because
all the existing cleanup actions are very light-weight. This is just groundwork
for deferring checksum verification to cleanup time.

This change does mean that if the user expects -J2 will mean that they see no
more than 2 jobs running at a time, they may be surprised to see 4 in some
cases (if the cleanup actions are slow enough to notice).

It might also make sense to enable background cleanup without the -J,
for at least one cleanup action. Indeed, that's the behavior that -J1
has now. At some point in the future, it make make sense to make the
behavior with no -J the same as -J1. The only reason it's not currently
is that git-annex can build w/o concurrent-output, and also any bugs
in concurrent-output (such as perhaps misbehaving on non-VT100 compatible
terminals) are avoided by default by only using it when -J is used.
2019-06-05 17:54:35 -04:00

61 lines
1.6 KiB
Haskell

{- git-annex concurrent state
-
- Copyright 2015 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
module Annex.Concurrent where
import Annex
import Annex.Common
import Annex.Action
import qualified Annex.Queue
import qualified Data.Map as M
{- Allows forking off a thread that uses a copy of the current AnnexState
- to run an Annex action.
-
- The returned IO action can be used to start the thread.
- It returns an Annex action that must be run in the original
- calling context to merge the forked AnnexState back into the
- current AnnexState.
-}
forkState :: Annex a -> Annex (IO (Annex a))
forkState a = do
st <- dupState
return $ do
(ret, newst) <- run st a
return $ do
mergeState newst
return ret
{- Returns a copy of the current AnnexState that is safe to be
- used when forking off a thread.
-
- After an Annex action is run using this AnnexState, it
- should be merged back into the current Annex's state,
- by calling mergeState.
-}
dupState :: Annex AnnexState
dupState = do
st <- Annex.getState id
return $ st
-- each thread has its own repoqueue
{ Annex.repoqueue = Nothing
-- avoid sharing eg, open file handles
, Annex.catfilehandles = M.empty
, Annex.checkattrhandle = Nothing
, Annex.checkignorehandle = Nothing
}
{- Merges the passed AnnexState into the current Annex state.
- Also closes various handles in it. -}
mergeState :: AnnexState -> Annex ()
mergeState st = do
st' <- liftIO $ snd <$> run st stopCoProcesses
forM_ (M.toList $ Annex.cleanup st') $
uncurry addCleanup
Annex.Queue.mergeFrom st'
changeState $ \s -> s { errcounter = errcounter s + errcounter st' }