git-annex/Database/Keys/Handle.hs
Joey Hess 6fbd337e34
avoid uncessary keys db writes; doubled speed!
When running eg git-annex get, for each file it has to read from and
write to the keys database. But it's reading exclusively from one table,
and writing to a different table. So, it is not necessary to flush the
write to the database before reading. This avoids writing the database
once per file, instead it will buffer 1000 changes before writing.

Benchmarking getting 1000 small files from a local origin,
git-annex get now takes 13.62s, down from 22.41s!
git-annex drop now takes 9.07s, down from 18.63s!
Wowowowowowowow!

(It would perhaps have been better if there were separate databases for
the two tables. At least it would have avoided this complexity. Ah well,
this is better than splitting the table in a annex.version upgrade.)

Sponsored-by: Dartmouth College's Datalad project
2022-10-12 15:33:16 -04:00

69 lines
1.7 KiB
Haskell

{- Handle for the Keys database.
-
- Copyright 2015 Joey Hess <id@joeyh.name>
-:
- Licensed under the GNU AGPL version 3 or higher.
-}
module Database.Keys.Handle (
DbHandle,
newDbHandle,
DbState(..),
withDbState,
flushDbQueue,
closeDbHandle,
) where
import qualified Database.Queue as H
import Database.Keys.Tables
import Utility.Exception
import Utility.DebugLocks
import Control.Concurrent
import Control.Monad.IO.Class (liftIO, MonadIO)
import Control.Applicative
import Prelude
-- The MVar is always left full except when actions are run
-- that access the database.
newtype DbHandle = DbHandle (MVar DbState)
-- The database can be closed or open, but it also may have been
-- tried to open (for read) and didn't exist yet or is not readable.
data DbState = DbClosed | DbOpen (H.DbQueue, DbTablesChanged) | DbUnavailable
newDbHandle :: IO DbHandle
newDbHandle = DbHandle <$> newMVar DbClosed
-- Runs an action on the state of the handle, which can change its state.
-- The MVar is empty while the action runs, which blocks other users
-- of the handle from running.
withDbState
:: (MonadIO m, MonadCatch m)
=> DbHandle
-> (DbState -> m (v, DbState))
-> m v
withDbState (DbHandle mvar) a = do
st <- liftIO $ debugLocks $ takeMVar mvar
go st `onException` (liftIO $ debugLocks $ putMVar mvar st)
where
go st = do
(v, st') <- a st
liftIO $ debugLocks $ putMVar mvar st'
return v
flushDbQueue :: DbHandle -> IO ()
flushDbQueue h = withDbState h go
where
go (DbOpen (qh, _)) = do
H.flushDbQueue qh
return ((), DbOpen (qh, mempty))
go st = return ((), st)
closeDbHandle :: DbHandle -> IO ()
closeDbHandle h = withDbState h go
where
go (DbOpen (qh, _)) = do
H.closeDbQueue qh
return ((), DbClosed)
go st = return ((), st)