6fbd337e34
When running eg git-annex get, for each file it has to read from and write to the keys database. But it's reading exclusively from one table, and writing to a different table. So, it is not necessary to flush the write to the database before reading. This avoids writing the database once per file, instead it will buffer 1000 changes before writing. Benchmarking getting 1000 small files from a local origin, git-annex get now takes 13.62s, down from 22.41s! git-annex drop now takes 9.07s, down from 18.63s! Wowowowowowowow! (It would perhaps have been better if there were separate databases for the two tables. At least it would have avoided this complexity. Ah well, this is better than splitting the table in a annex.version upgrade.) Sponsored-by: Dartmouth College's Datalad project
38 lines
1 KiB
Haskell
38 lines
1 KiB
Haskell
{- Keeping track of which tables in the keys database have changed
|
|
-
|
|
- Copyright 2022 Joey Hess <id@joeyh.name>
|
|
-
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
|
-}
|
|
|
|
module Database.Keys.Tables where
|
|
|
|
import Data.Monoid
|
|
import qualified Data.Semigroup as Sem
|
|
import Prelude
|
|
|
|
data DbTable = AssociatedTable | ContentTable
|
|
deriving (Eq, Show)
|
|
|
|
data DbTablesChanged = DbTablesChanged
|
|
{ associatedTable :: Bool
|
|
, contentTable :: Bool
|
|
}
|
|
deriving (Show)
|
|
|
|
instance Sem.Semigroup DbTablesChanged where
|
|
a <> b = DbTablesChanged
|
|
{ associatedTable = associatedTable a || associatedTable b
|
|
, contentTable = contentTable a || contentTable b
|
|
}
|
|
|
|
instance Monoid DbTablesChanged where
|
|
mempty = DbTablesChanged False False
|
|
|
|
addDbTable :: DbTablesChanged -> DbTable -> DbTablesChanged
|
|
addDbTable ts AssociatedTable = ts { associatedTable = True }
|
|
addDbTable ts ContentTable = ts { contentTable = True }
|
|
|
|
isDbTableChanged :: DbTablesChanged -> DbTable -> Bool
|
|
isDbTableChanged ts AssociatedTable = associatedTable ts
|
|
isDbTableChanged ts ContentTable = contentTable ts
|