avoid uncessary keys db writes; doubled speed!

When running eg git-annex get, for each file it has to read from and
write to the keys database. But it's reading exclusively from one table,
and writing to a different table. So, it is not necessary to flush the
write to the database before reading. This avoids writing the database
once per file, instead it will buffer 1000 changes before writing.

Benchmarking getting 1000 small files from a local origin,
git-annex get now takes 13.62s, down from 22.41s!
git-annex drop now takes 9.07s, down from 18.63s!
Wowowowowowowow!

(It would perhaps have been better if there were separate databases for
the two tables. At least it would have avoided this complexity. Ah well,
this is better than splitting the table in a annex.version upgrade.)

Sponsored-by: Dartmouth College's Datalad project
This commit is contained in:
Joey Hess 2022-10-12 15:21:19 -04:00
parent ba7ecbc6a9
commit 6fbd337e34
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
7 changed files with 117 additions and 59 deletions

38
Database/Keys/Tables.hs Normal file
View file

@ -0,0 +1,38 @@
{- Keeping track of which tables in the keys database have changed
-
- Copyright 2022 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
module Database.Keys.Tables where
import Data.Monoid
import qualified Data.Semigroup as Sem
import Prelude
data DbTable = AssociatedTable | ContentTable
deriving (Eq, Show)
data DbTablesChanged = DbTablesChanged
{ associatedTable :: Bool
, contentTable :: Bool
}
deriving (Show)
instance Sem.Semigroup DbTablesChanged where
a <> b = DbTablesChanged
{ associatedTable = associatedTable a || associatedTable b
, contentTable = contentTable a || contentTable b
}
instance Monoid DbTablesChanged where
mempty = DbTablesChanged False False
addDbTable :: DbTablesChanged -> DbTable -> DbTablesChanged
addDbTable ts AssociatedTable = ts { associatedTable = True }
addDbTable ts ContentTable = ts { contentTable = True }
isDbTableChanged :: DbTablesChanged -> DbTable -> Bool
isDbTableChanged ts AssociatedTable = associatedTable ts
isDbTableChanged ts ContentTable = contentTable ts