git-annex/Annex/WorkTree.hs
Joey Hess 6fbd337e34
avoid uncessary keys db writes; doubled speed!
When running eg git-annex get, for each file it has to read from and
write to the keys database. But it's reading exclusively from one table,
and writing to a different table. So, it is not necessary to flush the
write to the database before reading. This avoids writing the database
once per file, instead it will buffer 1000 changes before writing.

Benchmarking getting 1000 small files from a local origin,
git-annex get now takes 13.62s, down from 22.41s!
git-annex drop now takes 9.07s, down from 18.63s!
Wowowowowowowow!

(It would perhaps have been better if there were separate databases for
the two tables. At least it would have avoided this complexity. Ah well,
this is better than splitting the table in a annex.version upgrade.)

Sponsored-by: Dartmouth College's Datalad project
2022-10-12 15:33:16 -04:00

58 lines
1.8 KiB
Haskell

{- git-annex worktree files
-
- Copyright 2013-2021 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
module Annex.WorkTree where
import Annex.Common
import Annex.Link
import Annex.CatFile
import Annex.CurrentBranch
import qualified Database.Keys
{- Looks up the key corresponding to an annexed file in the work tree,
- by examining what the file links to.
-
- An unlocked file will not have a link on disk, so fall back to
- looking for a pointer to a key in git.
-
- When in an adjusted branch that may have hidden the file, looks for a
- pointer to a key in the original branch.
-}
lookupKey :: RawFilePath -> Annex (Maybe Key)
lookupKey = lookupKey' catkeyfile
where
catkeyfile file =
ifM (liftIO $ doesFileExist $ fromRawFilePath file)
( catKeyFile file
, catKeyFileHidden file =<< getCurrentBranch
)
lookupKeyNotHidden :: RawFilePath -> Annex (Maybe Key)
lookupKeyNotHidden = lookupKey' catkeyfile
where
catkeyfile file =
ifM (liftIO $ doesFileExist $ fromRawFilePath file)
( catKeyFile file
, return Nothing
)
lookupKey' :: (RawFilePath -> Annex (Maybe Key)) -> RawFilePath -> Annex (Maybe Key)
lookupKey' catkeyfile file = isAnnexLink file >>= \case
Just key -> return (Just key)
Nothing -> catkeyfile file
{- Modifies an action to only act on files that are already annexed,
- and passes the key on to it. -}
whenAnnexed :: (RawFilePath -> Key -> Annex (Maybe a)) -> RawFilePath -> Annex (Maybe a)
whenAnnexed a file = ifAnnexed file (a file) (return Nothing)
ifAnnexed :: RawFilePath -> (Key -> Annex a) -> Annex a -> Annex a
ifAnnexed file yes no = maybe no yes =<< lookupKey file
{- Find all annexed files and update the keys database for them. -}
scanAnnexedFiles :: Annex ()
scanAnnexedFiles = Database.Keys.updateDatabase