cache one more log file for metadata

My worry was that a preferred content expression that matches on metadata
would have removed the location log from cache, causing an expensive
re-read when a Seek action later checked the location log.

Especially when the --all optimisation in the previous commit
pre-cached the location log.

This also means that the --all optimisation could cache the metadata log
too, if it wanted too, but not currently done.

The cache is a list, with the most recently accessed file first. That
optimises it for the common case of reading the same file twice, eg a
get, examine, followed by set reads it twice. And sync --content reads the
location log 3 times in a row commonly.

But, as a list, it should not be made to be too long. I thought about
expanding it to 5 items, but that seemed unlikely to be a win commonly
enough to outweigh the extra time spent checking the cache.

Clearly there could be some further benchmarking and tuning here.
This commit is contained in:
Joey Hess 2020-07-07 14:18:55 -04:00
parent d010ab04be
commit 9483b10469
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 35 additions and 20 deletions

View file

@ -19,10 +19,8 @@ data BranchState = BranchState
, journalIgnorable :: Bool
-- ^ can reading the journal be skipped, while still getting
-- sufficiently up-to-date information from the branch?
, cachedFile :: Maybe RawFilePath
-- ^ a file recently read from the branch
, cachedContent :: L.ByteString
-- ^ content of the cachedFile
, cachedFileContents :: [(RawFilePath, L.ByteString)]
-- ^ contents of a few files recently read from the branch
, needInteractiveAccess :: Bool
-- ^ do new changes written to the journal or branch by another
-- process need to be noticed while the current process is running?
@ -31,4 +29,4 @@ data BranchState = BranchState
}
startBranchState :: BranchState
startBranchState = BranchState False False False Nothing mempty False
startBranchState = BranchState False False False [] False