fix branch precacheing bug by checking journal

Fix bug caused by recent optimisations that could make git-annex not see
recently recorded status information when configured with
annex.alwayscommit=false.

When not using --all, precaching only gets triggered when the
command actually needs location logs, and so there's no speed hit there.

This is a minor speed hit for --all, because it precaches even when the
location log is not actually going to be used, and so checking the journal
is not necessary. It would have been possible to defer checking the journal
until the cache gets used. But that would complicate the usual Branch.get
code path with two different kinds of caches, and the speed hit is really
minimal. A better way to speed up --all, later, would be to avoid
precaching at all when the location log is not going to be used.
This commit is contained in:
Joey Hess 2021-04-21 14:02:15 -04:00
parent b470673e50
commit 6eb3c0a6b4
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 19 additions and 4 deletions

View file

@ -1,6 +1,6 @@
{- management of the git-annex branch
-
- Copyright 2011-2020 Joey Hess <id@joeyh.name>
- Copyright 2011-2021 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
@ -30,6 +30,7 @@ module Annex.Branch (
rememberTreeish,
performTransitions,
withIndex,
precache,
) where
import qualified Data.ByteString as B
@ -260,6 +261,18 @@ get file = getCache file >>= \case
setCache file content
return content
{- Used to cache the value of a file, which has been read from the branch
- using some optimised method. The journal has to be checked, in case
- it has a newer version of the file that has not reached the branch yet.
-}
precache :: RawFilePath -> L.ByteString -> Annex ()
precache file branchcontent = do
st <- getState
content <- if journalIgnorable st
then pure branchcontent
else fromMaybe branchcontent <$> getJournalFileStale file
Annex.BranchState.setCache file content
{- Like get, but does not merge the branch, so the info returned may not
- reflect changes in remotes.
- (Changing the value this returns, and then merging is always the

View file

@ -12,6 +12,9 @@ git-annex (8.20210331) UNRELEASED; urgency=medium
* directory: When cp supports reflinks, use it.
* init: Fix a crash when the repo's was cloned from a repo that had an
adjusted branch checked out, and the origin remote is not named "origin".
* Fix bug caused by recent optimisations that could make git-annex not
see recently recorded status information when configured with
annex.alwayscommit=false.
-- Joey Hess <id@joeyh.name> Thu, 01 Apr 2021 12:17:26 -0400

View file

@ -44,7 +44,6 @@ import Annex.Concurrent
import Annex.CheckIgnore
import Annex.Action
import qualified Annex.Branch
import qualified Annex.BranchState
import qualified Database.Keys
import qualified Utility.RawFilePath as R
import Utility.Tuple
@ -288,7 +287,7 @@ withKeyOptions' ko auto mkkeyaction fallbackaction worktreeitems = do
let go reader = liftIO reader >>= \case
Nothing -> return ()
Just ((k, f), content) -> checktimelimit (discard reader) $ do
maybe noop (Annex.BranchState.setCache f) content
maybe noop (Annex.Branch.precache f) content
keyaction Nothing (SeekInput [], k, mkActionItem k)
go reader
catObjectStreamLsTree l (getk . getTopFilePath . LsTree.file) g go
@ -395,7 +394,7 @@ seekFilteredKeys seeker listfs = do
precachefinisher mi lreader checktimelimit = liftIO lreader >>= \case
Just ((logf, (si, f), k), logcontent) -> checktimelimit discard $ do
maybe noop (Annex.BranchState.setCache logf) logcontent
maybe noop (Annex.Branch.precache logf) logcontent
checkMatcherWhen mi
(matcherNeedsLocationLog mi && not (matcherNeedsFileName mi))
(MatchingFile $ FileInfo f f (Just k))