info: Added calculation of combined annex size of all repositories

Factored out overLocationLogs from CmdLine.Seek, which can calculate this
pretty fast even in a large repo. In my big repo, the time to run git-annex
info went up from 1.33s to 8.5s.

Note that the "backend usage" stats are for annexed files in the working
tree only, not all annexed files. This new data source would let that be
changed, but that would be a confusing behavior change. And I cannot
retitle it either, out of fear something uses the current title (eg parsing
the json).

Also note that, while time says "402108maxresident" in my big repo now,
up from "54092maxresident", top shows the RES constant at 64mb, and it
was 48mb before. So I don't think there is a memory leak. I tried using
deepseq to force full evaluation of addKeyCopies and memory use didn't
change, which also says no memory leak. And indeed, not even calling
addKeyCopies resulted in the same memory use. Probably the increased memory
usage is buffering the stream of data from git in overLocationLogs.

Sponsored-by: Brett Eisenberg on Patreon
This commit is contained in:
Joey Hess 2023-11-08 13:15:00 -04:00
parent 8768966d97
commit 11cc9f1933
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 90 additions and 24 deletions

View file

@ -276,24 +276,14 @@ withKeyOptions' ko auto mkkeyaction fallbackaction worktreeitems = do
-- those. This significantly speeds up typical operations
-- that need to look at the location log for each key.
runallkeys = do
checktimelimit <- mkCheckTimeLimit
keyaction <- mkkeyaction
config <- Annex.getGitConfig
let getk = locationLogFileKey config
checktimelimit <- mkCheckTimeLimit
let discard reader = reader >>= \case
Nothing -> noop
Just _ -> discard reader
let go reader = reader >>= \case
Just (k, f, content) -> checktimelimit (discard reader) $ do
maybe noop (Annex.Branch.precache f) content
unlessM (checkDead k) $
keyaction Nothing (SeekInput [], k, mkActionItem k)
go reader
Nothing -> return ()
Annex.Branch.overBranchFileContents getk go >>= \case
Just r -> return r
Nothing -> giveup "This repository is read-only, and there are unmerged git-annex branches, which prevents operating on all keys. (Set annex.merge-annex-branches to false to ignore the unmerged git-annex branches.)"
overLocationLogs' ()
(\reader cont -> checktimelimit (discard reader) cont)
(\k _ () -> keyaction Nothing (SeekInput [], k, mkActionItem k))
runkeyaction getks = do
keyaction <- mkkeyaction