This commit is contained in:
Joey Hess 2020-10-19 14:48:39 -04:00
parent 72644d919a
commit 5009c1ce68
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 66 additions and 0 deletions

View file

@ -0,0 +1,29 @@
[[!comment format=mdwn
username="joey"
subject="""comment 8"""
date="2020-10-19T18:27:56Z"
content="""
Update after some recent optimisations involving seekFilteredKeys.
Mon Oct 19 14:36 2020 Time and Allocation Profiling Report (Final)
git-annex +RTS -p -RTS find
total time = 1.32 secs (1316 ticks @ 1000 us, 1 processor)
total alloc = 583,359,800 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
keyFile Annex.Locations Annex/Locations.hs:(602,1)-(612,30) 13.2 32.7
readObjectContent Git.CatFile Git/CatFile.hs:(122,1)-(131,42) 10.9 2.3
copyAndFreeze Data.ByteArray.Methods Data/ByteArray/Methods.hs:(237,1)-(240,21) 5.2 0.8
withPtr Basement.Block.Base Basement/Block/Base.hs:(402,1)-(411,31) 5.2 4.1
parseLinkTarget Annex.Link Annex/Link.hs:(263,1)-(271,25) 4.5 12.3
fileKey Annex.Locations Annex/Locations.hs:(618,1)-(628,41) 4.5 8.1
decimal Data.Attoparsec.ByteString.Char8 Data/Attoparsec/ByteString/Char8.hs:(447,1)-(448,49) 4.3 4.5
doesPathExist Utility.RawFilePath Utility/RawFilePath.hs:32:1-25 3.9 0.6
ifM Utility.Monad Utility/Monad.hs:(54,1)-(56,44) 3.0 3.2
catObjectStream Git.CatFile Git/CatFile.hs:(322,1)-(330,45) 2.6 2.5
</> System.FilePath.Posix.ByteString System/FilePath/Posix/../Internal.hs:841:1-15 2.4 3.4
respParser Git.CatFile Git/CatFile.hs:(175,1)-(182,39) 2.4 1.3
"""]]

View file

@ -0,0 +1,18 @@
[[!comment format=mdwn
username="joey"
subject="""comment 9"""
date="2020-10-19T18:35:02Z"
content="""
Also, /usr/bin/time git-annex find:
1.70user 0.27system 0:01.55elapsed 126%CPU (0avgtext+0avgdata 97352maxresident)k
0inputs+0outputs (0major+9303minor)pagefaults 0swaps
The maxresident seems high, but a stack profile does not show a memory
leak, or such a large amount of memory use at all. Currently, I
think that memory is being preallocated by the ghc runtime,
or something like that. (See [[todo/memory_use_increase]].)
ghc 8.8.4
Should keep an eye on this with newer ghc versions.
"""]]

View file

@ -30,3 +30,5 @@ Version 8.20200720 (cat-file --buffer)
Smoking gun. And probably reasonable. But, why exactly does that optimisation
change the memory use in this way?
[[done]]

View file

@ -0,0 +1,17 @@
[[!comment format=mdwn
username="joey"
subject="""comment 10"""
date="2020-10-19T17:33:49Z"
content="""
[[/profiling]] has a history of `+RTS -p` profiles in the same repo.
Comparing against the 10 month old one there, current git-annex find
runs in same time, and actually allocates slightly less memory, 583357880
bytes down from 608475328. That's memory churn, not max memory usage,
so doesn't rule out a memory leak. But if there is one, it's memory that
was allocated before, so it would need to be a laziness bug I think.
And the profiles are not showing another such leak.
My feeling is, what's left now is all due to a change to haskell runtime's
memory management, or a library. So not worth keeping this open for since I
can't do anything about it except for keep an eye on it.
"""]]