This commit is contained in:
Joey Hess 2020-10-19 14:48:39 -04:00
parent 72644d919a
commit 5009c1ce68
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 66 additions and 0 deletions

View file

@ -0,0 +1,29 @@
[[!comment format=mdwn
username="joey"
subject="""comment 8"""
date="2020-10-19T18:27:56Z"
content="""
Update after some recent optimisations involving seekFilteredKeys.
Mon Oct 19 14:36 2020 Time and Allocation Profiling Report (Final)
git-annex +RTS -p -RTS find
total time = 1.32 secs (1316 ticks @ 1000 us, 1 processor)
total alloc = 583,359,800 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
keyFile Annex.Locations Annex/Locations.hs:(602,1)-(612,30) 13.2 32.7
readObjectContent Git.CatFile Git/CatFile.hs:(122,1)-(131,42) 10.9 2.3
copyAndFreeze Data.ByteArray.Methods Data/ByteArray/Methods.hs:(237,1)-(240,21) 5.2 0.8
withPtr Basement.Block.Base Basement/Block/Base.hs:(402,1)-(411,31) 5.2 4.1
parseLinkTarget Annex.Link Annex/Link.hs:(263,1)-(271,25) 4.5 12.3
fileKey Annex.Locations Annex/Locations.hs:(618,1)-(628,41) 4.5 8.1
decimal Data.Attoparsec.ByteString.Char8 Data/Attoparsec/ByteString/Char8.hs:(447,1)-(448,49) 4.3 4.5
doesPathExist Utility.RawFilePath Utility/RawFilePath.hs:32:1-25 3.9 0.6
ifM Utility.Monad Utility/Monad.hs:(54,1)-(56,44) 3.0 3.2
catObjectStream Git.CatFile Git/CatFile.hs:(322,1)-(330,45) 2.6 2.5
</> System.FilePath.Posix.ByteString System/FilePath/Posix/../Internal.hs:841:1-15 2.4 3.4
respParser Git.CatFile Git/CatFile.hs:(175,1)-(182,39) 2.4 1.3
"""]]

View file

@ -0,0 +1,18 @@
[[!comment format=mdwn
username="joey"
subject="""comment 9"""
date="2020-10-19T18:35:02Z"
content="""
Also, /usr/bin/time git-annex find:
1.70user 0.27system 0:01.55elapsed 126%CPU (0avgtext+0avgdata 97352maxresident)k
0inputs+0outputs (0major+9303minor)pagefaults 0swaps
The maxresident seems high, but a stack profile does not show a memory
leak, or such a large amount of memory use at all. Currently, I
think that memory is being preallocated by the ghc runtime,
or something like that. (See [[todo/memory_use_increase]].)
ghc 8.8.4
Should keep an eye on this with newer ghc versions.
"""]]

View file

@ -30,3 +30,5 @@ Version 8.20200720 (cat-file --buffer)
Smoking gun. And probably reasonable. But, why exactly does that optimisation Smoking gun. And probably reasonable. But, why exactly does that optimisation
change the memory use in this way? change the memory use in this way?
[[done]]

View file

@ -0,0 +1,17 @@
[[!comment format=mdwn
username="joey"
subject="""comment 10"""
date="2020-10-19T17:33:49Z"
content="""
[[/profiling]] has a history of `+RTS -p` profiles in the same repo.
Comparing against the 10 month old one there, current git-annex find
runs in same time, and actually allocates slightly less memory, 583357880
bytes down from 608475328. That's memory churn, not max memory usage,
so doesn't rule out a memory leak. But if there is one, it's memory that
was allocated before, so it would need to be a laziness bug I think.
And the profiles are not showing another such leak.
My feeling is, what's left now is all due to a change to haskell runtime's
memory management, or a library. So not worth keeping this open for since I
can't do anything about it except for keep an eye on it.
"""]]