comment

This commit was sponsored by Ethan Aubin.
2020-10-12 15:47:46 -04:00 · 2020-10-12 15:47:46 -04:00 · db1def72ee
commit db1def72ee
parent 8e7eeb753d
1 changed files with 32 additions and 0 deletions
--- a/doc/todo/memory_use_increase/comment_4_774d540ce6f5c3ffda924159e146721e._comment
+++ b/doc/todo/memory_use_increase/comment_4_774d540ce6f5c3ffda924159e146721e._comment
@ -0,0 +1,32 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2020-10-12T19:19:40Z"
+ content="""
+Thinking a little more about this, the lazy bytestring it reads is probably
+in around 32kb chunks. The git ls-files --stage output segment for a file
+is 50 bytes plus the filename, so probably under 200 bytes.
+
+The lazy bytestring is split into those segments, and then each segment
+is copied to a strict bytestring with L.toStrict.
+
+How does a lazy bytestring get split on null? L.split uses L.take.
+L.take uses S.take on the chunk. S.take simply updates the length of
+the bytestring, but the result still keeps the rest of it allocated.
+(And similar for drop I assume.)
+
+So, if L.toStrict is run on a lazy bytestring consisting of a single chunk
+that's a strict bytestring, that's had its size reduced by L.split,
+the rest is still allocated. And in L.toStrict, there's a special case for a
+single chunk input, that bypasses the usual copying:
+
+    goLen1 _   bs Empty = bs
+
+That keeps the original strict bytestring, not copying it. And so
+the rest of it, after the NULL, remains allocated for as long as the result
+is in use.
+
+Hmm, this doesn't explain the memory leak (throwing in a S.copy didn't fix
+it either) or why profiling doesn't show the full memory use, but it does
+explain the PINNED memory use, probably.
+"""]]