convert Key to ShortByteString

This adds the overhead of a copy when serializing and deserializing keys.
I have not benchmarked much, but runtimes seem barely changed at all by that.

When a lot of keys are in memory, it improves memory use.

And, it prevents keys sometimes getting PINNED in memory and failing to GC,
which is a problem ByteString has sometimes. In particular, git-annex sync
from a borg special remote had that problem and this improved its memory
use by a large amount.

Sponsored-by: Shae Erisson on Patreon
This commit is contained in:
Joey Hess 2021-10-05 20:20:08 -04:00
parent 012b71e471
commit 19e78816f0
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
15 changed files with 65 additions and 36 deletions

View file

@ -4,6 +4,20 @@
date="2021-10-05T23:00:18Z"
content="""
I tried converting Ref to use ShortByteString. Memory use did not improve
and the -hc profile is unchanged. So the pinned memory is not in refs. My
guess is it must be filenames in the tree then.
and the -hc profile is unchanged. So the pinned memory is not in refs.
Also tried converting Key to use ShortByteString. That was a win!
My 20 borg archive test case is down from 320 mb to 242 mb.
Looking at Command.SyncpullThirdPartyPopulated,
it calls listContents, which calls borg's listImportableContents,
and produces an `ImportableContents (ContentIdentifier, ByteSize)`
then that gets passed through importKeys to produce
an `ImportableContents (Either Sha Key)`. Probably
double memory is used while doing that conversion, unless
the GC manages to free the first one while it's traversed.
If borg's listImportableContents included a Key (which it does
produce already only to throw away!) that might
eliminate the big spike just before treeItemsToTree.
"""]]