convert Key to ShortByteString

This adds the overhead of a copy when serializing and deserializing keys.
I have not benchmarked much, but runtimes seem barely changed at all by that.

When a lot of keys are in memory, it improves memory use.

And, it prevents keys sometimes getting PINNED in memory and failing to GC,
which is a problem ByteString has sometimes. In particular, git-annex sync
from a borg special remote had that problem and this improved its memory
use by a large amount.

Sponsored-by: Shae Erisson on Patreon
This commit is contained in:
Joey Hess 2021-10-05 20:20:08 -04:00
parent 012b71e471
commit 19e78816f0
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
15 changed files with 65 additions and 36 deletions

View file

@ -17,6 +17,7 @@ import Utility.Metered
import qualified Data.ByteString.Char8 as S8
import qualified Utility.RawFilePath as R
import qualified Data.ByteString.Short as S (toShort, fromShort)
backends :: [Backend]
backends = [backend]
@ -53,12 +54,13 @@ keyValue source _ = do
{- Old WORM keys could contain spaces and carriage returns,
- and can be upgraded to remove them. -}
needsUpgrade :: Key -> Bool
needsUpgrade key = any (`S8.elem` fromKey keyName key) [' ', '\r']
needsUpgrade key =
any (`S8.elem` S.fromShort (fromKey keyName key)) [' ', '\r']
removeProblemChars :: Key -> Backend -> AssociatedFile -> Annex (Maybe Key)
removeProblemChars oldkey newbackend _
| migratable = return $ Just $ alterKey oldkey $ \d -> d
{ keyName = encodeBS $ reSanitizeKeyName $ decodeBS $ keyName d }
{ keyName = S.toShort $ encodeBS $ reSanitizeKeyName $ decodeBS $ S.fromShort $ keyName d }
| otherwise = return Nothing
where
migratable = oldvariety == newvariety