update based on profiling

While L.toStrict copies, profiling showed it was only around 0.3% of
git-annex find runtime. Does not seem worth optimising that, which would
probably involve either a major refactoring, or a use of
UnsafeInterleaveIO.

Also, it seems to me that the latter would need to read chunks, and
preappend the leftover part to the next chunk. But a strict ByteString
append itself is a copy, so I'm not convinced that would be faster than
L.toStrict.
This commit is contained in:
Joey Hess 2019-11-27 14:09:11 -04:00
parent c914058bf9
commit d830386ab2
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 4 additions and 7 deletions

View file

@ -101,10 +101,8 @@ pipeNullSplit params repo = do
(s, cleanup) <- pipeReadLazy params repo (s, cleanup) <- pipeReadLazy params repo
return (filter (not . L.null) $ L.split 0 s, cleanup) return (filter (not . L.null) $ L.split 0 s, cleanup)
{- Reads lazily, but converts each part to a strict ByteString for {- Reads lazily, but copies each part to a strict ByteString for
- convenience. - convenience.
-
- FIXME the L.toStrict makes a copy, more expensive than ideal.
-} -}
pipeNullSplit' :: [CommandParam] -> Repo -> IO ([S.ByteString], IO Bool) pipeNullSplit' :: [CommandParam] -> Repo -> IO ([S.ByteString], IO Bool)
pipeNullSplit' params repo = do pipeNullSplit' params repo = do

View file

@ -14,8 +14,9 @@ the `bs` branch has quite a lot of things still needing work, including:
decodeBS conversions. Or at least most of them. There are likely decodeBS conversions. Or at least most of them. There are likely
quite a few places where a value is converted back and forth several times. quite a few places where a value is converted back and forth several times.
It would be good to instrument them with Debug.Trace and find out which As a first step, profile and look for the hot spots. For example, keyFile
are the hot ones that get called, and focus on those. uses fromRawFilePath and that adds around 3% overhead in `git-annex find`.
Converting it to a RawFilePath needs a version of `</>` for RawFilePaths.
* System.FilePath is not available for RawFilePath, and many of the * System.FilePath is not available for RawFilePath, and many of the
conversions are to get a FilePath in order to use that library. conversions are to get a FilePath in order to use that library.
@ -29,6 +30,4 @@ the `bs` branch has quite a lot of things still needing work, including:
windows, so a compatability shim will be needed. windows, so a compatability shim will be needed.
(I can't seem to find any library that provides one.) (I can't seem to find any library that provides one.)
* Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy.
* Use ByteString for parsing git config to speed up startup. * Use ByteString for parsing git config to speed up startup.