update based on profiling

While L.toStrict copies, profiling showed it was only around 0.3% of
git-annex find runtime. Does not seem worth optimising that, which would
probably involve either a major refactoring, or a use of
UnsafeInterleaveIO.

Also, it seems to me that the latter would need to read chunks, and
preappend the leftover part to the next chunk. But a strict ByteString
append itself is a copy, so I'm not convinced that would be faster than
L.toStrict.
This commit is contained in:
Joey Hess 2019-11-27 14:09:11 -04:00
parent c914058bf9
commit d830386ab2
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 4 additions and 7 deletions

View file

@ -101,10 +101,8 @@ pipeNullSplit params repo = do
(s, cleanup) <- pipeReadLazy params repo
return (filter (not . L.null) $ L.split 0 s, cleanup)
{- Reads lazily, but converts each part to a strict ByteString for
{- Reads lazily, but copies each part to a strict ByteString for
- convenience.
-
- FIXME the L.toStrict makes a copy, more expensive than ideal.
-}
pipeNullSplit' :: [CommandParam] -> Repo -> IO ([S.ByteString], IO Bool)
pipeNullSplit' params repo = do

View file

@ -14,8 +14,9 @@ the `bs` branch has quite a lot of things still needing work, including:
decodeBS conversions. Or at least most of them. There are likely
quite a few places where a value is converted back and forth several times.
It would be good to instrument them with Debug.Trace and find out which
are the hot ones that get called, and focus on those.
As a first step, profile and look for the hot spots. For example, keyFile
uses fromRawFilePath and that adds around 3% overhead in `git-annex find`.
Converting it to a RawFilePath needs a version of `</>` for RawFilePaths.
* System.FilePath is not available for RawFilePath, and many of the
conversions are to get a FilePath in order to use that library.
@ -29,6 +30,4 @@ the `bs` branch has quite a lot of things still needing work, including:
windows, so a compatability shim will be needed.
(I can't seem to find any library that provides one.)
* Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy.
* Use ByteString for parsing git config to speed up startup.