fix lack of laziness streaming large diffs
A commit last year that made a partial function use Maybe unfortunately caused the whole input to need to be consumed, breaking streaming. So, revert it. This commit was sponsored by Nick Daly on Patreon.
This commit is contained in:
parent
a130bc4f9b
commit
dbaea98836
2 changed files with 32 additions and 14 deletions
|
@ -89,7 +89,7 @@ commitDiff ref = getdiff (Param "show")
|
|||
getdiff :: CommandParam -> [CommandParam] -> Repo -> IO ([DiffTreeItem], IO Bool)
|
||||
getdiff command params repo = do
|
||||
(diff, cleanup) <- pipeNullSplit ps repo
|
||||
return (fromMaybe (error $ "git " ++ show (toCommand ps) ++ " parse failed") (parseDiffRaw diff), cleanup)
|
||||
return (parseDiffRaw diff, cleanup)
|
||||
where
|
||||
ps =
|
||||
command :
|
||||
|
@ -100,24 +100,23 @@ getdiff command params repo = do
|
|||
params
|
||||
|
||||
{- Parses --raw output used by diff-tree and git-log. -}
|
||||
parseDiffRaw :: [String] -> Maybe [DiffTreeItem]
|
||||
parseDiffRaw :: [String] -> [DiffTreeItem]
|
||||
parseDiffRaw l = go l []
|
||||
where
|
||||
go [] c = Just c
|
||||
go (info:f:rest) c = case mk info f of
|
||||
Nothing -> Nothing
|
||||
Just i -> go rest (i:c)
|
||||
go (_:[]) _ = Nothing
|
||||
go [] c = c
|
||||
go (info:f:rest) c = go rest (mk info f : c)
|
||||
go (s:[]) _ = error $ "diff-tree parse error near \"" ++ s ++ "\""
|
||||
|
||||
mk info f = DiffTreeItem
|
||||
<$> readmode srcm
|
||||
<*> readmode dstm
|
||||
<*> extractSha ssha
|
||||
<*> extractSha dsha
|
||||
<*> pure s
|
||||
<*> pure (asTopFilePath $ fromInternalGitPath $ Git.Filename.decode f)
|
||||
{ srcmode = readmode srcm
|
||||
, dstmode = readmode dstm
|
||||
, srcsha = fromMaybe (error "bad srcsha") $ extractSha ssha
|
||||
, dstsha = fromMaybe (error "bad dstsha") $ extractSha dsha
|
||||
, status = s
|
||||
, file = asTopFilePath $ fromInternalGitPath $ Git.Filename.decode f
|
||||
}
|
||||
where
|
||||
readmode = fst <$$> headMaybe . readOct
|
||||
readmode = fst . Prelude.head . readOct
|
||||
|
||||
-- info = :<srcmode> SP <dstmode> SP <srcsha> SP <dstsha> SP <status>
|
||||
-- All fields are fixed, so we can pull them out of
|
||||
|
|
|
@ -0,0 +1,19 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2017-01-31T20:24:04Z"
|
||||
content="""
|
||||
The heap profile has multiple spikes (so not an accumulating memory leak).
|
||||
The diff parsing code is indeed what's using so much memory. Looks like
|
||||
data is failing to stream through that code and instead the whole
|
||||
diff output gets buffered.
|
||||
|
||||
Aha.. Git.DiffTree.parseDiffRaw used to return a list, but changed
|
||||
in [[!commit 8d124beba8]]
|
||||
to a Maybe list in order to avoid being a partial function. But
|
||||
that change destroyed laziness, since the whole input has to be parsed
|
||||
in order to determine if Nothing should be returned.
|
||||
|
||||
However, fixing that only eliminated part of the spike. There's something
|
||||
else keeping data from streaming.
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue