more efficient union merges

Tries to avoid generating a new object when the merged content has the same
lines that were in the old object.

I've noticed some merge commits that only move lines around, like this:

- 1323478057.181191s 1 be23c3ac-0ee5-11e0-b185-3b0f9b5b00c5
  1323204972.062151s 1 87e06c7a-7388-11e0-ba07-03cdf300bd87
++1323478057.181191s 1 be23c3ac-0ee5-11e0-b185-3b0f9b5b00c5

Unsure if this will really save anything in practice, since it only looks
at one of the two old objects, and maybe I didn't pick the best one.
This commit is contained in:
Joey Hess 2011-12-11 23:02:25 -04:00
parent d3d9c8a9a6
commit b9ac585454

View file

@ -104,14 +104,17 @@ mergeFile :: String -> FilePath -> CatFileHandle -> Repo -> IO (Maybe String)
mergeFile info file h repo = case filter (/= nullsha) [Ref asha, Ref bsha] of mergeFile info file h repo = case filter (/= nullsha) [Ref asha, Ref bsha] of
[] -> return Nothing [] -> return Nothing
(sha:[]) -> return $ Just $ update_index_line sha file (sha:[]) -> return $ Just $ update_index_line sha file
shas -> do (sha:shas) -> do
content <- L.concat <$> mapM (catObject h) shas origcontent <- L.lines <$> catObject h sha
sha <- hashObject (unionmerge content) repo content <- map L.lines <$> mapM (catObject h) shas
return $ Just $ update_index_line sha file let newcontent = nub $ concat $ origcontent:content
newsha <- if (newcontent == origcontent)
then return sha
else hashObject (L.unlines $ newcontent) repo
return $ Just $ update_index_line newsha file
where where
[_colonamode, _bmode, asha, bsha, _status] = words info [_colonmode, _bmode, asha, bsha, _status] = words info
nullsha = Ref $ replicate shaSize '0' nullsha = Ref $ replicate shaSize '0'
unionmerge = L.unlines . nub . L.lines
{- Injects some content into git, returning its Sha. -} {- Injects some content into git, returning its Sha. -}
hashObject :: L.ByteString -> Repo -> IO Sha hashObject :: L.ByteString -> Repo -> IO Sha