bup: Avoid memory leak when transferring encrypted data.

This was a most surprising leak. It occurred in the process that is forked
off to feed data to gpg. That process was passed a lazy ByteString of
input, and ghc seemed to not GC the ByteString as it was lazily read
and consumed, so memory slowly leaked as the file was read and passed
through gpg to bup.

To fix it, I simply changed the feeder to take an IO action that returns
the lazy bytestring, and fed the result directly to hPut.

AFAICS, this should change nothing WRT buffering. But somehow it makes
ghc's GC do the right thing. Probably I triggered some weakness in ghc's
GC (version 6.12.1).

(Note that S3 still has this leak, and others too. Fixing it will involve
another dance with the type system.)

Update: One theory I have is that this has something to do with
the forking of the feeder process. Perhaps, when the ByteString
is produced before the fork, ghc decides it need to hold a pointer
to the start of it, for some reason -- maybe it doesn't realize that
it is only used in the forked process.
This commit is contained in:
Joey Hess 2011-04-19 15:26:50 -04:00
parent b1274b6378
commit 5985acdfad
5 changed files with 18 additions and 21 deletions

View file

@ -92,8 +92,7 @@ storeEncrypted d (cipher, enck) k = do
liftIO $ catchBool $ storeHelper dest $ encrypt src dest
where
encrypt src dest = do
content <- L.readFile src
withEncryptedContent cipher content $ L.writeFile dest
withEncryptedContent cipher (L.readFile src) $ L.writeFile dest
return True
storeHelper :: FilePath -> IO Bool -> IO Bool
@ -113,8 +112,7 @@ retrieve d k f = liftIO $ copyFile (dirKey d k) f
retrieveEncrypted :: FilePath -> (Cipher, Key) -> FilePath -> Annex Bool
retrieveEncrypted d (cipher, enck) f =
liftIO $ catchBool $ do
content <- L.readFile (dirKey d enck)
withDecryptedContent cipher content $ L.writeFile f
withDecryptedContent cipher (L.readFile (dirKey d enck)) $ L.writeFile f
return True
remove :: FilePath -> Key -> Annex Bool