Added a comment

This commit is contained in:
http://joeyh.name/ 2014-07-04 19:26:00 +00:00 committed by admin
parent 4b43f4e481
commit c85d20d02e

View file

@ -0,0 +1,27 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="209.250.56.55"
subject="comment 3"
date="2014-07-04T19:26:00Z"
content="""
Looking at the code, it's pretty clear why this is using a lot of memory:
<pre>
fs <- getJournalFiles jl
liftIO $ do
h <- hashObjectStart g
Git.UpdateIndex.streamUpdateIndex g
[genstream dir h fs]
hashObjectStop h
return $ liftIO $ mapM_ (removeFile . (dir </>)) fs
</pre>
So the big list in `fs` has to be retained in memory after the files are streamed to update-index, in order for them to be deleted!
Fixing is a bit tricky.. New journal files can appear while this is going on, so it can't just run getJournalFiles a second time to get the files to clean.
Maybe delete the file after it's been sent to git-update-index? But git-update-index is going to want to read the file, and we don't really know when it will choose to do so. It could wait a while after we've sent the filename to it, potentially.
Also, per [[!commit 750c4ac6c282d14d19f79e0711f858367da145e4]], we cannot delete the journal files until *after* the commit, or another git-annex process would see inconsistent data!
I actually think I am going to need to use a temp file to hold the list of files..
"""]]