From d0036b91004d023df9d44cfb2e931792724799be Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Wed, 27 Apr 2016 17:15:02 -0400 Subject: [PATCH] todo --- ...ge_chunks_without_buffering_in_memory.mdwn | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 doc/todo/upload_large_chunks_without_buffering_in_memory.mdwn diff --git a/doc/todo/upload_large_chunks_without_buffering_in_memory.mdwn b/doc/todo/upload_large_chunks_without_buffering_in_memory.mdwn new file mode 100644 index 0000000000..2c5c96c1f6 --- /dev/null +++ b/doc/todo/upload_large_chunks_without_buffering_in_memory.mdwn @@ -0,0 +1,26 @@ +Currently, when chunk=100MiB is used with a special remote, git-annex will +use 100 mb ram when uploading a large file to it. The current chunk is +buffered in ram. + +See comment in Remote.Helper.Chunked: + + - More optimal versions of this can be written, that rely + - on L.toChunks to split the lazy bytestring into chunks (typically + - smaller than the ChunkSize), and eg, write those chunks to a Handle. + - But this is the best that can be done with the storer interface that + - writes a whole L.ByteString at a time. + +The buffering is probably fine for smaller chunks, in the < 10 mb or +whatever range. Reading such a chunk from gpg and converting it into a http +request will generally need to buffer it all in memory anyway. But, for +eg external special remotes, the content is going to be written to a file +anyway, so there's no point in buffering the content in memory first. + +So, need something like: + + prefersFileContent :: Storer -> Bool + +And then when storeChunks is given such a Storer, it can then avoid buffering +and write chunk data to a temp file to pass it to the Storer. + +--[[Joey]]