aws library now supports multipart; initial design
This commit is contained in:
parent
04e920d715
commit
dcaf89da2f
1 changed files with 31 additions and 0 deletions
|
@ -0,0 +1,31 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 9"""
|
||||
date="2014-10-28T16:42:21Z"
|
||||
content="""
|
||||
The aws library now supports multipart uploads, using its
|
||||
S3.Commands.Multipart module.
|
||||
|
||||
I don't think that multipart and chunking fit together: Typically the
|
||||
chunks are too small to need multipart for individual chunks. And the
|
||||
chunks shouldn't be combined together into a complete object at the end (at
|
||||
least not if we care about using chunking to obscure object size).
|
||||
Individual chunks sizes can vary when encryption is used, so combining them
|
||||
all into one file wouldn't work.
|
||||
|
||||
Also, multipart uploads require at least 3 http calls, so there's no point
|
||||
using it for small objects, as it would only add overhead.
|
||||
|
||||
So, multipart uploads should be used when not chunking, when the object to
|
||||
upload exceeds some size, which should probably defaut to something in the
|
||||
range of 100 mb to 1 gb.
|
||||
|
||||
It might be possible to support resuming of interrupted multipart uploads.
|
||||
It seems that git-annex would need to store, locally, the UploadId,
|
||||
as well as the list of uploaded parts, including the Etag for the upload
|
||||
(which is needed when completing the multipart upload too).
|
||||
|
||||
Also it should probably set Expires when initiating the multipart upload,
|
||||
so that incomplete ones get cleaned up after some period of time.
|
||||
Otherwise, users would probably be billed for them.
|
||||
"""]]
|
Loading…
Reference in a new issue