aws library now supports multipart; initial design
This commit is contained in:
parent
04e920d715
commit
dcaf89da2f
1 changed files with 31 additions and 0 deletions
|
@ -0,0 +1,31 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 9"""
|
||||||
|
date="2014-10-28T16:42:21Z"
|
||||||
|
content="""
|
||||||
|
The aws library now supports multipart uploads, using its
|
||||||
|
S3.Commands.Multipart module.
|
||||||
|
|
||||||
|
I don't think that multipart and chunking fit together: Typically the
|
||||||
|
chunks are too small to need multipart for individual chunks. And the
|
||||||
|
chunks shouldn't be combined together into a complete object at the end (at
|
||||||
|
least not if we care about using chunking to obscure object size).
|
||||||
|
Individual chunks sizes can vary when encryption is used, so combining them
|
||||||
|
all into one file wouldn't work.
|
||||||
|
|
||||||
|
Also, multipart uploads require at least 3 http calls, so there's no point
|
||||||
|
using it for small objects, as it would only add overhead.
|
||||||
|
|
||||||
|
So, multipart uploads should be used when not chunking, when the object to
|
||||||
|
upload exceeds some size, which should probably defaut to something in the
|
||||||
|
range of 100 mb to 1 gb.
|
||||||
|
|
||||||
|
It might be possible to support resuming of interrupted multipart uploads.
|
||||||
|
It seems that git-annex would need to store, locally, the UploadId,
|
||||||
|
as well as the list of uploaded parts, including the Etag for the upload
|
||||||
|
(which is needed when completing the multipart upload too).
|
||||||
|
|
||||||
|
Also it should probably set Expires when initiating the multipart upload,
|
||||||
|
so that incomplete ones get cleaned up after some period of time.
|
||||||
|
Otherwise, users would probably be billed for them.
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue