2015-01-13 20:16:02 +00:00
|
|
|
Most [[special_remotes]] have support for breaking large files up into
|
2014-07-27 03:39:51 +00:00
|
|
|
chunks that are stored on the remote.
|
|
|
|
|
|
|
|
This can be useful to work around limitations on the size of files
|
|
|
|
on the remote.
|
|
|
|
|
resume interrupted chunked downloads
Leverage the new chunked remotes to automatically resume downloads.
Sort of like rsync, although of course not as efficient since this
needs to start at a chunk boundry.
But, unlike rsync, this method will work for S3, WebDAV, external
special remotes, etc, etc. Only directory special remotes so far,
but many more soon!
This implementation will also properly handle starting a download
from one remote, interrupting, and resuming from another one, and so on.
(Resuming interrupted chunked uploads is similarly doable, although
slightly more expensive.)
This commit was sponsored by Thomas Djärv.
2014-07-27 22:52:42 +00:00
|
|
|
Chunking also allows for resuming interrupted downloads and uploads.
|
|
|
|
|
2014-07-27 03:39:51 +00:00
|
|
|
Note that git-annex has to buffer chunks in memory before they are sent to
|
|
|
|
a remote. So, using a large chunk size will make it use more memory.
|
|
|
|
|
2014-07-30 16:04:00 +00:00
|
|
|
To enable chunking, pass a `chunk=nnMiB` parameter to `git annex
|
2014-08-01 22:18:52 +00:00
|
|
|
initremote`, specifying the chunk size.
|
2014-07-30 16:04:00 +00:00
|
|
|
|
|
|
|
Good chunk sizes will depend on the remote, but a good starting place
|
2015-04-01 23:50:17 +00:00
|
|
|
is probably `1MiB`. Very large chunks are problematic, both because
|
2014-07-30 16:04:00 +00:00
|
|
|
git-annex needs to buffer one chunk in memory when uploading, and because
|
|
|
|
a larger chunk will make resuming interrupted transfers less efficient.
|
|
|
|
On the other hand, when a file is split into a great many chunks,
|
|
|
|
there can be increased overhead of making many requests to the remote.
|
2014-07-27 03:39:51 +00:00
|
|
|
|
|
|
|
To disable chunking of a remote that was using chunking,
|
|
|
|
pass `chunk=0` to `git annex enableremote`. Any content already stored on
|
|
|
|
the remote using chunks will continue to be accessed via chunks, this
|
|
|
|
just prevents using chunks when storing new content.
|
|
|
|
|
2014-07-30 16:04:00 +00:00
|
|
|
To change the chunk size, pass a `chunk=nnMiB` parameter to
|
2014-07-27 03:39:51 +00:00
|
|
|
`git annex enableremote`. This only affects the chunk sized used when
|
|
|
|
storing new content.
|
|
|
|
|
2014-09-18 18:54:35 +00:00
|
|
|
# old-style chunking
|
|
|
|
|
|
|
|
Note that older versions of git-annex used a different chunk method, which
|
|
|
|
was configured by passing `chunksize=nnMib` when initializing a remote.
|
|
|
|
|
|
|
|
The old-style chunking had a number of problems, including being less
|
|
|
|
efficient, and not allowing resumes of encrypted uploads.
|
|
|
|
|
|
|
|
It's not possible to change a remote using that old chunking method to the
|
|
|
|
new one, but git-annex continues to support the old-style chunking to
|
|
|
|
support such remotes.
|
|
|
|
|
2014-07-27 03:39:51 +00:00
|
|
|
See also: [[design document|design/assistant/chunks]]
|