git-annex

Author	SHA1	Message	Date
Joey Hess	f5af470875	add ContentSource type, for remotes that act on files rather than ByteStrings Note that currently nothing cleans up a ContentSource's file, when eg, retrieving chunks.	2014-07-29 15:16:12 -04:00
Joey Hess	216fdbd6bd	fix non-checked hasKeyChunks	2014-07-29 15:07:32 -04:00
Joey Hess	58f727afdd	resume interrupted chunked uploads Leverage the new chunked remotes to automatically resume uploads. Sort of like rsync, although of course not as efficient since this needs to start at a chunk boundry. But, unlike rsync, this method will work for S3, WebDAV, external special remotes, etc, etc. Only directory special remotes so far, but many more soon! This implementation will also allow starting an upload from one repository, interrupting it, and then resuming the upload to the same remote from an entirely different repository. Note that I added a comment that storeKey should atomically move the content into place once it's all received. This was already an undocumented requirement -- it's necessary for hasKey to work reliably. This resume code just uses hasKey to find the first chunk that's missing. Note that if there are two uploads of the same key to the same chunked remote, one might resume at the point the other had gotten to, but both will then redundantly upload. As before. In the non-resume case, this adds one hasKey call per storeKey, and only if the remote is configured to use chunks. Future work: Try to eliminate that hasKey. Notice that eg, `git annex copy --to` checks if the key is present before sending it, so is already running hasKey.. which could perhaps be cached and reused. However, this additional overhead is not very large compared with transferring an entire large file, and the ability to resume is certianly worth it. There is an optimisation in place for small files, that avoids trying to resume if the whole file fits within one chunk. This commit was sponsored by Georg Bauer.	2014-07-28 14:35:52 -04:00
Joey Hess	80cc554c82	add ChunkMethod type and make Logs.Chunk use it, rather than assuming fixed size chunks (so eg, rolling hash chunks can be supported later) If a newer git-annex starts logging something else in the chunk log, it won't be used by this version, but it will be preserved when updating the log.	2014-07-28 13:19:08 -04:00
Joey Hess	9d4a766cd7	resume interrupted chunked downloads Leverage the new chunked remotes to automatically resume downloads. Sort of like rsync, although of course not as efficient since this needs to start at a chunk boundry. But, unlike rsync, this method will work for S3, WebDAV, external special remotes, etc, etc. Only directory special remotes so far, but many more soon! This implementation will also properly handle starting a download from one remote, interrupting, and resuming from another one, and so on. (Resuming interrupted chunked uploads is similarly doable, although slightly more expensive.) This commit was sponsored by Thomas Djärv.	2014-07-27 18:56:32 -04:00
Joey Hess	2996f0eb05	use existing chunks even when chunk=0 When chunk=0, always try the unchunked key first. This avoids the overhead of needing to read the git-annex branch to find the chunkcount. However, if the unchunked key is not present, go on and try the chunks. Also, when removing a chunked key, update the chunkcounts even when chunk=0.	2014-07-27 02:13:51 -04:00
Joey Hess	7afb057d60	reorg	2014-07-27 01:24:34 -04:00
Joey Hess	c3af4897c0	faster storeChunks No need to process each L.ByteString chunk, instead ask it to split. Doesn't seem to have really sped things up much, but it also made the code simpler. Note that this does (and already did) buffer in memory. It seems that only the directory special remote could take advantage of streaming chunks to files w/o buffering, so probably won't add an interface to allow for that.	2014-07-27 01:18:38 -04:00
Joey Hess	9a8c4bb21f	improve exception handling Push it down from needing to be done in every Storer, to being checked once inside ChunkedEncryptable. Also, catch exceptions from PrepareStorer and PrepareRetriever, just in case..	2014-07-26 23:26:10 -04:00
Joey Hess	867fd116a7	better exception display	2014-07-26 23:01:44 -04:00
Joey Hess	93be3296fc	fix another fallback bug	2014-07-26 22:47:52 -04:00
Joey Hess	86e8532c0a	allM has slightly better memory use	2014-07-26 22:34:40 -04:00
Joey Hess	67975bf50d	fix fallback to other chunk size when first does not have it	2014-07-26 22:25:50 -04:00
Joey Hess	d4d68f57e5	finish up basic chunked remote groundwork Chunk retrieval and reassembly, removal, and checking if all necessary chunks are present. This commit was sponsored by Damien Raude-Morvan.	2014-07-26 20:11:41 -04:00
Joey Hess	cf83697c33	reorg	2014-07-26 12:04:35 -04:00
Joey Hess	ab4cce4114	core implementation of new style chunking Not yet used by any special remotes, but should not be too hard to add it to most of them. storeChunks is the hairy bit! It's loosely based on Remote.Directory.storeLegacyChunked. The object is read in using a lazy bytestring, which is streamed though, creating chunks as needed, without ever buffering more than 1 chunk in memory. Getting the progress meter update to work right was also fun, since progress meter values are absolute. Finessed by constructing an offset meter. This commit was sponsored by Richard Collins.	2014-07-25 16:20:32 -04:00
Joey Hess	ceea04e77f	move meteredWriteFileChunks out of legacy	2014-07-24 16:42:35 -04:00
Joey Hess	e2c44bf656	implement chunk logs Slightly tricky as they are not normal UUIDBased logs, but are instead maps from (uuid, chunksize) to chunkcount. This commit was sponsored by Frank Thomas.	2014-07-24 16:23:36 -04:00
Joey Hess	bbdb2c04d5	improve chunk data types	2014-07-24 15:08:07 -04:00
Joey Hess	9e2d49d441	prepare for new style chunking Moved old legacy chunking code, and cleaned up the directory and webdav remotes use of it, so when no chunking is configured, that code is not used. The config for new style chunking will be chunk=1M instead of chunksize=1M. There should be no behavior changes from this commit. This commit was sponsored by Andreas Laas.	2014-07-24 14:49:22 -04:00
Joey Hess	5756636486	directory, webdav: Fix bug introduced in version 4.20131002 that caused the chunkcount file to not be written. Work around repositories without such a file, so files can still be retreived from them.	2013-10-26 15:03:12 -04:00
Joey Hess	06ea92282f	fix inverted logic when determining whether to write a chunkcount file late-night hlint bit me on this one.. Reviewed `c1990702e9` and the rest of it seems ok	2013-10-26 14:08:29 -04:00
Joey Hess	c1990702e9	hlint	2013-09-25 23:19:01 -04:00
Joey Hess	cf07a2c412	webapp: Progess bar fixes for many types of special remotes. There was confusion in different parts of the progress bar code about whether an update contained the total number of bytes transferred, or the number of bytes transferred since the last update. One way this bug showed up was progress bars that seemed to stick at zero for a long time. In order to fix it comprehensively, I add a new BytesProcessed data type, that is explicitly a total quantity of bytes, not a delta. Note that this doesn't necessarily fix every problem with progress bars. Particularly, buffering can now cause progress bars to seem to run ahead of transfers, reaching 100% when data is still being uploaded.	2013-03-28 17:04:37 -04:00
Joey Hess	24c6eae1b5	show errors	2013-01-02 13:50:16 -04:00
Joey Hess	020a25abe1	avoid unnecessary Maybe	2012-11-30 00:55:59 -04:00
Joey Hess	5f977cc725	directory special remote: Made more efficient and robust. Files are now written to a tmp directory in the remote, and once all chunks are written, etc, it's moved into the final place atomically. For now, checkpresent still checks every single chunk of a file, because the old method could leave partially transferred files with some chunks present and others not.	2012-11-19 13:18:23 -04:00
Joey Hess	7df1e71fe3	S3: Added progress display for uploading and downloading.	2012-11-18 22:49:07 -04:00
Joey Hess	c8751be151	simplify	2012-11-18 18:27:53 -04:00
Joey Hess	1fe76b57d6	webdav now checks presence of and receives chunked content Note that receiving encrypted chunked content currently involves buffering. (So does doing so with the directory special remote.)	2012-11-16 23:16:18 -04:00
Joey Hess	92d5d81c2c	generic chunked content helper However, directory still uses its optimzed chunked file writer, as it uses less memory than the generic one in the helper.	2012-11-16 17:58:08 -04:00

1 2

81 commits