incremental checksum for local remotes

This benchmarks only slightly faster than the old git-annex. Eg, for a 1 gb file, 14.56s vs 15.57s. (On a ram disk; there would certianly be more of an effect if the file was written to disk and didn't stay in cache.) Commenting out the updateIncremental calls make the same run in 6.31s. May be that overhead in the implementation, other than the actual checksumming, is slowing it down. Eg, MVar access. (I also tried using 10x larger chunks, which did not change the speed.)
2021-02-10 16:05:24 -04:00 · 2021-02-10 16:05:24 -04:00 · f44d4704c6
commit f44d4704c6
parent 48f63c2798
4 changed files with 42 additions and 19 deletions
--- a/doc/todo/OPT58_34bundle34_get_+_check_40of_checksum41_in_a_single_operation/comment_10_695d1269ab20c66630ddfa2d8cbabbef._comment
+++ b/doc/todo/OPT58_34bundle34_get_+_check_40of_checksum41_in_a_single_operation/comment_10_695d1269ab20c66630ddfa2d8cbabbef._comment
@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 10"""
+ date="2021-02-10T19:48:58Z"
+ content="""
+Incremental hashing implemented for local git remotes.
+
+Next step should be a special remote, such as directory,
+that uses byteRetriever. Chunking and encryption will complicate them..
+"""]]
--- a/doc/todo/OPT58_34bundle34_get_+_check_40of_checksum41_in_a_single_operation/comment_9_4f4f4a42adafe52207ed32a5d20e94be._comment
+++ b/doc/todo/OPT58_34bundle34_get_+_check_40of_checksum41_in_a_single_operation/comment_9_4f4f4a42adafe52207ed32a5d20e94be._comment
@ -20,6 +20,6 @@ checksum.

 Urk: Using rsync currently protects against
 [[bugs/URL_key_potential_data_loss]], so the replacement would also need to
-deal with that. Probably by refusing to resume a partial transfer of an
-affected key. (Or it could just fall back to rsync for such keys.)
+deal with that. Eg, by comparing the temp file content with the start of
+the object when resuming.
 """]]