incremental hashing for fileRetriever

It uses tailVerify to hash the file while it's being written.

This is able to sometimes avoid a separate checksum step. Although
if the file gets written quickly enough, tailVerify may not see it
get created before the write finishes, and the checksum still happens.

Testing with the directory special remote, incremental checksumming did
not happen. But then I disabled the copy CoW probing, and it did work.
What's going on with that is the CoW probe creates an empty file on
failure, then deletes it, and then the file is created again. tailVerify
will open the first, empty file, and so fails to read the content that
gets written to the file that replaces it.

The directory special remote really ought to be able to avoid needing to
use tailVerify, and while other special remotes could do things that
cause similar problems, they probably don't. And if they do, it just
means the checksum doesn't get done incrementally.

Sponsored-by: Dartmouth College's DANDI project
This commit is contained in:
Joey Hess 2021-08-13 15:43:29 -04:00
parent ff2dc5eb18
commit dadbb510f6
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
8 changed files with 80 additions and 49 deletions

View file

@ -52,4 +52,6 @@ data IncrementalVerifier = IncrementalVerifier
-- if the hash verified.
, failIncremental :: IO ()
-- ^ Call if the incremental verification needs to fail.
, descVerify :: String
-- ^ A description of what is done to verify the content.
}