This commit is contained in:
Joey Hess 2018-03-13 12:17:24 -04:00
parent e16b069331
commit 9930b1f140
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -20,3 +20,41 @@ gets changed while it's transferred so some bad bytes are sent, then the
transfer is interrupted, and later is resumed from a different remote
that has the correct content. How can it tell that the bad data was sent
in this case?
----
The way that git-annex-shell recvkey handles this is the client
communicates to it if it's sending an unlocked file, which forces
verification. Otherwise, verification can be skipped.
In the case where an upload is started from one repository and later
resumed by another, rsync wipes out any differences, so if the first
repository was unlocked, and the second is locked, it's safe for recvkey to
treat it locked and skip verification.
Seems the best we could do with the P2P protocol, barring adding
rsync-style rolling hashing to it, is to allow skipping verification
when the sender is locked.. But not when resuming, since we don't know
where that resumed data comes from.
This is not really unique to the P2P protocol -- special remotes
can be written to support resuming. The web special remote does; there may
be external special remotes that do too. While the content of a key on
a special remote is not allowed to change, a download could start from
an unlocked git repo, and then be resumed from such a special remote.
When verification is disabled, this can result in bad content getting into
the repository.
So, let's solve this broadly. Whenever a download is resumed, force
AlwaysVerify, unless the remote returns Verified. This can be done in
Annex.Content.getViaTmp, so it will affect all downloads involving the tmp
key for a file. (The P2P protocol still needs to prevent skipping
verification when a download is not being resumed, if the sender is
locked.)
This would change handling of resumes of downloads using rsync too.
But those are always safe to skip verification of, although they don't
quite do a full verification of the key's hash. To still allow disabling of
verification of those, could add a third state in between UnVerified and
Verified, that means it's sure it's gotten exactly the same bytes as are on
the remote.