deal with unlocked files

P2P protocol version 1 adds VALID|INVALID after DATA; INVALID means the
file was detected to change content while it was being sent and so we
may not have received the valid content of the file.

Added new MustVerify constructor for Verification, which forces
verification even when annex.verify=false etc. This is used when INVALID
and in protocol version 0.

As well as changing git-annex-shell p2psdio, this makes git-annex tor
remotes always force verification, since they don't yet use protocol
version 1. Previously, annex.verify=false could skip verification when
using tor remotes, and let bad data into the repository.

This commit was sponsored by Jack Hill on Patreon.
This commit is contained in:
Joey Hess 2018-03-13 14:18:30 -04:00
parent 9930b1f140
commit 31e1adc005
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
10 changed files with 141 additions and 78 deletions

View file

@ -16,7 +16,7 @@ module Annex.Content (
lockContentForRemoval, lockContentForRemoval,
ContentRemovalLock, ContentRemovalLock,
getViaTmp, getViaTmp,
getViaTmp', getViaTmpFromDisk,
checkDiskSpaceToGet, checkDiskSpaceToGet,
prepTmp, prepTmp,
withTmp, withTmp,
@ -295,13 +295,13 @@ lockContentUsing locker key a = do
- the key and moves the file into the annex as a key's content. -} - the key and moves the file into the annex as a key's content. -}
getViaTmp :: VerifyConfig -> Key -> (FilePath -> Annex (Bool, Verification)) -> Annex Bool getViaTmp :: VerifyConfig -> Key -> (FilePath -> Annex (Bool, Verification)) -> Annex Bool
getViaTmp v key action = checkDiskSpaceToGet key False $ getViaTmp v key action = checkDiskSpaceToGet key False $
getViaTmp' v key action getViaTmpFromDisk v key action
{- Like getViaTmp, but does not check that there is enough disk space {- Like getViaTmp, but does not check that there is enough disk space
- for the incoming key. For use when the key content is already on disk - for the incoming key. For use when the key content is already on disk
- and not being copied into place. -} - and not being copied into place. -}
getViaTmp' :: VerifyConfig -> Key -> (FilePath -> Annex (Bool, Verification)) -> Annex Bool getViaTmpFromDisk :: VerifyConfig -> Key -> (FilePath -> Annex (Bool, Verification)) -> Annex Bool
getViaTmp' v key action = do getViaTmpFromDisk v key action = do
tmpfile <- prepTmp key tmpfile <- prepTmp key
(ok, verification) <- action tmpfile (ok, verification) <- action tmpfile
if ok if ok
@ -331,12 +331,15 @@ getViaTmp' v key action = do
- it is checked. - it is checked.
-} -}
verifyKeyContent :: VerifyConfig -> Verification -> Key -> FilePath -> Annex Bool verifyKeyContent :: VerifyConfig -> Verification -> Key -> FilePath -> Annex Bool
verifyKeyContent _ Verified _ _ = return True verifyKeyContent v verification k f = case verification of
verifyKeyContent v UnVerified k f = ifM (shouldVerify v) Verified -> return True
( verifysize <&&> verifycontent UnVerified -> ifM (shouldVerify v)
, return True ( verify
) , return True
)
MustVerify -> verify
where where
verify = verifysize <&&> verifycontent
verifysize = case keySize k of verifysize = case keySize k of
Nothing -> return True Nothing -> return True
Just size -> do Just size -> do

View file

@ -213,7 +213,7 @@ storeReceived f = do
warning $ "Received a file " ++ f ++ " that is not a git-annex key. Deleting this file." warning $ "Received a file " ++ f ++ " that is not a git-annex key. Deleting this file."
liftIO $ nukeFile f liftIO $ nukeFile f
Just k -> void $ Just k -> void $
getViaTmp' AlwaysVerify k $ \dest -> unVerified $ getViaTmpFromDisk AlwaysVerify k $ \dest -> unVerified $
liftIO $ catchBoolIO $ do liftIO $ catchBoolIO $ do
rename f dest rename f dest
return True return True

View file

@ -79,11 +79,11 @@ perform file oldkey newkey = do
linkKey :: FilePath -> Key -> Key -> Annex Bool linkKey :: FilePath -> Key -> Key -> Annex Bool
linkKey file oldkey newkey = ifM (isJust <$> isAnnexLink file) linkKey file oldkey newkey = ifM (isJust <$> isAnnexLink file)
{- If the object file is already hardlinked to elsewhere, a hard {- If the object file is already hardlinked to elsewhere, a hard
- link won't be made by getViaTmp', but a copy instead. - link won't be made by getViaTmpFromDisk, but a copy instead.
- This avoids hard linking to content linked to an - This avoids hard linking to content linked to an
- unlocked file, which would leave the new key unlocked - unlocked file, which would leave the new key unlocked
- and vulnerable to corruption. -} - and vulnerable to corruption. -}
( getViaTmp' DefaultVerify newkey $ \tmp -> unVerified $ do ( getViaTmpFromDisk DefaultVerify newkey $ \tmp -> unVerified $ do
oldobj <- calcRepo (gitAnnexLocation oldkey) oldobj <- calcRepo (gitAnnexLocation oldkey)
linkOrCopy' (return True) newkey oldobj tmp Nothing linkOrCopy' (return True) newkey oldobj tmp Nothing
, do , do

View file

@ -51,14 +51,10 @@ runLocal runst runner a = case a of
ReadContent k af o sender next -> do ReadContent k af o sender next -> do
v <- tryNonAsync $ prepSendAnnex k v <- tryNonAsync $ prepSendAnnex k
case v of case v of
-- The check can detect if the file Right (Just (f, checkchanged)) -> do
-- changed while it was transferred, but we don't
-- use it. Instead, the receiving peer must
-- AlwaysVerify the content it receives.
Right (Just (f, _check)) -> do
v' <- tryNonAsync $ v' <- tryNonAsync $
transfer upload k af $ transfer upload k af $
sinkfile f o sender sinkfile f o checkchanged sender
case v' of case v' of
Left e -> return (Left (show e)) Left e -> return (Left (show e))
Right (Left e) -> return (Left (show e)) Right (Left e) -> return (Left (show e))
@ -66,16 +62,16 @@ runLocal runst runner a = case a of
-- content not available -- content not available
Right Nothing -> runner (next False) Right Nothing -> runner (next False)
Left e -> return (Left (show e)) Left e -> return (Left (show e))
StoreContent k af o l getb next -> do StoreContent k af o l getb validitycheck next -> do
ok <- flip catchNonAsync (const $ return False) $ ok <- flip catchNonAsync (const $ return False) $
transfer download k af $ \p -> transfer download k af $ \p ->
getViaTmp AlwaysVerify k $ \tmp -> getViaTmp DefaultVerify k $ \tmp -> do
unVerified $ storefile tmp o l getb p storefile tmp o l getb validitycheck p
runner (next ok)
StoreContentTo dest o l getb next -> do
ok <- flip catchNonAsync (const $ return False) $
storefile dest o l getb nullMeterUpdate
runner (next ok) runner (next ok)
StoreContentTo dest o l getb validitycheck next -> do
res <- flip catchNonAsync (const $ return (False, UnVerified)) $
storefile dest o l getb validitycheck nullMeterUpdate
runner (next res)
SetPresent k u next -> do SetPresent k u next -> do
v <- tryNonAsync $ logChange k u InfoPresent v <- tryNonAsync $ logChange k u InfoPresent
case v of case v of
@ -120,6 +116,7 @@ runLocal runst runner a = case a of
UpdateMeterTotalSize m sz next -> do UpdateMeterTotalSize m sz next -> do
liftIO $ setMeterTotalSize m sz liftIO $ setMeterTotalSize m sz
runner next runner next
RunValidityCheck check next -> runner . next =<< check
where where
transfer mk k af ta = case runst of transfer mk k af ta = case runst of
-- Update transfer logs when serving. -- Update transfer logs when serving.
@ -129,20 +126,31 @@ runLocal runst runner a = case a of
-- a client. -- a client.
Client _ -> ta nullMeterUpdate Client _ -> ta nullMeterUpdate
storefile dest (Offset o) (Len l) getb p = do storefile dest (Offset o) (Len l) getb validitycheck p = do
let p' = offsetMeterUpdate p (toBytesProcessed o) let p' = offsetMeterUpdate p (toBytesProcessed o)
v <- runner getb v <- runner getb
case v of case v of
Right b -> liftIO $ do Right b -> do
withBinaryFile dest ReadWriteMode $ \h -> do liftIO $ withBinaryFile dest ReadWriteMode $ \h -> do
when (o /= 0) $ when (o /= 0) $
hSeek h AbsoluteSeek o hSeek h AbsoluteSeek o
meteredWrite p' h b meteredWrite p' h b
sz <- getFileSize dest rightsize <- do
return (toInteger sz == l + o) sz <- liftIO $ getFileSize dest
return (toInteger sz == l + o)
runner validitycheck >>= \case
Right (Just Valid) ->
return (rightsize, UnVerified)
_ -> do
-- Invalid, or old protocol
-- version. Validity is not
-- known. Force content
-- verification.
return (rightsize, MustVerify)
Left e -> error e Left e -> error e
sinkfile f (Offset o) sender p = bracket setup cleanup go sinkfile f (Offset o) checkchanged sender p = bracket setup cleanup go
where where
setup = liftIO $ openBinaryFile f ReadMode setup = liftIO $ openBinaryFile f ReadMode
cleanup = liftIO . hClose cleanup = liftIO . hClose
@ -151,4 +159,8 @@ runLocal runst runner a = case a of
when (o /= 0) $ when (o /= 0) $
liftIO $ hSeek h AbsoluteSeek o liftIO $ hSeek h AbsoluteSeek o
b <- liftIO $ hGetContentsMetered h p' b <- liftIO $ hGetContentsMetered h p'
runner (sender b) let validitycheck = local $ runValidityCheck $
checkchanged >>= return . \case
False -> Invalid
True -> Valid
runner (sender b validitycheck)

View file

@ -14,8 +14,10 @@
module P2P.Protocol where module P2P.Protocol where
import qualified Utility.SimpleProtocol as Proto import qualified Utility.SimpleProtocol as Proto
import Types (Annex)
import Types.Key import Types.Key
import Types.UUID import Types.UUID
import Types.Remote (Verification(..), unVerified)
import Utility.AuthToken import Utility.AuthToken
import Utility.Applicative import Utility.Applicative
import Utility.PartialPrelude import Utility.PartialPrelude
@ -48,12 +50,15 @@ defaultProtocolVersion :: ProtocolVersion
defaultProtocolVersion = ProtocolVersion 0 defaultProtocolVersion = ProtocolVersion 0
maxProtocolVersion :: ProtocolVersion maxProtocolVersion :: ProtocolVersion
maxProtocolVersion = ProtocolVersion 0 maxProtocolVersion = ProtocolVersion 1
-- | Service as used by the connect message in gitremote-helpers(1) -- | Service as used by the connect message in gitremote-helpers(1)
data Service = UploadPack | ReceivePack data Service = UploadPack | ReceivePack
deriving (Show) deriving (Show)
data Validity = Valid | Invalid
deriving (Show)
-- | Messages in the protocol. The peer that makes the connection -- | Messages in the protocol. The peer that makes the connection
-- always initiates requests, and the other peer makes responses to them. -- always initiates requests, and the other peer makes responses to them.
data Message data Message
@ -76,6 +81,7 @@ data Message
| SUCCESS | SUCCESS
| FAILURE | FAILURE
| DATA Len -- followed by bytes of data | DATA Len -- followed by bytes of data
| VALIDITY Validity
| ERROR String | ERROR String
deriving (Show) deriving (Show)
@ -98,6 +104,8 @@ instance Proto.Sendable Message where
formatMessage ALREADY_HAVE = ["ALREADY-HAVE"] formatMessage ALREADY_HAVE = ["ALREADY-HAVE"]
formatMessage SUCCESS = ["SUCCESS"] formatMessage SUCCESS = ["SUCCESS"]
formatMessage FAILURE = ["FAILURE"] formatMessage FAILURE = ["FAILURE"]
formatMessage (VALIDITY Valid) = ["VALID"]
formatMessage (VALIDITY Invalid) = ["INVALID"]
formatMessage (DATA len) = ["DATA", Proto.serialize len] formatMessage (DATA len) = ["DATA", Proto.serialize len]
formatMessage (ERROR err) = ["ERROR", Proto.serialize err] formatMessage (ERROR err) = ["ERROR", Proto.serialize err]
@ -122,6 +130,8 @@ instance Proto.Receivable Message where
parseCommand "FAILURE" = Proto.parse0 FAILURE parseCommand "FAILURE" = Proto.parse0 FAILURE
parseCommand "DATA" = Proto.parse1 DATA parseCommand "DATA" = Proto.parse1 DATA
parseCommand "ERROR" = Proto.parse1 ERROR parseCommand "ERROR" = Proto.parse1 ERROR
parseCommand "VALID" = Proto.parse0 (VALIDITY Valid)
parseCommand "INVALID" = Proto.parse0 (VALIDITY Invalid)
parseCommand _ = Proto.parseFail parseCommand _ = Proto.parseFail
instance Proto.Serializable ProtocolVersion where instance Proto.Serializable ProtocolVersion where
@ -226,25 +236,26 @@ data LocalF c
| ContentSize Key (Maybe Len -> c) | ContentSize Key (Maybe Len -> c)
-- ^ Gets size of the content of a key, when the full content is -- ^ Gets size of the content of a key, when the full content is
-- present. -- present.
| ReadContent Key AssociatedFile Offset (L.ByteString -> Proto Bool) (Bool -> c) | ReadContent Key AssociatedFile Offset (L.ByteString -> Proto Validity -> Proto Bool) (Bool -> c)
-- ^ Reads the content of a key and sends it to the callback. -- ^ Reads the content of a key and sends it to the callback.
-- Note that the content may change while it's being sent.
-- If the content is not available, sends L.empty to the callback. -- If the content is not available, sends L.empty to the callback.
| StoreContent Key AssociatedFile Offset Len (Proto L.ByteString) (Bool -> c) -- Note that the content may change while it's being sent.
-- The callback is passed a validity check that it can run after
-- sending the content to detect when this happened.
| StoreContent Key AssociatedFile Offset Len (Proto L.ByteString) (Proto (Maybe Validity)) (Bool -> c)
-- ^ Stores content to the key's temp file starting at an offset. -- ^ Stores content to the key's temp file starting at an offset.
-- Once the whole content of the key has been stored, moves the -- Once the whole content of the key has been stored, moves the
-- temp file into place as the content of the key, and returns True. -- temp file into place as the content of the key, and returns True.
-- --
-- Note: The ByteString may not contain the entire remaining content -- If the validity check is provided and fails, the content was
-- of the key. Only once the temp file size == Len has the whole -- changed while it was being sent, so verificiation of the
-- content been transferred. -- received content should be forced.
| StoreContentTo FilePath Offset Len (Proto L.ByteString) (Bool -> c)
-- ^ Stores the content to a temp file starting at an offset.
-- Once the whole content of the key has been stored, returns True.
-- --
-- Note: The ByteString may not contain the entire remaining content -- Note: The ByteString may not contain the entire remaining content
-- of the key. Only once the temp file size == Len has the whole -- of the key. Only once the temp file size == Len has the whole
-- content been transferred. -- content been transferred.
| StoreContentTo FilePath Offset Len (Proto L.ByteString) (Proto (Maybe Validity)) ((Bool, Verification) -> c)
-- ^ Like StoreContent, but stores the content to a temp file.
| SetPresent Key UUID c | SetPresent Key UUID c
| CheckContentPresent Key (Bool -> c) | CheckContentPresent Key (Bool -> c)
-- ^ Checks if the whole content of the key is locally present. -- ^ Checks if the whole content of the key is locally present.
@ -261,6 +272,8 @@ data LocalF c
| UpdateMeterTotalSize Meter Integer c | UpdateMeterTotalSize Meter Integer c
-- ^ Updates the total size of a Meter, for cases where the size is -- ^ Updates the total size of a Meter, for cases where the size is
-- not known until the data is being received. -- not known until the data is being received.
| RunValidityCheck (Annex Validity) (Validity -> c)
-- ^ Runs a deferred validity check.
deriving (Functor) deriving (Functor)
type Local = Free LocalF type Local = Free LocalF
@ -326,7 +339,7 @@ remove key = do
net $ sendMessage (REMOVE key) net $ sendMessage (REMOVE key)
checkSuccess checkSuccess
get :: FilePath -> Key -> AssociatedFile -> Meter -> MeterUpdate -> Proto Bool get :: FilePath -> Key -> AssociatedFile -> Meter -> MeterUpdate -> Proto (Bool, Verification)
get dest key af m p = get dest key af m p =
receiveContent (Just m) p sizer storer (\offset -> GET offset af key) receiveContent (Just m) p sizer storer (\offset -> GET offset af key)
where where
@ -436,8 +449,9 @@ serveAuthed servermode myuuid = void $ serverLoop handler
then net $ sendMessage ALREADY_HAVE then net $ sendMessage ALREADY_HAVE
else do else do
let sizer = tmpContentSize key let sizer = tmpContentSize key
let storer = storeContent key af let storer = \o l b v -> unVerified $
ok <- receiveContent Nothing nullMeterUpdate sizer storer PUT_FROM storeContent key af o l b v
(ok, _v) <- receiveContent Nothing nullMeterUpdate sizer storer PUT_FROM
when ok $ when ok $
local $ setPresent key myuuid local $ setPresent key myuuid
return ServerContinue return ServerContinue
@ -468,20 +482,29 @@ serveAuthed servermode myuuid = void $ serverLoop handler
sendContent :: Key -> AssociatedFile -> Offset -> MeterUpdate -> Proto Bool sendContent :: Key -> AssociatedFile -> Offset -> MeterUpdate -> Proto Bool
sendContent key af offset@(Offset n) p = go =<< local (contentSize key) sendContent key af offset@(Offset n) p = go =<< local (contentSize key)
where where
go Nothing = sender (Len 0) L.empty go Nothing = sender (Len 0) L.empty (return Valid)
go (Just (Len totallen)) = do go (Just (Len totallen)) = do
let len = totallen - n let len = totallen - n
if len <= 0 if len <= 0
then sender (Len 0) L.empty then sender (Len 0) L.empty (return Valid)
else local $ readContent key af offset $ else local $ readContent key af offset $
sender (Len len) sender (Len len)
sender len content = do sender len content validitycheck = do
let p' = offsetMeterUpdate p (toBytesProcessed n) let p' = offsetMeterUpdate p (toBytesProcessed n)
net $ sendMessage (DATA len) net $ sendMessage (DATA len)
net $ sendBytes len content p' net $ sendBytes len content p'
ver <- net getProtocolVersion
when (ver >= ProtocolVersion 1) $
net . sendMessage . VALIDITY =<< validitycheck
checkSuccess checkSuccess
receiveContent :: Maybe Meter -> MeterUpdate -> Local Len -> (Offset -> Len -> Proto L.ByteString -> Local Bool) -> (Offset -> Message) -> Proto Bool receiveContent
:: Maybe Meter
-> MeterUpdate
-> Local Len
-> (Offset -> Len -> Proto L.ByteString -> Proto (Maybe Validity) -> Local (Bool, Verification))
-> (Offset -> Message)
-> Proto (Bool, Verification)
receiveContent mm p sizer storer mkmsg = do receiveContent mm p sizer storer mkmsg = do
Len n <- local sizer Len n <- local sizer
let p' = offsetMeterUpdate p (toBytesProcessed n) let p' = offsetMeterUpdate p (toBytesProcessed n)
@ -493,13 +516,22 @@ receiveContent mm p sizer storer mkmsg = do
local $ case mm of local $ case mm of
Nothing -> return () Nothing -> return ()
Just m -> updateMeterTotalSize m (n+l) Just m -> updateMeterTotalSize m (n+l)
ok <- local $ storer offset len ver <- net getProtocolVersion
let validitycheck = if ver >= ProtocolVersion 1
then net receiveMessage >>= \case
Just (VALIDITY v) -> return (Just v)
_ -> do
net $ sendMessage (ERROR "expected VALID or INVALID")
return Nothing
else return Nothing
(ok, v) <- local $ storer offset len
(net (receiveBytes len p')) (net (receiveBytes len p'))
validitycheck
sendSuccess ok sendSuccess ok
return ok return (ok, v)
_ -> do _ -> do
net $ sendMessage (ERROR "expected DATA") net $ sendMessage (ERROR "expected DATA")
return False return (False, UnVerified)
checkSuccess :: Proto Bool checkSuccess :: Proto Bool
checkSuccess = do checkSuccess = do

View file

@ -470,13 +470,13 @@ copyFromRemote' forcersync r (State connpool _) key file dest meterupdate
file forwardRetry file forwardRetry
(\p -> copier object dest (combineMeterUpdate p meterupdate) checksuccess) (\p -> copier object dest (combineMeterUpdate p meterupdate) checksuccess)
| Git.repoIsSsh (repo r) = if forcersync | Git.repoIsSsh (repo r) = if forcersync
then unVerified $ fallback meterupdate then fallback meterupdate
else P2PHelper.retrieve else P2PHelper.retrieve
(\p -> Ssh.runProto r connpool False (fallback p)) (\p -> Ssh.runProto r connpool (False, UnVerified) (fallback p))
key file dest meterupdate key file dest meterupdate
| otherwise = giveup "copying from non-ssh, non-http remote not supported" | otherwise = giveup "copying from non-ssh, non-http remote not supported"
where where
fallback p = feedprogressback $ \p' -> do fallback p = unVerified $ feedprogressback $ \p' -> do
oh <- mkOutputHandlerQuiet oh <- mkOutputHandlerQuiet
Ssh.rsyncHelper oh (Just (combineMeterUpdate p' p)) Ssh.rsyncHelper oh (Just (combineMeterUpdate p' p))
=<< Ssh.rsyncParamsRemote False r Download key dest file =<< Ssh.rsyncParamsRemote False r Download key dest file

View file

@ -37,10 +37,11 @@ store runner k af p = do
fromMaybe False fromMaybe False
<$> runner p' (P2P.put k af p') <$> runner p' (P2P.put k af p')
retrieve :: (MeterUpdate -> ProtoRunner Bool) -> Key -> AssociatedFile -> FilePath -> MeterUpdate -> Annex (Bool, Verification) retrieve :: (MeterUpdate -> ProtoRunner (Bool, Verification)) -> Key -> AssociatedFile -> FilePath -> MeterUpdate -> Annex (Bool, Verification)
retrieve runner k af dest p = unVerified $ retrieve runner k af dest p =
metered (Just p) k (return Nothing) $ \m p' -> fromMaybe False metered (Just p) k (return Nothing) $ \m p' ->
<$> runner p' (P2P.get dest k af m p') fromMaybe (False, UnVerified)
<$> runner p' (P2P.get dest k af m p')
remove :: ProtoRunner Bool -> Key -> Annex Bool remove :: ProtoRunner Bool -> Key -> Annex Bool
remove runner k = fromMaybe False <$> runner (P2P.remove k) remove runner k = fromMaybe False <$> runner (P2P.remove k)

View file

@ -149,9 +149,15 @@ instance Ord (RemoteA a) where
instance ToUUID (RemoteA a) where instance ToUUID (RemoteA a) where
toUUID = uuid toUUID = uuid
-- Use Verified when the content of a key is verified as part of a data Verification
-- transfer, and so a separate verification step is not needed. = UnVerified
data Verification = UnVerified | Verified -- ^ content was not verified during transfer, but is probably
-- ok, so if verification is disabled, don't verify it
| Verified
-- ^ content of key was verified during transfer
| MustVerify
-- ^ content likely to have been altered during transfer,
-- verify even if verification is normally disabled
unVerified :: Monad m => m Bool -> m (Bool, Verification) unVerified :: Monad m => m Bool -> m (Bool, Verification)
unVerified a = do unVerified a = do

View file

@ -138,6 +138,13 @@ the client to start. This allows resuming transfers.
The client then sends a DATA message with content of the file from The client then sends a DATA message with content of the file from
the offset to the end of file. the offset to the end of file.
In protocol version 1, after the data, the client sends an additional
message, to indicate if the content of the file has changed while it
was being sent.
INVALID
VALID
If the server successfully receives the data and stores the content, If the server successfully receives the data and stores the content,
it replies with SUCCESS. Otherwise, FAILURE. it replies with SUCCESS. Otherwise, FAILURE.
@ -154,6 +161,13 @@ See description of AssociatedFile above.
The server then sends a DATA message with the content of the file The server then sends a DATA message with the content of the file
from the offset to end of file. from the offset to end of file.
In protocol version 1, after the data, the server sends an additional
message, to indicate if the content of the file has changed while it
was being sent.
INVALID
VALID
The client replies with SUCCESS or FAILURE. The client replies with SUCCESS or FAILURE.
## Connection to services ## Connection to services

View file

@ -3,19 +3,15 @@ git-annex-shell recvkey has a speed optimisation, when it's told the file
being sent is locked, it can avoid an expensive verification, when being sent is locked, it can avoid an expensive verification, when
annex.verify=false. (Similar for transfers in the other direction.) annex.verify=false. (Similar for transfers in the other direction.)
The P2P protocol does not have a way to communicate when that happens, The P2P protocol does not have a way to communicate when that happens.
and forces AlwaysVerify. File content can be modified while it's sent, and if annex.verify=false
is allowed to take effect, bad data will get into the repository.
It would be nice to support that, but if it added an extra round trip It would be nice to support annex.verify=false when it's safe but not
when the file got modified, but if it added an extra round trip
to the P2P protocol, that could lose some of the speed gains. to the P2P protocol, that could lose some of the speed gains.
The best way seems to be to add a new protocol version, where DATA
has an extra byte at the end that is "1" when the file didn't change
as it was transferred, and "0" when it did.
My first attempt to implement this failed miserably due to a Free monad Resumes make this difficult. What if a file starts to be transferred,
type check problem I could not see a way around.
Also, resumes make this difficult. What if a file starts to be transferred,
gets changed while it's transferred so some bad bytes are sent, then the gets changed while it's transferred so some bad bytes are sent, then the
transfer is interrupted, and later is resumed from a different remote transfer is interrupted, and later is resumed from a different remote
that has the correct content. How can it tell that the bad data was sent that has the correct content. How can it tell that the bad data was sent
@ -33,9 +29,10 @@ repository was unlocked, and the second is locked, it's safe for recvkey to
treat it locked and skip verification. treat it locked and skip verification.
Seems the best we could do with the P2P protocol, barring adding Seems the best we could do with the P2P protocol, barring adding
rsync-style rolling hashing to it, is to allow skipping verification rsync-style rolling hashing to it, is to detect when a file got modified
when the sender is locked.. But not when resuming, since we don't know as it was being sent, and inform the peer that the data it got is bad.
where that resumed data comes from. It can then throw it away rather than putting the bad data into the
repository.
This is not really unique to the P2P protocol -- special remotes This is not really unique to the P2P protocol -- special remotes
can be written to support resuming. The web special remote does; there may can be written to support resuming. The web special remote does; there may
@ -48,9 +45,7 @@ the repository.
So, let's solve this broadly. Whenever a download is resumed, force So, let's solve this broadly. Whenever a download is resumed, force
AlwaysVerify, unless the remote returns Verified. This can be done in AlwaysVerify, unless the remote returns Verified. This can be done in
Annex.Content.getViaTmp, so it will affect all downloads involving the tmp Annex.Content.getViaTmp, so it will affect all downloads involving the tmp
key for a file. (The P2P protocol still needs to prevent skipping key for a file.
verification when a download is not being resumed, if the sender is
locked.)
This would change handling of resumes of downloads using rsync too. This would change handling of resumes of downloads using rsync too.
But those are always safe to skip verification of, although they don't But those are always safe to skip verification of, although they don't