From 0b053b96119c1a2d6c2d03ba7f98f032e89b8b66 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 2 Nov 2018 13:41:50 -0400 Subject: [PATCH] Fix a P2P protocol hang When readContent got Nothing from prepSendAnnex, it did not run its callback, and the callback is what sends the DATA reply. sendContent checks with contentSize that the object file is present, but that doesn't really guarantee that prepSendAnnex won't return Nothing. So, it was possible for a P2P protocol GET to not receive a response, and appear to hang. When what it's really doing is waiting for the next protocol command. This seems most likely to happen when the annex is in direct mode, and the file being requested has been modified. It could also happen in an indirect mode repository if genInodeCache somehow failed. Perhaps due to a race with a drop of the content file. Fixed by making readContent behave the way its spec said it should, and run the callback with L.empty in this case. Note that, it's finee for readContent to send any amount of data to the callback, including L.empty. sendBytes deals with that by making sure it sends exactly the specified number of bytes, aborting the protocol if it's too short. So, when L.empty is sent, the protocol will end up aborting. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project. --- CHANGELOG | 1 + P2P/Annex.hs | 21 +++++++++++---------- P2P/Protocol.hs | 5 +++-- 3 files changed, 15 insertions(+), 12 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index 41196c28d8..a4b255dce7 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,6 +1,7 @@ git-annex (7.20181032) UNRELEASED; urgency=medium * Increase minimum QuickCheck version. + * Fix a P2P protocol hang. -- Joey Hess Wed, 31 Oct 2018 15:53:03 -0400 diff --git a/P2P/Annex.hs b/P2P/Annex.hs index c89ef93a1e..a08b923176 100644 --- a/P2P/Annex.hs +++ b/P2P/Annex.hs @@ -50,18 +50,19 @@ runLocal runst runner a = case a of size <- inAnnex' isJust Nothing getsize k runner (next (Len <$> size)) ReadContent k af o sender next -> do + let proceed c = do + r <- tryNonAsync c + case r of + Left e -> return $ Left $ ProtoFailureException e + Right (Left e) -> return $ Left e + Right (Right ok) -> runner (next ok) v <- tryNonAsync $ prepSendAnnex k case v of - Right (Just (f, checkchanged)) -> do - v' <- tryNonAsync $ - transfer upload k af $ - sinkfile f o checkchanged sender - case v' of - Left e -> return $ Left $ ProtoFailureException e - Right (Left e) -> return $ Left e - Right (Right ok) -> runner (next ok) - -- content not available - Right Nothing -> runner (next False) + Right (Just (f, checkchanged)) -> proceed $ + transfer upload k af $ + sinkfile f o checkchanged sender + Right Nothing -> proceed $ + runner (sender mempty (return Invalid)) Left e -> return $ Left $ ProtoFailureException e StoreContent k af o l getb validitycheck next -> do -- This is the same as the retrievalSecurityPolicy of diff --git a/P2P/Protocol.hs b/P2P/Protocol.hs index 49a3d5bf6f..6e2d7fc81e 100644 --- a/P2P/Protocol.hs +++ b/P2P/Protocol.hs @@ -238,8 +238,9 @@ data LocalF c -- present. | ReadContent Key AssociatedFile Offset (L.ByteString -> Proto Validity -> Proto Bool) (Bool -> c) -- ^ Reads the content of a key and sends it to the callback. - -- If the content is not available, sends L.empty to the callback. - -- Note that the content may change while it's being sent. + -- May send any amount of data, including L.empty if the content is + -- not available. The callback must deal with that. + -- And the content may change while it's being sent. -- The callback is passed a validity check that it can run after -- sending the content to detect when this happened. | StoreContent Key AssociatedFile Offset Len (Proto L.ByteString) (Proto (Maybe Validity)) (Bool -> c)