clean shut down of cluster connection when PUT is interrupted
An interrupted `git-annex copy --to` a cluster via the http server, when repeated, failed. The http server output "transfer already in progress, or unable to take transfer lock". Apparently a second connection was opened to the cluster, because the first connection never got shut down. Turned out the problem was that when proxying to a cluster, it would read a short ByteString from the client, and send that to the nodes. But that left the nodes warning more. Meanwhile, the proxy was expecting a SUCCESS/FAILURE message from the nodes. So it didn't return, and so the cluster connection stayed open.
This commit is contained in:
parent
bdde6d829c
commit
5e205f215d
2 changed files with 13 additions and 10 deletions
17
P2P/Proxy.hs
17
P2P/Proxy.hs
|
@ -558,7 +558,6 @@ proxyRequest proxydone proxyparams requestcomplete requestmessage protoerrhandle
|
||||||
(const protoerr)
|
(const protoerr)
|
||||||
|
|
||||||
relayPUTMulti minoffset remotes k (Len datalen) _ = do
|
relayPUTMulti minoffset remotes k (Len datalen) _ = do
|
||||||
let totallen = datalen + minoffset
|
|
||||||
-- Tell each remote how much data to expect, depending
|
-- Tell each remote how much data to expect, depending
|
||||||
-- on the remote's offset.
|
-- on the remote's offset.
|
||||||
rs <- forMC (proxyConcurrencyConfig proxyparams) remotes $ \r@(remoteside, remoteoffset) ->
|
rs <- forMC (proxyConcurrencyConfig proxyparams) remotes $ \r@(remoteside, remoteoffset) ->
|
||||||
|
@ -569,6 +568,8 @@ proxyRequest proxydone proxyparams requestcomplete requestmessage protoerrhandle
|
||||||
protoerrhandler (send (catMaybes rs) minoffset) $
|
protoerrhandler (send (catMaybes rs) minoffset) $
|
||||||
client $ net $ receiveBytes (Len datalen) nullMeterUpdate
|
client $ net $ receiveBytes (Len datalen) nullMeterUpdate
|
||||||
where
|
where
|
||||||
|
totallen = datalen + minoffset
|
||||||
|
|
||||||
chunksize = fromIntegral defaultChunkSize
|
chunksize = fromIntegral defaultChunkSize
|
||||||
|
|
||||||
-- Stream the lazy bytestring out to the remotes in chunks.
|
-- Stream the lazy bytestring out to the remotes in chunks.
|
||||||
|
@ -593,13 +594,21 @@ proxyRequest proxydone proxyparams requestcomplete requestmessage protoerrhandle
|
||||||
return r
|
return r
|
||||||
else return (Just r)
|
else return (Just r)
|
||||||
if L.null b'
|
if L.null b'
|
||||||
then sent (catMaybes rs')
|
then do
|
||||||
|
-- If we didn't receive as much
|
||||||
|
-- data as expected, close
|
||||||
|
-- connections to all the remotes,
|
||||||
|
-- because they are still waiting
|
||||||
|
-- on the rest of the data.
|
||||||
|
when (n' /= totallen) $
|
||||||
|
mapM_ (closeRemoteSide . fst) rs
|
||||||
|
sent (catMaybes rs')
|
||||||
else send (catMaybes rs') n' b'
|
else send (catMaybes rs') n' b'
|
||||||
|
|
||||||
sent [] = proxydone
|
sent [] = proxydone
|
||||||
sent rs = relayDATAFinishMulti k (map fst rs)
|
sent rs = relayDATAFinishMulti k (map fst rs)
|
||||||
|
|
||||||
runRemoteSideOrSkipFailed remoteside a =
|
runRemoteSideOrSkipFailed remoteside a =
|
||||||
runRemoteSide remoteside a >>= \case
|
runRemoteSide remoteside a >>= \case
|
||||||
Right v -> return (Just v)
|
Right v -> return (Just v)
|
||||||
Left _ -> do
|
Left _ -> do
|
||||||
|
@ -640,7 +649,7 @@ proxyRequest proxydone proxyparams requestcomplete requestmessage protoerrhandle
|
||||||
net receiveMessage
|
net receiveMessage
|
||||||
where
|
where
|
||||||
finish a = do
|
finish a = do
|
||||||
storeduuids <- forMC (proxyConcurrencyConfig proxyparams) rs $ \r ->
|
storeduuids <- forMC (proxyConcurrencyConfig proxyparams) rs $ \r ->
|
||||||
runRemoteSideOrSkipFailed r a >>= \case
|
runRemoteSideOrSkipFailed r a >>= \case
|
||||||
Just (Just resp) ->
|
Just (Just resp) ->
|
||||||
relayPUTRecord k r resp
|
relayPUTRecord k r resp
|
||||||
|
|
|
@ -28,12 +28,6 @@ Planned schedule of work:
|
||||||
|
|
||||||
## work notes
|
## work notes
|
||||||
|
|
||||||
* An interrupted `git-annex copy --to` a cluster via the http server,
|
|
||||||
when repeated, fails. The http server outputs "transfer already in
|
|
||||||
progress, or unable to take transfer lock". Apparently a second
|
|
||||||
connection gets opened to the cluster, because the first connection
|
|
||||||
never got shut down.
|
|
||||||
|
|
||||||
* When part of a file has been sent to a cluster via the http server,
|
* When part of a file has been sent to a cluster via the http server,
|
||||||
the transfer interrupted, and another node is added to the cluster,
|
the transfer interrupted, and another node is added to the cluster,
|
||||||
and the transfer of the file performed again, there is a failure
|
and the transfer of the file performed again, there is a failure
|
||||||
|
|
Loading…
Reference in a new issue