proxying GET now working

Memory use is small and constant; receiveBytes returns a lazy bytestring
and it does stream.

Comparing speed of a get of a 500 mb file over proxy from origin-origin,
vs from the same remote over a direct ssh:

joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from origin-origin
get bigfile (from origin-origin...)
ok
(recording state in git...)
1.89user 0.67system 0:10.79elapsed 23%CPU (0avgtext+0avgdata 68716maxresident)k
0inputs+984320outputs (0major+10779minor)pagefaults 0swaps

joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from direct-ssh
get bigfile (from direct-ssh...)
ok
1.79user 0.63system 0:10.49elapsed 23%CPU (0avgtext+0avgdata 65776maxresident)k
0inputs+1024312outputs (0major+9773minor)pagefaults 0swaps

So the proxy doesn't add much overhead even when run on the same machine as
the client and remote.

Still, piping receiveBytes into sendBytes like this does suggest that the proxy
could be made to use less CPU resouces by using `sendfile()`.
This commit is contained in:
Joey Hess 2024-06-11 15:01:14 -04:00
parent 09b5e53f49
commit a2f4a8eddf
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 59 additions and 24 deletions

View file

@ -79,7 +79,6 @@ getRepoUUID r = do
if c /= u && u /= NoUUID if c /= u && u /= NoUUID
then do then do
liftIO $ print (r, "setting cache", c, u)
updatecache u updatecache u
return u return u
else return c else return c

View file

@ -13,6 +13,7 @@ import Annex.Common
import P2P.Protocol import P2P.Protocol
import P2P.IO import P2P.IO
import qualified Remote import qualified Remote
import Utility.Metered (nullMeterUpdate)
data ClientSide = ClientSide RunState P2PConnection data ClientSide = ClientSide RunState P2PConnection
data RemoteSide = RemoteSide RunState P2PConnection data RemoteSide = RemoteSide RunState P2PConnection
@ -79,47 +80,57 @@ proxy proxydone servermode clientside remoteside othermessage protoerrhandler =
case othermessage of case othermessage of
Just message -> proxyclientmessage (Just message) Just message -> proxyclientmessage (Just message)
Nothing -> do Nothing -> do
-- Send client the VERSION from the remote. v <- protocolversion
proxyprotocolversion <-
either (const defaultProtocolVersion) id
<$> toremote (net getProtocolVersion)
protoerrhandler proxynextclientmessage $ protoerrhandler proxynextclientmessage $
toclient $ net $ sendMessage client $ net $ sendMessage $ VERSION v
(VERSION proxyprotocolversion)
where where
ClientSide clientrunst clientconn = clientside ClientSide clientrunst clientconn = clientside
RemoteSide remoterunst remoteconn = remoteside RemoteSide remoterunst remoteconn = remoteside
toremote = liftIO . runNetProto remoterunst remoteconn remote = liftIO . runNetProto remoterunst remoteconn
toclient = liftIO . runNetProto clientrunst clientconn client = liftIO . runNetProto clientrunst clientconn
protocolversion = either (const defaultProtocolVersion) id
<$> remote (net getProtocolVersion)
proxynextclientmessage () = protoerrhandler proxyclientmessage $ proxynextclientmessage () = protoerrhandler proxyclientmessage $
toclient (net receiveMessage) client (net receiveMessage)
-- Send a message to the remote and then -- Send a message to the remote and then
-- send its response back to the client. -- send its response back to the client.
proxyresponse message = proxyresponse message =
protoerrhandler handleresp $ protoerrhandler (withresp handleresp) $
toremote $ net $ do remote $ net $ do
sendMessage message sendMessage message
receiveMessage receiveMessage
where where
handleresp (Just resp) = handleresp resp =
protoerrhandler proxynextclientmessage $ protoerrhandler proxynextclientmessage $
toclient $ net $ sendMessage resp client $ net $ sendMessage resp
-- Remote hung up
handleresp Nothing = proxydone withresp a (Just resp) = a resp
-- Whichever of the remote or client the message was read from
-- hung up.
withresp _ Nothing = proxydone
-- Send a message to the remote, that it will not respond to. -- Send a message to the remote, that it will not respond to.
proxynoresponse message = proxynoresponse message =
protoerrhandler proxynextclientmessage $ protoerrhandler proxynextclientmessage $
toremote $ net $ sendMessage message remote $ net $ sendMessage message
-- Read a message from one party, send it to the other,
-- and then call the continuation.
relayonemessage from to cont =
flip protoerrhandler (from $ net $ receiveMessage) $
withresp $ \resp ->
protoerrhandler cont $
to $ net $ sendMessage resp
servermodechecker c a = c servermode $ \case servermodechecker c a = c servermode $ \case
Nothing -> a Nothing -> a
Just notallowed -> Just notallowed ->
protoerrhandler proxynextclientmessage $ protoerrhandler proxynextclientmessage $
toclient notallowed client notallowed
proxyclientmessage Nothing = proxydone proxyclientmessage Nothing = proxydone
proxyclientmessage (Just message) = case message of proxyclientmessage (Just message) = case message of
@ -129,8 +140,10 @@ proxy proxydone servermode clientside remoteside othermessage protoerrhandler =
REMOVE _ -> REMOVE _ ->
servermodechecker checkREMOVEServerMode $ servermodechecker checkREMOVEServerMode $
proxyresponse message proxyresponse message
GET offset af k -> giveup "TODO GET" GET _ _ _ ->
PUT af k -> protoerrhandler (handleGET message) $
remote $ net $ sendMessage message
PUT _ _ ->
servermodechecker checkPUTServerMode $ servermodechecker checkPUTServerMode $
giveup "TODO PUT" giveup "TODO PUT"
-- These messages involve the git repository, not the -- These messages involve the git repository, not the
@ -144,8 +157,8 @@ proxy proxydone servermode clientside remoteside othermessage protoerrhandler =
-- the messages above. -- the messages above.
SUCCESS -> protoerr SUCCESS -> protoerr
FAILURE -> protoerr FAILURE -> protoerr
DATA len -> protoerr DATA _ -> protoerr
VALIDITY v -> protoerr VALIDITY _ -> protoerr
-- If the client errors out, give up. -- If the client errors out, give up.
ERROR msg -> giveup $ "client error: " ++ msg ERROR msg -> giveup $ "client error: " ++ msg
-- Messages that only the server should send. -- Messages that only the server should send.
@ -159,6 +172,29 @@ proxy proxydone servermode clientside remoteside othermessage protoerrhandler =
AUTH _ _ -> protoerr AUTH _ _ -> protoerr
VERSION _ -> protoerr VERSION _ -> protoerr
handleGET message () =
protoerrhandler (withresp handleresp) $
remote $ net $ do
sendMessage message
receiveMessage
where
handleresp resp@(DATA len) =
protoerrhandler (receive len) $
client $ net $ sendMessage resp
handleresp _ = protoerr
receive len () = protoerrhandler (send len) $
remote $ net $ receiveBytes len nullMeterUpdate
send len b = protoerrhandler finishget $
client $ net $ sendBytes len b nullMeterUpdate
finishget () = protocolversion >>= \case
ProtocolVersion 0 -> sendclientsuccessfailure ()
-- Protocol version 1 has the remote send a a VALID or
-- INVALID message after the data.
_ -> relayonemessage remote client sendclientsuccessfailure
sendclientsuccessfailure () =
relayonemessage client remote proxynextclientmessage
protoerr = do protoerr = do
_ <- toclient $ net $ sendMessage (ERROR "protocol error") _ <- client $ net $ sendMessage (ERROR "protocol error")
giveup "protocol error" giveup "protocol error"

View file

@ -37,7 +37,7 @@ For June's work on [[design/passthrough_proxy]], implementation plan:
2. Remote instantiation for proxies. (done) 2. Remote instantiation for proxies. (done)
3. Implement proxying in git-annex-shell. 3. Implement proxying in git-annex-shell.
(Partly done, still need it for GET, PUT, CONNECT, and NOTIFYCHANGES (Partly done, still need it for PUT, CONNECT, and NOTIFYCHANGES
messages.) messages.)
4. Either implement proxying for local path remotes, or prevent 4. Either implement proxying for local path remotes, or prevent