proxy stores received keys to known export locations

This handles the workflow where the branch is first pushed to the proxy,
and then files in the exported tree are later are copied to the proxied remote.

Turns out that the way the export log is structured, nothing needs
to be done to finalize the export once the last key is sent to it. Which
is great because that would have been a lot of complication. On
receiving the push, Command.Export runs and calls recordExportBeginning,
does as much as it can to update the export with the files currently
on it, and then calls recordExportUnderway. At that point, the
export.log records the export as "complete", but it's not really. And
that's fine. The same happens when using `git-annex export` when some
files are not available to send. Other repositories that have
access to the special remote can already retrieve files from it. As
the missing files get copied to the exported remote, all that needs
to be done is record each in the export db.

At this point, proxying to exporttree=yes annexobjects=yes special remotes
is fully working. Except for in the case where multiple files in the
tree use the same key, and the files are sent to the proxied remote
before pushing the tree.

It seems that even special remotes without annexobjects=yes will work if
used with the workflow where the git-annex branch is pushed before
copying files. But not with the `git-annex push` workflow.
This commit is contained in:
Joey Hess 2024-08-07 09:38:15 -04:00
parent bb9b02b723
commit 1038567881
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 36 additions and 16 deletions

View file

@ -27,6 +27,7 @@ import Logs.Location
import Utility.Tmp.Dir
import Utility.Metered
import Git.Types
import qualified Database.Export as Export
import Control.Concurrent.STM
import Control.Concurrent.Async
@ -65,8 +66,12 @@ proxySpecialRemoteSide clientmaxversion r = mkRemoteSide r $ do
owaitv <- liftIO newEmptyTMVarIO
iclosedv <- liftIO newEmptyTMVarIO
oclosedv <- liftIO newEmptyTMVarIO
exportdb <- ifM (Remote.isExportSupported r)
( Just <$> Export.openDb (Remote.uuid r)
, pure Nothing
)
worker <- liftIO . async =<< forkState
(proxySpecialRemote protoversion r ihdl ohdl owaitv oclosedv)
(proxySpecialRemote protoversion r ihdl ohdl owaitv oclosedv exportdb)
let remoteconn = P2PConnection
{ connRepo = Nothing
, connCheckAuth = const False
@ -77,6 +82,7 @@ proxySpecialRemoteSide clientmaxversion r = mkRemoteSide r $ do
let closeremoteconn = do
liftIO $ atomically $ putTMVar oclosedv ()
join $ liftIO (wait worker)
maybe noop Export.closeDb exportdb
return $ Just
( remoterunst
, remoteconn
@ -91,8 +97,9 @@ proxySpecialRemote
-> TMVar (Either L.ByteString Message)
-> TMVar ()
-> TMVar ()
-> Maybe Export.ExportHandle
-> Annex ()
proxySpecialRemote protoversion r ihdl ohdl owaitv oclosedv = go
proxySpecialRemote protoversion r ihdl ohdl owaitv oclosedv mexportdb = go
where
go :: Annex ()
go = liftIO receivemessage >>= \case
@ -169,7 +176,7 @@ proxySpecialRemote protoversion r ihdl ohdl owaitv oclosedv = go
proxyput af k = do
liftIO $ sendmessage $ PUT_FROM (Offset 0)
withproxytmpfile k $ \tmpfile -> do
let store = tryNonAsync (Remote.storeKey r k af (Just (decodeBS tmpfile)) nullMeterUpdate) >>= \case
let store = tryNonAsync (storeput k af (decodeBS tmpfile)) >>= \case
Right () -> liftIO $ sendmessage SUCCESS
Left err -> liftIO $ propagateerror err
liftIO receivemessage >>= \case
@ -193,6 +200,25 @@ proxySpecialRemote protoversion r ihdl ohdl owaitv oclosedv = go
_ -> giveup "protocol error"
liftIO $ removeWhenExistsWith removeFile (fromRawFilePath tmpfile)
storeput k af tmpfile = case mexportdb of
Just exportdb -> liftIO (Export.getExportTree exportdb k) >>= \case
[] -> storeputkey k af tmpfile
locs -> do
havelocs <- liftIO $ S.fromList
<$> Export.getExportedLocation exportdb k
let locs' = filter (`S.notMember` havelocs) locs
forM_ locs' $ \loc ->
storeputexport exportdb k loc tmpfile
liftIO $ Export.flushDbQueue exportdb
Nothing -> storeputkey k af tmpfile
storeputkey k af tmpfile =
Remote.storeKey r k af (Just tmpfile) nullMeterUpdate
storeputexport exportdb k loc tmpfile = do
Remote.storeExport (Remote.exportActions r) tmpfile k loc nullMeterUpdate
liftIO $ Export.addExportedLocation exportdb k loc
receivetofile iv h n = liftIO receivebytestring >>= \case
Just b -> do
liftIO $ atomically $

View file

@ -34,7 +34,7 @@ import Control.DeepSeq
-- PINNED in memory which caused memory fragmentation and excessive memory
-- use.
newtype ExportLocation = ExportLocation S.ShortByteString
deriving (Show, Eq, Generic)
deriving (Show, Eq, Generic, Ord)
instance NFData ExportLocation

View file

@ -41,21 +41,15 @@ Planned schedule of work:
export not supported
failed
* These are only needed to support workflows other than `git-annex push`.
(Since a push sends all content to the proxied remote and then pushes
to the proxy, it happens to do things in an order where these are not
necessary.)
* `git-annex post-receive` of a proxied exporttree=yes special remote's
annex-tracking-branch should check if the special remote contains all
keys in the tree. If so, it can exporttree. If not, record
the keys that are needed. (It could always exporttree,
but better to avoid leaving it incomplete.)
* After a key is received, the proxy should check if it's the *last* key
that is needed to complete the export, and exporttree when so.
* Prevent `enableproxy` from enabling an exporttree=yes special remote
that does not have annexobjects=yes, to avoid foot shooting.
* Handle cases where a single key is used by multiple files in the exported
tree. Need to download from the special remote in order to export
multiple copies to it.
multiple copies to it. (In particular, this is needed when using
`git-annex push`. When using first `git push` followed by
`git-annex copy --to` the proxied remote, the received key is stored
to all export locations.)
* Handle case where the special remote does not support renameExport.
Each key will need to be downloaded from it in order to export the key
back to it, if the proxy is to support such a remote.