add thirdPartyPopulated interface

This is to support, eg a borg repo as a special remote, which is
populated not by running git-annex commands, but by using borg. Then
git-annex sync lists the content of the remote, learns which files are
annex objects, and treats those as present in the remote.

So, most of the import machinery is reused, to a new purpose. While
normally importtree maintains a remote tracking branch, this does not,
because the files stored in the remote are annex object files, not
user-visible filenames. But, internally, a git tree is still generated,
of the files on the remote that are annex objects. This tree is used
by retrieveExportWithContentIdentifier, etc. As with other import/export
remotes, that  the tree is recorded in the export log, and gets grafted
into the git-annex branch.

importKey changed to be able to return Nothing, to indicate when an
ImportLocation is not an annex object and so should be skipped from
being included in the tree.

It did not seem to make sense to have git-annex import do this, since
from the user's perspective, it's not like other imports. So only
git-annex sync does it.

Note that, git-annex sync does not yet download objects from such
remotes that are preferred content. importKeys is run with
content downloading disabled, to avoid getting the content of all
objects. Perhaps what's needed is for seekSyncContent to be run with these
remotes, but I don't know if it will just work (in particular, it needs
to avoid trying to transfer objects to them), so I skipped that for now.

(Untested and unused as of yet.)

This commit was sponsored by Jochen Bartl on Patreon.
This commit is contained in:
Joey Hess 2020-12-18 14:52:57 -04:00
parent 037f8b6863
commit 9a2c8757f3
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
23 changed files with 176 additions and 77 deletions

View file

@ -67,7 +67,7 @@ import Annex.UpdateInstead
import Annex.Export
import Annex.TaggedPush
import Annex.CurrentBranch
import Annex.Import (canImportKeys)
import Annex.Import
import Annex.CheckIgnore
import Types.FileMatcher
import qualified Database.Export as Export
@ -77,6 +77,7 @@ import Utility.Process.Transcript
import Utility.Tuple
import Control.Concurrent.MVar
import Control.Concurrent.STM
import qualified Data.Map as M
import qualified Data.ByteString as S
import Data.Char
@ -463,6 +464,9 @@ pullRemote o mergeconfig remote branch = stopUnless (pure $ pullOption o && want
importRemote :: Bool -> SyncOptions -> [Git.Merge.MergeConfig] -> Remote -> CurrBranch -> CommandSeek
importRemote importcontent o mergeconfig remote currbranch
| not (pullOption o) || not wantpull = noop
| Remote.thirdPartyPopulated (Remote.remotetype remote) =
when (canImportKeys remote importcontent) $
importThirdPartyPopulated remote
| otherwise = case remoteAnnexTrackingBranch (Remote.gitconfig remote) of
Nothing -> noop
Just tb -> do
@ -479,6 +483,34 @@ importRemote importcontent o mergeconfig remote currbranch
where
wantpull = remoteAnnexPull (Remote.gitconfig remote)
{- Import from a remote that is populated by a third party, by listing
- the contents of the remote, and then adding only the files on it that
- importKey identifies to a tree. The tree is only used to keep track
- of where keys are located on the remote, no remote tracking branch is
- updated, because the filenames are the names of annex object files,
- not suitable for a tracking branch. Does not transfer any content. -}
importThirdPartyPopulated :: Remote -> CommandSeek
importThirdPartyPopulated remote = do
importabletvar <- liftIO $ newTVarIO Nothing
void $ includeCommandAction (Command.Import.listContents remote ImportTree (CheckGitIgnore False) importabletvar)
liftIO (atomically (readTVar importabletvar)) >>= \case
Nothing -> return ()
Just importable ->
importKeys remote ImportTree False False importable >>= \case
Just importablekeys -> go importablekeys
Nothing -> warning $ concat
[ "Failed to import from"
, Remote.name remote
]
where
go importablekeys = void $ includeCommandAction $ starting "pull" ai si $ do
(_imported, updatestate) <- recordImportTree remote ImportTree importablekeys
next $ do
updatestate
return True
ai = ActionItemOther (Just (Remote.name remote))
si = SeekInput []
{- The remote probably has both a master and a synced/master branch.
- Which to merge from? Well, the master has whatever latest changes
- were committed (or pushed changes, if this is a bare remote),