borg: Support importing files that are hard linked in the borg backup

Note that a key with no size field that is hard linked will
result in listImportableContents reporting a file size of 0,
rather than the actual size of the file. One result is that
the progress meter when getting the file will seem to get stuck
at 100%. Another is that the remote's preferred content expression,
if it tries to match against file size, will treat it as an empty file.
I don't see a way to improve the latter behavior, and the former behavior
is a minor enough problem.

This commit was sponsored by Jake Vosloo on Patreon.
This commit is contained in:
Joey Hess 2021-03-26 13:29:34 -04:00
parent 31eb5fddf3
commit f085ae4937
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 42 additions and 15 deletions

View file

@ -27,6 +27,7 @@ git-annex (8.20210311) UNRELEASED; urgency=medium
* Fix build with attoparsec-0.14.
* Improved display of errors when accessing a git http remote fails.
* borg: Fix a bug that prevented importing keys of type URL and WORM.
* borg: Support importing files that are hard linked in the borg backup.
-- Joey Hess <id@joeyh.name> Fri, 12 Mar 2021 12:06:37 -0400

View file

@ -1,10 +1,12 @@
{- Using borg as a remote.
-
- Copyright 2020 Joey Hess <id@joeyh.name>
- Copyright 2020,2021 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
{-# LANGUAGE OverloadedStrings #-}
module Remote.Borg (remote) where
import Annex.Common
@ -26,6 +28,7 @@ import Types.ProposedAccepted
import Utility.Metered
import Logs.Export
import qualified Remote.Helper.ThirdPartyPopulated as ThirdPartyPopulated
import Utility.Env
import Data.Either
import Text.Read
@ -151,18 +154,19 @@ listImportableContentsM :: UUID -> BorgRepo -> ParsedRemoteConfig -> Annex (Mayb
listImportableContentsM u borgrepo c = prompt $ do
imported <- getImported u
ls <- withborglist borgrepo Nothing formatarchivelist $ \as ->
forM as $ \archivename ->
forM (filter (not . S.null) as) $ \archivename ->
case M.lookup archivename imported of
Just getfast -> return $ Left (archivename, getfast)
Nothing -> Right <$>
let archive = borgArchive borgrepo archivename
in withborglist archive subdir formatfilelist $
liftIO . evaluate . force . parsefilelist archivename
liftIO . evaluate . force $ parsefilelist archivename
if all isLeft ls && M.null (M.difference imported (M.fromList (lefts ls)))
then return Nothing -- unchanged since last time, avoid work
else Just . mkimportablecontents <$> mapM (either snd pure) ls
where
withborglist what addparam format a = do
environ <- liftIO getEnvironment
let p = proc "borg" $ toCommand $ catMaybes
[ Just (Param "list")
, Just (Param "--format")
@ -171,9 +175,13 @@ listImportableContentsM u borgrepo c = prompt $ do
, addparam
]
(Nothing, Just h, Nothing, pid) <- liftIO $ createProcess $ p
{ std_out = CreatePipe }
{ std_out = CreatePipe
-- Run in C locale because the file list can
-- include some possibly translatable text in the
-- "extra" field.
, env = Just (addEntry "LANG" "C" environ)
}
l <- liftIO $ map L.toStrict
. filter (not . L.null)
. L.split 0
<$> L.hGetContents h
let cleanup = liftIO $ do
@ -183,21 +191,31 @@ listImportableContentsM u borgrepo c = prompt $ do
formatarchivelist = "{barchive}{NUL}"
formatfilelist = "{size}{NUL}{path}{NUL}"
formatfilelist = "{size}{NUL}{path}{NUL}{extra}{NUL}"
subdir = File <$> getRemoteConfigValue subdirField c
parsefilelist archivename (bsz:f:rest) = case readMaybe (fromRawFilePath bsz) of
parsefilelist archivename (bsz:f:extra:rest) = case readMaybe (fromRawFilePath bsz) of
Nothing -> parsefilelist archivename rest
Just sz ->
let loc = genImportLocation archivename f
-- borg list reports hard links as 0 byte files,
-- with the extra field set to " link to ".
-- When the annex object is a hard link to
-- something else, we'll assume it has not been
-- modified, since usually git-annex does prevent
-- this. Since the 0 byte size is not the actual
-- size, report the key size instead, when available.
(reqsz, retsz) = case extra of
" link to " -> (Nothing, fromMaybe sz . fromKey keySize)
_ -> (Just sz, const sz)
-- This does a little unncessary work to parse the
-- key, which is then thrown away. But, it lets the
-- file list be shrank down to only the ones that are
-- importable keys, so avoids needing to buffer all
-- the rest of the files in memory.
in case ThirdPartyPopulated.importKey' loc sz of
Just _k -> (loc, (borgContentIdentifier, sz))
in case ThirdPartyPopulated.importKey' loc reqsz of
Just k -> (loc, (borgContentIdentifier, retsz k))
: parsefilelist archivename rest
Nothing -> parsefilelist archivename rest
parsefilelist _ _ = []

View file

@ -47,10 +47,10 @@ fromThirdPartyImportLocation =
-- find only those ImportLocations that are annex object files.
-- All other ImportLocations are ignored.
importKey :: ImportLocation -> ContentIdentifier -> ByteSize -> MeterUpdate -> Annex (Maybe Key)
importKey loc _cid sz _ = return $ importKey' loc sz
importKey loc _cid sz _ = return $ importKey' loc (Just sz)
importKey' :: ImportLocation -> ByteSize -> Maybe Key
importKey' loc sz = case fileKey f of
importKey' :: ImportLocation -> Maybe ByteSize -> Maybe Key
importKey' loc msz = case fileKey f of
Just k
-- Annex objects always are in a subdirectory with the same
-- name as the filename. If this is not the case for the file
@ -75,11 +75,11 @@ importKey' loc sz = case fileKey f of
-- (eg, wrong data read off disk during backup, or the object
-- was corrupt in the git-annex repo and that bad object got
-- backed up), they can fsck the remote.
| otherwise -> case fromKey keySize k of
Just sz'
| otherwise -> case (msz, fromKey keySize k) of
(Just sz, Just sz')
| sz' == sz -> Just k
| otherwise -> Nothing
Nothing -> Just k
_ -> Just k
Nothing -> Nothing
where
p = fromImportLocation loc

View file

@ -13,3 +13,4 @@ on NixOS 20.09
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Thank you for including borg, and for fixing gcrypt remotes! This is great!
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2021-03-26T16:14:56Z"
content="""
Fixed the hard link bug.
"""]]