borg: Support importing files that are hard linked in the borg backup

Note that a key with no size field that is hard linked will
result in listImportableContents reporting a file size of 0,
rather than the actual size of the file. One result is that
the progress meter when getting the file will seem to get stuck
at 100%. Another is that the remote's preferred content expression,
if it tries to match against file size, will treat it as an empty file.
I don't see a way to improve the latter behavior, and the former behavior
is a minor enough problem.

This commit was sponsored by Jake Vosloo on Patreon.
This commit is contained in:
Joey Hess 2021-03-26 13:29:34 -04:00
parent 31eb5fddf3
commit f085ae4937
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 42 additions and 15 deletions

View file

@ -27,6 +27,7 @@ git-annex (8.20210311) UNRELEASED; urgency=medium
* Fix build with attoparsec-0.14. * Fix build with attoparsec-0.14.
* Improved display of errors when accessing a git http remote fails. * Improved display of errors when accessing a git http remote fails.
* borg: Fix a bug that prevented importing keys of type URL and WORM. * borg: Fix a bug that prevented importing keys of type URL and WORM.
* borg: Support importing files that are hard linked in the borg backup.
-- Joey Hess <id@joeyh.name> Fri, 12 Mar 2021 12:06:37 -0400 -- Joey Hess <id@joeyh.name> Fri, 12 Mar 2021 12:06:37 -0400

View file

@ -1,10 +1,12 @@
{- Using borg as a remote. {- Using borg as a remote.
- -
- Copyright 2020 Joey Hess <id@joeyh.name> - Copyright 2020,2021 Joey Hess <id@joeyh.name>
- -
- Licensed under the GNU AGPL version 3 or higher. - Licensed under the GNU AGPL version 3 or higher.
-} -}
{-# LANGUAGE OverloadedStrings #-}
module Remote.Borg (remote) where module Remote.Borg (remote) where
import Annex.Common import Annex.Common
@ -26,6 +28,7 @@ import Types.ProposedAccepted
import Utility.Metered import Utility.Metered
import Logs.Export import Logs.Export
import qualified Remote.Helper.ThirdPartyPopulated as ThirdPartyPopulated import qualified Remote.Helper.ThirdPartyPopulated as ThirdPartyPopulated
import Utility.Env
import Data.Either import Data.Either
import Text.Read import Text.Read
@ -151,18 +154,19 @@ listImportableContentsM :: UUID -> BorgRepo -> ParsedRemoteConfig -> Annex (Mayb
listImportableContentsM u borgrepo c = prompt $ do listImportableContentsM u borgrepo c = prompt $ do
imported <- getImported u imported <- getImported u
ls <- withborglist borgrepo Nothing formatarchivelist $ \as -> ls <- withborglist borgrepo Nothing formatarchivelist $ \as ->
forM as $ \archivename -> forM (filter (not . S.null) as) $ \archivename ->
case M.lookup archivename imported of case M.lookup archivename imported of
Just getfast -> return $ Left (archivename, getfast) Just getfast -> return $ Left (archivename, getfast)
Nothing -> Right <$> Nothing -> Right <$>
let archive = borgArchive borgrepo archivename let archive = borgArchive borgrepo archivename
in withborglist archive subdir formatfilelist $ in withborglist archive subdir formatfilelist $
liftIO . evaluate . force . parsefilelist archivename liftIO . evaluate . force $ parsefilelist archivename
if all isLeft ls && M.null (M.difference imported (M.fromList (lefts ls))) if all isLeft ls && M.null (M.difference imported (M.fromList (lefts ls)))
then return Nothing -- unchanged since last time, avoid work then return Nothing -- unchanged since last time, avoid work
else Just . mkimportablecontents <$> mapM (either snd pure) ls else Just . mkimportablecontents <$> mapM (either snd pure) ls
where where
withborglist what addparam format a = do withborglist what addparam format a = do
environ <- liftIO getEnvironment
let p = proc "borg" $ toCommand $ catMaybes let p = proc "borg" $ toCommand $ catMaybes
[ Just (Param "list") [ Just (Param "list")
, Just (Param "--format") , Just (Param "--format")
@ -171,9 +175,13 @@ listImportableContentsM u borgrepo c = prompt $ do
, addparam , addparam
] ]
(Nothing, Just h, Nothing, pid) <- liftIO $ createProcess $ p (Nothing, Just h, Nothing, pid) <- liftIO $ createProcess $ p
{ std_out = CreatePipe } { std_out = CreatePipe
-- Run in C locale because the file list can
-- include some possibly translatable text in the
-- "extra" field.
, env = Just (addEntry "LANG" "C" environ)
}
l <- liftIO $ map L.toStrict l <- liftIO $ map L.toStrict
. filter (not . L.null)
. L.split 0 . L.split 0
<$> L.hGetContents h <$> L.hGetContents h
let cleanup = liftIO $ do let cleanup = liftIO $ do
@ -183,21 +191,31 @@ listImportableContentsM u borgrepo c = prompt $ do
formatarchivelist = "{barchive}{NUL}" formatarchivelist = "{barchive}{NUL}"
formatfilelist = "{size}{NUL}{path}{NUL}" formatfilelist = "{size}{NUL}{path}{NUL}{extra}{NUL}"
subdir = File <$> getRemoteConfigValue subdirField c subdir = File <$> getRemoteConfigValue subdirField c
parsefilelist archivename (bsz:f:rest) = case readMaybe (fromRawFilePath bsz) of parsefilelist archivename (bsz:f:extra:rest) = case readMaybe (fromRawFilePath bsz) of
Nothing -> parsefilelist archivename rest Nothing -> parsefilelist archivename rest
Just sz -> Just sz ->
let loc = genImportLocation archivename f let loc = genImportLocation archivename f
-- borg list reports hard links as 0 byte files,
-- with the extra field set to " link to ".
-- When the annex object is a hard link to
-- something else, we'll assume it has not been
-- modified, since usually git-annex does prevent
-- this. Since the 0 byte size is not the actual
-- size, report the key size instead, when available.
(reqsz, retsz) = case extra of
" link to " -> (Nothing, fromMaybe sz . fromKey keySize)
_ -> (Just sz, const sz)
-- This does a little unncessary work to parse the -- This does a little unncessary work to parse the
-- key, which is then thrown away. But, it lets the -- key, which is then thrown away. But, it lets the
-- file list be shrank down to only the ones that are -- file list be shrank down to only the ones that are
-- importable keys, so avoids needing to buffer all -- importable keys, so avoids needing to buffer all
-- the rest of the files in memory. -- the rest of the files in memory.
in case ThirdPartyPopulated.importKey' loc sz of in case ThirdPartyPopulated.importKey' loc reqsz of
Just _k -> (loc, (borgContentIdentifier, sz)) Just k -> (loc, (borgContentIdentifier, retsz k))
: parsefilelist archivename rest : parsefilelist archivename rest
Nothing -> parsefilelist archivename rest Nothing -> parsefilelist archivename rest
parsefilelist _ _ = [] parsefilelist _ _ = []

View file

@ -47,10 +47,10 @@ fromThirdPartyImportLocation =
-- find only those ImportLocations that are annex object files. -- find only those ImportLocations that are annex object files.
-- All other ImportLocations are ignored. -- All other ImportLocations are ignored.
importKey :: ImportLocation -> ContentIdentifier -> ByteSize -> MeterUpdate -> Annex (Maybe Key) importKey :: ImportLocation -> ContentIdentifier -> ByteSize -> MeterUpdate -> Annex (Maybe Key)
importKey loc _cid sz _ = return $ importKey' loc sz importKey loc _cid sz _ = return $ importKey' loc (Just sz)
importKey' :: ImportLocation -> ByteSize -> Maybe Key importKey' :: ImportLocation -> Maybe ByteSize -> Maybe Key
importKey' loc sz = case fileKey f of importKey' loc msz = case fileKey f of
Just k Just k
-- Annex objects always are in a subdirectory with the same -- Annex objects always are in a subdirectory with the same
-- name as the filename. If this is not the case for the file -- name as the filename. If this is not the case for the file
@ -75,11 +75,11 @@ importKey' loc sz = case fileKey f of
-- (eg, wrong data read off disk during backup, or the object -- (eg, wrong data read off disk during backup, or the object
-- was corrupt in the git-annex repo and that bad object got -- was corrupt in the git-annex repo and that bad object got
-- backed up), they can fsck the remote. -- backed up), they can fsck the remote.
| otherwise -> case fromKey keySize k of | otherwise -> case (msz, fromKey keySize k) of
Just sz' (Just sz, Just sz')
| sz' == sz -> Just k | sz' == sz -> Just k
| otherwise -> Nothing | otherwise -> Nothing
Nothing -> Just k _ -> Just k
Nothing -> Nothing Nothing -> Nothing
where where
p = fromImportLocation loc p = fromImportLocation loc

View file

@ -13,3 +13,4 @@ on NixOS 20.09
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Thank you for including borg, and for fixing gcrypt remotes! This is great! Thank you for including borg, and for fixing gcrypt remotes! This is great!
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2021-03-26T16:14:56Z"
content="""
Fixed the hard link bug.
"""]]