global numcopies setting

* numcopies: New command, sets global numcopies value that is seen by all
  clones of a repository.
* The annex.numcopies git config setting is deprecated. Once the numcopies
  command is used to set the global number of copies, any annex.numcopies
  git configs will be ignored.
* assistant: Make the prefs page set the global numcopies.

This global numcopies setting is needed to let preferred content
expressions operate on numcopies.

It's also convenient, because typically if you want git-annex to preserve N
copies of files in a repo, you want it to do that no matter which repo it's
running in. Making it global avoids needing to warn the user about gotchas
involving inconsistent annex.numcopies settings.
(See changes to doc/numcopies.mdwn.)

Added a new variety of git-annex branch log file, that holds only 1 value.
Will probably be useful for other stuff later.

This commit was sponsored by Nicolas Pouillard.
This commit is contained in:
Joey Hess 2014-01-20 16:47:56 -04:00
parent ecd4c35d7e
commit d66535f065
25 changed files with 246 additions and 45 deletions

View file

@ -94,7 +94,7 @@ data AnnexState = AnnexState
, checkattrhandle :: Maybe CheckAttrHandle
, checkignorehandle :: Maybe (Maybe CheckIgnoreHandle)
, forcebackend :: Maybe String
, forcenumcopies :: Maybe Int
, globalnumcopies :: Maybe Int
, limit :: Matcher (MatchInfo -> Annex Bool)
, uuidmap :: Maybe UUIDMap
, preferredcontentmap :: Maybe PreferredContentMap
@ -129,7 +129,7 @@ newState c r = AnnexState
, checkattrhandle = Nothing
, checkignorehandle = Nothing
, forcebackend = Nothing
, forcenumcopies = Nothing
, globalnumcopies = Nothing
, limit = Left []
, uuidmap = Nothing
, preferredcontentmap = Nothing

View file

@ -41,6 +41,7 @@ dropDead f content trustmap = case getLogVariety f of
in if null newlog
then RemoveFile
else ChangeFile $ Presence.showLog newlog
Just SingleValueLog -> PreserveFile
Nothing -> PreserveFile
dropDeadFromUUIDBasedLog :: TrustMap -> UUIDBased.Log String -> UUIDBased.Log String

View file

@ -8,7 +8,6 @@
module Annex.Drop where
import Common.Annex
import Logs.Location
import Logs.Trust
import Types.Remote (uuid)
import qualified Remote
@ -18,7 +17,6 @@ import Annex.Wanted
import Annex.Exception
import Config
import Annex.Content.Direct
import RunCommand
import qualified Data.Set as S
import System.Log.Logger (debugM)

View file

@ -17,6 +17,7 @@ import Logs.UUID
import Logs.Trust
import Logs.PreferredContent
import Logs.Group
import Logs.NumCopies
import Remote.List (remoteListRefresh)
import qualified Git.LsTree as LsTree
import Git.FilePath
@ -59,6 +60,7 @@ configFilesActions =
, (remoteLog, void $ liftAnnex remoteListRefresh)
, (trustLog, void $ liftAnnex trustMapLoad)
, (groupLog, void $ liftAnnex groupMapLoad)
, (numcopiesLog, void $ liftAnnex numCopiesLoad)
, (scheduleLog, void updateScheduleLog)
-- Preferred content settings depend on most of the other configs,
-- so will be reloaded whenever any configs change.

View file

@ -21,6 +21,7 @@ import Utility.DataUnits
import Git.Config
import Types.Distribution
import qualified Build.SysConfig
import Logs.NumCopies
import qualified Data.Text as T
@ -81,7 +82,7 @@ prefsAForm def = PrefsForm
getPrefs :: Annex PrefsForm
getPrefs = PrefsForm
<$> (T.pack . roughSize storageUnits False . annexDiskReserve <$> Annex.getGitConfig)
<*> (annexNumCopies <$> Annex.getGitConfig)
<*> (maybe deprecatedNumCopies return =<< getGlobalNumCopies)
<*> inAutoStartFile
<*> (annexAutoUpgrade <$> Annex.getGitConfig)
<*> (annexDebug <$> Annex.getGitConfig)
@ -89,7 +90,8 @@ getPrefs = PrefsForm
storePrefs :: PrefsForm -> Annex ()
storePrefs p = do
setConfig (annexConfig "diskreserve") (T.unpack $ diskReserve p)
setConfig (annexConfig "numcopies") (show $ numCopies p)
setGlobalNumCopies (numCopies p)
unsetConfig (annexConfig "numcopies") -- deprecated
setConfig (annexConfig "autoupgrade") (fromAutoUpgrade $ autoUpgrade p)
unlessM ((==) <$> pure (autoStart p) <*> inAutoStartFile) $ do
here <- fromRepo Git.repoPath

View file

@ -37,6 +37,7 @@ import Checks as ReExported
import Usage as ReExported
import RunCommand as ReExported
import Logs.Trust
import Logs.NumCopies
import Config
import Annex.CheckAttr
@ -88,8 +89,8 @@ isBareRepo = fromRepo Git.repoIsLocalBare
numCopies :: FilePath -> Annex (Maybe Int)
numCopies file = do
forced <- Annex.getState Annex.forcenumcopies
case forced of
global <- getGlobalNumCopies
case global of
Just n -> return $ Just n
Nothing -> readish <$> checkAttr "annex.numcopies" file

View file

@ -139,7 +139,7 @@ notEnoughCopies key need have skip bad = do
return False
where
unsafe = showNote "unsafe"
hint = showLongNote "(Use --force to override this check, or adjust annex.numcopies.)"
hint = showLongNote "(Use --force to override this check, or adjust numcopies.)"
{- In auto mode, only runs the action if there are enough
- copies on other semitrusted repositories.

View file

@ -64,7 +64,7 @@ showMoveAction False key Nothing = showStart "copy" (key2file key)
- If the remote already has the content, it is still removed from
- the current repository.
-
- Note that unlike drop, this does not honor annex.numcopies.
- Note that unlike drop, this does not honor numcopies.
- A file's content can be moved even if there are insufficient copies to
- allow it to be dropped.
-}

56
Command/NumCopies.hs Normal file
View file

@ -0,0 +1,56 @@
{- git-annex command
-
- Copyright 2014 Joey Hess <joey@kitenet.net>
-
- Licensed under the GNU GPL version 3 or higher.
-}
module Command.NumCopies where
import Common.Annex
import qualified Annex
import Command
import Logs.NumCopies
import Types.Messages
def :: [Command]
def = [command "numcopies" paramNumber seek
SectionSetup "configure desired number of copies"]
seek :: CommandSeek
seek = withWords start
start :: [String] -> CommandStart
start [] = startGet
start [s] = do
case readish s of
Nothing -> error $ "Bad number: " ++ s
Just n
| n > 0 -> startSet n
| n == 0 -> ifM (Annex.getState Annex.force)
( startSet n
, error "Setting numcopies to 0 is very unsafe. You will lose data! If you really want to do that, specify --force."
)
| otherwise -> error "Number cannot be negative!"
start _ = error "Specify a single number."
startGet :: CommandStart
startGet = next $ next $ do
Annex.setOutput QuietOutput
v <- getGlobalNumCopies
case v of
Just n -> liftIO $ putStrLn $ show n
Nothing -> do
liftIO $ putStrLn $ "global numcopies is not set"
old <- annexNumCopies <$> Annex.getGitConfig
case old of
Nothing -> liftIO $ putStrLn "(default is 1)"
Just n -> liftIO $ putStrLn $ "(deprecated git config annex.numcopies is set to " ++ show n ++ " locally)"
return True
startSet :: Int -> CommandStart
startSet n = do
showStart "numcopies" (show n)
next $ next $ do
setGlobalNumCopies n
return True

View file

@ -71,7 +71,10 @@ setRemoteAvailability r c = setConfig (remoteConfig r "availability") (show c)
getNumCopies :: Maybe Int -> Annex Int
getNumCopies (Just v) = return v
getNumCopies Nothing = annexNumCopies <$> Annex.getGitConfig
getNumCopies Nothing = deprecatedNumCopies
deprecatedNumCopies :: Annex Int
deprecatedNumCopies = fromMaybe 1 . annexNumCopies <$> Annex.getGitConfig
isDirect :: Annex Bool
isDirect = annexDirect <$> Annex.getGitConfig

View file

@ -50,6 +50,7 @@ import qualified Command.Info
import qualified Command.Status
import qualified Command.Migrate
import qualified Command.Uninit
import qualified Command.NumCopies
import qualified Command.Trust
import qualified Command.Untrust
import qualified Command.Semitrust
@ -117,6 +118,7 @@ cmds = concat
, Command.Unannex.def
, Command.Uninit.def
, Command.PreCommit.def
, Command.NumCopies.def
, Command.Trust.def
, Command.Untrust.def
, Command.Semitrust.def

View file

@ -63,7 +63,7 @@ options = Option.common ++
where
trustArg t = ReqArg (Remote.forceTrust t) paramRemote
setnumcopies v = maybe noop
(\n -> Annex.changeState $ \s -> s { Annex.forcenumcopies = Just n })
(\n -> Annex.changeState $ \s -> s { Annex.globalnumcopies = Just n })
(readish v)
setuseragent v = Annex.changeState $ \s -> s { Annex.useragent = Just v }
setgitconfig v = inRepo (Git.Config.store v)

13
Logs.hs
View file

@ -11,7 +11,11 @@ import Common.Annex
import Types.Key
{- There are several varieties of log file formats. -}
data LogVariety = UUIDBasedLog | NewUUIDBasedLog | PresenceLog Key
data LogVariety
= UUIDBasedLog
| NewUUIDBasedLog
| PresenceLog Key
| SingleValueLog
deriving (Show)
{- Converts a path from the git-annex branch into one of the varieties
@ -20,6 +24,7 @@ getLogVariety :: FilePath -> Maybe LogVariety
getLogVariety f
| f `elem` topLevelUUIDBasedLogs = Just UUIDBasedLog
| isRemoteStateLog f = Just NewUUIDBasedLog
| f == numcopiesLog = Just SingleValueLog
| otherwise = PresenceLog <$> firstJust (presenceLogs f)
{- All the uuid-based logs stored in the top of the git-annex branch. -}
@ -43,6 +48,9 @@ presenceLogs f =
uuidLog :: FilePath
uuidLog = "uuid.log"
numcopiesLog :: FilePath
numcopiesLog = "numcopies.log"
remoteLog :: FilePath
remoteLog = "remote.log"
@ -118,6 +126,7 @@ prop_logs_sane dummykey = all id
, expect isPresenceLog (getLogVariety $ locationLogFile dummykey)
, expect isPresenceLog (getLogVariety $ urlLogFile dummykey)
, expect isNewUUIDBasedLog (getLogVariety $ remoteStateLogFile dummykey)
, expect isSingleValueLog (getLogVariety $ numcopiesLog)
]
where
expect = maybe False
@ -127,3 +136,5 @@ prop_logs_sane dummykey = all id
isNewUUIDBasedLog _ = False
isPresenceLog (PresenceLog k) = k == dummykey
isPresenceLog _ = False
isSingleValueLog SingleValueLog = True
isSingleValueLog _ = False

33
Logs/NumCopies.hs Normal file
View file

@ -0,0 +1,33 @@
{- git-annex numcopies log
-
- Copyright 2014 Joey Hess <joey@kitenet.net>
-
- Licensed under the GNU GPL version 3 or higher.
-}
{-# OPTIONS_GHC -fno-warn-orphans #-}
module Logs.NumCopies where
import Common.Annex
import qualified Annex
import Logs
import Logs.SingleValue
instance Serializable Int where
serialize = show
deserialize = readish
setGlobalNumCopies :: Int -> Annex ()
setGlobalNumCopies = setLog numcopiesLog
{- Cached for speed. -}
getGlobalNumCopies :: Annex (Maybe Int)
getGlobalNumCopies = maybe numCopiesLoad (return . Just)
=<< Annex.getState Annex.globalnumcopies
numCopiesLoad :: Annex (Maybe Int)
numCopiesLoad = do
v <- getLog numcopiesLog
Annex.changeState $ \s -> s { Annex.globalnumcopies = v }
return v

65
Logs/SingleValue.hs Normal file
View file

@ -0,0 +1,65 @@
{- git-annex single-value log
-
- This is used to store a value in a way that can be union merged.
-
- A line of the log will look like: "timestamp value"
-
- The line with the newest timestamp wins.
-
- Copyright 2014 Joey Hess <joey@kitenet.net>
-
- Licensed under the GNU GPL version 3 or higher.
-}
module Logs.SingleValue where
import Common.Annex
import qualified Annex.Branch
import qualified Data.Set as S
import Data.Time.Clock.POSIX
import Data.Time
import System.Locale
class Serializable v where
serialize :: v -> String
deserialize :: String -> Maybe v
data LogEntry v = LogEntry
{ changed :: POSIXTime
, value :: v
} deriving (Eq, Show, Ord)
type Log v = S.Set (LogEntry v)
showLog :: (Serializable v) => Log v -> String
showLog = unlines . map showline . S.toList
where
showline (LogEntry t v) = unwords [show t, serialize v]
parseLog :: (Ord v, Serializable v) => String -> Log v
parseLog = S.fromList . mapMaybe parse . lines
where
parse line = do
let (ts, s) = splitword line
date <- utcTimeToPOSIXSeconds <$> parseTime defaultTimeLocale "%s%Qs" ts
v <- deserialize s
Just (LogEntry date v)
splitword = separate (== ' ')
newestValue :: Log v -> Maybe v
newestValue s
| S.null s = Nothing
| otherwise = Just (value $ S.findMax s)
readLog :: (Ord v, Serializable v) => FilePath -> Annex (Log v)
readLog = parseLog <$$> Annex.Branch.get
getLog :: (Ord v, Serializable v) => FilePath -> Annex (Maybe v)
getLog = newestValue <$$> readLog
setLog :: (Serializable v) => FilePath -> v -> Annex ()
setLog f v = do
now <- liftIO getPOSIXTime
let ent = LogEntry now v
Annex.Branch.change f $ \_old -> showLog (S.singleton ent)

View file

@ -292,6 +292,9 @@ test_drop_withremote :: TestEnv -> Assertion
test_drop_withremote env = intmpclonerepo env $ do
git_annex env "get" [annexedfile] @? "get failed"
annexed_present annexedfile
git_annex env "numcopies" ["2"] @? "numcopies config failed"
not <$> git_annex env "drop" [annexedfile] @? "drop succeeded although numcopies is not satisfied"
git_annex env "numcopies" ["1"] @? "numcopies config failed"
git_annex env "drop" [annexedfile] @? "drop failed though origin has copy"
annexed_notpresent annexedfile
inmainrepo env $ annexed_present annexedfile
@ -511,9 +514,9 @@ test_trust env = intmpclonerepo env $ do
test_fsck_basic :: TestEnv -> Assertion
test_fsck_basic env = intmpclonerepo env $ do
git_annex env "fsck" [] @? "fsck failed"
boolSystem "git" [Params "config annex.numcopies 2"] @? "git config failed"
git_annex env "numcopies" ["2"] @? "numcopies config failed"
fsck_should_fail env "numcopies unsatisfied"
boolSystem "git" [Params "config annex.numcopies 1"] @? "git config failed"
git_annex env "numcopies" ["1"] @? "numcopies config failed"
corrupt annexedfile
corrupt sha1annexedfile
where
@ -542,7 +545,7 @@ test_fsck_localuntrusted env = intmpclonerepo env $ do
test_fsck_remoteuntrusted :: TestEnv -> Assertion
test_fsck_remoteuntrusted env = intmpclonerepo env $ do
boolSystem "git" [Params "config annex.numcopies 2"] @? "git config failed"
git_annex env "numcopies" ["2"] @? "numcopies config failed"
git_annex env "get" [annexedfile] @? "get failed"
git_annex env "get" [sha1annexedfile] @? "get failed"
git_annex env "fsck" [] @? "fsck failed with numcopies=2 and 2 copies"

View file

@ -24,7 +24,7 @@ import Types.Availability
- such as annex.foo -}
data GitConfig = GitConfig
{ annexVersion :: Maybe String
, annexNumCopies :: Int
, annexNumCopies :: Maybe Int
, annexDiskReserve :: Integer
, annexDirect :: Bool
, annexBackends :: [String]
@ -52,7 +52,7 @@ data GitConfig = GitConfig
extractGitConfig :: Git.Repo -> GitConfig
extractGitConfig r = GitConfig
{ annexVersion = notempty $ getmaybe (annex "version")
, annexNumCopies = get (annex "numcopies") 1
, annexNumCopies = getmayberead (annex "numcopies")
, annexDiskReserve = fromMaybe onemegabyte $
readSize dataUnits =<< getmaybe (annex "diskreserve")
, annexDirect = getbool (annex "direct") False

6
debian/changelog vendored
View file

@ -8,6 +8,12 @@ git-annex (5.20140118) UNRELEASED; urgency=medium
* list: Fix specifying of files to list.
* Allow --all to be mixed with matching options like --copies and --in
(but not --include and --exclude).
* numcopies: New command, sets global numcopies value that is seen by all
clones of a repository.
* The annex.numcopies git config setting is deprecated. Once the numcopies
command is used to set the global number of copies, any annex.numcopies
git configs will be ignored.
* assistant: Make the prefs page set the global numcopies.
-- Joey Hess <joeyh@debian.org> Sat, 18 Jan 2014 11:54:17 -0400

View file

@ -6,8 +6,8 @@ command. So, git-annex can be configured to try
to keep N copies of a file's content available across all repositories.
(Although [[untrusted_repositories|trust]] don't count toward this total.)
By default, N is 1; it is configured by annex.numcopies. This default
can be overridden on a per-file-type basis by the annex.numcopies
By default, N is 1; it is configured by running `git annex numcopies N`.
This default can be overridden on a per-file-type basis by the annex.numcopies
setting in `.gitattributes` files. The --numcopies switch allows
temporarily using a different value.
@ -30,9 +30,3 @@ refuse to do so.
With N=2, in order to drop the file content from Laptop, it would need access
to both USB and Server.
Note that different repositories can be configured with different values of
N. So just because Laptop has N=2, this does not prevent the number of
copies falling to 1, when USB and Server have N=1. To avoid this,
configure it in `.gitattributes`, which is shared between repositories
using git.

View file

@ -403,6 +403,20 @@ subdirectories).
keyid+= and keyid-= with such remotes should be used with care, and
make little sense except in cases like the revoked key example above.
* `numcopies [N]`
Tells git-annex how many copies it should preserve of files, over all
repositories. The default is 1.
Run without a number to get the current value.
When git-annex is asked to drop a file, it first verifies that the
required number of copies can be satisfied amoung all the other
repositories that have a copy of the file.
This can be overridden on a per-file basis by the annex.numcopies setting
in .gitattributes files.
* `trust [repository ...]`
Records that a repository is trusted to not unexpectedly lose
@ -828,7 +842,7 @@ subdirectories).
* `--auto`
Enable automatic mode. Commands that get, drop, or move file contents
will only do so when needed to help satisfy the setting of annex.numcopies,
will only do so when needed to help satisfy the setting of numcopies,
and preferred content configuration.
* `--all`
@ -883,7 +897,7 @@ subdirectories).
* `--numcopies=n`
Overrides the `annex.numcopies` setting, forcing git-annex to ensure the
Overrides the numcopies setting, forcing git-annex to ensure the
specified number of copies exist.
Note that setting numcopies to 0 is very unsafe.
@ -1117,12 +1131,6 @@ Here are all the supported configuration settings.
A unique UUID for this repository (automatically set).
* `annex.numcopies`
Number of copies of files to keep across all repositories. (default: 1)
Note that setting numcopies to 0 is very unsafe.
* `annex.backends`
Space-separated list of names of the key-value backends to use.
@ -1151,6 +1159,17 @@ Here are all the supported configuration settings.
annex.largefiles = largerthan=100kb and not (include=*.c or include=*.h)
* `annex.numcopies`
This is a deprecated setting. You should instead use the
`git annex numcopies` command to configure how many copies of files
are kept acros all repositories.
This config setting is only looked at when `git annex numcopies` has
never been configured.
Note that setting numcopies to 0 is very unsafe.
* `annex.queuesize`
git-annex builds a queue of git commands, in order to combine similar
@ -1456,10 +1475,12 @@ but the SHA256E backend for ogg files:
The numcopies setting can also be configured on a per-file-type basis via
the `annex.numcopies` attribute in `.gitattributes` files. This overrides
any value set using `annex.numcopies` in `.git/config`.
For example, this makes two copies be needed for wav files:
other numcopies settings.
For example, this makes two copies be needed for wav files and 3 copies
for flac files:
*.wav annex.numcopies=2
*.flac annex.numcopies=3
Note that setting numcopies to 0 is very unsafe.

View file

@ -56,8 +56,11 @@ space and then the description, followed by a timestamp. Example:
e605dca6-446a-11e0-8b2a-002170d25c55 laptop timestamp=1317929189.157237s
26339d22-446b-11e0-9101-002170d25c55 usb disk timestamp=1317929330.769997s
If there are multiple lines for the same uuid, the one with the most recent
timestamp wins. git-annex union merges this and other files.
## `numcopies.log`
Records the global numcopies setting.
The file format is simply a timestamp followed by a number.
## `remote.log`

View file

@ -34,7 +34,7 @@ With the result that it will hang onto files:
Could only verify the existence of 0 out of 1 necessary copies
Also these untrusted repositories may contain the file:
00000000-0000-0000-0000-000000000001 -- web
(Use --force to override this check, or adjust annex.numcopies.)
(Use --force to override this check, or adjust numcopies.)
failed
## attaching urls to existing files

View file

@ -47,10 +47,10 @@ work tree, which gitattributes settings do not.
Conclusion:
* Add to the git-annex branch a numcopies file that holds the global
numcopies default if present.
* Modify the assistant to use it when configuring numcopies.
numcopies default if present. **done**
* Modify the assistant to use it when configuring numcopies. **done**
* To deprecate .git/config's annex.numcopies, only make it take effect
when there is no numcopies file in the git-annex branch.
when there is no numcopies file in the git-annex branch. **done**
* Add "numcopiesneeded=N" preferred content expression using the git-annex
branch numcopies setting, overridden by any .gitattributes numcopies setting
for a particular file. It should ignore the other ways to specify

View file

@ -2,7 +2,7 @@ You can use the fsck subcommand to check for problems in your data. What
can be checked depends on the key-value [[backend|backends]] you've used
for the data. For example, when you use the SHA1 backend, fsck will verify
that the checksums of your files are good. Fsck also checks that the
annex.numcopies setting is satisfied for all files.
[[numcopies|copies]] setting is satisfied for all files.
# git annex fsck
fsck some_file (checksum...) ok

View file

@ -10,12 +10,12 @@ you'll see something like this.
Try making some of these repositories available:
58d84e8a-d9ae-11df-a1aa-ab9aa8c00826 -- portable USB drive
ca20064c-dbb5-11df-b2fe-002170d25c55 -- backup SATA drive
(Use --force to override this check, or adjust annex.numcopies.)
(Use --force to override this check, or adjust numcopies.)
failed
drop other.iso (unsafe)
Could only verify the existence of 0 out of 1 necessary copies
No other repository is known to contain the file.
(Use --force to override this check, or adjust annex.numcopies.)
(Use --force to override this check, or adjust numcopies.)
failed
Here you might --force it to drop `important_file` if you [[trust]] your backup.