Replace "in=" with "present" in preferred content expressions

in= was problimatic in two ways. First, it referred to a remote by name,
but preferred content expressions can be evaluated elsewhere, where that
remote doesn't exist, or a different remote has the same name. This name
lookup code could error out at runtime. Secondly, in= seemed pretty useless.
in=here did not cause content to be gotten, but it did let present content
be dropped.

present is more useful, although "not present" is unstable and should be
avoided.
This commit is contained in:
Joey Hess 2012-10-19 16:09:21 -04:00
parent 3417c55189
commit 40aab719df
4 changed files with 73 additions and 14 deletions

View file

@ -113,6 +113,20 @@ limitIn name = Right $ \notpresent -> check $
then return False
else inAnnex key
{- Limit to content that is currently present on a uuid. -}
limitPresent :: Maybe UUID -> MkLimit
limitPresent u name = Right $ const $ check $ \key -> do
hereu <- getUUID
if u == Just hereu || u == Nothing
then inAnnex key
else do
us <- Remote.keyLocations key
return $ maybe False (`elem` us) u
where
check a = lookupFile >=> handle a
handle _ Nothing = return False
handle a (Just (key, _)) = a key
{- Adds a limit to skip files not believed to have the specified number
- of copies. -}
addCopies :: String -> Annex ()

View file

@ -88,7 +88,7 @@ makeMatcher groupmap u s
| null (lefts tokens) = Utility.Matcher.generate $ rights tokens
| otherwise = matchAll
where
tokens = map (parseToken groupmap) (tokenizeMatcher s)
tokens = map (parseToken (Just u) groupmap) (tokenizeMatcher s)
{- Standard matchers are pre-defined for some groups. If none is defined,
- or a repository is in multiple groups with standard matchers, match all. -}
@ -103,26 +103,26 @@ matchAll = Utility.Matcher.generate []
checkPreferredContentExpression :: String -> Maybe String
checkPreferredContentExpression s
| s == "standard" = Nothing
| otherwise = case lefts $ map (parseToken emptyGroupMap) (tokenizeMatcher s) of
| otherwise = case lefts $ map (parseToken Nothing emptyGroupMap) (tokenizeMatcher s) of
[] -> Nothing
l -> Just $ unwords $ map ("Parse failure: " ++) l
parseToken :: GroupMap -> String -> Either String (Utility.Matcher.Token MatchFiles)
parseToken groupmap t
parseToken :: (Maybe UUID) -> GroupMap -> String -> Either String (Utility.Matcher.Token MatchFiles)
parseToken mu groupmap t
| any (== t) Utility.Matcher.tokens = Right $ Utility.Matcher.token t
| otherwise = maybe (Left $ "near " ++ show t) use $ M.lookup k m
where
(k, v) = separate (== '=') t
m = M.fromList
| t == "present" = use $ limitPresent mu
| otherwise = maybe (Left $ "near " ++ show t) use $ M.lookup k $
M.fromList
[ ("include", limitInclude)
, ("exclude", limitExclude)
, ("in", limitIn)
, ("copies", limitCopies)
, ("inbackend", limitInBackend)
, ("largerthan", limitSize (>))
, ("smallerthan", limitSize (<))
, ("inallgroup", limitInAllGroup groupmap)
]
where
(k, v) = separate (== '=') t
use a = Utility.Matcher.Operation <$> a v
{- This is really dumb tokenization; there's no support for quoted values.

2
debian/changelog vendored
View file

@ -2,6 +2,8 @@ git-annex (3.20121018) UNRELEASED; urgency=low
* Fix handling of GIT_DIR when it refers to a git submodule.
* Preferred content path matching bugfix.
* Preferred content expressions cannot use "in=".
* Preferred content expressions can use "present".
-- Joey Hess <joeyh@debian.org> Wed, 17 Oct 2012 14:24:10 -0400

View file

@ -20,17 +20,18 @@ The expressions are very similar to the file matching options documented
on the [[git-annex]] man page. At the command line, you can use those
options in commands like this:
git annex get --include='*.mp3' --and -'(' --not --in=archive -')'
git annex get --include='*.mp3' --and -'(' --not --largerthan=100mb -')'
The equivilant preferred content expression looks like this:
include=*.mp3 and (not in=archive)
include=*.mp3 and (not largerthan=100mb)
So, just remove the dashes, basically.
So, just remove the dashes, basically. However, there are some differences
from the command line options to keep in mind:
## file matching
### difference: file matching
Note that while --include and --exclude match files relative to the current
While --include and --exclude match files relative to the current
directory, preferred content expressions always match files relative to the
top of the git repository. Perhaps you put files into `archive` directories
when you're done with them. Then you could configure your laptop to prefer
@ -38,6 +39,48 @@ to not retain those files, like this:
exclude=*/archive/*
### difference: no "in="
Preferred content expressions have no direct equivilant to `--in`.
Often, it's best to add repositories to groups, and match against
the groups in a preferred content expression. So rather than
`--in=usbdrive`, put all the USB drives into a "transfer" group,
and use "copies=transfer:1"
### difference: dropping
To decide if content should be dropped, git-annex evaluates the preferred
content expression under the assumption that the content has *already* been
dropped. If the content would not be preferred then, the drop can be done.
So, for example, `copies=2` in a preferred content expression lets
content be dropped only when there are currently 3 copies of it, including
the repo it's being dropped from. This is different than running `git annex
drop --copies=2`, which will drop files that current have 2 copies.
A wrinkle of this approach is how `in=` is handled. When deciding if
content should be dropped, git-annex looks at the current status, not
the status if the content would be dropped. So `in=here` means that
any currently present content is preferred, which can be useful if you
want manual control over content. Meanwhile `not (in=here)` should be
avoided -- it will cause content that's not here to be preferred,
but once the content arrives, it'll stop being preferred and will be
dropped again!
## difference: "present"
There's a special "present" keyword you can use in a preferred content
expression. This means that content is preferred if it's present,
and not otherwise. This leaves it up to you to use git-annex manually
to move content around. You can use this to avoid preferred content
settings from affecting a subdirectory. For example:
auto/* or (include=ad-hoc/* and present)
Note that `not present` is a very bad thing to put in a preferred content
expression. It'll make it prefer to get content that's not present, and
drop content that is present! Don't go there..
## standard expressions
git-annex comes with some standard preferred content expressions, that can