treat "not present" in preferred content as invalid
Detect when a preferred content expression contains "not present", which would lead to repeatedly getting and then dropping files, and make it never match. This also applies to "not balanced" and "not sizebalanced". --explain will tell the user when this happens Note that getMatcher calls matchMrun' and does not check for unstable negated limits. While there is no --present anyway, if there was, it would not make sense for --not --present to complain about instability and fail to match.
This commit is contained in:
parent
8b2bd42540
commit
340bdd0dac
8 changed files with 121 additions and 47 deletions
|
@ -172,9 +172,11 @@ elsewhere to allow removing it).
|
|||
settings from affecting a subdirectory. For example:
|
||||
`auto/* or (include=ad-hoc/* and present)`
|
||||
|
||||
Note that `not present` is a very bad thing to put in a preferred content
|
||||
expression. It'll make it want to get content that's not present, and
|
||||
drop content that is present! Don't go there..
|
||||
Note that `not present` is not a reasonable thing to put in a preferred
|
||||
content expression. It says to get content that's not present, but then
|
||||
drop it! If that somehow gets into a preferred content expression,
|
||||
git-annex will recognize that the preferred content expression is not stable,
|
||||
and make it never match.
|
||||
|
||||
* `inpreferreddir`
|
||||
|
||||
|
@ -304,8 +306,8 @@ elsewhere to allow removing it).
|
|||
a lot of files. When this causes git-annex to do a lot of work, it will
|
||||
display "(calculating repository sizes)".
|
||||
|
||||
Note that `not balanced` is a bad thing to put in a preferred content
|
||||
expression for the same reason `not present` is.
|
||||
Note that `not balanced` not a reasonable thing to use in a preferred
|
||||
content expression for the same reasons as `not present`.
|
||||
|
||||
* `fullybalanced=groupname[:number]`
|
||||
|
||||
|
@ -365,6 +367,9 @@ elsewhere to allow removing it).
|
|||
will make repositories want to move files around as necessary in order to
|
||||
get fully balanced.
|
||||
|
||||
Note that `not sizebalanced` not a reasonable thing to use in a preferred
|
||||
content expression for the same reasons as `not present`.
|
||||
|
||||
* `fullysizebalanced=groupname:number`
|
||||
|
||||
This is like `sizebalanced`, but allows moving content between repositories
|
||||
|
|
|
@ -10,19 +10,17 @@ It would be very handy to provide some way to prove things about behavior
|
|||
of preferred content expressions, or a way to simulate the behavior of a
|
||||
network of git-annex repositories with a given preferred content configuration
|
||||
|
||||
For example, consider two reposities A and B. A is in group M and B is in
|
||||
group N. A has preferred content `not inallgroup=N` and B has `not inallgroup=M`.
|
||||
|
||||
If A contains a file, then B will want to also get a copy. And things
|
||||
stabilize there. But if the file is removed from A, then B also wants to
|
||||
remove it. And once B has removed it, A wants a copy of it. And then B also
|
||||
wants a copy of it. So the result is that the file got transferred twice,
|
||||
to end up right back where we started.
|
||||
|
||||
The worst case of this is `not present`, where the file gets dropped and
|
||||
transferred over and over again. The docs warn against using that one. But
|
||||
they can't warn about every bad preferred content expression.
|
||||
|
||||
Mostly, git-annex manages to keep things stable that seem like they would
|
||||
not be. Consider repo A that is not in group foo, and B is in group foo. A
|
||||
has preferred content "onlyingroup=foo". This will make A want a file that
|
||||
is in B. And once it has it, it will not want to drop it. That's because
|
||||
when dropping, it considers if it would be preferred content after the
|
||||
drop. In this case it would, so it doesn't drop it.
|
||||
|
||||
## balanced preferred content
|
||||
|
||||
When [[design/balanced_preferred_content]] is added, a whole new level of
|
||||
|
@ -35,7 +33,7 @@ matter the sizes of the underlying repositories, but balanced preferred
|
|||
content does take repository fullness into account, which further
|
||||
complicates fully understanding the behavior.
|
||||
|
||||
Notice that `fullbalanced()` is not stable when used
|
||||
Notice that `fullybalanced()` is not stable when used
|
||||
on its own, and so `balanced()` adds an "or present" to stabilize it.
|
||||
And so `not balanced()` includes `not present`, which is bad!
|
||||
|
||||
|
@ -53,16 +51,31 @@ would be good if git-annex warned and/or refused to set such an expression
|
|||
if it could detect it. Similarly `not groupwanted` could be detected as a
|
||||
problem when the group's preferred content expression contains `present`.
|
||||
|
||||
Is there is a more general purpose and not expensive way to detect such
|
||||
problematic expressions, that can find problems such as the
|
||||
`not inallgroup=N` example above?
|
||||
> This is now detected and such an unstable expression never matches.
|
||||
> --debug explains why too.
|
||||
>
|
||||
> Note that the detection will not be trigged by `"not (not present)"`,
|
||||
> but it will by `"include=* or (not present)"` even though that is always
|
||||
> stable, because `"include=*"` always matches and so what it's ORed with
|
||||
> doesn't matter. Probably noone will set something like that in real life
|
||||
> though.
|
||||
>
|
||||
> It's problimatic to make `git-annex wanted` warn about it. Consider
|
||||
> if in one repository, groupwanted is set to "present". In another
|
||||
> repository, which is disconnected, wanted is set to "not groupwanted".
|
||||
> Both operations are ok, but upon merging the two repositories,
|
||||
> the combined effect is that "not present" has been set.
|
||||
>
|
||||
> So while it could warn sometimes on setting "not present",
|
||||
> it would sometimes not be able to. Better to not warn inconsistently.
|
||||
> --[[Joey]]
|
||||
|
||||
## simulation
|
||||
|
||||
Simulation seems fairly straightforward, just simulate the network of
|
||||
git-annex repositories with random files with different sizes and
|
||||
metadata. Be sure to enforce invariants like numcopies the same as
|
||||
git-annex does.
|
||||
metadata. Or use the current files and metadata.
|
||||
Be sure to enforce invariants like numcopies the same as git-annex does.
|
||||
|
||||
Since users can write preferred content expressions, this should be
|
||||
targeted at being used by end users.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue