3ce2e95a5f
This all works fine. But it doesn't check repository sizes yet, and without repository size checking, once a repository gets full, there will be no other repository that will want its files. Use of sha2 seems unncessary, probably alder2 or md5 or crc would have been enough. Possibly just summing up the bytes of the key mod the number of repositories would have sufficed. But sha2 is there, and probably hardware accellerated. I doubt very much there is any security benefit to using it though. If someone wants to construct a key that will be balanced onto a given repository, sha2 is certianly not going to stop them.
72 lines
3 KiB
Markdown
72 lines
3 KiB
Markdown
git-annex tries to ensure that the configured number of [[copies]] of your
|
|
data always exist, and leaves it up to you to use commands like `git annex
|
|
get` and `git annex drop` to move the content to the repositories you want
|
|
to contain it. But often, it can be good to have more fine-grained
|
|
control over which content is wanted by which repositories. Configuring
|
|
this allows the git-annex assistant as well as
|
|
`git annex get --auto`, `git annex drop --auto`, `git annex sync --content`,
|
|
etc to do smarter things.
|
|
|
|
Preferred content settings can be edited using `git
|
|
annex vicfg`, or viewed and set at the command line with `git annex wanted`.
|
|
Each repository can have its own settings, and other repositories will
|
|
try to honor those settings when interacting with it.
|
|
(So there's no local `.git/config` for preferred content settings.)
|
|
|
|
The idea is that you write an expression that files are matched against.
|
|
If a file matches, the repository wants to store its content.
|
|
If it doesn't, the repository wants to drop its content
|
|
(if there are enough copies elsewhere to allow removing it).
|
|
|
|
## writing expressions
|
|
|
|
[[!template id=note text="""
|
|
### [[quickstart|standard_groups]]
|
|
|
|
Rather than writing your own preferred content expression, you can use
|
|
several standard ones included in git-annex that are tuned to cover different
|
|
common use cases.
|
|
|
|
You do this by putting a repository in a group,
|
|
and simply setting its preferred content to "standard" to match whatever
|
|
is standard for that group. See [[standard_groups]] for a list.
|
|
"""]]
|
|
|
|
See the man page [[git-annex-preferred-content]] for details on the syntax
|
|
of preferred content expressions.
|
|
|
|
An example:
|
|
|
|
include=*.mp3 and (not largerthan=100mb) and exclude=old/*
|
|
|
|
This makes all .mp3 files, and all other files that are less than 100 mb in
|
|
size be preferred content. It excludes all files under the "old" directory.
|
|
|
|
## upgrades
|
|
|
|
It's important that all clones of a repository can understand one-another's
|
|
preferred content expressions, especially when using the git-annex
|
|
assistant. So using newly added keywords can cause a problem if
|
|
an older version of git-annex is in use elsewhere.
|
|
|
|
Before git-annex version 5.20140320, when git-annex saw a keyword it
|
|
did not understand, it defaulted to assuming *all* files were
|
|
preferred content. From version 5.20140320, git-annex has a nicer fallback
|
|
behavior: When it is unable to parse a preferred content expression,
|
|
it assumes all files that are currently present are preferred content.
|
|
|
|
Here are recent changes to preferred content expressions, and the version
|
|
they were added in.
|
|
|
|
* "balanced=", "fullybalanced=" 10.20240831
|
|
* "securehash" 6.20170228
|
|
* "nothing" 6.201600202
|
|
* "anything" 5.20150616
|
|
* "standard" 5.20140314
|
|
(only when used in a more complicated expression; "standard" by
|
|
itself has been supported for a long time)
|
|
* "groupwanted=" 5.20140314
|
|
* "metadata=" 5.20140221
|
|
* "lackingcopies=", "approxlackingcopies=", "unused=" 5.20140127
|
|
* "inpreferreddir=" 4.20130501
|
|
* "metadata=field<number" etc 6.20160227
|