31a5b58b2c
Code change should be trvial, but not yet implemented. This significantly complicated the task of documenting how git-annex works. I'm not sure how useful the annex.gitaddtoannex confguration is after this change; seems that if a user has an annex.largefiles they will want it applied consistently. But the last thing I want to hear is more complaining from users about git add doing something they don't want it to. There's a pretty high risk users who got used to the git add behavior and don't have annex.largefiles configured will miss the NEWS and complain bitterly about their suddenly bloated repositories. Oh well. Removed outdated comments about the old behavior to avoid confusion. I don't know if I've found all the places that griping spread to.
157 lines
5.3 KiB
Markdown
157 lines
5.3 KiB
Markdown
[[!meta title="annex.largefiles: configuring mixed content repositories"]]
|
|
|
|
Normally commands like `git annex add` always add files to the annex,
|
|
while `git add` adds files to git.
|
|
|
|
Let's suppose you're developing a video game, written in C. You have
|
|
source code, and some large game assets. You want to ensure the source
|
|
code is stored in git -- that's what git's for! And you want to store
|
|
the game assets in the git annex -- to avod bloating your git repos with
|
|
possibly enormous files, but still version control them.
|
|
|
|
You could take care to use `git annex add` after changes to the assets,
|
|
but it would be easy to slip up and `git commit -a` (which runs `git add`),
|
|
checking your large assets into git. Configuring annex.largefiles
|
|
saves you the bother of keeping things straight when adding files.
|
|
Once you've told git-annex what files are large, both `git annex add`
|
|
and `git add`/`git commit -a` will add the large files to the annex and the
|
|
small files to git.
|
|
|
|
Other commands that use the annex.largefiles configuration include
|
|
`git annex import`, git annex addurl`, `git annex importfeed`, and
|
|
the assistant.
|
|
|
|
## examples
|
|
|
|
For example, let's make only files larger than 100 kb be added to the annex,
|
|
and never `*.c` and `*.h` source code files.
|
|
|
|
Write this to the `.gitattributes` file:
|
|
|
|
* annex.largefiles=(largerthan=100kb)
|
|
*.c annex.largefiles=nothing
|
|
*.h annex.largefiles=nothing
|
|
|
|
Or, set the git configuration instead:
|
|
|
|
git config annex.largefiles 'largerthan=100kb and not (include=*.c or include=*.h)'
|
|
|
|
Both of these settings do the same thing. Setting it in the
|
|
`.gitattributes` file makes any checkout of the repository share that
|
|
configuration, so is often a good choice. Setting the annex.largefiles git
|
|
configuration lets different checkouts behave differently. The git
|
|
configuration overrides the `.gitattributes` configuration.
|
|
|
|
Or, perhaps you just want all files to be added to the annex, no matter
|
|
what. Just write "* annex.largefiles=anything" to the `.gitattributes`
|
|
file, or run:
|
|
|
|
git config annex.largefiles anything
|
|
|
|
## syntax
|
|
|
|
The value of annex.largefiles is similar to a
|
|
[[preferred content expression|git-annex-preferred-content]].
|
|
The following terms can be used in annex.largefiles:
|
|
|
|
* `include=glob` / `exclude=glob`
|
|
|
|
Specify files to include or exclude.
|
|
|
|
The glob can contain `*` and `?` to match arbitrary characters.
|
|
|
|
* `smallerthan=size` / `largerthan=size`
|
|
|
|
Matches only files smaller than, or larger than the specified size.
|
|
|
|
The size can be specified with any commonly used units, for example,
|
|
"0.5 gb" or "100 KiloBytes"
|
|
|
|
* `mimetype=glob`
|
|
|
|
Looks up the MIME type of a file, and checks if the glob matches it.
|
|
|
|
For example, `"mimetype=text/*"` will match many varieties of text files,
|
|
including "text/plain", but also "text/x-shellscript", "text/x-makefile",
|
|
etc.
|
|
|
|
The MIME types are the same that are displayed by running `file --mime-type`
|
|
|
|
This is only available to use when git-annex was built with the
|
|
MagicMime build flag.
|
|
|
|
* `mimeencoding=glob`
|
|
|
|
Looks up the MIME encoding of a file, and checks if the glob matches it.
|
|
|
|
For example, `"mimeencoding=binary"` will match many kinds of binary
|
|
files.
|
|
|
|
The MIME encodings are the same that are displayed by running `file --mime-encoding`
|
|
|
|
This is only available to use when git-annex was built with the
|
|
MagicMime build flag.
|
|
|
|
* `anything`
|
|
|
|
Matches any file.
|
|
|
|
* `nothing`
|
|
|
|
Matches no files. (Same as "not anything")
|
|
|
|
* `not expression`
|
|
|
|
Inverts what the expression matches.
|
|
|
|
* `and` / `or` / `( expression )`
|
|
|
|
These can be used to build up more complicated expressions.
|
|
|
|
The way the `.gitattributes` example above works is, `*.c` and `*.h` files
|
|
have the annex.largefiles attribute set to "nothing",
|
|
and so those files are never treated as large files. All other files use
|
|
the other value, which checks the file size.
|
|
|
|
Note that, since git attribute values cannot contain whitespace,
|
|
it's useful to instead parenthesize the terms of the annex.largefiles
|
|
attribute. This trick allows for more complicated expressions.
|
|
For example, this is the same as the git config shown earlier, shoehorned
|
|
into a git attribute:
|
|
|
|
* annex.largefiles=(largerthan=100kb)and(not((include=*.c)or(include=*.h)))
|
|
|
|
## temporarily override
|
|
|
|
If you've set up an annex.largefiles configuration but want to force a file to
|
|
be stored in the annex, you can temporarily override the configuration like
|
|
this:
|
|
|
|
git annex add -c annex.largefiles=anything smallfile
|
|
|
|
## converting git to annexed
|
|
|
|
When you have a file that is currently stored in git, and you want to
|
|
convert that to be stored in the annex, here's how to accomplish that:
|
|
|
|
git rm --cached file
|
|
git annex add -c annex.largefiles=anything file
|
|
git commit file
|
|
|
|
This first removes the file from git's index cache, and then adds it back
|
|
using git-annex. You can modify the file before the `git-annex add` step,
|
|
perhaps replacing it with new larger content that necessitates git-annex.
|
|
|
|
## converting annexed to git
|
|
|
|
When you have a file that is currently stored in the annex, and you want to
|
|
convert that to be stored in git, here's how to accomplish that:
|
|
|
|
git annex unlock file
|
|
git rm --cached file
|
|
git -c annex.largefiles=nothing add file
|
|
git commit file
|
|
|
|
You can modify the file after unlocking it and before adding it to
|
|
git. And this is probably a good idea if it was really a big file,
|
|
so that you can replace its content with something smaller.
|