2016-02-02 20:50:58 +00:00
|
|
|
[[!meta title="annex.largefiles: configuring mixed content repositories"]]
|
|
|
|
|
2019-10-24 17:50:44 +00:00
|
|
|
Normally commands like `git annex add` always add files to the annex,
|
|
|
|
while `git add` adds files to git.
|
2016-02-02 20:50:58 +00:00
|
|
|
|
|
|
|
Let's suppose you're developing a video game, written in C. You have
|
|
|
|
source code, and some large game assets. You want to ensure the source
|
|
|
|
code is stored in git -- that's what git's for! And you want to store
|
|
|
|
the game assets in the git annex -- to avod bloating your git repos with
|
|
|
|
possibly enormous files, but still version control them.
|
|
|
|
|
2019-10-24 17:50:44 +00:00
|
|
|
You could take care to use `git annex add` after changes to the assets,
|
|
|
|
but it would be easy to slip up and `git commit -a` (which runs `git add`),
|
|
|
|
checking your large assets into git. Configuring annex.largefiles
|
|
|
|
saves you the bother of keeping things straight when adding files.
|
|
|
|
Once you've told git-annex what files are large, both `git annex add`
|
|
|
|
and `git add`/`git commit -a` will add the large files to the annex and the
|
|
|
|
small files to git.
|
2016-02-02 20:50:58 +00:00
|
|
|
|
2019-10-24 17:50:44 +00:00
|
|
|
Other commands that use the annex.largefiles configuration include
|
|
|
|
`git annex import`, git annex addurl`, `git annex importfeed`, and
|
|
|
|
the assistant.
|
2016-02-02 20:53:29 +00:00
|
|
|
|
2016-02-02 20:50:58 +00:00
|
|
|
## examples
|
|
|
|
|
|
|
|
For example, let's make only files larger than 100 kb be added to the annex,
|
2016-02-02 20:53:29 +00:00
|
|
|
and never `*.c` and `*.h` source code files.
|
2016-02-02 20:50:58 +00:00
|
|
|
|
|
|
|
Write this to the `.gitattributes` file:
|
|
|
|
|
|
|
|
* annex.largefiles=(largerthan=100kb)
|
|
|
|
*.c annex.largefiles=nothing
|
|
|
|
*.h annex.largefiles=nothing
|
|
|
|
|
|
|
|
Or, set the git configuration instead:
|
|
|
|
|
|
|
|
git config annex.largefiles 'largerthan=100kb and not (include=*.c or include=*.h)'
|
|
|
|
|
2019-10-24 17:50:44 +00:00
|
|
|
Both of these settings do the same thing. Setting it in the
|
|
|
|
`.gitattributes` file makes any checkout of the repository share that
|
|
|
|
configuration, so is often a good choice. Setting the annex.largefiles git
|
|
|
|
configuration lets different checkouts behave differently. The git
|
|
|
|
configuration overrides the `.gitattributes` configuration.
|
|
|
|
|
|
|
|
Or, perhaps you just want all files to be added to the annex, no matter
|
|
|
|
what. Just write "* annex.largefiles=anything" to the `.gitattributes`
|
|
|
|
file, or run:
|
|
|
|
|
|
|
|
git config annex.largefiles anything
|
2016-02-02 20:50:58 +00:00
|
|
|
|
|
|
|
## syntax
|
|
|
|
|
2016-02-03 18:56:34 +00:00
|
|
|
The value of annex.largefiles is similar to a
|
|
|
|
[[preferred content expression|git-annex-preferred-content]].
|
|
|
|
The following terms can be used in annex.largefiles:
|
|
|
|
|
|
|
|
* `include=glob` / `exclude=glob`
|
|
|
|
|
|
|
|
Specify files to include or exclude.
|
|
|
|
|
2016-02-03 20:29:34 +00:00
|
|
|
The glob can contain `*` and `?` to match arbitrary characters.
|
|
|
|
|
2016-02-03 18:56:34 +00:00
|
|
|
* `smallerthan=size` / `largerthan=size`
|
|
|
|
|
|
|
|
Matches only files smaller than, or larger than the specified size.
|
|
|
|
|
|
|
|
The size can be specified with any commonly used units, for example,
|
|
|
|
"0.5 gb" or "100 KiloBytes"
|
|
|
|
|
2016-02-03 20:29:34 +00:00
|
|
|
* `mimetype=glob`
|
|
|
|
|
|
|
|
Looks up the MIME type of a file, and checks if the glob matches it.
|
|
|
|
|
2019-04-30 15:58:06 +00:00
|
|
|
For example, `"mimetype=text/*"` will match many varieties of text files,
|
2016-02-03 20:29:34 +00:00
|
|
|
including "text/plain", but also "text/x-shellscript", "text/x-makefile",
|
|
|
|
etc.
|
|
|
|
|
2016-02-03 23:00:23 +00:00
|
|
|
The MIME types are the same that are displayed by running `file --mime-type`
|
|
|
|
|
2016-02-03 20:29:34 +00:00
|
|
|
This is only available to use when git-annex was built with the
|
|
|
|
MagicMime build flag.
|
|
|
|
|
2019-04-30 15:58:06 +00:00
|
|
|
* `mimeencoding=glob`
|
|
|
|
|
|
|
|
Looks up the MIME encoding of a file, and checks if the glob matches it.
|
|
|
|
|
|
|
|
For example, `"mimeencoding=binary"` will match many kinds of binary
|
|
|
|
files.
|
|
|
|
|
|
|
|
The MIME encodings are the same that are displayed by running `file --mime-encoding`
|
|
|
|
|
|
|
|
This is only available to use when git-annex was built with the
|
|
|
|
MagicMime build flag.
|
|
|
|
|
2016-02-03 18:56:34 +00:00
|
|
|
* `anything`
|
|
|
|
|
|
|
|
Matches any file.
|
|
|
|
|
|
|
|
* `nothing`
|
|
|
|
|
|
|
|
Matches no files. (Same as "not anything")
|
|
|
|
|
|
|
|
* `not expression`
|
|
|
|
|
|
|
|
Inverts what the expression matches.
|
|
|
|
|
|
|
|
* `and` / `or` / `( expression )`
|
|
|
|
|
|
|
|
These can be used to build up more complicated expressions.
|
|
|
|
|
2016-02-02 20:53:29 +00:00
|
|
|
The way the `.gitattributes` example above works is, `*.c` and `*.h` files
|
2016-02-03 18:56:34 +00:00
|
|
|
have the annex.largefiles attribute set to "nothing",
|
2016-02-02 20:50:58 +00:00
|
|
|
and so those files are never treated as large files. All other files use
|
|
|
|
the other value, which checks the file size.
|
|
|
|
|
|
|
|
Note that, since git attribute values cannot contain whitespace,
|
2016-02-03 18:56:34 +00:00
|
|
|
it's useful to instead parenthesize the terms of the annex.largefiles
|
|
|
|
attribute. This trick allows for more complicated expressions.
|
2016-02-02 20:50:58 +00:00
|
|
|
For example, this is the same as the git config shown earlier, shoehorned
|
|
|
|
into a git attribute:
|
|
|
|
|
|
|
|
* annex.largefiles=(largerthan=100kb)and(not((include=*.c)or(include=*.h)))
|
|
|
|
|
|
|
|
## temporarily override
|
|
|
|
|
|
|
|
If you've set up an annex.largefiles configuration but want to force a file to
|
|
|
|
be stored in the annex, you can temporarily override the configuration like
|
|
|
|
this:
|
|
|
|
|
2016-02-02 20:55:21 +00:00
|
|
|
git annex add -c annex.largefiles=anything smallfile
|
2018-08-09 19:05:19 +00:00
|
|
|
|
|
|
|
## converting git to annexed
|
|
|
|
|
|
|
|
When you have a file that is currently stored in git, and you want to
|
|
|
|
convert that to be stored in the annex, here's how to accomplish that:
|
|
|
|
|
|
|
|
git rm --cached file
|
|
|
|
git annex add -c annex.largefiles=anything file
|
|
|
|
git commit file
|
|
|
|
|
|
|
|
This first removes the file from git's index cache, and then adds it back
|
|
|
|
using git-annex. You can modify the file before the `git-annex add` step,
|
|
|
|
perhaps replacing it with new larger content that necessitates git-annex.
|
|
|
|
|
|
|
|
## converting annexed to git
|
|
|
|
|
|
|
|
When you have a file that is currently stored in the annex, and you want to
|
|
|
|
convert that to be stored in git, here's how to accomplish that:
|
|
|
|
|
|
|
|
git annex unlock file
|
2019-10-08 18:16:39 +00:00
|
|
|
git rm --cached file
|
2018-08-27 18:47:17 +00:00
|
|
|
git -c annex.largefiles=nothing add file
|
2019-09-30 17:34:26 +00:00
|
|
|
git commit file
|
2018-08-09 19:05:19 +00:00
|
|
|
|
|
|
|
You can modify the file after unlocking it and before adding it to
|
|
|
|
git. And this is probably a good idea if it was really a big file,
|
|
|
|
so that you can replace its content with something smaller.
|