git-annex/doc/tips/largefiles.mdwn
Joey Hess 37467a008f
annex.addunlocked expressions
* annex.addunlocked can be set to an expression with the same format used by
  annex.largefiles, in case you want to default to unlocking some files but
  not others.
* annex.addunlocked can be configured by git-annex config.

Added a git-annex-matching-expression man page, broken out from
tips/largefiles.

A tricky consequence of this is that git-annex add --relaxed
honors annex.addunlocked, but an expression might want to know the size
or content of an url, which it's not going to download. I decided it was
better not to fail, and just dummy up some plausible data in that case.

Performance impact should be negligible. The global config is already
loaded for annex.largefiles. The expression only has to be parsed once,
and in the simple true/false case, it should not do any additional work
matching it.
2019-12-20 15:56:25 -04:00

119 lines
4.4 KiB
Markdown

[[!meta title="annex.largefiles: configuring mixed content repositories"]]
Normally commands like `git annex add` always add files to the annex,
while `git add` adds files to git.
Let's suppose you're developing a video game, written in C. You have
source code, and some large game assets. You want to ensure the source
code is stored in git -- that's what git's for! And you want to store
the game assets in the git annex -- to avod bloating your git repos with
possibly enormous files, but still version control them.
You could take care to use `git annex add` after changes to the assets,
but it would be easy to slip up and `git commit -a` (which runs `git add`),
checking your large assets into git. Configuring annex.largefiles
saves you the bother of keeping things straight when adding files.
Once you've told git-annex what files are large, both `git annex add`
and `git add`/`git commit -a` will add the large files to the annex and the
small files to git.
Other commands that use the annex.largefiles configuration include
`git annex import`, git annex addurl`, `git annex importfeed`, and
the assistant.
## examples
For example, let's make only files larger than 100 kb be added to the annex,
and never `*.c` and `*.h` source code files.
git config annex.largefiles 'largerthan=100kb and not (include=*.c or include=*.h)'
That is a local configuration, so will only apply to your clone of the
repository. To set a default that will apply to all clones, unless
overridden, do this instead:
git annex config --set annex.largefiles 'largerthan=100kb and not (include=*.c or include=*.h)'
There's one other way to configure the same thing, you can put this in
the `.gitattributes` file:
* annex.largefiles=largerthan=100kb
*.c annex.largefiles=nothing
*.h annex.largefiles=nothing
The syntax in .gitattributes is a bit different, because the .gitattributes
matches files itself, and the values of attributes cannot contain spaces.
So using .gitattributes for this is not recommended (but it does work for
older versions of git-annex, where the `git annex config` setting does
not). Any .gitattributes setting overrides the `git annex config` setting,
but will be overridden by the `git config` setting.
Another example. If you wanted `git add` to put all files the annex
in your local repository:
git config annex.largefiles anything
Or in all clones:
git annex config --set annex.largefiles anything
## syntax
See [[git-annex-matching-expression]] for details about the syntax.
## gitattributes format
Here's that example `.gitattributes` again:
* annex.largefiles=largerthan=100kb
*.c annex.largefiles=nothing
*.h annex.largefiles=nothing
The way that works is, `*.c` and `*.h` files have the annex.largefiles
attribute set to "nothing", and so those files are never treated as large
files. All other files use the other value, which checks the file size.
Since git attribute values cannot contain whitespace, when you need
a more complicated annex.largefiles expression, you can instead
parenthesize the terms of the annex.largefiles attribute.
For example, this is the same as the git config shown earlier, shoehorned
into a single git attribute:
* annex.largefiles=(largerthan=100kb)and(not((include=*.c)or(include=*.h)))
It's generally a better idea to use `git annex config` instead.
## temporarily override
If you've set up an annex.largefiles configuration but want to force a file to
be stored in the annex, you can temporarily override the configuration like
this:
git annex add -c annex.largefiles=anything smallfile
## converting git to annexed
When you have a file that is currently stored in git, and you want to
convert that to be stored in the annex, here's how to accomplish that:
git rm --cached file
git annex add -c annex.largefiles=anything file
git commit file
This first removes the file from git's index cache, and then adds it back
using git-annex. You can modify the file before the `git-annex add` step,
perhaps replacing it with new larger content that necessitates git-annex.
## converting annexed to git
When you have a file that is currently stored in the annex, and you want to
convert that to be stored in git, here's how to accomplish that:
git annex unlock file
git rm --cached file
git -c annex.largefiles=nothing add file
git commit file
You can modify the file after unlocking it and before adding it to
git. And this is probably a good idea if it was really a big file,
so that you can replace its content with something smaller.