documentation for making git add only annex when configured by annex.largefiles

Code change should be trvial, but not yet implemented. This
significantly complicated the task of documenting how git-annex works.

I'm not sure how useful the annex.gitaddtoannex confguration is after
this change; seems that if a user has an annex.largefiles they will want
it applied consistently. But the last thing I want to hear is more
complaining from users about git add doing something they don't want it
to.

There's a pretty high risk users who got used to the git add behavior
and don't have annex.largefiles configured will miss the NEWS and
complain bitterly about their suddenly bloated repositories. Oh well.

Removed outdated comments about the old behavior to avoid confusion.
I don't know if I've found all the places that griping spread to.
This commit is contained in:
Joey Hess 2019-10-24 13:50:44 -04:00
parent 64d4a35523
commit 31a5b58b2c
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
14 changed files with 111 additions and 138 deletions

View file

@ -1,8 +1,7 @@
[[!meta title="annex.largefiles: configuring mixed content repositories"]]
Normally commands like `git annex add` always add files to the annex.
And when using the v7 repository mode, even `git add` and `git commit -a`
will add files to the annex.
Normally commands like `git annex add` always add files to the annex,
while `git add` adds files to git.
Let's suppose you're developing a video game, written in C. You have
source code, and some large game assets. You want to ensure the source
@ -10,14 +9,17 @@ code is stored in git -- that's what git's for! And you want to store
the game assets in the git annex -- to avod bloating your git repos with
possibly enormous files, but still version control them.
The annex.largefiles configuration is useful for such mixed content
repositories. It's checked by `git annex add`, by `git add` and `git commit -a`
(in v7 repositories), by `git annex import` and the assistant. It's
also used by `git annex addurl` and `git annex importfeed` when downloading
files. When a file does not match annex.largefiles, these commands will
add its content to git instead of to the annex.
You could take care to use `git annex add` after changes to the assets,
but it would be easy to slip up and `git commit -a` (which runs `git add`),
checking your large assets into git. Configuring annex.largefiles
saves you the bother of keeping things straight when adding files.
Once you've told git-annex what files are large, both `git annex add`
and `git add`/`git commit -a` will add the large files to the annex and the
small files to git.
This saves you the bother of keeping things straight when adding files.
Other commands that use the annex.largefiles configuration include
`git annex import`, git annex addurl`, `git annex importfeed`, and
the assistant.
## examples
@ -34,11 +36,17 @@ Or, set the git configuration instead:
git config annex.largefiles 'largerthan=100kb and not (include=*.c or include=*.h)'
Both of these settings do the same thing. Setting it in the `.gitattributes`
file makes any checkout of the repository share that configuration, so is often
a good choice. Setting the annex.largefiles git configuration lets different
checkouts behave differently. The git configuration overrides the
`.gitattributes` configuration.
Both of these settings do the same thing. Setting it in the
`.gitattributes` file makes any checkout of the repository share that
configuration, so is often a good choice. Setting the annex.largefiles git
configuration lets different checkouts behave differently. The git
configuration overrides the `.gitattributes` configuration.
Or, perhaps you just want all files to be added to the annex, no matter
what. Just write "* annex.largefiles=anything" to the `.gitattributes`
file, or run:
git config annex.largefiles anything
## syntax

View file

@ -1,7 +0,0 @@
[[!comment format=mdwn
username="joey"
subject="""comment 12"""
date="2019-09-16T18:43:02Z"
content="""
[[forum/lets_discuss_git_add_behavior]]
"""]]

View file

@ -1,10 +0,0 @@
[[!comment format=mdwn
username="hoxu"
avatar="http://cdn.libravatar.org/avatar/95e33a0073f6c06477b3a202f0301dde"
subject="v6 & manual annexation"
date="2017-06-29T07:25:31Z"
content="""
With v6, is there any way to retain old usage of `git add` and `git annex add` to manually choose which files are kept under plain git and which annexed?
I'm aware of the `-c annex.largefiles=foo` parameter, but that's pretty cumbersome.
"""]]

View file

@ -14,41 +14,45 @@ They are stored in the git repository differently, and they appear as
regular files in the working tree, instead of the symbolic links used for
locked files.
## adding unlocked files
## using unlocked files
Instead of using `git annex add`, use `git add`, and the file will be
stored in git-annex, but left unlocked.
You can unlock any annexed file:
[[!template id=note text="""
Want `git add` to add some file contents to the annex, but store the contents of
smaller files in git itself? Configure annex.largefiles to match the former.
See [[largefiles]].
"""]]
# git annex unlock my_cool_big_file
# cp ~/my_cool_big_file .
# git add my_cool_big_file
# git commit -m "added my_cool_big_file to the annex"
[master (root-commit) 92f2725] added my_cool_big_file to the annex
1 file changed, 1 insertion(+)
create mode 100644 my_cool_big_file
# git annex find
my_cool_big_file
That changes what's stored in git between a git-annex symlink
(locked) and a git-annex pointer file (unlocked). You can commit
the change, if you want that file to be unlocked in other clones of the
repository. To lock the file again, use `git annex lock`.
You can make whatever modifications you want to unlocked files, and commit
your changes.
The nice thing about an unlocked file is that you can modify it
in place -- it's a regular file. And you can commit your changes.
# echo more stuff >> my_cool_big_file
# git mv my_cool_big_file my_cool_bigger_file
# git commit -a -m "some changes"
[master 196c0e2] some changes
2 files changed, 1 insertion(+), 1 deletion(-)
delete mode 100644 my_cool_big_file
create mode 100644 my_cool_bigger_file
1 files changed, 1 insertion(+), 1 deletion(-)
Under the hood, this uses git's [[todo/smudge]] filter interface, and
git-annex converts between the content of the big file and a pointer file,
which is what gets committed to git. All the regular git-annex commands
(get, drop, etc) can be used on unlocked files too.
Notice that `git commit -a` added the new content of the file to the annex,
and only committed a change to the pointer. That happened because git-annex
knows this was an annexed file before. Git leaves the file unlocked, so
you can continue to make modifications to it.
By default, using git to add a file that has not been annexed before will
still add its contents to git, not to the annex. If you tell git-annex what
files are large, it will arrange for the large files to be added to the
annex, and the small ones to be added to git. This is done by configuring
annex.largefiles. See [[largefiles]] for full documentation of that.
All the regular git-annex commands (find, get, drop, etc) can be used on
unlocked files as well as locked files. When you drop the content of
an unlocked file, it will be replaced by a pointer file, which
looks like "/annex/objects/...". So if you open a file and see
that, you'll need to use `git annex get`.
Under the hood, unlocked files use git's [[todo/smudge]] filter interface,
and git-annex converts between the content of the big file and a pointer
file, which is what gets committed to git.
[[!template id=note text="""
By default, git-annex commands will add files in locked mode,
@ -57,14 +61,7 @@ mode is used. To make them always use unlocked mode, run:
`git config annex.addunlocked true`
"""]]
## mixing locked and unlocked files
A repository can contain both locked and unlocked files. You can switch
a file back and forth using the `git annex lock` and `git annex unlock`
commands. This changes what's stored in git between a git-annex symlink
(locked) and a git-annex pointer file (unlocked). To add a file to
the repository in locked mode, use `git annex add`; to add a file in
unlocked mode, use `git add`.
## adjusted branches
If you want to mostly keep files locked, but be able to locally switch
to having them all unlocked, you can do so using `git annex adjust
@ -73,6 +70,15 @@ useful when using filesystems like FAT, and OS's like Windows that don't
support symlinks. Indeed, `git-annex init` detects such filesystems and
automatically sets up a repository to use all unlocked files.
## finding unlocked files
While it's easy to see when a file is a git-annex symlink, unlocked files
look the same as files stored in git. To see what files are unlocked or
locked, many git-annex commands support `--unlocked` and `--locked`
options.
git annex find --unlocked
## imperfections
Unlocked files mostly work very well, but there are a

View file

@ -1,15 +0,0 @@
[[!comment format=mdwn
username="ginquistador@86f226616ead98d2733e249429918f241f928064"
nickname="ginquistador"
avatar="http://cdn.libravatar.org/avatar/f0ef7d68c0ff5d4948a9b0d282987195"
subject="Disappointed with `git add`"
date="2019-09-03T07:30:28Z"
content="""
I first have to say, I have been following and using git annex for ages (5+ years at least), and is my trusted source for all my data. However, for the first time in all these years, I'm seeing a decision that I do not agree with or understand.
Specifically, using `git add .` to add a file to git annex as the default pattern just seems a fundamentally wrong design to me (at least for my usage pattern). I want to be able to use git normally, and have git-annex only get involved when I explicitly request it to, and not for all files. AFAIK, git-lfs does do it right. I understand [annex.largefiles: configuring mixed content repositories](http://git-annex.branchable.com/tips/largefiles/) can be configured to get the behavior I want. However, the default behavior should add it to vanilla git, and any other desired behavior can be obtained by the user via annex attributes, or extra command line flags to `git annex add`
Knowing Joey, I assume there's a strong rationale as always, and would love to hear it, but I would still like to STRONGLY REQUEST changing the default behavior.
"""]]

View file

@ -1,8 +0,0 @@
[[!comment format=mdwn
username="joey"
subject="""comment 16"""
date="2019-09-16T17:36:06Z"
content="""
@ginquistador it may or may not have been the best decision, but this tip
is not a good place to discuss it. A bug would be a good place.
"""]]

View file

@ -1,7 +0,0 @@
[[!comment format=mdwn
username="joey"
subject="""comment 17"""
date="2019-09-16T18:44:33Z"
content="""
[[forum/lets_discuss_git_add_behavior]]
"""]]