31a5b58b2c
Code change should be trvial, but not yet implemented. This significantly complicated the task of documenting how git-annex works. I'm not sure how useful the annex.gitaddtoannex confguration is after this change; seems that if a user has an annex.largefiles they will want it applied consistently. But the last thing I want to hear is more complaining from users about git add doing something they don't want it to. There's a pretty high risk users who got used to the git add behavior and don't have annex.largefiles configured will miss the NEWS and complain bitterly about their suddenly bloated repositories. Oh well. Removed outdated comments about the old behavior to avoid confusion. I don't know if I've found all the places that griping spread to.
156 lines
6.3 KiB
Markdown
156 lines
6.3 KiB
Markdown
Normally, git-annex stores annexed files in the repository, locked down,
|
|
which prevents the content of the file from being modified.
|
|
That's a good thing, because it might be the only copy, you wouldn't
|
|
want to lose it in a fumblefingered mistake.
|
|
|
|
# git annex add some_file
|
|
add some_file
|
|
# echo oops > some_file
|
|
bash: some_file: Permission denied
|
|
|
|
Sometimes though you want to modify a file. Maybe once, or maybe
|
|
repeatedly. To support this, git-annex also supports unlocked files.
|
|
They are stored in the git repository differently, and they appear as
|
|
regular files in the working tree, instead of the symbolic links used for
|
|
locked files.
|
|
|
|
## using unlocked files
|
|
|
|
You can unlock any annexed file:
|
|
|
|
# git annex unlock my_cool_big_file
|
|
|
|
That changes what's stored in git between a git-annex symlink
|
|
(locked) and a git-annex pointer file (unlocked). You can commit
|
|
the change, if you want that file to be unlocked in other clones of the
|
|
repository. To lock the file again, use `git annex lock`.
|
|
|
|
The nice thing about an unlocked file is that you can modify it
|
|
in place -- it's a regular file. And you can commit your changes.
|
|
|
|
# echo more stuff >> my_cool_big_file
|
|
# git commit -a -m "some changes"
|
|
[master 196c0e2] some changes
|
|
1 files changed, 1 insertion(+), 1 deletion(-)
|
|
|
|
Notice that `git commit -a` added the new content of the file to the annex,
|
|
and only committed a change to the pointer. That happened because git-annex
|
|
knows this was an annexed file before. Git leaves the file unlocked, so
|
|
you can continue to make modifications to it.
|
|
|
|
By default, using git to add a file that has not been annexed before will
|
|
still add its contents to git, not to the annex. If you tell git-annex what
|
|
files are large, it will arrange for the large files to be added to the
|
|
annex, and the small ones to be added to git. This is done by configuring
|
|
annex.largefiles. See [[largefiles]] for full documentation of that.
|
|
|
|
All the regular git-annex commands (find, get, drop, etc) can be used on
|
|
unlocked files as well as locked files. When you drop the content of
|
|
an unlocked file, it will be replaced by a pointer file, which
|
|
looks like "/annex/objects/...". So if you open a file and see
|
|
that, you'll need to use `git annex get`.
|
|
|
|
Under the hood, unlocked files use git's [[todo/smudge]] filter interface,
|
|
and git-annex converts between the content of the big file and a pointer
|
|
file, which is what gets committed to git.
|
|
|
|
[[!template id=note text="""
|
|
By default, git-annex commands will add files in locked mode,
|
|
unless used on a filesystem that does not support symlinks, when unlocked
|
|
mode is used. To make them always use unlocked mode, run:
|
|
`git config annex.addunlocked true`
|
|
"""]]
|
|
|
|
## adjusted branches
|
|
|
|
If you want to mostly keep files locked, but be able to locally switch
|
|
to having them all unlocked, you can do so using `git annex adjust
|
|
--unlock`. See [[git-annex-adjust]] for details. This is particularly
|
|
useful when using filesystems like FAT, and OS's like Windows that don't
|
|
support symlinks. Indeed, `git-annex init` detects such filesystems and
|
|
automatically sets up a repository to use all unlocked files.
|
|
|
|
## finding unlocked files
|
|
|
|
While it's easy to see when a file is a git-annex symlink, unlocked files
|
|
look the same as files stored in git. To see what files are unlocked or
|
|
locked, many git-annex commands support `--unlocked` and `--locked`
|
|
options.
|
|
|
|
git annex find --unlocked
|
|
|
|
## imperfections
|
|
|
|
Unlocked files mostly work very well, but there are a
|
|
few imperfections which you should be aware of when using them.
|
|
|
|
1. `git stash`, `git cherry-pick` and `git reset --hard` don't update
|
|
the working tree with the content of unlocked files. The files
|
|
will contain pointers, the same as if the content was not in the
|
|
repository. So after running these commands, you will need to manually
|
|
run `git annex smudge --update`.
|
|
|
|
2. When git-annex is running a command that gets or drops the content
|
|
of an unlocked file, git's index will briefly be locked, which might
|
|
prevent you from running a `git commit` at the same time.
|
|
|
|
3. Conversely, if you have a git commit in progress, running git-annex may
|
|
complain that the index is locked, though this will not prevent it from
|
|
working.
|
|
|
|
4. When an operation such as a checkout or merge needs to update a large
|
|
number of unlocked files, it can become slow. So can be `git add` of
|
|
a large number of files (`git annex add` is faster).
|
|
|
|
(The technical reasons behind these imperfections are explained in
|
|
detail in [[todo/git_smudge_clean_interface_suboptiomal]].)
|
|
|
|
## using less disk space
|
|
|
|
Unlocked files are handy, but they have one significant disadvantage
|
|
compared with locked files: They use more disk space.
|
|
|
|
While only one copy of a locked file has to be stored, often
|
|
two copies of an unlocked file are stored on disk. One copy is in
|
|
the git work tree, where you can use and modify it,
|
|
and the other is stashed away in `.git/annex/objects` (see [[internals]]).
|
|
|
|
The reason for that second copy is to preserve the old version of the file,
|
|
when you modify the unlocked file in the work tree. Being able to access
|
|
old versions of files is an important part of git after all!
|
|
|
|
That's a good safe default. But there are ways to use git-annex that
|
|
make the second copy not be worth keeping:
|
|
|
|
* When you're using git-annex to sync the current version of files across
|
|
devices, and don't care much about previous versions.
|
|
* When you have set up a backup repository, and use git-annex to copy
|
|
your files to the backup.
|
|
|
|
In situations like these, you may want to avoid the overhead of the second
|
|
local copy of unlocked files. There's a config setting for that.
|
|
|
|
[[!template id=note text="""
|
|
Note that setting annex.thin only has any effect on systems that support
|
|
hard links. It is supported on Windows, but not on FAT filesystems.
|
|
"""]]
|
|
|
|
git config annex.thin true
|
|
|
|
After changing annex.thin, you'll want to fix up the work tree to
|
|
match the new setting:
|
|
|
|
git annex fix
|
|
|
|
[[!template id=note text="""
|
|
When a [[direct_mode]] repository is upgraded, annex.thin is automatically
|
|
set, because direct mode made the same single-copy tradeoff.
|
|
"""]]
|
|
|
|
Setting annex.thin can save a lot of disk space, but it's a tradeoff
|
|
between disk usage and safety.
|
|
|
|
Keeping files locked is safer and also avoids using unnecessary
|
|
disk space, but trades off easy modification of files.
|
|
|
|
Pick the tradeoff that's right for you.
|