159 lines
6.4 KiB
Markdown
159 lines
6.4 KiB
Markdown
Normally, git-annex stores annexed files in the repository, locked down,
|
|
which prevents the content of the file from being modified.
|
|
That's a good thing, because it might be the only copy, you wouldn't
|
|
want to lose it in a fumblefingered mistake.
|
|
|
|
# git annex add some_file
|
|
add some_file
|
|
# echo oops > some_file
|
|
bash: some_file: Permission denied
|
|
|
|
Sometimes though you want to modify a file. Maybe once, or maybe
|
|
repeatedly. To support this, git-annex also supports unlocked files.
|
|
They are stored in the git repository differently, and they appear as
|
|
regular files in the working tree, instead of the symbolic links used for
|
|
locked files.
|
|
|
|
## using unlocked files
|
|
|
|
You can unlock any annexed file:
|
|
|
|
# git annex unlock my_cool_big_file
|
|
|
|
That changes what's stored in git between a git-annex symlink
|
|
(locked) and a git-annex pointer file (unlocked). You can commit
|
|
the change, if you want that file to be unlocked in other clones of the
|
|
repository. To lock the file again, use `git annex lock`.
|
|
|
|
The nice thing about an unlocked file is that you can modify it
|
|
in place -- it's a regular file. And you can commit your changes.
|
|
|
|
# echo more stuff >> my_cool_big_file
|
|
# git commit -a -m "some changes"
|
|
[master 196c0e2] some changes
|
|
1 files changed, 1 insertion(+), 1 deletion(-)
|
|
|
|
Notice that `git commit -a` added the new content of the file to the annex,
|
|
and only committed a change to the pointer. That happened because git-annex
|
|
knows this was an annexed file before. Git leaves the file unlocked, so
|
|
you can continue to make modifications to it.
|
|
|
|
By default, using git to add a file that has not been annexed before will
|
|
still add its contents to git, not to the annex. If you tell git-annex what
|
|
files are large, it will arrange for the large files to be added to the
|
|
annex, and the small ones to be added to git. This is done by configuring
|
|
annex.largefiles. See [[largefiles]] for full documentation of that.
|
|
|
|
All the regular git-annex commands (find, get, drop, etc) can be used on
|
|
unlocked files as well as locked files. When you drop the content of
|
|
an unlocked file, it will be replaced by a pointer file, which
|
|
looks like "/annex/objects/...". So if you open a file and see
|
|
that, you'll need to use `git annex get`.
|
|
|
|
Under the hood, unlocked files use git's [[todo/smudge]] filter interface,
|
|
and git-annex converts between the content of the big file and a pointer
|
|
file, which is what gets committed to git.
|
|
|
|
[[!template id=note text="""
|
|
By default, git-annex commands will add files in locked mode,
|
|
unless used on a filesystem that does not support symlinks, when unlocked
|
|
mode is used. To make them always use unlocked mode, run:
|
|
`git config annex.addunlocked true`
|
|
"""]]
|
|
|
|
## adjusted branches
|
|
|
|
If you want to mostly keep files locked, but be able to locally switch
|
|
to having them all unlocked, you can do so using `git annex adjust
|
|
--unlock`. See [[git-annex-adjust]] for details. This is particularly
|
|
useful when using filesystems like FAT, and OS's like Windows that don't
|
|
support symlinks. Indeed, `git-annex init` detects such filesystems and
|
|
automatically sets up a repository to use all unlocked files.
|
|
|
|
## finding unlocked files
|
|
|
|
While it's easy to see when a file is a git-annex symlink, unlocked files
|
|
look the same as files stored in git. To see what files are unlocked or
|
|
locked, many git-annex commands support `--unlocked` and `--locked`
|
|
options.
|
|
|
|
git annex find --unlocked
|
|
|
|
## imperfections
|
|
|
|
Unlocked files mostly work very well, but there are a
|
|
few imperfections which you should be aware of when using them.
|
|
|
|
1. `git stash`, `git cherry-pick` and `git reset --hard` don't update
|
|
the working tree with the content of unlocked files. The files
|
|
will contain pointers, the same as if the content was not in the
|
|
repository. So after running these commands, you will need to manually
|
|
run `git annex smudge --update`.
|
|
|
|
2. When git-annex is running a command that gets or drops the content
|
|
of an unlocked file, git's index will briefly be locked, which might
|
|
prevent you from running a `git commit` at the same time.
|
|
|
|
3. Conversely, if you have a git commit in progress, running git-annex may
|
|
complain that the index is locked, though this will not prevent it from
|
|
working.
|
|
|
|
4. When an operation such as a checkout or merge needs to update a large
|
|
number of unlocked files, it can become slow. So can be `git add` of
|
|
a large number of files (`git annex add` is faster).
|
|
|
|
(The technical reasons behind these imperfections are explained in
|
|
detail in [[todo/git_smudge_clean_interface_suboptiomal]].)
|
|
|
|
## using less disk space
|
|
|
|
Unlocked files are handy, but they have one significant disadvantage
|
|
compared with locked files: On most filesystems, they use more disk space.
|
|
|
|
While only one copy of a locked file has to be stored, often
|
|
two copies of an unlocked file are stored on disk. One copy is in
|
|
the git work tree, where you can use and modify it,
|
|
and the other is stashed away in `.git/annex/objects` (see [[internals]]).
|
|
|
|
The reason for that second copy is to preserve the old version of the file,
|
|
when you modify the unlocked file in the work tree. Being able to access
|
|
old versions of files is an important part of git after all!
|
|
|
|
(Some filesystems including btrfs and xfs support reflinks, and on those,
|
|
the extra copy is a reflink, and takes up no additional space.)
|
|
|
|
So two copies is a good safe default. But there are ways to use git-annex that
|
|
make the second copy not be worth keeping:
|
|
|
|
* When you're using git-annex to sync the current version of files across
|
|
devices, and don't care much about previous versions.
|
|
* When you have set up a backup repository, and use git-annex to copy
|
|
your files to the backup.
|
|
|
|
In situations like these, you may want to avoid the overhead of the second
|
|
local copy of unlocked files. There's a config setting for that.
|
|
|
|
[[!template id=note text="""
|
|
Note that setting annex.thin only has any effect on systems that support
|
|
hard links. It is supported on Windows, but not on FAT filesystems.
|
|
"""]]
|
|
|
|
git config annex.thin true
|
|
|
|
After changing annex.thin, you'll want to fix up the work tree to
|
|
match the new setting:
|
|
|
|
git annex fix
|
|
|
|
[[!template id=note text="""
|
|
When a [[direct_mode]] repository is upgraded, annex.thin is automatically
|
|
set, because direct mode made the same single-copy tradeoff.
|
|
"""]]
|
|
|
|
Setting annex.thin can save a lot of disk space, but it's a tradeoff
|
|
between disk usage and safety.
|
|
|
|
Keeping files locked is safer and also avoids using unnecessary
|
|
disk space, but trades off easy modification of files.
|
|
|
|
Pick the tradeoff that's right for you.
|