git-annex/doc/tips/unlocked_files.mdwn

151 lines
5.9 KiB
Text
Raw Normal View History

Normally, git-annex stores annexed files in the repository, locked down,
which prevents the content of the file from being modified.
That's a good thing, because it might be the only copy, you wouldn't
want to lose it in a fumblefingered mistake.
# git annex add some_file
add some_file
# echo oops > some_file
bash: some_file: Permission denied
Sometimes though you want to modify a file. Maybe once, or maybe
repeatedly. To support this, git-annex also supports unlocked files.
They are stored in the git repository differently, and they appear as
regular files in the working tree, instead of the symbolic links used for
locked files.
## adding unlocked files
Instead of using `git annex add`, use `git add`, and the file will be
stored in git-annex, but left unlocked.
[[!template id=note text="""
Want `git add` to add some file contents to the annex, but store the contents of
smaller files in git itself? Configure annex.largefiles to match the former.
2016-02-02 20:50:58 +00:00
See [[largefiles]].
"""]]
# cp ~/my_cool_big_file .
# git add my_cool_big_file
# git commit -m "added my_cool_big_file to the annex"
[master (root-commit) 92f2725] added my_cool_big_file to the annex
1 file changed, 1 insertion(+)
create mode 100644 my_cool_big_file
# git annex find
my_cool_big_file
You can make whatever modifications you want to unlocked files, and commit
your changes.
# echo more stuff >> my_cool_big_file
# git mv my_cool_big_file my_cool_bigger_file
# git commit -a -m "some changes"
[master 196c0e2] some changes
2 files changed, 1 insertion(+), 1 deletion(-)
delete mode 100644 my_cool_big_file
create mode 100644 my_cool_bigger_file
Under the hood, this uses git's [[todo/smudge]] filter interface, and
git-annex converts between the content of the big file and a pointer file,
2016-01-14 22:11:09 +00:00
which is what gets committed to git. All the regular git-annex commands
(get, drop, etc) can be used on unlocked files too.
[[!template id=note text="""
By default, git-annex commands will add files in locked mode,
unless used on a filesystem that does not support symlinks, when unlocked
mode is used. To make them always use unlocked mode, run:
`git config annex.addunlocked true`
"""]]
2016-03-29 15:33:26 +00:00
## mixing locked and unlocked files
A repository can contain both locked and unlocked files. You can switch
a file back and forth using the `git annex lock` and `git annex unlock`
commands. This changes what's stored in git between a git-annex symlink
(locked) and a git-annex pointer file (unlocked). To add a file to
the repository in locked mode, use `git annex add`; to add a file in
unlocked mode, use `git add`.
2016-03-29 15:33:26 +00:00
If you want to mostly keep files locked, but be able to locally switch
to having them all unlocked, you can do so using `git annex adjust
--unlock`. See [[git-annex-adjust]] for details. This is particularly
useful when using filesystems like FAT, and OS's like Windows that don't
2018-10-25 22:56:14 +00:00
support symlinks. Indeed, `git-annex init` detects such filesystems and
automatically sets up a repository to use all unlocked files.
2016-03-29 15:33:26 +00:00
2018-10-26 16:19:44 +00:00
## imperfections
Unlocked files mostly work very well, but there are a
2018-10-26 16:19:44 +00:00
few imperfections which you should be aware of when using them.
2018-10-26 16:19:44 +00:00
1. `git stash`, `git cherry-pick` and `git reset --hard` don't update
the working tree with the content of unlocked files. The files
will contain pointers, the same as if the content was not in the
repository. So after running these commands, you will need to manually
run `git annex smudge --update`.
2018-10-26 16:19:44 +00:00
2. When git-annex is running a command that gets or drops the content
of an unlocked file, git's index will briefly be locked, which might
prevent you from running a `git commit` at the same time.
2018-08-17 14:48:18 +00:00
2018-10-26 16:19:44 +00:00
3. Conversely, if you have a git commit in progress, running git-annex may
complain that the index is locked, though this will not prevent it from
working.
2018-10-26 16:19:44 +00:00
4. When an operation such as a checkout or merge needs to update a large
number of unlocked files, it can become slow. So can be `git add` of
a large number of files (`git annex add` is faster).
(The technical reasons behind these imperfections are explained in
detail in [[todo/git_smudge_clean_interface_suboptiomal]].)
## using less disk space
Unlocked files are handy, but they have one significant disadvantage
compared with locked files: They use more disk space.
2015-12-27 20:06:11 +00:00
While only one copy of a locked file has to be stored, often
two copies of an unlocked file are stored on disk. One copy is in
the git work tree, where you can use and modify it,
and the other is stashed away in `.git/annex/objects` (see [[internals]]).
The reason for that second copy is to preserve the old version of the file,
2015-12-27 20:06:11 +00:00
when you modify the unlocked file in the work tree. Being able to access
old versions of files is an important part of git after all!
That's a good safe default. But there are ways to use git-annex that
make the second copy not be worth keeping:
2016-01-14 22:11:38 +00:00
* When you're using git-annex to sync the current version of files across
devices, and don't care much about previous versions.
* When you have set up a backup repository, and use git-annex to copy
your files to the backup.
In situations like these, you may want to avoid the overhead of the second
2016-01-14 22:11:38 +00:00
local copy of unlocked files. There's a config setting for that.
[[!template id=note text="""
Note that setting annex.thin only has any effect on systems that support
hard links. It is supported on Windows, but not on FAT filesystems.
"""]]
git config annex.thin true
After changing annex.thin, you'll want to fix up the work tree to
match the new setting:
git annex fix
2016-01-14 22:19:00 +00:00
[[!template id=note text="""
When a [[direct_mode]] repository is upgraded, annex.thin is automatically
set, because direct mode made the same single-copy tradeoff.
"""]]
2016-01-14 22:19:50 +00:00
Setting annex.thin can save a lot of disk space, but it's a tradeoff
between disk usage and safety.
Keeping files locked is safer and also avoids using unnecessary
disk space, but trades off easy modification of files.
Pick the tradeoff that's right for you.