git-annex/doc/internals/lockdown.mdwn
Joey Hess 4b1b9d7a83
Added annex.freezecontent-command and annex.thawcontent-command configs
Freeze first sets the file perms, and then runs
freezecontent-command. Thaw runs thawcontent-command before
restoring file permissions. This is in case the freeze command
prevents changing file perms, as eg setting a file immutable does.
Also, changing file perms tends to mess up previously set ACLs.

git-annex init's probe for crippled filesystem uses them, so if file perms
don't work, but freezecontent-command manages to prevent write to a file,
it won't treat the filesystem as crippled.

When the the filesystem has been probed as crippled, the hooks are not
used, because there seems to be no point then; git-annex won't be relying
on locking annex objects down. Also, this avoids them being run when the
file perms have not been changed, in case they somehow rely on
git-annex's setting of the file perms in order to work.

Sponsored-by: Dartmouth College's Datalad project
2021-06-21 14:40:52 -04:00

48 lines
2.4 KiB
Markdown

Object files stored in `.git/annex/objects` are each put in their own directory.
This allows the write bit to be removed from both the file, and its directory,
which prevents accidentally deleting or changing the file contents.
The reasoning for doing this follows:
Normally with git, once you have committed a file, editing the file in the
working tree cannot cause you to lose the committed version. This is an
important property of git. Of course you can `rm -rf .git` and delete
commits if you like (before you've pushed them). But you can't lose a
committed version of the file because of something you do with the working
tree version.
It's easy for git to do this, because committing a file makes a copy of it.
But git-annex does not make a local copy of a file added to it, because
the file could be very large.
So, it's important for git-annex to find another way to preserve the expected
property that once committed, you cannot accidentally lose a file.
The most important protection it makes is just to remove the write bit of
the file. Thus preventing programs from modifying it.
But, that does not prevent any program that might follow the symlink and
delete the symlinked file. This might seem an unlikely thing for a program to
do at first, but consider a command like:
`tar cf foo.tar foo --remove-files --dereference`
When I tested this, I didn't know if it would remove the file foo symlinked
to or not! It turned out that my tar doesn't remove it. But it could
have easily went the other way.
Rather than needing to worry about every possible program that might
decide to do something like this, git-annex removes the write bit from the
directory containing the annexed object, as well as removing the write
bit from the file. (The only bad consequence of this is that `rm -rf .git`
doesn't work unless you first run `chmod -R +w .git`)
----
It's known that this lockdown mechanism is incomplete. The worst hole in
it is that if you explicitly run `chmod +w` on an annexed file in the working
tree, this follows the symlink and allows writing to the file. It would be
better to make the files fully immutable. But most systems either don't
support immutable attributes, or only let root make files immutable.
The git configs `annex.freezecontent-command` and `annex.thawcontent-command`
can be used to run additional commands to further lock down and later thaw
the annex object and directory.