some thoughts for madduck

This commit is contained in:
Joey Hess 2013-11-04 14:34:34 -04:00
parent 58db042033
commit 70f3f22d8c
2 changed files with 45 additions and 1 deletions

View file

@ -16,7 +16,7 @@ Each subdirectory has the [[name_of_a_key|key_format]] in one of the
[[key-value_backends|backends]]. The file inside also has the name of the key.
This two-level structure is used because it allows the write bit to be removed
from the subdirectories as well as from the files. That prevents accidentially
deleting or changing the file contents.
deleting or changing the file contents. See [[permissions]] for details.
In [[direct_mode]], file contents are not stored in here, and instead
are stored directly in the file. However, the same symlinks are still

View file

@ -0,0 +1,44 @@
Object files stored in `.git/annex/objects` are each put in their own directory.
This allows the write bit to be removed from both the file, and its directory,
which prevents accidentially deleting or changing the file contents.
The reasoning for doing this follows:
Normally with git, once you have committed a file, editing the file in the
working tree cannot cause you to lose the committed version. This is an
important property of git. Of course you can `rm -rf .git` and delete
commits if you like (before you've pushed them). But you can't lose a
committed version of the file because of something you do with the working
tree version.
It's easy for git to do this, because committing a file makes a copy of it.
But git-annex does not make a local copy of a file added to it, because
the file could be very large.
So, it's important for git-annex to find another way to preserve the expected
property that once committed, you cannot accidentially lose a file.
The most important protection it makes is just to remove the write bit of
the file. Thus preventing programs from modifying it.
But, that does not prevent any program that might follow the symlink and
delete the symlinked file. This might seem an unlikely thing for a program to
do at first, but consider a command like:
`tar cf foo.tar foo --remove-files --dereference`
When I tested this, I didn't know if it would remove the file foo symlinked
to or not! It turned out that my tar doesn't remove it. But it could
have easily went the other way.
Rather than needing to worry about every possible program that might
decide to do something like this, git-annex removes the write bit from the
directory containing the annexed object, as well as removing the write
bit from the file. (The only bad consequence of this is that `rm -rf .git`
doesn't work unless you first run `chmod -R +w .git`)
----
It's known that this lockdown mechanism is incomplete. The worst hole in
it is that if you explicitly run `chmod +w` on an annexed file in the working
tree, this follows the symlink and allows writing to the file. It would be
better to make the files fully immutable. But most systems either don't
support immutable attributes, or only let root make files immutable.