document annex.thin risk to locked files pointing at same content

This commit is contained in:
Joey Hess 2023-06-21 15:39:15 -04:00
parent 928b2a4839
commit 6c84aabe63
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 38 additions and 1 deletions

View file

@ -7,3 +7,8 @@ Also, with `annex.thin`, the [[invariant|internals]] that `.git/annex/objects/aa
[[`git-annex-drop`|git-annex-drop]] succeeds but does not actually drop the file.
Also, even if the current repo is trusted, with `annex.thin`, an unlocked file should not count as a trusted copy.
[[!tag confirmed]]
> documentation updated to mention this unfixable wart. Ugh. [[done]]
> --[[Joey]]

View file

@ -0,0 +1,29 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2023-06-21T19:28:52Z"
content="""
Confirmed this is still a problem. `git-annex fsck` does detect and deal
with it, by eg deleting the corrupted object.
It seems like it would be hard for git-annex to make other files that use
the same object be unlocked. Consider a repo with one file, that is
unlocked and uses an object. Then a `git merge` adds another, locked file,
using the same object. git-annex didn't have a chance to run at all, and
now the stage is set for this problem to happen if the user appends to the
unlocked file.
In a way, the docs for annex.thin do warn the user about this. If you squint
just right:
... but when a modification is made to a
file, you will lose the local (and possibly only) copy of the
old version
But, git-annex goes out of its way to avoid 2 unlocked files being hardlinked
when using annex.thin. So it seems wrong that a locked file and an unlocked file
will be hard linked, and that the locked file can get corrupted.
Ok, I've made the docs warn about it, and I think that is just the best that
can be done. The only real fix would be to remove annex.thin.
"""]]

View file

@ -1156,7 +1156,10 @@ repository, using [[git-annex-config]]. See its man page for a list.)
Set this to `true` to make unlocked files be a hard link to their content
in the annex, rather than a second copy. This can save considerable
disk space, but when a modification is made to a file, you will lose the
local (and possibly only) copy of the old version. So, enable with care.
local (and possibly only) copy of the old version. Any other, locked
files in the repository that pointed to that content will get broken
as well (`git-annex fsck` will detect and clean up after that).
So, enable this with care.
After setting (or unsetting) this, you should run `git annex fix` to
fix up the annexed files in the work tree to be hard links (or copies).