avoid creating content directory when locking content

If the content directory does not exist, then it does not make sense to
lock the content file, as it also does not exist, and so it's ok for the
lock operation to fail.

This avoids potential races where the content file exists but is then
deleted/renamed, while another process sees that it exists and goes to
lock it, resulting in a dangling lock file in an otherwise empty object
directory.

Also renamed modifyContent to modifyContentDir since it is not only
necessarily used for modifying content files, but also other files in
the content directory.

Sponsored-by: Dartmouth College's Datalad project
This commit is contained in:
Joey Hess 2022-05-16 12:34:56 -04:00
parent b6c7819803
commit 5a98f2d509
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
7 changed files with 52 additions and 16 deletions

View file

@ -0,0 +1,26 @@
[[!comment format=mdwn
username="joey"
subject="""comment 6"""
date="2022-05-16T15:24:43Z"
content="""
If fsck locks the content for removal, then moves it to the preferred
location, how is that any different from git-annex first dropping content
and then very quickly retrieving another copy and storing it in the other
location? The only difference is timing, but things like being suspended
and resumed can affect timing.
So, if there is a problem with fsck doing that, there would also be a more
general problem, that could occur in other circumstances, even if only
rarely.
One way to see the general problem happen would be to have two processes
trying to drop the same object. One process finds the object location, then
stalls. Meanwhile, the second process drops the object. Then the first
process resumes, and locks for removal. Per comment #5 this will result in
a dangling lock file in the object directory. I have not managed to get
this to happen yet though.
A fix for the general problem is to make it not create the
object directory when opening the object lock file. So I've made that
change.
"""]]