require the side lock be held to take pidlock

This is less portable, since currently sidelocks rely on /dev/shm. But, I've seen crazy lustre inconsistencies that make me not trust the link() method at all, so what can you do.
2015-11-13 14:44:53 -04:00 · 2015-11-13 14:44:53 -04:00 · 60a9c7f5c6
commit 60a9c7f5c6
parent 85345abe8b
2 changed files with 68 additions and 15 deletions
--- a/Utility/LockFile/PidLock.hs
+++ b/Utility/LockFile/PidLock.hs
@ -38,7 +38,9 @@ import Data.Hash.MD5

 type LockFile = FilePath

-data LockHandle = LockHandle FilePath Fd (Maybe Posix.LockHandle)
+data LockHandle = LockHandle FilePath Fd SideLockHandle
+
+type SideLockHandle = Maybe Posix.LockHandle

 data PidLock = PidLock
 	{ lockingPid :: ProcessID
@ -58,7 +60,7 @@ readPidLock lockfile = (readish =<<) <$> catchMaybeIO (readFile lockfile)
 -- This is a regular posix exclusive lock. The side lock is put in
 -- /dev/shm. This will work on most any Linux system, even if its whole
 -- root filesystem doesn't support posix locks.
-trySideLock :: LockFile -> (Maybe Posix.LockHandle -> IO a) -> IO a
+trySideLock :: LockFile -> (SideLockHandle -> IO a) -> IO a
 trySideLock lockfile a = do
 	sidelock <- sideLockFile lockfile
 	mlck <- catchDefaultIO Nothing $ 
@ -82,8 +84,6 @@ sideLockFile lockfile = do

 -- | Tries to take a lock; does not block when the lock is already held.
 --
-- The method used is atomic even on NFS without needing O_EXCL support.
--
 -- Note that stale locks are automatically detected and broken.
 -- However, if the lock file is on a networked file system, and was
 -- created on a different host than the current host (determined by hostname),
@ -99,7 +99,7 @@ tryLock lockfile = trySideLock lockfile $ \sidelock -> do
 		nukeFile tmp
 		return Nothing
 	let tooklock = return $ Just $ LockHandle lockfile fd sidelock
-	ifM (linkToLock tmp lockfile)
+	ifM (linkToLock sidelock tmp lockfile)
 		( do
 			nukeFile tmp
 			tooklock
@ -118,16 +118,27 @@ tryLock lockfile = trySideLock lockfile $ \sidelock -> do
 				_ -> failedlock
 		)

-- Linux man pages recommend linking a pid lock into place,
+-- Linux's open(2) man page recommends linking a pid lock into place,
 -- as the most portable atomic operation that will fail if
-- it already exists. However, on some network filesystems,
-- link will return success sometimes despite having failed,
-- so we have to stat both files to check if it actually worked.
-linkToLock :: FilePath -> FilePath -> IO Bool
-linkToLock src dest = ifM (isJust <$> catchMaybeIO (createLink src dest))
-	( catchDefaultIO False checklink
-	, return False
-	)
+-- it already exists. 
+--
+-- open(2) suggests that link can sometimes appear to fail
+-- on NFS but have actually succeeded, and the way to find out is to stat
+-- the file and check its link count etc.
+--
+-- On a Lustre filesystem, link has been observed to incorrectly *succeed*,
+-- despite the dest already existing. A subsequent stat of the dest
+-- looked like it had been replaced with the src. The process proceeded to
+-- run and then deleted the dest, and after the process was done, the
+-- original file was observed to still be in place. This is horrible and we
+-- can't do anything about such a lying filesystem.
+-- At least the side lock file will prevent git-annex's running on the same
+-- host from running concurrently even on such a lying filesystem.
+linkToLock :: SideLockHandle -> FilePath -> FilePath -> IO Bool
+linkToLock Nothing _ _ = return False
+linkToLock (Just _) src dest = do
+	_ <- tryIO $ createLink src dest
+	catchDefaultIO False checklink
  where
 	checklink = do
 		x <- getSymbolicLinkStatus src
@ -136,13 +147,14 @@ linkToLock src dest = ifM (isJust <$> catchMaybeIO (createLink src dest))
 			[ deviceID x == deviceID y
 			, fileID x == fileID y
 			, fileMode x == fileMode y
-			, linkCount x == linkCount y
 			, fileOwner x == fileOwner y
 			, fileGroup x == fileGroup y
 			, specialDeviceID x == specialDeviceID y
 			, fileSize x == fileSize y
 			, modificationTime x == modificationTime y
 			, isRegularFile x == isRegularFile y
+			, linkCount x == linkCount y
+			, linkCount x == 2
 			]

 -- | Waits as necessary to take a lock.
--- a/doc/bugs/git-annex_doesn39t_work_on_lustre:_waitToSetLock:_unsupported_operation_40Function_not_implemented41/comment_13_5a1191042b32a85b4299cff3004d29de._comment
+++ b/doc/bugs/git-annex_doesn39t_work_on_lustre:_waitToSetLock:_unsupported_operation_40Function_not_implemented41/comment_13_5a1191042b32a85b4299cff3004d29de._comment
@ -0,0 +1,41 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 13"""
+ date="2015-11-13T18:23:37Z"
+ content="""
+While testing a git-annex that used pid locks, on the Lustre 
+system I've been given access to, I observed something most
+strange:
+
+	link(".git/annex/locktmp12011", ".git/annex/pidlock") = 0
+	lstat64(".git/annex/locktmp12011", {st_mode=S_IFREG|0444, st_size=70, ...}) = 0
+	lstat64(".git/annex/pidlock", {st_mode=S_IFREG|0444, st_size=70, ...}) = 0
+	...
+	unlink(".git/annex/pidlock")            = 0
+
+Seeing that strace, it would make sense that the pidlock file didn't exist,
+since a hard link was successfully made by that name, and link() never,
+ever overwrites an existing file. The stats of the 2 files are of course
+identical since they're hard links. And, since the pidlock is unlinked at
+the end, we'd expect the file to be gone then.
+
+But, none of that has anything to do with the reality. Which was:
+The pidlock file already existed, with size=72, and had existed for some
+hours at the point the strace begins. The link didn't replace it
+at all, and the unlink didn't delete it. When the program exited,
+the pidlock file still existed, with contents unaltered.
+
+All I can guess is happening is that different processes on a Lustre
+filesystem, running on the same host, somehow see inconsistent realities.
+
+I do think that, despite this being completely insane, the locking will
+actually work ok, when all git-annex processes in a given repo on Lustre
+are running *on the same computer*. That because git-annex actually will
+drop a proper lock into a proper filesystem (/dev/shm), and so avoid all
+this Lustre nonsense.
+
+But in general, I can make no warantee express or implied as to the
+suitability of Lustre as a platform to use git-annex. If it's this
+inconsistent, and modifications made to files are somehow silently rolled
+back, anything could happen.
+"""]]