2015-05-18 18:16:49 +00:00
|
|
|
{- STM implementation of lock pools.
|
|
|
|
-
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
- Copyright 2015-2021 Joey Hess <id@joeyh.name>
|
2015-05-18 18:16:49 +00:00
|
|
|
-
|
|
|
|
- License: BSD-2-clause
|
|
|
|
-}
|
|
|
|
|
|
|
|
module Utility.LockPool.STM (
|
|
|
|
LockPool,
|
|
|
|
lockPool,
|
|
|
|
LockFile,
|
|
|
|
LockMode(..),
|
|
|
|
LockHandle,
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
FirstLock(..),
|
|
|
|
FirstLockSemVal(..),
|
2015-05-18 18:16:49 +00:00
|
|
|
waitTakeLock,
|
|
|
|
tryTakeLock,
|
|
|
|
getLockStatus,
|
|
|
|
releaseLock,
|
Fix shared lock file FD leak.
This fixes behavior in this situation:
l1 <- lockShared Nothing "lck"
l2 <- lockShared Nothing "lck"
dropLock l1
dropLock l2
Before, the lock was dropped upon the second dropLock call, but the fd
remained open, and would never be closed while the program was running.
Fixed by a rather round-about method, but it should work well enough.
It would have been simpler to open open the shared lock once, and not open
it again in the second call to lockShared. But, that's difficult to do
atomically.
This also affects Windows and PID locks, not just posix locks.
In the case of pid locks, multiple calls to waitLock within the same
process are allowed because the side lock is locked using a posix lock,
and so multiple exclusive locks can be taken in the same process. So,
this change fixes a similar problem with pid locks.
l1 <- waitLock (Seconds 1) "lck"
l2 <- waitLock (Seconds 1) "lck"
dropLock l1
dropLock l2
Here the l2 side lock fd remained open but not locked,
although the pid lock file was removed. After this change, the second
dropLock will close both fds to the side lock, and delete the pidlock.
2016-03-01 19:31:39 +00:00
|
|
|
CloseLockFile,
|
|
|
|
registerCloseLockFile,
|
2021-12-03 21:20:21 +00:00
|
|
|
registerPostReleaseLock,
|
2015-05-18 18:16:49 +00:00
|
|
|
) where
|
|
|
|
|
Fix shared lock file FD leak.
This fixes behavior in this situation:
l1 <- lockShared Nothing "lck"
l2 <- lockShared Nothing "lck"
dropLock l1
dropLock l2
Before, the lock was dropped upon the second dropLock call, but the fd
remained open, and would never be closed while the program was running.
Fixed by a rather round-about method, but it should work well enough.
It would have been simpler to open open the shared lock once, and not open
it again in the second call to lockShared. But, that's difficult to do
atomically.
This also affects Windows and PID locks, not just posix locks.
In the case of pid locks, multiple calls to waitLock within the same
process are allowed because the side lock is locked using a posix lock,
and so multiple exclusive locks can be taken in the same process. So,
this change fixes a similar problem with pid locks.
l1 <- waitLock (Seconds 1) "lck"
l2 <- waitLock (Seconds 1) "lck"
dropLock l1
dropLock l2
Here the l2 side lock fd remained open but not locked,
although the pid lock file was removed. After this change, the second
dropLock will close both fds to the side lock, and delete the pidlock.
2016-03-01 19:31:39 +00:00
|
|
|
import Utility.Monad
|
|
|
|
|
2015-05-18 18:16:49 +00:00
|
|
|
import System.IO.Unsafe (unsafePerformIO)
|
2020-10-29 14:33:12 +00:00
|
|
|
import System.FilePath.ByteString (RawFilePath)
|
2015-05-18 18:16:49 +00:00
|
|
|
import qualified Data.Map.Strict as M
|
|
|
|
import Control.Concurrent.STM
|
|
|
|
import Control.Exception
|
|
|
|
|
2020-10-29 14:33:12 +00:00
|
|
|
type LockFile = RawFilePath
|
2015-05-18 18:16:49 +00:00
|
|
|
|
|
|
|
data LockMode = LockExclusive | LockShared
|
|
|
|
deriving (Eq)
|
|
|
|
|
|
|
|
-- This TMVar is full when the handle is open, and is emptied when it's
|
|
|
|
-- closed.
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
type LockHandle = TMVar (LockPool, LockFile, CloseLockFile)
|
2015-05-18 18:16:49 +00:00
|
|
|
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
-- When a shared lock is taken, this will only be true for the first
|
|
|
|
-- process, not subsequent processes. The first process should
|
|
|
|
-- fill the FirstLockSem after doing any IO actions to finish lock setup
|
|
|
|
-- and subsequent processes can block on that getting filled to know
|
|
|
|
-- when the lock is fully set up.
|
|
|
|
data FirstLock = FirstLock Bool FirstLockSem
|
|
|
|
|
|
|
|
type FirstLockSem = TMVar FirstLockSemVal
|
|
|
|
|
|
|
|
data FirstLockSemVal = FirstLockSemWaited Bool | FirstLockSemTried Bool
|
|
|
|
|
2015-05-18 18:16:49 +00:00
|
|
|
type LockCount = Integer
|
|
|
|
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
-- Action that closes the underlying lock file. When this is used
|
|
|
|
-- in a LockHandle, it closes a resource that is specific to that
|
|
|
|
-- LockHandle (such as eg a file handle), but does not release
|
|
|
|
-- any other shared locks. When this is used in a LockStatus,
|
|
|
|
-- it closes a resource that should only be closed when there are no
|
|
|
|
-- other shared locks.
|
Fix shared lock file FD leak.
This fixes behavior in this situation:
l1 <- lockShared Nothing "lck"
l2 <- lockShared Nothing "lck"
dropLock l1
dropLock l2
Before, the lock was dropped upon the second dropLock call, but the fd
remained open, and would never be closed while the program was running.
Fixed by a rather round-about method, but it should work well enough.
It would have been simpler to open open the shared lock once, and not open
it again in the second call to lockShared. But, that's difficult to do
atomically.
This also affects Windows and PID locks, not just posix locks.
In the case of pid locks, multiple calls to waitLock within the same
process are allowed because the side lock is locked using a posix lock,
and so multiple exclusive locks can be taken in the same process. So,
this change fixes a similar problem with pid locks.
l1 <- waitLock (Seconds 1) "lck"
l2 <- waitLock (Seconds 1) "lck"
dropLock l1
dropLock l2
Here the l2 side lock fd remained open but not locked,
although the pid lock file was removed. After this change, the second
dropLock will close both fds to the side lock, and delete the pidlock.
2016-03-01 19:31:39 +00:00
|
|
|
type CloseLockFile = IO ()
|
2015-05-18 18:16:49 +00:00
|
|
|
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
data LockStatus = LockStatus LockMode LockCount FirstLockSem CloseLockFile
|
2021-12-03 21:20:21 +00:00
|
|
|
|
2015-05-18 18:16:49 +00:00
|
|
|
-- This TMVar is normally kept full.
|
|
|
|
type LockPool = TMVar (M.Map LockFile LockStatus)
|
|
|
|
|
|
|
|
-- A shared global variable for the lockPool. Avoids callers needing to
|
|
|
|
-- maintain state for this implementation detail.
|
2017-05-25 20:02:17 +00:00
|
|
|
{-# NOINLINE lockPool #-}
|
2015-05-18 18:16:49 +00:00
|
|
|
lockPool :: LockPool
|
|
|
|
lockPool = unsafePerformIO (newTMVarIO M.empty)
|
|
|
|
|
|
|
|
-- Updates the LockPool, blocking as necessary if another thread is holding
|
|
|
|
-- a conflicting lock.
|
|
|
|
--
|
|
|
|
-- Note that when a shared lock is held, an exclusive lock will block.
|
|
|
|
-- While that blocking is happening, another call to this function to take
|
|
|
|
-- the same shared lock should not be blocked on the exclusive lock.
|
|
|
|
-- Keeping the whole Map in a TMVar accomplishes this, at the expense of
|
|
|
|
-- sometimes retrying after unrelated changes in the map.
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
waitTakeLock :: LockPool -> LockFile -> LockMode -> STM (LockHandle, FirstLock)
|
2017-05-25 20:02:17 +00:00
|
|
|
waitTakeLock pool file mode = maybe retry return =<< tryTakeLock pool file mode
|
2015-05-18 18:16:49 +00:00
|
|
|
|
|
|
|
-- Avoids blocking if another thread is holding a conflicting lock.
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
tryTakeLock :: LockPool -> LockFile -> LockMode -> STM (Maybe (LockHandle, FirstLock))
|
2017-05-25 20:02:17 +00:00
|
|
|
tryTakeLock pool file mode = do
|
|
|
|
m <- takeTMVar pool
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
let success firstlock v = do
|
2017-05-25 20:02:17 +00:00
|
|
|
putTMVar pool (M.insert file v m)
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
tmv <- newTMVar (pool, file, noop)
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
return (Just (tmv, firstlock))
|
2017-05-25 20:02:17 +00:00
|
|
|
case M.lookup file m of
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
Just (LockStatus mode' n firstlocksem postreleaselock)
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
| mode == LockShared && mode' == LockShared -> do
|
|
|
|
fl@(FirstLock _ firstlocksem') <- if n == 0
|
|
|
|
then FirstLock True <$> newEmptyTMVar
|
|
|
|
else pure (FirstLock False firstlocksem)
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
success fl $ LockStatus mode (succ n) firstlocksem' postreleaselock
|
2017-05-25 20:02:17 +00:00
|
|
|
| n > 0 -> do
|
|
|
|
putTMVar pool m
|
|
|
|
return Nothing
|
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.
The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.
Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.
The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.
Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.
Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.
Sponsored-by: Dartmouth College's Datalad project
2021-12-01 19:22:31 +00:00
|
|
|
_ -> do
|
|
|
|
firstlocksem <- newEmptyTMVar
|
2021-12-03 21:20:21 +00:00
|
|
|
success (FirstLock True firstlocksem) $
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
LockStatus mode 1 firstlocksem noop
|
2015-05-18 18:16:49 +00:00
|
|
|
|
Fix shared lock file FD leak.
This fixes behavior in this situation:
l1 <- lockShared Nothing "lck"
l2 <- lockShared Nothing "lck"
dropLock l1
dropLock l2
Before, the lock was dropped upon the second dropLock call, but the fd
remained open, and would never be closed while the program was running.
Fixed by a rather round-about method, but it should work well enough.
It would have been simpler to open open the shared lock once, and not open
it again in the second call to lockShared. But, that's difficult to do
atomically.
This also affects Windows and PID locks, not just posix locks.
In the case of pid locks, multiple calls to waitLock within the same
process are allowed because the side lock is locked using a posix lock,
and so multiple exclusive locks can be taken in the same process. So,
this change fixes a similar problem with pid locks.
l1 <- waitLock (Seconds 1) "lck"
l2 <- waitLock (Seconds 1) "lck"
dropLock l1
dropLock l2
Here the l2 side lock fd remained open but not locked,
although the pid lock file was removed. After this change, the second
dropLock will close both fds to the side lock, and delete the pidlock.
2016-03-01 19:31:39 +00:00
|
|
|
-- Call after waitTakeLock or tryTakeLock, to register a CloseLockFile
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
-- action to run when releasing the lock. This action should only
|
|
|
|
-- close the lock file associated with the LockHandle, while
|
|
|
|
-- leaving any other shared locks of the same file open.
|
2020-07-21 19:30:47 +00:00
|
|
|
registerCloseLockFile :: LockHandle -> CloseLockFile -> STM ()
|
|
|
|
registerCloseLockFile h closelockfile = do
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
(p, f, c) <- takeTMVar h
|
|
|
|
putTMVar h (p, f, c >> closelockfile)
|
2021-12-03 21:20:21 +00:00
|
|
|
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
-- Register an action that should be run only once a lock has been
|
|
|
|
-- released. When there are multiple shared locks of the same file,
|
|
|
|
-- the action will only be run after all are released.
|
|
|
|
registerPostReleaseLock :: LockHandle -> CloseLockFile -> STM ()
|
2021-12-03 21:20:21 +00:00
|
|
|
registerPostReleaseLock h postreleaselock = do
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
(p, f, _) <- readTMVar h
|
|
|
|
m <- takeTMVar p
|
|
|
|
case M.lookup f m of
|
|
|
|
Nothing -> putTMVar p m
|
|
|
|
Just (LockStatus mode cnt firstlocksem c) -> do
|
|
|
|
let c' = c >> postreleaselock
|
|
|
|
putTMVar p $ M.insert f (LockStatus mode cnt firstlocksem c') m
|
Fix shared lock file FD leak.
This fixes behavior in this situation:
l1 <- lockShared Nothing "lck"
l2 <- lockShared Nothing "lck"
dropLock l1
dropLock l2
Before, the lock was dropped upon the second dropLock call, but the fd
remained open, and would never be closed while the program was running.
Fixed by a rather round-about method, but it should work well enough.
It would have been simpler to open open the shared lock once, and not open
it again in the second call to lockShared. But, that's difficult to do
atomically.
This also affects Windows and PID locks, not just posix locks.
In the case of pid locks, multiple calls to waitLock within the same
process are allowed because the side lock is locked using a posix lock,
and so multiple exclusive locks can be taken in the same process. So,
this change fixes a similar problem with pid locks.
l1 <- waitLock (Seconds 1) "lck"
l2 <- waitLock (Seconds 1) "lck"
dropLock l1
dropLock l2
Here the l2 side lock fd remained open but not locked,
although the pid lock file was removed. After this change, the second
dropLock will close both fds to the side lock, and delete the pidlock.
2016-03-01 19:31:39 +00:00
|
|
|
|
2015-05-18 18:16:49 +00:00
|
|
|
-- Checks if a lock is being held. If it's held by the current process,
|
|
|
|
-- runs the getdefault action; otherwise runs the checker action.
|
|
|
|
--
|
|
|
|
-- Note that the lock pool is left empty while the checker action is run.
|
|
|
|
-- This allows checker actions that open/close files, and so would be in
|
2015-05-18 20:23:07 +00:00
|
|
|
-- danger of conflicting with locks created at the same time this is
|
|
|
|
-- running. With the lock pool empty, anything that attempts
|
|
|
|
-- to take a lock will block, avoiding that race.
|
2015-05-20 03:35:24 +00:00
|
|
|
getLockStatus :: LockPool -> LockFile -> IO v -> IO v -> IO v
|
2015-05-18 18:16:49 +00:00
|
|
|
getLockStatus pool file getdefault checker = do
|
|
|
|
v <- atomically $ do
|
|
|
|
m <- takeTMVar pool
|
|
|
|
let threadlocked = case M.lookup file m of
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
Just (LockStatus _ n _ _) | n > 0 -> True
|
2015-05-18 18:16:49 +00:00
|
|
|
_ -> False
|
|
|
|
if threadlocked
|
|
|
|
then do
|
|
|
|
putTMVar pool m
|
|
|
|
return Nothing
|
|
|
|
else return $ Just $ atomically $ putTMVar pool m
|
|
|
|
case v of
|
2015-05-20 03:35:24 +00:00
|
|
|
Nothing -> getdefault
|
2015-05-18 18:16:49 +00:00
|
|
|
Just restore -> bracket_ (return ()) restore checker
|
|
|
|
|
2021-12-06 16:48:37 +00:00
|
|
|
-- Releases the lock. When it is a shared lock, it may remain locked by
|
|
|
|
-- other LockHandles.
|
2015-05-18 18:16:49 +00:00
|
|
|
--
|
Fix shared lock file FD leak.
This fixes behavior in this situation:
l1 <- lockShared Nothing "lck"
l2 <- lockShared Nothing "lck"
dropLock l1
dropLock l2
Before, the lock was dropped upon the second dropLock call, but the fd
remained open, and would never be closed while the program was running.
Fixed by a rather round-about method, but it should work well enough.
It would have been simpler to open open the shared lock once, and not open
it again in the second call to lockShared. But, that's difficult to do
atomically.
This also affects Windows and PID locks, not just posix locks.
In the case of pid locks, multiple calls to waitLock within the same
process are allowed because the side lock is locked using a posix lock,
and so multiple exclusive locks can be taken in the same process. So,
this change fixes a similar problem with pid locks.
l1 <- waitLock (Seconds 1) "lck"
l2 <- waitLock (Seconds 1) "lck"
dropLock l1
dropLock l2
Here the l2 side lock fd remained open but not locked,
although the pid lock file was removed. After this change, the second
dropLock will close both fds to the side lock, and delete the pidlock.
2016-03-01 19:31:39 +00:00
|
|
|
-- Note that the lock pool is left empty while the CloseLockFile action
|
2015-05-18 18:16:49 +00:00
|
|
|
-- is run, to avoid race with another thread trying to open the same lock
|
2021-12-03 21:20:21 +00:00
|
|
|
-- file. However, the pool is full again when the PostReleaseLock action
|
|
|
|
-- runs.
|
Fix shared lock file FD leak.
This fixes behavior in this situation:
l1 <- lockShared Nothing "lck"
l2 <- lockShared Nothing "lck"
dropLock l1
dropLock l2
Before, the lock was dropped upon the second dropLock call, but the fd
remained open, and would never be closed while the program was running.
Fixed by a rather round-about method, but it should work well enough.
It would have been simpler to open open the shared lock once, and not open
it again in the second call to lockShared. But, that's difficult to do
atomically.
This also affects Windows and PID locks, not just posix locks.
In the case of pid locks, multiple calls to waitLock within the same
process are allowed because the side lock is locked using a posix lock,
and so multiple exclusive locks can be taken in the same process. So,
this change fixes a similar problem with pid locks.
l1 <- waitLock (Seconds 1) "lck"
l2 <- waitLock (Seconds 1) "lck"
dropLock l1
dropLock l2
Here the l2 side lock fd remained open but not locked,
although the pid lock file was removed. After this change, the second
dropLock will close both fds to the side lock, and delete the pidlock.
2016-03-01 19:31:39 +00:00
|
|
|
releaseLock :: LockHandle -> IO ()
|
|
|
|
releaseLock h = go =<< atomically (tryTakeTMVar h)
|
2015-05-18 18:16:49 +00:00
|
|
|
where
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
go (Just (pool, file, closelockfile)) = do
|
|
|
|
(m, postreleaselock) <- atomically $ do
|
2015-05-18 18:16:49 +00:00
|
|
|
m <- takeTMVar pool
|
|
|
|
return $ case M.lookup file m of
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
Just (LockStatus mode n firstlocksem postreleaselock)
|
|
|
|
| n == 1 -> (M.delete file m, postreleaselock)
|
2015-05-18 18:16:49 +00:00
|
|
|
| otherwise ->
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
(M.insert file (LockStatus mode (pred n) firstlocksem postreleaselock) m, noop)
|
|
|
|
Nothing -> (m, noop)
|
2021-12-06 16:48:37 +00:00
|
|
|
() <- closelockfile
|
2015-05-18 18:16:49 +00:00
|
|
|
atomically $ putTMVar pool m
|
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
2021-12-06 19:01:39 +00:00
|
|
|
-- This action may access the pool, so run it only
|
|
|
|
-- after the pool is restored.
|
2021-12-06 17:00:40 +00:00
|
|
|
postreleaselock
|
2015-05-18 18:16:49 +00:00
|
|
|
-- The LockHandle was already closed.
|
|
|
|
go Nothing = return ()
|