Commit graph

55 commits

Author SHA1 Message Date
Joey Hess
ed0afbc36b
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.

The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.

Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.

The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.

Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.

Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.

Sponsored-by: Dartmouth College's Datalad project
2021-12-01 17:14:39 -04:00
Joey Hess
a6699be79d
catch error statting pid lock file if it somehow does not exist
It ought to exist, since linkToLock has just created it. However,
Lustre seems to have a rather probabilisitic view of the contents of a
directory, so catching the error if it somehow does not exist and
running the same code path that would be ran if linkToLock failed
might avoid this fun Lustre failure.

Sponsored-by: Dartmouth College's Datalad project
2021-11-29 14:53:07 -04:00
Joey Hess
e853ef3095
decorate openTempFile errors with the template name
This is to track down what file in .git/annex/ is being written to via a
temp file when the repository is read-only.

Sponsored-by: Dartmouth College's Datalad project
2021-08-30 13:05:02 -04:00
Joey Hess
5e39b7eb8d
Windows: Work around win32 length limits when dealing with lock files 2021-01-13 14:38:35 -04:00
Joey Hess
804808d569
squash build warnings on windows 2020-11-23 14:00:17 -04:00
Joey Hess
ed7afabdb1
fix build on windows 2020-11-13 13:34:28 -04:00
Joey Hess
e505c03bcc
more RawFilePath conversion
nukeFile replaced with removeWhenExistsWith removeLink, which allows
using RawFilePath. Utility.Directory cannot use RawFilePath since setup
does not depend on posix.

This commit was sponsored by Graham Spencer on Patreon.
2020-10-29 10:50:29 -04:00
Joey Hess
08cbaee1f8
more RawFilePath conversion
Most of Git/ builds now.

Notable win is toTopFilePath no longer double converts

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-10-28 15:55:30 -04:00
Joey Hess
b68f214312
Display a message when git-annex has to wait for a pid lock file held by another process 2020-08-26 13:05:34 -04:00
Joey Hess
7bdb0cdc0d
add gitAnnexChildProcess and use instead of incorrect use of runsGitAnnexChildProcess
Fixes reversion in 8.20200617 that made annex.pidlock being enabled result
in some commands stalling, particularly those needing to autoinit.

Renamed runsGitAnnexChildProcess to make clearer where it should be
used.

Arguably, it would be better to have a way to make any process git-annex
runs have the env var set. But then it would need to take the pid lock
when running any and all processes, and that would be a problem when
git-annex runs two processes concurrently. So, I'm left doing it ad-hoc
in places where git-annex really does run a child process, directly
or indirectly via a particular git command.
2020-08-25 14:57:49 -04:00
Joey Hess
82448bdf39
fix a annex.pidlock issue
That made eg git-annex get of an unlocked file hang until the
annex.pidlocktimeout and then fail.

This fix should be fully thread safe no matter what else git-annex is
doing.

Only using runsGitAnnexChildProcess in the one place it's known to be a
problem. Could audit for all places where git-annex runs itself as a child
and add it to all of them, later.
2020-06-17 15:30:59 -04:00
Joey Hess
24ff5e2b29
use uninterruptibleMask
Some recent changes to use mask missed that async exceptions can still
be thrown inside it. The goal is to make sure a block of cleanup code
runs entirely, w/o being interrupted by an async exception, so use
uninterruptibleMask.

Also, converted a few to bracket, which is nicer.
2020-06-09 15:02:56 -04:00
Joey Hess
0210e81d83
async exception safety for openFd
Audited for openFile and openFd, and this fixes all the ones I found
where an async exception could prevent the file getting closed.

Except for the lock pool, which is a whole other can of worms.
2020-06-05 15:48:00 -04:00
Joey Hess
8ea5f3ff99
explict export lists
Eliminated some dead code. In other cases, exported a currently unused
function, since it was a logical part of the API.

Of course this improves the API documentation. It may also sometimes
let ghc optimize code better, since it can know a function is internal
to a module.

364 modules still to go, according to
git grep -E 'module [A-Za-z.]+ where'
2019-11-21 16:08:37 -04:00
Joey Hess
b3c69eaaf8
strict bytestring encoders and decoders
Only had lazy ones before.

Already sped up a few parts of the code.
2019-01-01 14:55:15 -04:00
Joey Hess
19a6227e6e
remove temp file in failure case 2017-06-06 14:23:33 -04:00
Joey Hess
ed639c140d
Fix bug that prevented transfer locks from working when run on SMB or other filesystem that does not support fcntl locks and hard links.
This commit was sponsored by Ethan Aubin.
2017-06-06 14:22:03 -04:00
Joey Hess
6dd806f1ad
stop using MissingH for MD5
Cryptonite is faster and allocates less, and I want to get rid of
MissingH use.

Note that the new dependency on memory is free; it's a dependency of
cryptonite.

This commit was supported by the NSF-funded DataLad project.
2017-05-15 21:36:03 -04:00
Edward Betts
0750913136
correct spelling mistakes 2017-02-12 17:30:23 -04:00
Joey Hess
5e6ced7d0f
Improve pid locking code to work on filesystems that don't support hard links.
Probing for hard link support in the pid locking code is redundant since
git-annex init already probes that. But, it didn't seem worth threading
that data through; the pid locking code runs at most once per git-annex
process, and only on unusual filesystems. Optimising a single hard link
and unlink isn't worth it.

This commit was sponsored by Francois Marier on Patreon.
2017-02-10 15:22:28 -04:00
Joey Hess
0a4479b8ec
Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.

Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.

This commit was sponsored by Ethan Aubin.
2016-11-15 21:29:54 -04:00
Joey Hess
b22409db38
avoid warnings about not exported System.Directory.isSymbolicLink 2016-04-28 15:18:11 -04:00
Joey Hess
5fe450514b
Fix build with directory-1.2.6.2.
It started exporting a isSymbolicLink which supports windows. But,
git-annex does no use symlinks on windows yet and this conflicts with the
function by the same name from unix-compat, so hide it.
2016-04-28 13:18:44 -04:00
Joey Hess
f219ffc33b
comment typo fix 2016-03-01 12:24:22 -04:00
Joey Hess
b6ac443b60
fix build warnings under ghc 7.10
Caused by AMP.. Since I've finally upgraded my dev laptop to 7.10,
I may start missing imports that are not needed with it but are with older
versions..
2015-12-19 17:42:45 -04:00
Joey Hess
c670a0642c
fix warning 2015-11-16 15:37:27 -04:00
Joey Hess
e2b4861bff
store abspath to the lock file
Avoids problems if the program chdirs
2015-11-16 15:25:04 -04:00
Joey Hess
2e44da5c46
clean up side lock files when we're done with them
There's a potential race, but it's detected and just results in the other
process failing to take the side lock, so possibly retrying one second
later on. The race window is quite narrow so the extra delay is minor.

Left the side lock files mode 666 because an interruption can leave a side
lock file created by another user for a shared repository. When this
happens, the non-owning user can't delete it (+t) but can still lock it,
and so the code falls back to acting as it did before this commit.
2015-11-16 11:36:11 -04:00
Joey Hess
8efd3d71c8
starting to get a handle on how to detect that mad gleam in lustre's eye 2015-11-13 16:18:44 -04:00
Joey Hess
70bfe218f5
one more try to get sane behavior our of lustre 2015-11-13 15:51:45 -04:00
Joey Hess
389c6c7d37
fixed a fd double-close 2015-11-13 15:43:09 -04:00
Joey Hess
b0155d9093
also compare lock file contents to double-check link worked
And it closes the tmp file before this. I don't know if this will help
avoid lustre's craziness, but it can't hurt..
2015-11-13 15:20:52 -04:00
Joey Hess
1aba23ab4e
use /tmp for sidelock file when no /dev/shm 2015-11-13 14:49:30 -04:00
Joey Hess
60a9c7f5c6
require the side lock be held to take pidlock
This is less portable, since currently sidelocks rely on /dev/shm.
But, I've seen crazy lustre inconsistencies that make me not trust the
link() method at all, so what can you do.
2015-11-13 14:44:53 -04:00
Joey Hess
85345abe8b
avoid over-long filenames for side lock files 2015-11-13 14:04:29 -04:00
Joey Hess
c2cbe5619b
add stat check
I have a strace taken on a lustre filesystem on which link() returned 0,
but didn't actually succeed, since the file already existed.

One of the linux man pages recommended using link followed by checking like
this. I was reading it yesterday, but cannot find it now.
2015-11-13 13:22:45 -04:00
Joey Hess
88d94e674c
clean up temp file 2015-11-13 12:52:24 -04:00
Joey Hess
e31a51c5bb
better lock dropping order 2015-11-13 12:36:37 -04:00
Joey Hess
aa4192aea6
pid locking configuration and abstraction layer for git-annex
(not actually used anywhere yet)
2015-11-12 17:50:34 -04:00
Joey Hess
77b490bfba
add timeout for pid lock waiting 2015-11-12 17:12:54 -04:00
Joey Hess
0f25a7365a
module for PidLocks in LockPool 2015-11-12 16:31:34 -04:00
Joey Hess
710d1eeeac
module for pid lock files with atomic stale lock file takeover when possible 2015-11-12 15:39:49 -04:00
Joey Hess
c8fad345f2
add tryLockShared 2015-10-08 13:40:23 -04:00
Joey Hess
9461019e9a
open lock file ReadOnly when taking shared lock
It's only necessary to open a file for write when taking an exclusive lock.
2015-10-08 13:34:49 -04:00
Joey Hess
6c3cea7699 need more polymorphism 2015-05-22 13:50:37 -04:00
Joey Hess
9de5cd2966 fix crash in stale transfer lockfile cleanup code
Need to differentiate between the lockfile not being locked, and it not
existing.
2015-05-19 23:35:24 -04:00
Joey Hess
1312e721ed convert lockContent to use new LockPools
Also cleaned up the code, avoiding creating a lock file if we're going to
open it for create later anyway.

And, if there's an exception while preparing to lock the file, but not at
the point of actually taking the lock, throw an exception, instead of
silently not locking and pretending to succeed.

And, on Windows, always use lock file, even if the repo somehow got into
indirect mode (maybe with cygwin git..)
2015-05-19 14:12:23 -04:00
Joey Hess
6915b71c57 lock pools to work around non-concurrency/composition safety of POSIX fcntl 2015-05-18 15:57:17 -04:00
Joey Hess
e9172263e5 comment typos 2015-05-17 14:22:14 -04:00
Joey Hess
8c2dd7d8ee Fix an unlikely race that could result in two transfers of the same key running at once.
As discussed in bug report.
2015-05-12 19:39:28 -04:00