git-annex

Author	SHA1	Message	Date
Joey Hess	4ed82e5328	fsck: Work around bug in persistent that broke display of problematically encoded filenames on stderr when using --incremental.	2015-09-09 17:02:00 -04:00
Joey Hess	bc4129cc77	fsck: Commit incremental fsck database after every 1000 files fscked, or every 5 minutes, whichever comes first. Previously, commits were made every 1000 files fscked. Also, improve docs	2015-07-31 16:42:15 -04:00
Joey Hess	ecb0d5c087	use lock pools throughout git-annex The one exception is in Utility.Daemon. As long as a process only daemonizes once, which seems reasonable, and as long as it avoids calling checkDaemon once it's already running as a daemon, the fcntl locking gotchas won't be a problem there. Annex.LockFile has it's own separate lock pool layer, which has been renamed to LockCache. This is a persistent cache of locks that persist until closed. This is not quite done; lockContent stil needs to be converted.	2015-05-19 14:09:52 -04:00
Joey Hess	ec267aa1ea	rejigger imports for clean build with ghc 7.10's AMP changes The explict import Prelude after import Control.Applicative is a trick to avoid a warning.	2015-05-10 16:20:30 -04:00
Joey Hess	addc82dab7	removed all uses of undefined from code base It's a code smell, can lead to hard to diagnose error messages.	2015-04-19 00:38:29 -04:00
Joey Hess	5d974b26fc	generated TH uses forall	2015-02-22 16:57:19 -04:00
Joey Hess	e143d5e7d1	avoid closing db handle when reconnecting to do a write	2015-02-22 14:21:39 -04:00
Joey Hess	bf80a16c2e	complete work around for sqlite SELECT ErrorBusy on new connection bug	2015-02-22 14:08:26 -04:00
Joey Hess	b541a5e38b	WIP	2015-02-18 17:46:58 -04:00
Joey Hess	a01285ff33	more extensions needed by newer version of persistent	2015-02-18 17:30:07 -04:00
Joey Hess	80683871ee	deal with rare SELECT ErrorBusy failures I think they might be a sqlite bug. In discussions with sqlite devs.	2015-02-18 16:56:52 -04:00
Joey Hess	af254615b2	use WAL mode to ensure read from db always works, even when it's being written to Also, moved the database to a subdir, as there are multiple files. This seems to work well with concurrent fscks, although they still do redundant work due to the commit granularity. Occasionally two writes will conflict, and one is then deferred and happens later. Except, with 3 concurrent fscks, I got failures: git-annex: user error (SQLite3 returned ErrorBusy while attempting to perform prepare "SELECT \"fscked\".\"key\"\nFROM \"fscked\"\nWHERE \"fscked\".\"key\" = ?\n": database is locked) Argh!!!	2015-02-18 15:54:24 -04:00
Joey Hess	17cb219231	more robust handling of deferred commits Still not robust enough. I have 3 fscks running concurrently, and am seeing: ("commit deferred",user error (SQLite3 returned ErrorBusy while attempting to perform step.)) and git-annex: user error (SQLite3 returned ErrorBusy while attempting to perform prepare "SELECT \"fscked\".\"key\"\nFROM \"fscked\"\nWHERE \"fscked\".\"key\" = ?\n": database is locked)	2015-02-18 14:11:27 -04:00
Joey Hess	3414229354	fsck: Multiple incremental fscks of different repos (some remote) can now be in progress at the same time in the same repo without it getting confused about which files have been checked for which remotes.	2015-02-17 17:08:11 -04:00
Joey Hess	a3370ac459	allow for concurrent incremental fsck processes again (sorta) Sqlite doesn't support multiple concurrent writers at all. One of them will fail to write. It's not even possible to have two processes building up separate transactions at the same time. Before using sqlite, incremental fsck could work perfectly well with multiple fsck processes running concurrently. I'd like to keep that working. My partial solution, so far, is to make git-annex buffer writes, and every so often send them all to sqlite at once, in a transaction. So most of the time, nothing is writing to the database. (And if it gets unlucky and a write fails due to a collision with another writer, it can just wait and retry the write later.) This lets multiple processes write to the database successfully. But, for the purposes of concurrent, incremental fsck, it's not ideal. Each process doesn't immediately learn of files that another process has checked. So they'll tend to do redundant work. Only way I can see to improve this is to use some other mechanism for short-term IPC between the fsck processes. Not yet done. ---- Also, make addDb check if an item is in the database already, and not try to re-add it. That fixes an intermittent crash with "SQLite3 returned ErrorConstraint while attempting to perform step." I am not 100% sure why; it only started happening when I moved write buffering into the queue. It seemed to generally happen on the same file each time, so could just be due to multiple files having the same key. However, I doubt my sound repo has many duplicate keys, and I suspect something else is going on. ---- Updated benchmark, with the 1000 item queue: 6m33.808s	2015-02-17 16:56:12 -04:00
Joey Hess	afb3e3e472	avoid crash when starting fsck --incremental when one is already running Turns out sqlite does not like having its database deleted out from underneath it. It might suffice to empty the table, but I would rather start each fsck over with a new database, so I added a lock file, and running incremental fscks use a shared lock. This leaves one concurrency bug left; running two concurrent fsck --more will lead to: "SQLite3 returned ErrorBusy while attempting to perform step." and one or both will fail. This is a concurrent writers problem.	2015-02-17 13:30:24 -04:00
Joey Hess	ea76d04e15	show error when sqlite crashes worker thread Better than "blocked indefinitely in MVar"..	2015-02-17 13:03:57 -04:00
Joey Hess	99a1287f4f	avoid fromIntegral overhead	2015-02-16 17:22:00 -04:00
Joey Hess	7d36e7d18d	commit new transaction after 60 seconds Database.Handle can now be given a CommitPolicy, making it easy to specify transaction granularity. Benchmarking the old git-annex incremental fsck that flips sticky bits to the new that uses sqlite, running in a repo with 37000 annexed files, both from cold cache: old: 6m6.906s new: 6m26.913s This commit was sponsored by TasLUG.	2015-02-16 17:05:42 -04:00
Joey Hess	d2766df914	commit more transactions when fscking This makes interrupt and resume work, robustly. But, incremental fsck is slowed down by all those transactions..	2015-02-16 16:07:36 -04:00
Joey Hess	91e9146d1b	convert incremental fsck to using sqlite database Did not keep backwards compat for sticky bit records. An incremental fsck that is already in progress will start over on upgrade to this version. This is not yet ready for merging. The autobuilders need to have sqlite installed. Also, interrupting a fsck --incremental does not commit the database. So, resuming with fsck --more restarts from beginning. Memory: Constant during a fsck of tens of thousands of files. (But, it does seem to buffer whole transation in memory, so may really scale with number of files.) CPU: ?	2015-02-16 15:35:26 -04:00

21 commits