git-annex

Author	SHA1	Message	Date
Joey Hess	ec4d974dcf	assistant: Fix deadlock that could occur when adding a lot of files at once in indirect mode. This is a laziness problem. Despite the bang pattern on newfiles, the list was not being fully evaluated before cleanup was called. Moving cleanup out to after the list is actually used fixes this. More evidence that I should be using ResourceT or pipes, if any was needed.	2013-07-26 18:42:22 -04:00
Joey Hess	dba1e29949	webapp: Better display of added files.	2013-07-10 15:37:40 -04:00
Joey Hess	82a6db8fe8	committer tweak to wait for Watcher to resume after a max-size commit Without this, a very large batch add has commits of sizes approx 5000, 2500, 1250, etc down to 10, and then starts over at 5000. This fixes it so it's 5000+ every time.	2013-04-25 00:48:09 -04:00
Joey Hess	ebee93a837	get rid of need to run pre-commit hook when assistant commits in direct mode That hook updates associated file bookkeeping info for direct mode. But, everything already called addAssociatedFile when adding/changing a file. It only needed to also call removeAssociatedFile when deleting a file, or a directory. This should make bulk adds faster, by some possibly significant amount. Bulk removals may be a little slower, since it has to use catKeyFile now on each removed file, but will still be faster than adds.	2013-04-24 18:04:59 -04:00
Joey Hess	cd7055631f	batch commit every 5 thousand changes, not 10 thousand There's a tradeoff between making less frequent commits, and needing to use memory to store all the changes that are coming in. At 10 thousand, it needs 150 mb of memory. 5 thousand drops that down to 90 mb or so. This also turns out to have significant imact on total run time. I benchmarked 10k changes taking 27 minutes. But two 5k batches took only 21 minutes.	2013-04-24 16:40:35 -04:00
Joey Hess	bda237f14a	convert PendingAddChange back to Change when an add fails If an add failed, we should lose the KeySource, since it, presumably, differs due to a change that was made to the file. (The locked down file is already deleted.)	2013-04-24 16:29:25 -04:00
Joey Hess	a929e6641a	show one alert when bulk adding files Turns out that a lot of the time spent in a bulk add was just updating the add alert to rotate through each file that was added. Showing one alert makes for a significant speedup. Also, when the webapp is open, this makes it take quite a lot less cpu during bulk adds. Also, it lets the user know when a bulk add happened, which is sorta nice..	2013-04-24 13:04:46 -04:00
Joey Hess	ca72b1ac7b	assistant: when an add fails, requeue it for later See analysis in bug report for one way this could happen.	2013-04-23 18:23:04 -04:00
Joey Hess	090a69f00c	assistant: Work around misfeature in git 1.8.2 that makes `git commit --alow-empty -m ""` run an editor. See http://git-annex.branchable.com/bugs/assistant_hangs_during_commit/	2013-04-18 16:27:17 -04:00
Joey Hess	602baae12e	Bugfix: Direct mode no longer repeatedly checksums duplicated files. Fixed by storing a list of cached inodes for a key, instead of just one. Backwards compatability note: An old git-annex version will fail to parse an inode cache file that has been written by a new version, and has multiple items. It will succees if just one. So old git-annexes will have even worse behavior when there are duplicated files, if that is possible. I don't think it will be a problem. (Famous last words.) Also, note that it doesn't expire old and unused inode caches for a key. It would be possible to add this if needed; just look through the associated files for a key and if there are more cached inodes, throw out any not corresponding to associated files. Unless a file is being copied repeatedly and the old copy deleted, this lack of expiry should not be a problem.	2013-04-06 16:07:25 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	35a0ae334c	assistant: Fix OSX bug that prevented committing changed files to a repository when in indirect mode.	2013-03-17 17:01:43 -04:00
Joey Hess	393340dc3b	better handling of batch renames Rather than wait a full second, which may be longer than needed, or too short to get all the rename events, we start a mode where we wait 1/10th of a second, and if there are Changes received, wait again. Basically we're back in batch mode when this happens.	2013-03-11 15:46:09 -04:00
Joey Hess	14fcfced48	detect directory rename and wait up to 1 second to get all the changes	2013-03-11 15:24:13 -04:00
Joey Hess	f340fd324c	synthesize RmChange when a directory is deleted This gets directory renames closer to being fully detected. There's close to no extra overhead to doing it this way.	2013-03-11 15:14:42 -04:00
Joey Hess	06046a0d2b	finish fast direct mode rename handling. wow, it's fast	2013-03-11 14:14:45 -04:00
Joey Hess	87cba71d5a	fix changeFile to not be partial That led to runtime crashes, without even a warning from -Wall. Yipes!	2013-03-11 13:55:36 -04:00
Joey Hess	61c5e8736c	detect renames during commit, and .. um, do nothing special because it's lunch time But I'm well set up to fast-track direct mode adds for renames now.	2013-03-11 12:56:47 -04:00
Joey Hess	2762ab03b4	assistant: generate better commits for renames	2013-03-10 22:10:26 -04:00
Joey Hess	b2c7ee5551	tweak	2013-03-10 20:20:58 -04:00
Joey Hess	65a4c7966f	moved transfer queueing out of watcher and into committer This cleaned up the code quite a bit; now the committer just looks at the Change to see if it's a change that needs to have a transfer queued for it. If I later want to add dropping keys for files that were removed, or something like that, this should make it straightforward. This also fixes a bug. In direct mode, moving a file out of an archive directory failed to start a transfer to get its content. The problem was that the file had not been committed to git yet, and so the transfer code didn't want to touch it, since fileKey failed to get its key. Only starting transfers after a commit avoids this problem.	2013-03-10 18:16:03 -04:00
Joey Hess	724711e4b7	fix	2013-03-03 15:18:24 -04:00
Joey Hess	789ca15012	better prevention of auto repack Looking through the git sources (documentation is unclear), it seems commit doesn't ever trigger git-gc, mostly fetching and merging seems to. I cannot easily override the setting in all those places, so instead set gc.auto in git config when initializing a repository with the assistant. This does mean that the user cannot set gc.auto=0 and completely avoid repacks, as the assistant does it daily. But, it only does it after there are 100x the default number of loose objects, so this is probably not going to be too annoying.	2013-03-03 14:20:07 -04:00
Joey Hess	6dea43831e	assistant: Prevent automatic commits from causing git-gc runs, as that can make things quite slow. Instead, git-gc --auto is run once a day. (This can be disabled by the usual gc.auto=0 setting.)	2013-03-03 13:44:35 -04:00
Joey Hess	0c13d3065e	git subcommand cleanup Pass subcommand as a regular param, which allows passing git parameters like -c before it. This was already done in the pipeing set of functions, but not the command running set.	2013-03-03 13:39:07 -04:00
Joey Hess	4d33423067	assistant: Avoid noise in logs from git commit about typechanged files in direct mode repositories.	2013-03-01 16:21:29 -04:00
Joey Hess	46c9cbeb1e	add additional debug info about reasons for transfers	2013-03-01 15:23:59 -04:00
Joey Hess	2a4dad8bd4	remove debug prints	2013-02-19 23:18:15 -04:00
Joey Hess	d7c93b8913	fully support core.symlinks=false in all relevant symlink handling code Refactored annex link code into nice clean new library. Audited and dealt with calls to createSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ createSymbolicLink linktarget file only when core.symlinks=true Assistant/WebApp/Configurators/Local.hs: createSymbolicLink link link test if symlinks can be made Command/Fix.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/FromKey.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/Indirect.hs: liftIO $ createSymbolicLink l f refuses to run if core.symlinks=false Init.hs: createSymbolicLink f f2 test if symlinks can be made Remote/Directory.hs: go [file] = catchBoolIO $ createSymbolicLink file f >> return True fast key linking; catches failure to make symlink and falls back to copy Remote/Git.hs: liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True ditto Upgrade/V1.hs: liftIO $ createSymbolicLink link f v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to readSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ catchMaybeIO $ readSymbolicLink file only when core.symlinks=true Assistant/Threads/Watcher.hs: ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) code that fixes real symlinks when inotify sees them It's ok to not fix psdueo-symlinks. Assistant/Threads/Watcher.hs: mlink <- liftIO (catchMaybeIO $ readSymbolicLink file) ditto Command/Fix.hs: stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do command only works in indirect mode Upgrade/V1.hs: getsymlink = takeFileName <$> readSymbolicLink file v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to isSymbolicLink. (Typically used with getSymbolicLinkStatus, but that is just used because getFileStatus is not as robust; it also works on pseudolinks.) Remaining calls are all safe, because: Assistant/Threads/SanityChecker.hs: \| isSymbolicLink s -> addsymlink file ms only handles staging of symlinks that were somehow not staged (might need to be updated to support pseudolinks, but this is only a belt-and-suspenders check anyway, and I've never seen the code run) Command/Add.hs: if isSymbolicLink s \|\| not (isRegularFile s) avoids adding symlinks to the annex, so not relevant Command/Indirect.hs: \| isSymbolicLink s -> void $ flip whenAnnexed f $ only allowed on systems that support symlinks Command/Indirect.hs: whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do ditto Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f used to find unlocked files, only relevant in indirect mode Utility/FSEvents.hs: \| Files.isSymbolicLink s = runhook addSymlinkHook $ Just s Utility/FSEvents.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: checkfiletype Files.isSymbolicLink addSymlinkHook f Utility/Kqueue.hs: \| Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change all above are lower-level, not relevant Audited and dealt with calls to isSymLink. Remaining calls are all safe, because: Annex/Direct.hs: \| isSymLink (getmode item) = This is looking at git diff-tree objects, not files on disk Command/Unused.hs: \| isSymLink (LsTree.mode l) = do This is looking at git ls-tree, not file on disk Utility/FileMode.hs:isSymLink :: FileMode -> Bool Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode low-level Done!!	2013-02-17 16:43:14 -04:00
Joey Hess	630f4531a7	fix assistant's use of lsof in crippled filesystem mode	2013-02-15 13:08:22 -04:00
Joey Hess	47477b2807	crippled filesystem support, probing and initial support git annex init probes for crippled filesystems, and sets direct mode, as well as `annex.crippledfilesystem`. Avoid manipulating permissions of files on crippled filesystems. That would likely cause an exception to be thrown. Very basic support in Command.Add for cripped filesystems; avoids the lock down entirely since doing it needs both permissions and hard links. Will make this better soon.	2013-02-14 14:15:26 -04:00
Joey Hess	5737c49804	support Android's crippled lsof	2013-02-11 17:33:45 -04:00
Joey Hess	547d7745fb	pre-commit: Update direct mode mappings. Making the pre-commit hook look at git diff-index to find changed direct mode files and update the mappings works pretty well. One case where it does not work is when a file is git annex added, and then git rmed, and then this is committed. That's a no-op commit, so the hook probably doesn't even run, and it certianly never notices that the file was deleted, so the mapping will still have the original filename in it. For this and other reasons, it's important that the mappings still be treated as possibly inconsistent. Also, the assistant now allows the pre-commit hook to run when in direct mode, so the mappings also get updated there.	2013-02-06 12:44:19 -04:00
Joey Hess	b19c2e6122	assistant: Fix location log when adding new file in direct mode.	2013-02-05 13:41:48 -04:00
Joey Hess	76ddf9b6d3	webapp: Now allows restarting any threads that crash.	2013-01-26 17:09:33 +11:00
Joey Hess	f51ad2a00c	assistant: Avoid committer crashing if a file is deleted at the wrong instant.	2013-01-14 15:02:13 -04:00
Joey Hess	4008590c68	type based git config handling for remotes Still a couple of places that use git config ad-hoc, but this is most of it done.	2013-01-01 13:58:14 -04:00
Joey Hess	7f7c31df1c	type based git config handling Now there's a Config type, that's extracted from the git config at startup. Note that laziness means that individual config values are only looked up and parsed on demand, and so we get implicit memoization for all of them. So this is not only prettier and more type safe, it optimises several places that didn't have explicit memoization before. As well as getting rid of the ugly explicit memoization code. Not yet done for annex.<remote>.* configuration settings.	2012-12-29 23:10:18 -04:00
Joey Hess	c0f9810f0b	OSX assistant: Uses direct mode by default when setting up a new local repository.	2012-12-28 16:42:11 -04:00
Joey Hess	cc5140d295	assistant adding of modified files in direct mode Works with inotify, but I think in kqueue we don't get events existing files that get modified.	2012-12-24 14:42:19 -04:00
Joey Hess	c6d2bbe402	assistant adding of files in direct mode	2012-12-24 13:37:29 -04:00
Joey Hess	d2df2e52b4	remove hard link when sanity check failed See http://git-annex.branchable.com/forum/dot_git_slash_annex_slash_tmp/	2012-11-29 16:54:51 -04:00
Joey Hess	93ffd47d76	finished pushing Assistant monad into all relevant files All temporary and old functions are removed.	2012-10-30 17:14:51 -04:00
Joey Hess	68118b8986	split remaining assistant types	2012-10-30 14:34:48 -04:00
Joey Hess	42babf5012	split Commits and lifted	2012-10-29 19:35:18 -04:00
Joey Hess	d2294f0dfa	split Changes and lifted	2012-10-29 19:30:23 -04:00
Joey Hess	1852eddce6	lift alertWhile	2012-10-29 16:49:47 -04:00
Joey Hess	e18b733c81	move alert display functions	2012-10-29 16:34:11 -04:00
Joey Hess	76768ad977	converted 6 more threads	2012-10-29 11:40:22 -04:00
Joey Hess	4dbdc2b666	Assistant monad, stage 2.5 Converted several threads to run in the monad. Added a lot of useful combinators for working with the monad. Now the monad includes the name of the thread. Some debugging messages are disabled pending converting other threads.	2012-10-29 02:21:04 -04:00
Joey Hess	4ac2fd0a22	ensure that git-annex branch is pushed after a successful transfer I now have this topology working: assistant ---> {bare repo, special remote} <--- assistant And, I think, also this one: +----------- bare repo --------+ v v assistant ---> special remote <--- assistant While before with assistant <---> assistant connections, both sides got location info updated after a transfer, in this topology, the bare repo might get its location info updated, but the other assistant has no way to know that it did. And a special remote doesn't record location info, so transfers to it won't propigate out location log changes at all. So, for these to work, after a transfer succeeds, the git-annex branch needs to be pushed. This is done by recording a synthetic commit has occurred, which lets the pusher handle pushing out the change (which will include actually committing any still journalled changes to the git-annex branch). Of course, this means rather a lot more syncing action than happened before. At least the pusher bundles together very close together pushes, somewhat. Currently it just waits 2 seconds between each push.	2012-10-28 16:05:34 -04:00
Joey Hess	452e6819d0	!! removal	2012-10-21 00:51:42 -04:00
Joey Hess	8eb1ba4cfe	revert bad change	2012-10-09 13:49:27 -04:00
Joey Hess	5ac15149cc	assistant: Now honors preferred content settings when deciding what to transfer. Both when queueing downloads, and uploads, consults the preferred content settings. I didn't make it check yet when requeing failed transfers or queuing deferred downloads; dealing with the preferred content settings (or indeed, other settings) changing while the assistant is running still needs work.	2012-10-09 12:18:41 -04:00
Joey Hess	47314c0fad	fix last zombies in the assistant Made Git.LsFiles return cleanup actions, and everything waits on processes now, except of course for Seek.	2012-10-04 19:56:32 -04:00
Joey Hess	9a3471971b	avoid crashing committer if it fails to stage changes Just retry later.	2012-10-02 18:04:06 -04:00
Joey Hess	9aab70de66	always check with ls-files before adding new files Makes it safe to use git annex unlock with the watcher/assistant. And also to mix use of the watcher/assistant with regular files stored in git. Long ago, I had avoided doing this check, except during the startup scan, because it would be slow to run ls-files repeatedly. But then I added the lsof check, and to make that fast, got it to detect batch file adds. So let's move the ls-files check to also occur when it'll have a batch, and can check them all with one call. This does slow down adding a single file by just a bit, but really only a little bit. (The lsof check is probably more expensive.) It also speeds up the startup scan, especially when there are lots of new files found by the scan. Also, fixed the sleep for annex.delayadd to not run while the threadstate lock is held, so it doesn't unnecessarily freeze everything else. Also, --force no longer makes it skip the lsof check, which was not documented, and seems never a good idea.	2012-10-02 17:41:23 -04:00
Joey Hess	64514a3db3	close unreproducible bug and remove expensive code added to debug it	2012-09-28 12:56:58 -04:00
Joey Hess	d50d89eb6f	support old versions of git that do not have --allow-empty-message	2012-09-19 12:58:53 -04:00
Joey Hess	c4e8591351	add missing --no-verify to prevent the pre-commit hook's git annex fix	2012-09-19 12:48:32 -04:00
Joey Hess	ba27483c6a	avoid making empty commits This doesn't avoid it sometimes attempting to commit when there are no changes. Typically that happens when a change is pushed in from another repo; the watcher sees the file and tries to stage it, resulting in an empty commit. Really fixing that would probably use more CPU than occasionally trying to make an empty commit. However, this does save a lot of unnecessary work, as those empty commits had to be synced out, which no longer happens.	2012-09-18 14:43:56 -04:00
Joey Hess	adf5195082	run current branch merge in annex monad I was seeing some interesting crashes after the previous commit, when making file changes slightly faster than the assistant could keep up. error: Ref refs/heads/master is at 7074f8e0a11110c532d06746e334f2fec6af6ab4 but expected 95ea86008d72a40d97a81cfc8fb47a0da92166bd fatal: cannot lock HEAD ref Committer crashed: git commit [Param "--allow-empty-message",Param "-m",Param "",Param "--allow-empty",Param "--quiet"] failed Pusher crashed: thread blocked indefinitely in an STM transaction Clearly the the merger ended up running at the same time as the committer, and with both modifying HEAD the committer crashed. I fixed that by making the Merger run its merge inside the annex monad, which avoids it running concurrently with other git operations. Also by making the committer not crash if git fails. What I don't understand is why the pusher then crashed with a STM deadlock. That must be in either the DaemonStatusHandle or the FailedPushMap, and the latter is only used by the pusher. Did the committer's crash somehow break STM? The BlockedIndefinitelyOnSTM exception is described as: -- \|The thread is waiting to retry an STM transaction, but there are no -- other references to any @TVar@s involved, so it can't ever continue. If the Committer had a reference to a TVar and crashed, I can sort of see this leading to that exception.. The crash was quite easy to reproduce after the previous commit, but after making the above change, I have yet to see it again. Here's hoping.	2012-09-17 22:04:43 -04:00
Joey Hess	df337bb63b	hlint	2012-09-13 00:57:52 -04:00
Joey Hess	a00f1d26bc	display errors when any named thread crashes	2012-09-06 14:56:04 -04:00
Joey Hess	8f1a9ef8b5	added an alert after a file transfer	2012-08-06 17:09:23 -04:00
Joey Hess	74fc9fcbe6	add alert when committing	2012-08-02 14:02:35 -04:00
Joey Hess	e21a32627f	avoid bogus alert errors	2012-08-02 13:57:34 -04:00
Joey Hess	191ee3b697	awesome alert combining Now an alert tracks files that have recently been added. As a large file is added, it will have its own alert, that then combines with the tracker when dones. Also used for combining sanity checker alerts, as it could possibly want to display a lot.	2012-08-02 09:03:04 -04:00
Joey Hess	ce7889ba86	debuggery	2012-07-29 14:10:17 -04:00
Joey Hess	c4023f7858	probably fixes http://git-annex.branchable.com/bugs/lsof__47__committer_thread_loops_occassionally/	2012-07-29 13:55:07 -04:00
Joey Hess	a9dbfdf28d	better transfer queue management Allow transfers to be added with blocking until the queue is sufficiently small. Better control over which end of the queue to add a transfer to.	2012-07-25 13:12:34 -04:00
Joey Hess	b48d7747a3	debugging improvements add timestamps to debug messages Add lots of debug output in the assistant's threads.	2012-07-20 19:29:59 -04:00
Joey Hess	83c66ccaf8	queue Uploads of newly added files to remotes Added knownRemotes to DaemonStatus. This list is not entirely trivial to calculate, and having it here should make it easier to add/remove remotes on the fly later on. It did require plumbing the daemonstatus through to some more threads.	2012-07-05 10:21:22 -06:00
Joey Hess	0b146f9ecc	reorg threads	2012-06-25 16:10:24 -04:00

1 2 3

124 commits