git-annex

Author	SHA1	Message	Date
Joey Hess	58c7b0a56d	assistant: Always batch changes found in startup scan. Batch detection is heuristic, so can sometimes fail. I observed one such failure while starting up in a repository with 87000 files. After the first several batches of ~5000 files, it fell out of batch mode, and never re-entered it, and so made many more commits of a few files at a time than necessary. So, let's always use batch mode when in the startup scan. This avoids the heuristic there, at least. There is clearly also room to improve the heuristic. Possibly 10 files is too high a bar to be found during a commit, on a system that can commit quickly.	2013-12-16 16:16:19 -04:00
Joey Hess	2066e90421	avoid needing --force on windows despite no lsof Note that I still need to think this through and make sure handling of open files is safe. This is just for testing purposes.	2013-12-09 16:56:15 -04:00
Joey Hess	13108b7196	assistant: Notice on startup when the index file is corrupt, and auto-repair.	2013-11-13 14:27:17 -04:00
Joey Hess	a1b1b5ef52	moved code out of webapp No code changes, aside from some changes to lifting in code that turned out to be able to run in Assistant rather than Handler.	2013-10-26 16:58:16 -04:00
Joey Hess	4f871f89ba	git-recover-repository 1/2 done	2013-10-20 17:50:51 -04:00
Joey Hess	635c9a1549	assistant: Detect stale git lock files at startup time, and remove them. Extends the index.lock handling to other git lock files. I surveyed all lock files used by git, and found more than I expected. All are handled the same in git; it leaves them open while doing the operation, possibly writing the new file content to the lock file, and then closes them when done. The gc.pid file is excluded because it won't affect the normal operation of the assistant, and waiting for a gc to finish on startup wouldn't be good. All threads except the webapp thread wait on the new startup sanity checker thread to complete, so they won't try to do things with git that fail due to stale lock files. The webapp thread mostly avoids doing that kind of thing itself. A few configurators might fail on lock files, but only if the user is explicitly trying to run them. The webapp needs to start immediately when the user has opened it, even if there are stale lock files. Arranging for the threads to wait on the startup sanity checker was a bit of a bear. Have to get all the NotificationHandles set up before the startup sanity checker runs, or they won't see its signal. Perhaps the NotificationBroadcaster is not the best interface to have used for this. Oh well, it works. This commit was sponsored by Michael Jakl	2013-10-05 17:04:21 -04:00
Joey Hess	93dbb7842e	watcher: Detect at startup time when there is a stale .git/lock, and remove it so it does not interfere with the automatic commits of changed files.	2013-10-03 16:57:21 -04:00
Joey Hess	3ac9c4e672	hlint	2013-10-02 22:59:07 -04:00
Joey Hess	b191d5c595	gitignore support for the assistant and watcher Requires git 1.8.4 or newer. When it's installed, a background git check-ignore process is run, and used to efficiently check ignores whenever a new file is added. Thanks to Adam Spiers, for getting the necessary support into git for this. A complication is what to do about files that are gitignored but have been checked into git anyway. git commands assume the ignore has been overridden in this case, and not need any more overriding to commit a changed version. However, for the assistant to do the same, it would have to run git ls-files to check if the ignored file is in git. This is somewhat expensive. Or it could use the running git-cat-file process to query the file that way, but that requires transferring the whole file content over a pipe, so it can be quite expensive too, for files that are not git-annex symlinks. Now imagine if the user knows that a file or directory tree will be getting frequent changes, and doesn't want the assistant to sync it, so gitignores it. The assistant could overload the system with repeated ls-files checks! So, I've decided that the assistant will not automatically commit changes to files that are gitignored. This is a tradeoff. Hopefully it won't be a problem to adjust .gitignore settings to not ignore files you want the assistant to autocommit, or to manually git annex add files that are listed in .gitignore. (This could be revisited if git-annex gets access to an interface to check the content of the index w/o forking a git command. This could be libgit2, or perhaps a separate git cat-file --batch-check process, so it wouldn't need to ship over the whole file content.) This commit was sponsored by Francois Marier. Thanks!	2013-08-02 20:37:03 -04:00
Joey Hess	ed4febb170	remove debug print	2013-05-25 00:02:39 -04:00
Joey Hess	b8e5b9c645	test suite passes in direct mode This fixes a bug with git annex add in direct mode. If some files already existed in the tree pointing at the same key as a file that was just added, and their content was not present, add neglected to copy the content to those files. I also changed the behavior of moveAnnex slightly: When content is moved into the annex in direct mode, it does not overwrite any content already present in direct mode files. That content may be modified after all.	2013-05-17 15:59:37 -04:00
Joey Hess	a9081ae473	optimise direct mode startup scan A recent change made existing symlinks be re-staged. That does not need to be done during the startup scan though.	2013-04-24 21:20:29 -04:00
Joey Hess	ebee93a837	get rid of need to run pre-commit hook when assistant commits in direct mode That hook updates associated file bookkeeping info for direct mode. But, everything already called addAssociatedFile when adding/changing a file. It only needed to also call removeAssociatedFile when deleting a file, or a directory. This should make bulk adds faster, by some possibly significant amount. Bulk removals may be a little slower, since it has to use catKeyFile now on each removed file, but will still be faster than adds.	2013-04-24 18:04:59 -04:00
Joey Hess	b8e45ec9d7	refactoring and minor performance tweak	2013-04-24 17:46:46 -04:00
Joey Hess	04a27ad926	assistant: Bug fix to avoid annexing the files that git uses to stand in for symlinks on FAT and other filesystem not supporting symlinks. also, blog for the day..	2013-04-10 19:57:26 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	6e7842475b	convert "./file" from inotify to just "file" This just prettifies some display.	2013-04-02 16:20:23 -04:00
Joey Hess	38d61f934d	Update working tree files fully atomically This avoids commit churn by the assistant when eg, replacing a file with a symlink. But, just as importantly, it prevents the working tree being left with a deleted file if git-annex, or perhaps the whole system, crashes at the wrong time. (It also probably avoids confusing displays in file managers.)	2013-04-02 15:02:00 -04:00
Joey Hess	8c52b20cc7	optimise last commit Rather than re-adding a direct mode file unnecessarily when it's not changed, just re-stage the symlink.	2013-04-02 12:58:56 -04:00
Joey Hess	31cbde8190	assistant: Fix bug that could cause direct mode files to be unstaged from git. My test case for this bug is to have the assistant running and syncing to a remote, and create a file in the annex. Then at the command line run git annex drop. The assistant sees that the file is gone, sees it's a wanted file, and downloads it from the remote. With a directory special remote and a small file, I was seeing around 1 time in 3, a race where the file got unstaged from git after it got downloaded. Looking at what direct mode content managing code does in this case, it deletes the symlink, and then adds the file content back. It would be possible, sometimes, to avoid removing the symlink and do this atomically. And I probably should.. but in some cases, particularly where the file needs to be run through `cp` (multiple direct mode files with same content), there's no way to atomically replace the symlink with the content. Anyway, the bug turns out to be something that the watcher does right for indirect mode, but not for direct mode. When it got an add event, it checked to see if this was a new file, or one we've already added. In the latter case, no add event was queued. But that means that only the rm event is queued, and so it unstages the file. Fixed by queueing an add event even when the file is already in git. Tested by running hundreds of drops in a loop; file remained staged.	2013-04-02 12:45:31 -04:00
Joey Hess	5771cfce02	assistant: Check small files into git directly.	2013-03-29 16:54:59 -04:00
Joey Hess	67e817c6a1	New annex.largefiles setting, which configures which files `git annex add` and the assistant add to the annex. I would have sort of liked to put this in .gitattributes, but it seems it does not support multi-word attribute values. Also, making this a single config setting makes it easy to only parse the expression once. A natural next step would be to make the assistant `git add` files that are not annex.largefiles. OTOH, I don't think `git annex add` should `git add` such files, because git-annex command line tools are not in the business of wrapping git command line tools.	2013-03-29 16:17:13 -04:00
Joey Hess	f340fd324c	synthesize RmChange when a directory is deleted This gets directory renames closer to being fully detected. There's close to no extra overhead to doing it this way.	2013-03-11 15:14:42 -04:00
Joey Hess	74f723bb50	let's put type modules under the parent module, not in a Types directory	2013-03-10 22:24:13 -04:00
Joey Hess	65a4c7966f	moved transfer queueing out of watcher and into committer This cleaned up the code quite a bit; now the committer just looks at the Change to see if it's a change that needs to have a transfer queued for it. If I later want to add dropping keys for files that were removed, or something like that, this should make it straightforward. This also fixes a bug. In direct mode, moving a file out of an archive directory failed to start a transfer to get its content. The problem was that the file had not been committed to git yet, and so the transfer code didn't want to touch it, since fileKey failed to get its key. Only starting transfers after a commit avoids this problem.	2013-03-10 18:16:03 -04:00
Joey Hess	c908672f3d	fix another potential race with the watcher and direct mode Watcher wants to rewrite symlink to fix it. But in direct mode, the symlink could be replaced at any time with file content that has finished being transferred by some other process. So, just don't touch it. FWIW, I audited the rest of the assistant for places where it removes files, and the rest is ok. I have not audited the rest of git-annex.	2013-03-04 15:09:32 -04:00
Joey Hess	1d388d5579	fixed the race breaking moving files from archive in direct mode assistant: Fix bug in direct mode that could occur when a symlink is moved out of an archive directory, and resulted in the file not being set to direct mode when it was transferred. The bug was that the direct mode mapping was not up-to-date when the transferrer finished. So, finding no direct mode place to store the object, it was put into .git/annex in indirect mode. To fix this, just make the watcher update the direct mode mapping to include the new file before it starts the transfer. (Seems we don't need to update it to remove the old file if the link was moved, because the direct mode code will notice it's not present and the mapping gets updated for its removal later.) The reason this was a race, and was probably not seen often is because the committer came along and updated the direct mode mapping as part of adding the moved symlink. But when the file was sufficiently small or the remote sufficiently fast, this could happen after the transfer finished.	2013-03-04 14:25:22 -04:00
Joey Hess	a733271a9c	add additional debug info about reasons for drops	2013-03-01 15:58:44 -04:00
Joey Hess	46c9cbeb1e	add additional debug info about reasons for transfers	2013-03-01 15:23:59 -04:00
Joey Hess	08854afa10	fix inverted logic	2013-02-22 17:01:48 -04:00
Joey Hess	d7c93b8913	fully support core.symlinks=false in all relevant symlink handling code Refactored annex link code into nice clean new library. Audited and dealt with calls to createSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ createSymbolicLink linktarget file only when core.symlinks=true Assistant/WebApp/Configurators/Local.hs: createSymbolicLink link link test if symlinks can be made Command/Fix.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/FromKey.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/Indirect.hs: liftIO $ createSymbolicLink l f refuses to run if core.symlinks=false Init.hs: createSymbolicLink f f2 test if symlinks can be made Remote/Directory.hs: go [file] = catchBoolIO $ createSymbolicLink file f >> return True fast key linking; catches failure to make symlink and falls back to copy Remote/Git.hs: liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True ditto Upgrade/V1.hs: liftIO $ createSymbolicLink link f v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to readSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ catchMaybeIO $ readSymbolicLink file only when core.symlinks=true Assistant/Threads/Watcher.hs: ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) code that fixes real symlinks when inotify sees them It's ok to not fix psdueo-symlinks. Assistant/Threads/Watcher.hs: mlink <- liftIO (catchMaybeIO $ readSymbolicLink file) ditto Command/Fix.hs: stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do command only works in indirect mode Upgrade/V1.hs: getsymlink = takeFileName <$> readSymbolicLink file v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to isSymbolicLink. (Typically used with getSymbolicLinkStatus, but that is just used because getFileStatus is not as robust; it also works on pseudolinks.) Remaining calls are all safe, because: Assistant/Threads/SanityChecker.hs: \| isSymbolicLink s -> addsymlink file ms only handles staging of symlinks that were somehow not staged (might need to be updated to support pseudolinks, but this is only a belt-and-suspenders check anyway, and I've never seen the code run) Command/Add.hs: if isSymbolicLink s \|\| not (isRegularFile s) avoids adding symlinks to the annex, so not relevant Command/Indirect.hs: \| isSymbolicLink s -> void $ flip whenAnnexed f $ only allowed on systems that support symlinks Command/Indirect.hs: whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do ditto Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f used to find unlocked files, only relevant in indirect mode Utility/FSEvents.hs: \| Files.isSymbolicLink s = runhook addSymlinkHook $ Just s Utility/FSEvents.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: checkfiletype Files.isSymbolicLink addSymlinkHook f Utility/Kqueue.hs: \| Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change all above are lower-level, not relevant Audited and dealt with calls to isSymLink. Remaining calls are all safe, because: Annex/Direct.hs: \| isSymLink (getmode item) = This is looking at git diff-tree objects, not files on disk Command/Unused.hs: \| isSymLink (LsTree.mode l) = do This is looking at git ls-tree, not file on disk Utility/FileMode.hs:isSymLink :: FileMode -> Bool Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode low-level Done!!	2013-02-17 16:43:14 -04:00
Joey Hess	a261412c25	close	2013-01-28 15:39:51 +11:00
Joey Hess	a8bb2749b2	assistant: Ignore .DS_Store on OSX.	2013-01-28 15:13:22 +11:00
Joey Hess	5cd152b8a9	annex.autocommit New setting, can be used to disable autocommit of changed files by the assistant, while it still does data syncing and other tasks. Also wired into webapp UI	2013-01-27 22:43:05 +11:00
Joey Hess	76ddf9b6d3	webapp: Now allows restarting any threads that crash.	2013-01-26 17:09:33 +11:00
Joey Hess	d7ca6fb856	webapp: Now always logs to .git/annex/daemon.log It used to not log to daemon.log when a repository was first created, and when starting the webapp. Now both do. Redirecting stdout and stderr to the log is tricky when starting the webapp, because the web browser may want to communicate with the user. (Either a console web browser, or web.browser = echo) This is handled by restoring the original fds when running the browser.	2013-01-15 13:34:59 -04:00
Joey Hess	aedfcde969	guard readSymbolicLink throws an exception if the file is not a symlink	2013-01-05 16:07:27 -04:00
Joey Hess	1cdf2b923d	assistant: Make expensive transfer scan work fully in direct mode. The expensive scan uses lookupFile, but in direct mode, that doesn't work for files that are present. So the scan was not finding things that are present that need to be uploaded. (It did find things not present that needed to be downloaded.) Now lookupFile also works in direct mode. Note that it still prefers symlinks on disk to info committed to git, in direct mode. This is necessary to make things like Assistant.Threads.Watcher.onAddSymlink work correctly, when given a new symlink not yet checked into git (or replacing a file checked into git).	2013-01-05 15:57:53 -04:00
Joey Hess	8cc27b8afc	avoid double commits with inotify when direct mode file is created	2012-12-29 14:58:13 -04:00
Joey Hess	bf3270c5b7	add missing modifyHook for watcher Needed for FSEvents, which calls that hook for modified files. inotify seems to call the add hook, so I didn't notice it before.	2012-12-28 16:00:45 -04:00
Joey Hess	eb40227d15	assistant direct mode file add/change bookkeeping When a file is changed in direct mode, the old content is probably lost (at least from the local repo), and bookeeping needs to be updated to reflect this. Also, synthetic add events are generated at assistant startup, so make it detect when the file has not really changed, and avoid re-adding it. This does add the overhead of querying the runing git cat-file for the key that's recorded in git for the file, each time a file is added or modified in direct mode.	2012-12-25 15:48:15 -04:00
Joey Hess	cc5140d295	assistant adding of modified files in direct mode Works with inotify, but I think in kqueue we don't get events existing files that get modified.	2012-12-24 14:42:19 -04:00
Joey Hess	95db595e91	make startup scan for deleted files work in direct mode git add --update cannot be used, because it'll stage typechanged direct mode files. Intead, use ls-files to find deleted files, and stage them ourselves. It seems that no commit was made before when the scan staged deleted files. (Probably masked since if files were added, a commit happened then..) Now that I'm doing the staging, I was also able to fix that bug.	2012-12-24 14:24:13 -04:00
Joey Hess	c6d2bbe402	assistant adding of files in direct mode	2012-12-24 13:37:29 -04:00
Joey Hess	82617b92e9	move thirdparty program installation for standalone bundle into haskell program This allows it to use Build.SysConfig to always install the programs configure detected. Amoung other fixes, this ensures the right uuid generator and checksum programs are installed. I also cleaned up the handling of lsof's path; configure now checks for it in PATH, but falls back to looking for it in sbin directories.	2012-12-14 16:07:59 -04:00
Joey Hess	463cf58140	webapp and assistant glacier support	2012-11-24 16:30:15 -04:00
Joey Hess	c282c8b492	queue uploads when a new or renamed symlink is handled	2012-11-24 15:38:24 -04:00
Joey Hess	93ffd47d76	finished pushing Assistant monad into all relevant files All temporary and old functions are removed.	2012-10-30 17:14:51 -04:00
Joey Hess	47d94eb9a4	pushed Assistant monad down into DaemonStatus code Currently have three old versions of functions that more reworking is needed to remove: getDaemonStatusOld, modifyDaemonStatusOld_, and modifyDaemonStatusOld	2012-10-30 15:39:15 -04:00
Joey Hess	ea8df8fe9f	cleanup daemonStatus accessors	2012-10-30 14:44:18 -04:00

1 2

80 commits