git-annex

Author	SHA1	Message	Date
Joey Hess	ec98581112	notice deleted files on startup	2012-06-04 18:14:42 -04:00
Joey Hess	5b4e5ce7e5	deletion When a new file is annexed, a deletion event occurs when it's moved away to be replaced by a symlink. Most of the time, there is no problimatic race, because the same thread runs the add event as the deletion event. So, once the symlink is in place, the deletion code won't run at all, due to existing checks that a deleted file is really gone. But there is a race at startup, as then the inotify thread is running at the same time as the main thread, which does the initial tree walking and annexing. It would be possible for the deletion inotify to run in a perfect race with the addition, and remove the newly added symlink from the git cache. To solve this race, added event serialization via a MVar. We putMVar before running each event, which blocks if an event is already running. And when an event finishes (or crashes!), we takeMVar to free the lock. Also, make rm -rf not spew warnings by passing --ignore-unmatch when deleting directories.	2012-06-04 18:09:18 -04:00
Joey Hess	659e6b1324	suppress "recording state in git" message during add	2012-06-04 17:18:54 -04:00
Joey Hess	677ad74687	add handling of symlink addition events And just like that, annexed files can be moved and copies around within the tree, and are automatically fixed to point to the content, and staged in git. Huzzah! Delete still remains TODO, with its troublesome race during add..	2012-06-04 15:10:43 -04:00
Joey Hess	7053f5f947	handle directory deletion When a directory is deleted, or moved away, git rm -r it to stage the deletion.	2012-06-04 13:30:30 -04:00
Joey Hess	23dbff4b43	add events for symlink creation and directory removal Improved the inotify code, so it will also notice directory removal and symlink creation. In the watch code, optimised away a stat of a file that's being added, that's done by Command.Add.start. This is the reason symlink creation is handled separately from file creation, since during initial tree walk at startup, a stat was already done, and can be reused.	2012-06-04 13:22:56 -04:00
Joey Hess	eab3872d91	Merge branch 'master' into watch	2012-06-04 12:07:59 -04:00
Joey Hess	3a10095d40	import: New subcommand, pulls files from a directory outside the annex and adds them Use case for this was developed somewhere on the Transiberian Railroad.	2012-05-31 19:47:18 -04:00
Joey Hess	65977a5584	lock: Reset unlocked file to index, rather than to branch head. Resetting an unlocked file to the branch head failed if it had just been added, not committed, and unlocked, since the branch didbn't have it. The code was concerned about dropping any changes that might be staged in the index, but I cannot see why.	2012-05-30 17:01:22 -04:00
Joey Hess	6e213d04f1	sync: Show a nicer message if a user tries to sync to a special remote.	2012-05-27 20:55:56 -04:00
Joey Hess	bb4f31a0ee	Clean up handling of git directory and git worktree. Baked into the code was an assumption that a repository's git directory could be determined by adding ".git" to its work tree (or nothing for bare repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are used to separate the two. This was attacked at the type level, by storing the gitdir and worktree separately, so Nothing for the worktree means a bare repo. A complication arose because we don't learn where a repository is bare until its configuration is read. So another Location type handles repositories that have not had their config read yet. I am not entirely happy with this being a Location type, rather than representing them entirely separate from the Git type. The new code is not worse than the old, but better types could enforce more safety. Added support for core.worktree. Overriding it with -c isn't supported because it's not really clear what to do if a git repo's config is read, is not bare, and is then overridden to bare. What is the right git directory in this case? I will worry about this if/when someone has a use case for overriding core.worktree with -c. (See Git.Config.updateLocation) Also removed and renamed some functions like gitDir and workTree that misused git's terminology. One minor regression is known: git annex add in a bare repository does not print a nice error message, but runs git ls-files in a way that fails earlier with a less nice error message. This is because before --work-tree was always passed to git commands, even in a bare repo, while now it's not.	2012-05-18 17:03:12 -04:00
Joey Hess	f7d8982672	Fix use of several config settings annex.ssh-options, annex.rsync-options, annex.bup-split-options. And adjust types to avoid the bugs that broke several config settings recently. Now "annex." prefixing is enforced at the type level.	2012-05-05 20:16:56 -04:00
Joey Hess	392931eca9	addunused: New command, the opposite of dropunused, it relinks unused content into the git repository.	2012-05-02 14:59:05 -04:00
Joey Hess	8f45300479	dropunused: Allow specifying ranges to drop. Sort of by popular demand, but the last straw for not using seq was that it can run into command line length limits.	2012-05-02 13:15:19 -04:00
Joey Hess	0c9c14b52f	percentage library	2012-04-29 17:48:07 -04:00
Joey Hess	d2bfba6324	show percent the bloom filter is full	2012-04-29 16:10:47 -04:00
Joey Hess	eedde34549	show amount of reserved space	2012-04-23 10:37:05 -04:00
Joey Hess	84ac8c58db	Add annex.httpheaders and annex.httpheader-command config settings Allow custom headers to be sent with all HTTP requests. (Requested by the Internet Archive)	2012-04-22 01:13:09 -04:00
Joey Hess	ed79596b75	noop	2012-04-21 23:32:33 -04:00
Joey Hess	7e45712d19	better file mode setting code	2012-04-21 16:01:56 -04:00
Joey Hess	b4a5e39ee6	Support git's core.sharedRepository configuration This is incomplete, it does not honor it yet for hash directories and other annex bookkeeping files. Some of that is not needed for a bare repo; some of it may be.	2012-04-21 15:36:52 -04:00
Joey Hess	262017e17d	export a more generalized checkDiskSpace	2012-04-20 16:06:10 -04:00
Joey Hess	d5ffd2d99d	watch subcommand So far this only handles auto-annexing new files that are created inside the repository while it's running. To make this really useful, it needs to at least: - notice deleted files and stage the deletion (tricky; there's a race with add..) - notice renamed files, auto-fix the symlink, and stage the new file location - periodically auto-commit staged changes - honor .gitignore, not adding files it excludes Also nice to have would be: - Somehow sync remotes, possibly using a push sync like dvcs-autosync does, so they are immediately updated. - Somehow get content that is unavilable. This is problimatic with inotify, since we only get an event once the user has tried (and failed) to read from the file. Perhaps instead, automatically copy content that is added out to remotes, with the goal of all repos eventually getting a copy, if df allows. - Drop files that have not been used lately, or meet some other criteria (as long as there's a copy elsewhere). - Perhaps automatically dropunused files that have been deleted, although I cannot see a way to do that, since by the time the inotify deletion event arrives, the file is deleted, and we cannot see what its symlink pointed to! Alternatievely, perhaps automatically do an expensive unused/dropunused cleanup process. Some of this probably needs the currently stateless threads to maintain a common state.	2012-04-12 17:42:05 -04:00
Joey Hess	fcc08c59ec	use unabbreviated size units in status	2012-04-06 14:54:41 -04:00
Joey Hess	e38a839a80	Rewrote free disk space checking code Moving the portability handling into a small C library cleans up things a lot, avoiding the pain of unpacking structs from inside haskell code.	2012-03-22 17:32:47 -04:00
Joey Hess	f1398b5583	use new getConfig	2012-03-22 17:32:47 -04:00
Joey Hess	4eb5112681	rationalize getConfig getConfig got a remote-specific config, and this confusing name caused it to be used a couple of places that only were interested in global configs. Rename to getRemoteConfig and make getConfig only get global configs. There are no behavior changes here, but remote.<name>.annex-web-options never actually worked (and per-remote web options is a very unlikely to be useful case so I didn't make it work), so fix the documentation for it.	2012-03-22 17:32:47 -04:00
Joey Hess	52b90e5d4c	tweak	2012-03-22 17:32:47 -04:00
Joey Hess	188e2edc41	status: Prints available local disk space, or shows if git-annex doesn't know.	2012-03-21 21:55:02 -04:00
Joey Hess	a362c46b70	fun with symbols Nothing at all on hackage is using <&&> or <\|\|>. (Also, <&&> should short-circuit on failure.)	2012-03-17 00:38:40 -04:00
Joey Hess	771052a85e	optimize monadic \|\| (\|\|) used applicative style runs both conditions rather than short circuiting. Add an orM that properly short-circuits.	2012-03-16 12:28:17 -04:00
Joey Hess	60ab3d84e1	added ifM and nuked 11 lines of code no behavior changes	2012-03-14 17:43:34 -04:00
Joey Hess	342fc28437	Merge branch 'master' into bloom Conflicts: Command/Commit.hs debian/changelog	2012-03-14 12:41:48 -04:00
Joey Hess	6cb4743cfb	ignore hook exit status	2012-03-14 12:41:00 -04:00
Joey Hess	5b869eef91	git-annex-shell: Runs hooks/annex-content after content is received or dropped.	2012-03-14 12:18:10 -04:00
Joey Hess	caf97fcffd	git-annex-shell: Runs hooks/annex-content after content is received or dropped.	2012-03-14 12:01:56 -04:00
Joey Hess	94aff8b878	Merge branch 'master' into bloom Conflicts: debian/changelog	2012-03-12 16:32:29 -04:00
Joey Hess	25809ce2e0	finish bloom filters Add tuning, docs, etc. Not sure if status is the right place to remote size.. perhaps unused should report the size and also warn if it sees more keys than the bloom filter allows?	2012-03-12 16:18:35 -04:00
Joey Hess	faf3a94fa7	added second stage bloom filter	2012-03-12 15:21:58 -04:00
Joey Hess	32f9742a88	fixed bloom filter creation space leak it works!	2012-03-12 14:09:43 -04:00
Joey Hess	160715166b	try at using bloom filters leaks memory	2012-03-12 02:39:25 -04:00
Joey Hess	89ee70c43a	status: More accurate display of sizes of tmp and bad keys. Can't trust the key size to be accurate for tmp and bad keys, so check actual file size. In the wild I saw the old code be wrong by a factor of about 100! If all tmp/bad keys are empty, they're not shown in status at all. Showing 0 bytes and suggesting to clean it up seemed weird..	2012-03-12 00:41:48 -04:00
Joey Hess	83bbb3bc93	prettify	2012-03-11 21:21:51 -04:00
Joey Hess	5df18b311a	avoid needing to keep list of present keys Stale and bad files are rare, so it's more efficient to use inAnnex to see if they can be deleted, rather than keeping the list of all present keys around for them.	2012-03-11 20:46:03 -04:00
Joey Hess	ff3644ad38	status: Fixed to run in nearly constant space. Before, it leaked space due to caching lists of keys. Now all necessary data about keys is calculated as they stream in. The "nearly constant" is due to getKeysPresent, which builds up a lot of [] thunks as it traverses .git/annex/objects/. Will deal with it later.	2012-03-11 17:15:58 -04:00
Joey Hess	b086e32c63	unused: Reduce memory usage significantly. Much of the memory bloat turned out to be due to getKeysReferenced containing a mapM, which is strict and buffered the whole list rather than streaming it. The other half of the bloat was due to building a temporary Set in order to call S.difference. While that is more cpu efficient, I switched to successive S.delete, since with it, I can run a whole git annex unused in less than 8 mb of memory. The whole Set of keys with content available is still stored in memory, so running unused in a repo with a whole lot of file content will still use more memory. In a repo containing 6000 files, it needed 40 mb. Note that the status command still uses the bloatful getKeysReferenced.	2012-03-11 16:24:07 -04:00
Joey Hess	997e29f294	sync: Sync to lower cost remotes first. This has two benefits. 1. When a lot of refs are going to be received, get them via lower cost connection when possible. 2. Allows ctrl-c of sync after the cheaper remotes have been pulled from (or pushed to).	2012-03-10 15:37:38 -04:00
Joey Hess	5ab82230f7	fsck: Fix up any broken links and misplaced content caused by the directory hash calculation bug fixed in the last release.	2012-03-10 14:46:21 -04:00
Joey Hess	dc9049373e	cleanup	2012-03-06 14:12:15 -04:00
Joey Hess	1098bc37ab	"here" can be used to refer to the current repository, which can read better than the old "." (which still works too).	2012-03-01 22:35:10 -04:00

1 2 3 4 5 ...

512 commits