git-annex

Author	SHA1	Message	Date
Joey Hess	cb2255e93a	do fewer commits during long batch jobs 10 thousand queue size does not use appreciable memory in my testing.	2012-06-12 16:25:56 -04:00
Joey Hess	b240418acc	better optimisation of add check Now really only done in the startup scan. It turns out to be quite hard for event handlers to know when the startup scan is complete. I tried to make addWatch pass that info, but found threading the state very difficult. For now, a quick hack, using the fast flag. Note that it's actually possible for inotify events to come in while the startup scan is still ongoing. Due to my hack, the expensive check will be done for files added in such inotify events.	2012-06-12 16:24:06 -04:00
Joey Hess	7d2c813396	fix bug that turned files already in git into symlinks This requires a relatively expensive test at file add time to see if it's in git already. But it can be optimised to only happen during the startup scan.	2012-06-12 15:57:24 -04:00
Joey Hess	535d9e4998	add a flag indicating if an event was synthesized during initial dir scan	2012-06-12 14:34:09 -04:00
Joey Hess	d3b9b32f21	cleanup	2012-06-12 13:54:00 -04:00
Joey Hess	942d8f7298	hlint	2012-06-12 11:32:06 -04:00
Joey Hess	d3a6f04abf	update	2012-06-11 15:41:26 -04:00
Joey Hess	7f3934520a	avoid using STM while the MVar is held I thought this might be a lock conflict that explains the deadlock when built with -threaded, but it seems not.. it still locks! It even locks without the committer thread. Indeed, it locks when running "git annex add"! -threaded is exposing some other problem. Still, this seems conceptually cleaner and did not add any inneficiencies. Also added some high-level documentation about the threads used.	2012-06-11 15:29:11 -04:00
Joey Hess	f7dbcd58ff	tweak	2012-06-11 14:24:13 -04:00
Joey Hess	a5a3cd55ac	Merge branch 'master' into watch Conflicts: debian/changelog	2012-06-11 12:13:07 -04:00
Joey Hess	7f70767bfb	uninit: Refuse to run in a subdirectory. Closes: #677076	2012-06-11 10:33:58 -04:00
Joey Hess	d0a0a6ae21	git annex watch --stop	2012-06-11 02:01:20 -04:00
Joey Hess	0b3e2bed78	add a pid file Writes pid to a file. Is supposed to take an exclusive lock, but that's not working, and it's too late for me to understand why.	2012-06-11 01:20:19 -04:00
Joey Hess	d5884388b0	daemonize git annex watch	2012-06-11 00:39:09 -04:00
Joey Hess	ca9ee21bd7	crazy optimisation Crazy like a fox..	2012-06-10 19:58:34 -04:00
Joey Hess	c1b432ee54	run git add --update after inotify is started This way, there's no window where deleted files won't be noticed.	2012-06-10 19:10:18 -04:00
Joey Hess	aae0ba1995	fixed the double commits problem	2012-06-10 18:41:05 -04:00
Joey Hess	fc0dd79774	avoid running pre-commit hook from watch commits	2012-06-10 17:53:17 -04:00
Joey Hess	cda6c4dff5	tweak	2012-06-10 17:40:35 -04:00
Joey Hess	2de50f733a	smart commit thread The commit thread now has access to a channel containing the times of all uncommitted changes. This lets it be smart about detecting busy times when a batch job is running (such as rm -rf, or untarring something, etc), and avoid committing until it's done. While at the same time, instantly committing one-off changes that the user is going to expect to see immediately. I had to use STM to implement the channel, because of http://hackage.haskell.org/trac/ghc/ticket/4154 While this adds a dependency, I always wanted to use STM, so this actually makes me happy. ;) Also happy that shouldCommit is a pure function, so other commit smartness strategies can easily be played with. Although the current one seems pretty good. There is one bug, for some reason it does double commits, every time.	2012-06-10 16:07:48 -04:00
Joey Hess	6e54907e35	add a thread to commit changes Currently the stupidest possible version, just wakes up every second, and may make empty commits sometimes.	2012-06-10 13:56:39 -04:00
Joey Hess	e5f855b7f8	generalize and improve state MVar code	2012-06-10 13:23:10 -04:00
Joey Hess	5308b51ec0	stage deletions directly using update-index no need to run git-rm separately	2012-06-10 13:05:58 -04:00
Joey Hess	7f823b56af	fix non-linux build	2012-06-09 14:06:56 -04:00
Joey Hess	d45a9a7831	refactor and function name cleanup (oops, I had a calcMerge and a calc_merge!)	2012-06-08 00:29:39 -04:00
Joey Hess	7d78cbf97c	use git queue for rm too	2012-06-07 21:17:10 -04:00
Joey Hess	20f425be19	make watch use the queue May not work. Certianly needs to flush the queue from time to time when only symlink changes are being made.	2012-06-07 15:40:44 -04:00
Joey Hess	0a11b35d89	extend Git.Queue to be able to queue more than simple git commands While I was in there, I noticed and fixed a bug in the queue size calculations. It was never encountered only because Queue.add was only ever run with 1 file in the list.	2012-06-07 15:19:44 -04:00
Joey Hess	727158ff55	Merge branch 'master' into watch	2012-06-07 13:48:55 -04:00
Joey Hess	4d1c114e4d	initremote: Automatically describe a remote when creating it. This ensures that all special remotes show up in git annex status. Before, a special remote that was not manually described, and was not a current git remote, did not show up there, although initremote did list it.	2012-06-07 11:16:48 -04:00
Joey Hess	d5de27ff40	tweak	2012-06-06 23:30:38 -04:00
Joey Hess	b8ae9528ab	refactor	2012-06-06 23:20:09 -04:00
Joey Hess	b8f85f7a82	build watch on non-linux, just don't do anything	2012-06-06 22:49:32 -04:00
Joey Hess	c5b11561f0	handle running out of watch descriptors	2012-06-06 16:50:28 -04:00
Joey Hess	db8effb8f3	ignore .gitignore and .gitattributes	2012-06-06 15:50:12 -04:00
Joey Hess	b819f644ad	close the git add race There's a race adding a new file to the annex: The file is moved to the annex and replaced with a symlink, and then we git add the symlink. If someone comes along in the meantime and replaces the symlink with something else, such as a new large file, we add that instead. Which could be bad.. This race is fixed by avoiding using git add, instead the symlink is directly staged into the index. It would be nice to make `git annex add` use this same technique. I have not done so yet because it currently runs git update-index once per file, which would slow does `git annex add`. A future enhancement would be to extend the Git.Queue to include the ability to run update-index with a list of Streamers.	2012-06-06 14:29:10 -04:00
Joey Hess	993e6459a3	factor out nukeFile	2012-06-06 13:13:13 -04:00
Joey Hess	723eb19bbf	split out utility functions	2012-06-06 13:07:30 -04:00
Joey Hess	a7a729bce4	Merge branch 'master' into watch	2012-06-05 20:30:37 -04:00
Joey Hess	c981ccc077	add: Prevent (most) modifications from being made to a file while it is being added to the annex. Anything that tries to open the file for write, or delete the file, or replace it with something else, will not affect the add. Only if a process has the file open for write before add starts can it still change it while (or after) it's added to the annex. (fsck will catch this later of course)	2012-06-05 20:28:34 -04:00
Joey Hess	5809f33f8b	use createAnnexDirectory when setting up tmp dir	2012-06-05 20:25:32 -04:00
Joey Hess	d3cee987ca	separate source of content from the filename associated with the key when generating a key This already made migrate's code a lot simpler.	2012-06-05 19:51:03 -04:00
Joey Hess	cbdaccd44a	run event handlers all in the same Annex monad Uses a MVar again, as there seems no other way to thread the state through inotify events. This is a rather unsatisfactory result. I had wanted to run them in the same monad so that the git queue could be used to coleasce git commands and speed things up. But, that led to fragility: If several files are added, and one is removed before queue flush, git add will fail to add any of them. So, the queue is still explicitly flushed after each add for now. TODO: Investigate using git add --ignore-errors. This would need to be done in Command.Add. And, git add still exits nonzero with it, so would need to avoid crashing on queue flush.	2012-06-04 21:21:52 -04:00
Joey Hess	48efa2d2d3	avoid explicit queue flush The queue is still flushed on add, because each add event is handled by a separate Annex monad. That needs to be fixed to speed up add a lot.	2012-06-04 20:44:15 -04:00
Joey Hess	bd7857d903	ignore-unmatch when removing a staged file When a file is added, and then deleted before the add action runs, the delete event was unhappy that the file never did get staged.	2012-06-04 20:13:25 -04:00
Joey Hess	cbf16f1967	refactor	2012-06-04 19:43:29 -04:00
Joey Hess	ec98581112	notice deleted files on startup	2012-06-04 18:14:42 -04:00
Joey Hess	5b4e5ce7e5	deletion When a new file is annexed, a deletion event occurs when it's moved away to be replaced by a symlink. Most of the time, there is no problimatic race, because the same thread runs the add event as the deletion event. So, once the symlink is in place, the deletion code won't run at all, due to existing checks that a deleted file is really gone. But there is a race at startup, as then the inotify thread is running at the same time as the main thread, which does the initial tree walking and annexing. It would be possible for the deletion inotify to run in a perfect race with the addition, and remove the newly added symlink from the git cache. To solve this race, added event serialization via a MVar. We putMVar before running each event, which blocks if an event is already running. And when an event finishes (or crashes!), we takeMVar to free the lock. Also, make rm -rf not spew warnings by passing --ignore-unmatch when deleting directories.	2012-06-04 18:09:18 -04:00
Joey Hess	659e6b1324	suppress "recording state in git" message during add	2012-06-04 17:18:54 -04:00
Joey Hess	677ad74687	add handling of symlink addition events And just like that, annexed files can be moved and copies around within the tree, and are automatically fixed to point to the content, and staged in git. Huzzah! Delete still remains TODO, with its troublesome race during add..	2012-06-04 15:10:43 -04:00
Joey Hess	7053f5f947	handle directory deletion When a directory is deleted, or moved away, git rm -r it to stage the deletion.	2012-06-04 13:30:30 -04:00
Joey Hess	23dbff4b43	add events for symlink creation and directory removal Improved the inotify code, so it will also notice directory removal and symlink creation. In the watch code, optimised away a stat of a file that's being added, that's done by Command.Add.start. This is the reason symlink creation is handled separately from file creation, since during initial tree walk at startup, a stat was already done, and can be reused.	2012-06-04 13:22:56 -04:00
Joey Hess	eab3872d91	Merge branch 'master' into watch	2012-06-04 12:07:59 -04:00
Joey Hess	3a10095d40	import: New subcommand, pulls files from a directory outside the annex and adds them Use case for this was developed somewhere on the Transiberian Railroad.	2012-05-31 19:47:18 -04:00
Joey Hess	65977a5584	lock: Reset unlocked file to index, rather than to branch head. Resetting an unlocked file to the branch head failed if it had just been added, not committed, and unlocked, since the branch didbn't have it. The code was concerned about dropping any changes that might be staged in the index, but I cannot see why.	2012-05-30 17:01:22 -04:00
Joey Hess	6e213d04f1	sync: Show a nicer message if a user tries to sync to a special remote.	2012-05-27 20:55:56 -04:00
Joey Hess	bb4f31a0ee	Clean up handling of git directory and git worktree. Baked into the code was an assumption that a repository's git directory could be determined by adding ".git" to its work tree (or nothing for bare repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are used to separate the two. This was attacked at the type level, by storing the gitdir and worktree separately, so Nothing for the worktree means a bare repo. A complication arose because we don't learn where a repository is bare until its configuration is read. So another Location type handles repositories that have not had their config read yet. I am not entirely happy with this being a Location type, rather than representing them entirely separate from the Git type. The new code is not worse than the old, but better types could enforce more safety. Added support for core.worktree. Overriding it with -c isn't supported because it's not really clear what to do if a git repo's config is read, is not bare, and is then overridden to bare. What is the right git directory in this case? I will worry about this if/when someone has a use case for overriding core.worktree with -c. (See Git.Config.updateLocation) Also removed and renamed some functions like gitDir and workTree that misused git's terminology. One minor regression is known: git annex add in a bare repository does not print a nice error message, but runs git ls-files in a way that fails earlier with a less nice error message. This is because before --work-tree was always passed to git commands, even in a bare repo, while now it's not.	2012-05-18 17:03:12 -04:00
Joey Hess	f7d8982672	Fix use of several config settings annex.ssh-options, annex.rsync-options, annex.bup-split-options. And adjust types to avoid the bugs that broke several config settings recently. Now "annex." prefixing is enforced at the type level.	2012-05-05 20:16:56 -04:00
Joey Hess	392931eca9	addunused: New command, the opposite of dropunused, it relinks unused content into the git repository.	2012-05-02 14:59:05 -04:00
Joey Hess	8f45300479	dropunused: Allow specifying ranges to drop. Sort of by popular demand, but the last straw for not using seq was that it can run into command line length limits.	2012-05-02 13:15:19 -04:00
Joey Hess	0c9c14b52f	percentage library	2012-04-29 17:48:07 -04:00
Joey Hess	d2bfba6324	show percent the bloom filter is full	2012-04-29 16:10:47 -04:00
Joey Hess	eedde34549	show amount of reserved space	2012-04-23 10:37:05 -04:00
Joey Hess	84ac8c58db	Add annex.httpheaders and annex.httpheader-command config settings Allow custom headers to be sent with all HTTP requests. (Requested by the Internet Archive)	2012-04-22 01:13:09 -04:00
Joey Hess	ed79596b75	noop	2012-04-21 23:32:33 -04:00
Joey Hess	7e45712d19	better file mode setting code	2012-04-21 16:01:56 -04:00
Joey Hess	b4a5e39ee6	Support git's core.sharedRepository configuration This is incomplete, it does not honor it yet for hash directories and other annex bookkeeping files. Some of that is not needed for a bare repo; some of it may be.	2012-04-21 15:36:52 -04:00
Joey Hess	262017e17d	export a more generalized checkDiskSpace	2012-04-20 16:06:10 -04:00
Joey Hess	d5ffd2d99d	watch subcommand So far this only handles auto-annexing new files that are created inside the repository while it's running. To make this really useful, it needs to at least: - notice deleted files and stage the deletion (tricky; there's a race with add..) - notice renamed files, auto-fix the symlink, and stage the new file location - periodically auto-commit staged changes - honor .gitignore, not adding files it excludes Also nice to have would be: - Somehow sync remotes, possibly using a push sync like dvcs-autosync does, so they are immediately updated. - Somehow get content that is unavilable. This is problimatic with inotify, since we only get an event once the user has tried (and failed) to read from the file. Perhaps instead, automatically copy content that is added out to remotes, with the goal of all repos eventually getting a copy, if df allows. - Drop files that have not been used lately, or meet some other criteria (as long as there's a copy elsewhere). - Perhaps automatically dropunused files that have been deleted, although I cannot see a way to do that, since by the time the inotify deletion event arrives, the file is deleted, and we cannot see what its symlink pointed to! Alternatievely, perhaps automatically do an expensive unused/dropunused cleanup process. Some of this probably needs the currently stateless threads to maintain a common state.	2012-04-12 17:42:05 -04:00
Joey Hess	fcc08c59ec	use unabbreviated size units in status	2012-04-06 14:54:41 -04:00
Joey Hess	e38a839a80	Rewrote free disk space checking code Moving the portability handling into a small C library cleans up things a lot, avoiding the pain of unpacking structs from inside haskell code.	2012-03-22 17:32:47 -04:00
Joey Hess	f1398b5583	use new getConfig	2012-03-22 17:32:47 -04:00
Joey Hess	4eb5112681	rationalize getConfig getConfig got a remote-specific config, and this confusing name caused it to be used a couple of places that only were interested in global configs. Rename to getRemoteConfig and make getConfig only get global configs. There are no behavior changes here, but remote.<name>.annex-web-options never actually worked (and per-remote web options is a very unlikely to be useful case so I didn't make it work), so fix the documentation for it.	2012-03-22 17:32:47 -04:00
Joey Hess	52b90e5d4c	tweak	2012-03-22 17:32:47 -04:00
Joey Hess	188e2edc41	status: Prints available local disk space, or shows if git-annex doesn't know.	2012-03-21 21:55:02 -04:00
Joey Hess	a362c46b70	fun with symbols Nothing at all on hackage is using <&&> or <\|\|>. (Also, <&&> should short-circuit on failure.)	2012-03-17 00:38:40 -04:00
Joey Hess	771052a85e	optimize monadic \|\| (\|\|) used applicative style runs both conditions rather than short circuiting. Add an orM that properly short-circuits.	2012-03-16 12:28:17 -04:00
Joey Hess	60ab3d84e1	added ifM and nuked 11 lines of code no behavior changes	2012-03-14 17:43:34 -04:00
Joey Hess	342fc28437	Merge branch 'master' into bloom Conflicts: Command/Commit.hs debian/changelog	2012-03-14 12:41:48 -04:00
Joey Hess	6cb4743cfb	ignore hook exit status	2012-03-14 12:41:00 -04:00
Joey Hess	5b869eef91	git-annex-shell: Runs hooks/annex-content after content is received or dropped.	2012-03-14 12:18:10 -04:00
Joey Hess	caf97fcffd	git-annex-shell: Runs hooks/annex-content after content is received or dropped.	2012-03-14 12:01:56 -04:00
Joey Hess	94aff8b878	Merge branch 'master' into bloom Conflicts: debian/changelog	2012-03-12 16:32:29 -04:00
Joey Hess	25809ce2e0	finish bloom filters Add tuning, docs, etc. Not sure if status is the right place to remote size.. perhaps unused should report the size and also warn if it sees more keys than the bloom filter allows?	2012-03-12 16:18:35 -04:00
Joey Hess	faf3a94fa7	added second stage bloom filter	2012-03-12 15:21:58 -04:00
Joey Hess	32f9742a88	fixed bloom filter creation space leak it works!	2012-03-12 14:09:43 -04:00
Joey Hess	160715166b	try at using bloom filters leaks memory	2012-03-12 02:39:25 -04:00
Joey Hess	89ee70c43a	status: More accurate display of sizes of tmp and bad keys. Can't trust the key size to be accurate for tmp and bad keys, so check actual file size. In the wild I saw the old code be wrong by a factor of about 100! If all tmp/bad keys are empty, they're not shown in status at all. Showing 0 bytes and suggesting to clean it up seemed weird..	2012-03-12 00:41:48 -04:00
Joey Hess	83bbb3bc93	prettify	2012-03-11 21:21:51 -04:00
Joey Hess	5df18b311a	avoid needing to keep list of present keys Stale and bad files are rare, so it's more efficient to use inAnnex to see if they can be deleted, rather than keeping the list of all present keys around for them.	2012-03-11 20:46:03 -04:00
Joey Hess	ff3644ad38	status: Fixed to run in nearly constant space. Before, it leaked space due to caching lists of keys. Now all necessary data about keys is calculated as they stream in. The "nearly constant" is due to getKeysPresent, which builds up a lot of [] thunks as it traverses .git/annex/objects/. Will deal with it later.	2012-03-11 17:15:58 -04:00
Joey Hess	b086e32c63	unused: Reduce memory usage significantly. Much of the memory bloat turned out to be due to getKeysReferenced containing a mapM, which is strict and buffered the whole list rather than streaming it. The other half of the bloat was due to building a temporary Set in order to call S.difference. While that is more cpu efficient, I switched to successive S.delete, since with it, I can run a whole git annex unused in less than 8 mb of memory. The whole Set of keys with content available is still stored in memory, so running unused in a repo with a whole lot of file content will still use more memory. In a repo containing 6000 files, it needed 40 mb. Note that the status command still uses the bloatful getKeysReferenced.	2012-03-11 16:24:07 -04:00
Joey Hess	997e29f294	sync: Sync to lower cost remotes first. This has two benefits. 1. When a lot of refs are going to be received, get them via lower cost connection when possible. 2. Allows ctrl-c of sync after the cheaper remotes have been pulled from (or pushed to).	2012-03-10 15:37:38 -04:00
Joey Hess	5ab82230f7	fsck: Fix up any broken links and misplaced content caused by the directory hash calculation bug fixed in the last release.	2012-03-10 14:46:21 -04:00
Joey Hess	dc9049373e	cleanup	2012-03-06 14:12:15 -04:00
Joey Hess	1098bc37ab	"here" can be used to refer to the current repository, which can read better than the old "." (which still works too).	2012-03-01 22:35:10 -04:00
Joey Hess	2fd294d06f	move --from, copy --from: 10 times faster scanning remote on local disk Rather than go through the location log to see which files are present on the remote, it simply looks at the disk contents directly. I benchmarked this speeding up scanning 834 files, from an annex on my phone's SSD, from 11.39 seconds to 1.31 seconds. (No files actually moved.) Also benchmarked 8139 files, from an annex on spinning storage, speeding up from 103.17 to 13.39 seconds. Note that benchmarking with an encrypted annex on flash actually showed a minor slowdown with this optimisation -- from 13.93 to 14.50 seconds. Seems the overhead of doing the crypto needed to get the filenames to directly check can be higher than the overhead of looking up data in the location log. (Which says good things about how well the location log and git have been optimised!) It may make sense to make encrypted local remotes not have hasKeyCheap set; further benchmarking is called for.	2012-02-26 14:59:48 -04:00
Joey Hess	a3c9d06a26	add git-annex-shell commit Eventually, git-annex might try running this after making changes to a remote. I have not yet thought of a good way for it to tell which remotes it needs to run it on though. It can't just do it when shutting down a cached ssh connection, because ssh connection caching is optional, and that would not handle local remotes not accessed over ssh either.	2012-02-25 16:47:28 -04:00
Joey Hess	1f73db3469	improve alwayscommit=false mode Now changes are staged into the branch's index, but not committed, which avoids growing a large journal. And sync and merge always explicitly commit, ensuring that even when they do nothing else, they commit the staged changes. Added a flag file to indicate that the branch's journal contains uncommitted changes. (Could use git ls-files, but don't want to run that every time.) In the future, this ability to have uncommitted changes staged in the journal might be used on remotes after a series of oneshot commands.	2012-02-25 16:18:55 -04:00
Joey Hess	779ec91908	more robustness fixes	2012-02-18 12:08:02 -04:00
Joey Hess	abd50e01fb	don't fail with --pathdepth when file already exists	2012-02-18 12:05:13 -04:00
Joey Hess	00340dfe49	don't error out entirely if an url cannot be downloaded	2012-02-18 11:44:21 -04:00
Joey Hess	1ed5e4d9e3	variable name	2012-02-17 00:21:35 -04:00
Joey Hess	f3c75b601f	reorg	2012-02-17 00:19:47 -04:00
Joey Hess	ba5515d422	reorder for clarity	2012-02-16 22:38:08 -04:00
Joey Hess	156a631f63	make Migrate use ReKey rather than the other way around as ReKey is plumbing, this makes sense	2012-02-16 22:36:56 -04:00
Joey Hess	69a0161c3a	fix filename limit when using --pathdepth	2012-02-16 19:37:02 -04:00
Joey Hess	db6b4cdfcf	rekey: New plumbing level command, can be used to change the keys used for files en masse.	2012-02-16 16:36:35 -04:00
Joey Hess	d05550e803	zero still bad	2012-02-16 14:28:54 -04:00
Joey Hess	346c934409	allow pathdepth to drop from the front or take from the end (negative)	2012-02-16 14:26:53 -04:00
Joey Hess	c2245260b1	improve usage	2012-02-16 12:37:30 -04:00
Joey Hess	39c3f56b33	addurl: Add --pathdepth option.	2012-02-16 12:25:19 -04:00
Joey Hess	a86d937b5b	avoid too long filename when making up a filename for addurl too	2012-02-16 02:09:09 -04:00
Joey Hess	a1e52f0ce5	hlint	2012-02-16 00:44:51 -04:00
Joey Hess	e7aaa55c53	create parent directories as needed for addurl --file	2012-02-16 00:05:49 -04:00
Joey Hess	90a8b38ac0	set oneshot mode on a per-command basis Avoids ugly (and test suite failing) hack in Command.Version	2012-02-14 12:40:40 -04:00
Joey Hess	2f1f1e6b13	avoid version saving state This is not the place to commit journal files.	2012-02-14 10:59:48 -04:00
Joey Hess	cb631ce518	whereis: Prints the urls of files that the web special remote knows about.	2012-02-14 03:49:48 -04:00
Joey Hess	cbaebf538a	rework git check-attr interface Now gitattributes are looked up, efficiently, in only the places that really need them, using the same approach used for cat-file. The old CheckAttr code seemed very fragile, in the way it streamed files through git check-attr. I actually found that `cad8824852` was still deadlocking with ghc 7.4, at the end of adding a lot of files. This should fix that problem, and avoid future ones. The best part is that this removes withAttrFilesInGit and withNumCopies, which were complicated Seek methods, as well as simplfying the types for several other Seek methods that had a Backend tupled in.	2012-02-13 23:52:21 -04:00
Joey Hess	a3ebf16e62	also verify new urls when adding them to existing files	2012-02-10 19:40:54 -04:00
Joey Hess	17fed709c8	addurl --fast: Verifies that the url can be downloaded (only getting its head), and records the size in the key.	2012-02-10 19:23:46 -04:00
Joey Hess	1c0bd81ba6	addurl: Normalize badly encoded urls.	2012-02-09 14:19:58 -04:00
Joey Hess	ac97454659	improve error message	2012-02-08 15:49:42 -04:00
Joey Hess	ef013506cb	addurl: Added a --file option Can be used to specify what file the url is added to. This can be used to override the default filename that is used when adding an url, which is based on the url. Or, when the file already exists, the url is recorded as another location of the file.	2012-02-08 15:35:29 -04:00
Joey Hess	a81297065d	use "known" instead of "visible" I think it's clearer, also it's the same length as "local" :)	2012-02-06 20:42:49 -04:00
Joey Hess	90ab17e153	remove old comment	2012-02-04 16:34:13 -04:00
Joey Hess	f1c7dc1212	fix touch and statfs to work on any files in any locale Use withCAString rather than withCString. XXX Actually, this only works in non-unicode locales when presented with unicode characters. Help?	2012-02-04 12:44:51 -04:00
Joey Hess	44b115e0b1	Merge branch 'master' into ghc7.4 Conflicts: Utility/Misc.hs	2012-02-03 16:48:40 -04:00
Joey Hess	146c36ca54	IO exception rework ghc 7.4 comaplains about use of System.IO.Error to catch exceptions. Ok, use Control.Exception, with variants specialized to only catch IO exceptions.	2012-02-03 16:47:24 -04:00
Joey Hess	d8fb97806c	support all filename encodings with ghc 7.4 Under ghc 7.4, this seems to be able to handle all filename encodings again. Including filename encodings that do not match the LANG setting. I think this will not work with earlier versions of ghc, it uses some ghc internals. Turns out that ghc 7.4 has a special filesystem encoding that it uses when reading/writing filenames (as FilePaths). This encoding is documented to allow "arbitrary undecodable bytes to be round-tripped through it". So, to get FilePaths from eg, git ls-files, set the Handle that is reading from git to use this encoding. Then things basically just work. However, I have not found a way to make Text read using this encoding. Text really does assume unicode. So I had to switch back to using String when reading/writing data to git. Which is a pity, because it's some percent slower, but at least it works. Note that stdout and stderr also have to be set to this encoding, or printing out filenames that contain undecodable bytes causes a crash. IMHO this is a misfeature in ghc, that the user can pass you a filename, which you can readFile, etc, but that default, putStr of filename may cause a crash! Git.CheckAttr gave me special trouble, because the filenames I got back from git, after feeding them in, had further encoding breakage. Rather than try to deal with that, I just zip up the input filenames with the attributes. Which must be returned in the same order queried for this to work. Also of note is an apparent GHC bug I worked around in Git.CheckAttr. It used to forkProcess and feed git from the child process. Unfortunatly, after this forkProcess, accessing the `files` variable from the parent returns []. Not the value that was passed into the function. This screams of a bad bug, that's clobbering a variable, but for now I just avoid forkProcess there to work around it. That forkProcess was itself only added because of a ghc bug, #624389. I've confirmed that the test case for that bug doesn't reproduce it with ghc 7.4. So that's ok, except for the new ghc bug I have not isolated and reported. Why does this simple bit of code magnet the ghc bugs? :) Also, the symlink touching code is currently broken, when used on utf-8 filenames in a non-utf-8 locale, or probably on any filename containing undecodable bytes, and I temporarily commented it out.	2012-02-03 16:23:20 -04:00
Joey Hess	3d49258e5b	attempt at a quick, utf-8 only fix to the ghc 7.4 problem If you have only utf-8 filenames, and need to build git-annex with ghc 7.4, this will work. But, it will crash on non-utf-8 filenames.	2012-02-01 16:16:08 -04:00
Joey Hess	a964012fc3	switch to the strict state monad I had not realized what a memory leak the lazy state monad could be, although I have not seen much evidence of actual leaking in git-annex. However, if running git-annex on a great many files, this could matter. The additional Utility.State.changeState adds even more strictness, avoiding a problem I saw in github-backup where repeatedly modifying state built up a huge pile of thunks.	2012-01-29 22:55:06 -04:00
Joey Hess	b81d662cbf	Avoid repeated location log commits when a remote is receiving files. Done by adding a oneshot mode, in which location log changes are written to the journal, but not committed. Taking advantage of git-annex's existing ability to recover in this situation. This is used by git-annex-shell and other places where changes are made to a remote's location log.	2012-01-28 15:41:52 -04:00
Joey Hess	61dbad505d	fsck --from remote --fast Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.	2012-01-20 13:23:11 -04:00
Joey Hess	f35a84fac7	use a different tmp file when fscking remote data Since the content might be symlinked into place, it's not appropriate to use withTmp here.	2012-01-19 16:56:07 -04:00
Joey Hess	06b0cb6224	add tmp flag parameter to retrieveKeyFile	2012-01-19 16:07:36 -04:00
Joey Hess	90319afa41	fsck --from Fscking a remote is now supported. It's done by retrieving the contents of the specified files from the remote, and checking them, so can be an expensive operation. (Several optimisations are possible, to speed it up, of course.. This is the slow and stupid remote fsck to start with.) Still, if the remote is a special remote, or a git repository that you cannot run fsck in locally, it's nice to have the ability to fsck it. If you have any directory special remotes, now would be a good time to fsck them, in case you were hit by the data loss bug fixed in the previous release!	2012-01-19 15:24:05 -04:00
Joey Hess	d36525e974	convert fsckKey to a Maybe This way it's clear when a backend does not implement its own fsck checks.	2012-01-19 13:51:30 -04:00
Joey Hess	abdacf58ed	tweaks	2012-01-11 00:06:54 -04:00
Joey Hess	16e7178f20	reorg	2012-01-10 15:29:10 -04:00
Joey Hess	07cacbeee9	break module dependancy loop A PITA but worth it to clean up the trust configuration code.	2012-01-10 13:32:38 -04:00
Joey Hess	7675b83efa	map: Fix display of remote repos A change to break local cycles made remote repos be dropped entirely.	2012-01-08 16:05:57 -04:00
Joey Hess	a35278430a	log: Add --gource mode, which generates output usable by gource. As part of this, I fixed up how log was getting the descriptions of remotes.	2012-01-07 18:18:09 -04:00
Joey Hess	bdc49ddbdb	typo	2012-01-07 00:45:01 -04:00
Joey Hess	dfa76069d4	reap zombies	2012-01-07 00:22:16 -04:00
Joey Hess	b8966433ef	sped up git annex log rather a lot See comment! Isn't git fun, always interesting approaches to optimise things that seemed unfixably slow.	2012-01-07 00:15:01 -04:00
Joey Hess	945f56f348	cleanup Broke out pure general functions etc.	2012-01-07 00:11:15 -04:00
Joey Hess	24b35113cf	tweak	2012-01-06 23:43:18 -04:00
Joey Hess	64f9d00bed	tweak	2012-01-06 21:51:39 -04:00
Joey Hess	2557bb8764	complete set of log options	2012-01-06 21:48:30 -04:00

1 2 3 4 5 ...

658 commits