git-annex

Author	SHA1	Message	Date
Joey Hess	25dba9da24	fix windows build	2013-05-21 13:07:43 -04:00
Joey Hess	369fb69fe7	fix warning	2013-05-20 18:01:27 -04:00
Joey Hess	25cb9a48da	fix the day's Windows permissions damage	2013-05-14 20:15:14 -04:00
Joey Hess	959536ef03	fill in a few windows stubs	2013-05-14 16:32:03 -05:00
Joey Hess	306a36260f	typo	2013-05-14 15:44:49 -04:00
Joey Hess	7b92ffc3a1	more leaning toothpick fixes	2013-05-14 15:43:23 -04:00
Joey Hess	dc66b1f27d	Merge branch 'master' into windows Conflicts: Annex/Environment.hs Build/Configure.hs Git/Construct.hs Utility/FileMode.hs	2013-05-14 15:37:24 -04:00
Joey Hess	81cded2b9d	detect local urls on DOS	2013-05-14 15:27:39 -04:00
Joey Hess	03e8594369	fix the day's windows permissions damage	2013-05-12 19:09:48 -04:00
Joey Hess	73d2f8b280	deal with git using / internally, even on DOS	2013-05-12 17:29:49 -05:00
Joey Hess	06551ad86b	set raw mode for git check-attr	2013-05-12 16:37:06 -05:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	5e1458152f	refactoring	2013-05-11 23:11:56 -04:00
Joey Hess	1e2ddcb68a	use setCurrentDirectory On POSIX, this just calls changeWorkingDirectory.	2013-05-11 19:14:30 -04:00
Joey Hess	18bdff3fae	clean up from windows porting	2013-05-11 18:23:41 -04:00
Joey Hess	dc22549ab3	git annex init works on Windows! git hash-object and cat-file both only use \n at ends of line, even on Windows.	2013-05-11 16:02:35 -05:00
Joey Hess	c45a723876	catFile expects no \r, even on Windows	2013-05-11 15:32:34 -05:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	763cbda14f	fixup #if 0 stubs to use #ifndef mingw32_HOST_OS That's needed in files used to build the configure program. For the other files, I'm keeping my __WINDOWS__ define, as I find that much easier to type. I may search and replace it to use the mingw32_HOST_OS thing later.	2013-05-10 16:57:21 -05:00
Joey Hess	6c74a42cc6	stub out POSIX stuff	2013-05-10 16:29:59 -05:00
Joey Hess	8a2d1988d3	expose Control.Monad.join I think I've been looking for that function for some time. Ie, I remember wanting to collapse Just Nothing to Nothing.	2013-04-22 20:24:53 -04:00
Joey Hess	a5dded0401	assistant: The ConfigMonitor left one zombie behind each time it checked for changes, now fixed.	2013-03-18 22:09:51 -04:00
Joey Hess	2c05c85437	webapp: DTRT when told to create a git repo that already exists.	2013-03-12 08:09:31 -04:00
Joey Hess	ea672b7c77	Bugfix: git annex add, when ran without any file or directory specified, should add files in the current directory, but not act on unlocked files elsewhere in the tree.	2013-03-07 19:03:06 -04:00
Joey Hess	82f639c70f	fix type introduced in `0c13d306` Doubled command name broke show-ref, which broke git annex sync. Re-read all of `0c13d306` to check for other problems.	2013-03-07 11:09:30 -04:00
Joey Hess	0c13d3065e	git subcommand cleanup Pass subcommand as a regular param, which allows passing git parameters like -c before it. This was already done in the pipeing set of functions, but not the command running set.	2013-03-03 13:39:07 -04:00
Joey Hess	4d33423067	assistant: Avoid noise in logs from git commit about typechanged files in direct mode repositories.	2013-03-01 16:21:29 -04:00
Joey Hess	8d9c2afd89	Additional GIT_DIR support bugfixes. May actually work now. Two fixes. First, and most importantly, relax the isLinkToAnnex check to only look for /annex/objects/, not [^\|/].git/annex/objects. If GIT_DIR is used with a detached work tree, the git directory is not necessarily named .git. There are important caveats with doing that at all, since git-annex will make symlinks that point at GIT_DIR, which means that the relative path between GIT_DIR and GIT_WORK_TREE needs to remain stable across all clones of the repository. ---- The other fix is just fixing crazy and wrong code that, when GIT_DIR is set, expects to still find a git repository in the path below the work tree, and uses some of its configuration, and some of GIT_DIR. What was I thinking, and why can't I seem to get this code right?	2013-02-23 12:41:22 -04:00
Joey Hess	52902c0945	make adding modified files work on crippled filesystems	2013-02-20 14:12:55 -04:00
Joey Hess	547d7745fb	pre-commit: Update direct mode mappings. Making the pre-commit hook look at git diff-index to find changed direct mode files and update the mappings works pretty well. One case where it does not work is when a file is git annex added, and then git rmed, and then this is committed. That's a no-op commit, so the hook probably doesn't even run, and it certianly never notices that the file was deleted, so the mapping will still have the original filename in it. For this and other reasons, it's important that the mappings still be treated as possibly inconsistent. Also, the assistant now allows the pre-commit hook to run when in direct mode, so the mappings also get updated there.	2013-02-06 12:44:19 -04:00
Joey Hess	5cd152b8a9	annex.autocommit New setting, can be used to disable autocommit of changed files by the assistant, while it still does data syncing and other tasks. Also wired into webapp UI	2013-01-27 22:43:05 +11:00
Joey Hess	0214e0fb17	union merge bugfix Union merges involving two or more repositories could sometimes result in data from one repository getting lost. This could result in the location log data becoming wrong, and fsck being needed to fix it. NB: I audited for any other occurrences of this problem. There are other places than union merge where multiple changes are fed into update-index in a stream, but they all involve working copy files being staged, or their deletion being staged, and in this case it's fine for the later changes to override the earlier ones.	2013-01-16 21:31:06 -04:00
Joey Hess	95db595e91	make startup scan for deleted files work in direct mode git add --update cannot be used, because it'll stage typechanged direct mode files. Intead, use ls-files to find deleted files, and stage them ourselves. It seems that no commit was made before when the scan staged deleted files. (Probably masked since if files were added, a commit happened then..) Now that I'm doing the staging, I was also able to fix that bug.	2012-12-24 14:24:13 -04:00
Joey Hess	92bd889e61	unused	2012-12-18 17:15:11 -04:00
Joey Hess	53dbcce645	direct mode merging works! Automatic merge resoltion code needs to be fixed to preserve objects from direct mode files.	2012-12-18 15:04:44 -04:00
Joey Hess	ffdd08fd2e	Merge branch 'master' into desymlink	2012-12-13 00:46:10 -04:00
Joey Hess	0d50a6105b	whitespace fixes	2012-12-13 00:45:27 -04:00
Joey Hess	b080a58b76	Merge branch 'master' into desymlink Conflicts: Annex/CatFile.hs Annex/Content.hs Git/LsFiles.hs Git/LsTree.hs	2012-12-13 00:29:06 -04:00
Joey Hess	f87a781aa6	finished where indentation changes	2012-12-13 00:24:19 -04:00
Joey Hess	e7b8cb0063	direct mode committing	2012-12-12 19:20:38 -04:00
Joey Hess	b0c5cbfde2	add notStaged	2012-12-12 13:25:26 -04:00
Joey Hess	e8a74e9493	where indentation	2012-12-12 13:20:58 -04:00
Joey Hess	0714b0bd03	remove unused function	2012-12-12 13:17:41 -04:00
Joey Hess	715c67a3e5	git diff-tree interface	2012-12-10 14:36:57 -04:00
Joey Hess	444e984727	don't treat foo::bar as a ssh url It's a git-remote-helper location, and will be stored as just an url.	2012-11-09 13:50:23 -04:00
Joey Hess	39e82b1af8	webapp: Generate better git remote names. Wrote a better git remote name sanitizer. Git blows up on lots of weird stuff, especially if it starts the remote name, but I managed to get some common punctuation working.	2012-10-31 15:26:19 -04:00
Joey Hess	7ee0ffaeb9	Use USER and HOME environment when set, and only fall back to getpwent, which doesn't work with LDAP or NIS.	2012-10-25 18:17:54 -04:00
Joey Hess	c7c2015435	add ConfigMonitor thread Monitors git-annex branch for changes, which are noticed by the Merger thread whenever the branch ref is changed (either due to an incoming push, or a local change), and refreshes cached config values for modified config files. Rate limited to run no more often than once per minute. This is important because frequent git-annex branch changes happen when files are being added, or transferred, etc. A primary use case is that, when preferred content changes are made, and get pushed to remotes, the remotes start honoring those settings. Other use cases include propigating repository description and trust changes to remotes, and learning when a remote has added a new special remote, so the webapp can present the GUI to enable that special remote locally. Also added a uuid.log cache. All other config files already had caches.	2012-10-20 16:43:35 -04:00
Joey Hess	b281584422	remove some more !!	2012-10-20 16:21:43 -04:00
Joey Hess	e6b1f36e1d	Fix handling of GIT_DIR when it refers to a git submodule. The old code was just wrong in taking fromPath of GIT_DIR -- that made an localUnknown location with the GIT_DIR in it, which only worked by accident, and failed in submodules.	2012-10-17 14:28:05 -04:00
Joey Hess	919fec85cd	better fix for zombie problem, which turns out to be a zombie ssh started by rsync When rsyncProgress pipes rsync's stdout, this turns out to cause a ssh process started by rsync to be left behind as a zombie. I don't know why, but my recent zombie reaping cleanup was correct, it's just that this other zombie, that's not directly started by git-annex, was no longer reaped due to changes in the cleanup. Make rsyncProgress reap the zombie started by rsync, as a workaround. FWIW, the process tree looks like this. It seems like the rsync child is for some reason starting but not waiting on this extra ssh process. Ssh connection caching may be involved -- disabling it seemed to change the shape of the tree, but did not eliminate the zombie. 9378 pts/14 S+ 0:00 \| \_ rsync -p --progress --inplace -4 -e 'ssh' '-S' ... 9379 pts/14 S+ 0:00 \| \| \_ ssh ... 9380 pts/14 S+ 0:00 \| \| \_ rsync -p --progress --inplace -4 -e 'ssh' '-S' ... 9381 pts/14 Z+ 0:00 \| \_ [ssh] <defunct>	2012-10-17 00:47:52 -04:00
Joey Hess	4f95cc8ef1	ensure that gitdir is absolute calcGitLink turns out to need it to be absolute, and it normally is, but not if it's read from a .git file in a submodule, or perhaps from GIT_DIR. I should look into dropping this invariant.	2012-10-16 16:25:45 -04:00
Joey Hess	8fec62d299	A relative core.worktree is relative to the gitdir. Now that this is handled correctly, git-annex can be used in git submodules. Also, fixed infelicity where Git.CurrentRepo and Git.Config.updateLocation were both dealing with core.worktree. Now updateLocation handles it for Local as well as for LocalUnknown repos.	2012-10-16 00:08:39 -04:00
Joey Hess	148d9f0088	simplify	2012-10-15 23:12:50 -04:00
Joey Hess	429b77844e	drop old config when rereading repo config Before, the new config was merged into the old, so if eg, a remote was renamed, it would have both the new and the old remote name.	2012-10-14 17:23:40 -04:00
Joey Hess	06831e7754	fix slightly incorrect comment	2012-10-12 12:20:45 -04:00
Joey Hess	e05c21cb73	Fix a crash when merging files in the git-annex branch that contain invalid utf8. The crash actually occurred when writing out the file, which was done to a handle that had not had fileSystemEncoding applied to it.	2012-10-12 12:19:30 -04:00
Joey Hess	47314c0fad	fix last zombies in the assistant Made Git.LsFiles return cleanup actions, and everything waits on processes now, except of course for Seek.	2012-10-04 19:56:32 -04:00
Joey Hess	f7f1d25df8	bugfix	2012-10-04 19:41:58 -04:00
Joey Hess	de3ea4adb6	remove now-unnecessary manual reaps	2012-10-04 18:58:57 -04:00
Joey Hess	5594bf0643	more zombie fighting I'm down to 9 places in the code that can produce unwaited for zombies. Most of these are pretty innocuous, at least for now, are only used in short-running commands, or commands that run a set of actions and explicitly reap zombies after each one. The one from Annex.Branch.files could be trouble later, since both Command.Fsck and Command.Unused can trigger it, and the assistant will be doing those eventally. Ditto the one in Git.LsTree.lsTree, which Command.Unused uses. The only ones currently affecting the assistant though, are in Git.LsFiles. Several threads use several of those. (And yeah, using pipes or ResourceT would be a less ad-hoc approach, but I don't really feel like ripping my entire code base apart right now to change a foundation monad. Maybe one of these days..)	2012-10-04 18:47:31 -04:00
Joey Hess	f67b54e5e3	make a pipeReadStrict, that properly waits on the process Nearly everything that's reading from git is operating on a small amount of output and has been switched to use that. Only pipeNullSplit stuff continues using the lazy version that yields zombies.	2012-10-04 18:04:09 -04:00
Joey Hess	582316f66f	avoid webapp crash on startup when there's no ~/.gitconfig git config --list --global exits nonzero when there's no global config	2012-09-23 12:43:14 -04:00
Joey Hess	e8188ea611	flip catchDefaultIO	2012-09-17 00:18:07 -04:00
Joey Hess	ba744c84a4	better name for fallback sync refs Don't expose these as branches in refs/heads/. Instead hide them away in refs/synced/ where only show-ref will find them. Make unused only look at branches and tags, not these other things, so it won't care if some stale sync ref used to use a file. This means they don't need to be deleted, which could have led to an incoming sync being missed.	2012-09-16 23:09:08 -04:00
Joey Hess	6cddda4143	make the merger merge any equivilant sync branch into the current branch Not just synced/master, but synced/UUID/master, for example	2012-09-16 19:41:26 -04:00
Joey Hess	da63b7e96c	Support repositories created with --separate-git-dir. Closes: #684405	2012-09-15 22:40:04 -04:00
Joey Hess	ca45cea113	Revert "add catFileIndex" This interface is not a good idea, because a running git cat-file --batch does not notice when existing files in the index are changed.	2012-09-15 18:30:53 -04:00
Joey Hess	0b63ee6cd5	run git coprocesses with gitEnv	2012-09-15 17:43:37 -04:00
Joey Hess	e1baf48d88	add catFileIndex	2012-09-15 17:06:10 -04:00
Joey Hess	c9b3b8829d	thread safe git-annex index file use	2012-08-24 20:50:39 -04:00
Joey Hess	fb4b19deed	make the webapp honor the web.browser git config	2012-08-08 13:15:35 -04:00
Joey Hess	5ae1f75a39	handle case of adding populated drive to just created repo The just created repo has no master branch commits yet. This is now handled, merging in the master branch from the populated drive.	2012-08-05 16:35:30 -04:00
Joey Hess	34fc0d358e	fix crashes when run in a git repo that has been initted but has no master branch yet	2012-08-05 15:53:47 -04:00
Joey Hess	9fc94d780b	better readProcess	2012-07-19 00:57:40 -04:00
Joey Hess	1db7d27a45	add back debug logging Make Utility.Process wrap the parts of System.Process that I use, and add debug logging to them. Also wrote some higher-level code that allows running an action with handles to a processes stdin or stdout (or both), and checking its exit status, all in a single function call. As a bonus, the debug logging now indicates whether the process is being run to read from it, feed it data, chat with it (writing and reading), or just call it for its side effect.	2012-07-19 00:46:52 -04:00
Joey Hess	d1da9cf221	switch from System.Cmd.Utils to System.Process Test suite now passes with -threaded! I traced back all the hangs with -threaded to System.Cmd.Utils. It seems it's just crappy/unsafe/outdated, and should not be used. System.Process seems to be the cool new thing, so converted all the code to use it instead. In the process, --debug stopped printing commands it runs. I may try to bring that back later. Note that even SafeSystem was switched to use System.Process. Since that was a modified version of code from System.Cmd.Utils, it needed to be converted too. I also got rid of nearly all calls to forkProcess, and all calls to executeFile, which I'm also doubtful about working well with -threaded.	2012-07-18 18:00:24 -04:00
Joey Hess	fc5652c811	Merge branch 'master' into threaded	2012-07-18 13:31:28 -04:00
Joey Hess	05310538ef	more debugging	2012-07-18 13:31:00 -04:00
Joey Hess	0962d50ad2	typo	2012-07-17 14:51:42 -04:00
Joey Hess	4db09814e4	avoid --no-edit with older git versions	2012-07-17 14:50:37 -04:00
Joey Hess	182526ff68	add debugging	2012-07-17 14:40:05 -04:00
Joey Hess	048b64024a	sync: Automatically resolves merge conflicts. untested, but it compiles :)	2012-06-27 13:08:32 -04:00
Joey Hess	051c68041b	properly handle deleted files when processing ls-files --unmerged	2012-06-27 12:11:03 -04:00
Joey Hess	8e8439a519	add ls-files --unmerged support	2012-06-27 09:27:59 -04:00
Joey Hess	6f45827fe0	git-config fileEncoding Accept arbitrarily encoded repository filepaths etc when reading git config output. This fixes support for remotes with unusual characters in their names. For example, a remote with a url of /tmp/çüş was previously skipped, because the filename wasn't encoded right so it didn't think it was available. And when setting the annex-uuid of a remote named "çüş", it used to add it under a mis-encoded form of the remote's name. Both these cases now work ok in my testing.	2012-06-26 23:07:11 -04:00
Joey Hess	1093d82f6b	Got rid of the last place that did utf8 decoding. Probably fixes bugs/git-annex:_Cannot_decode_byte___39____92__xfc__39__/ although I don't know how to reproduce that bug.	2012-06-26 22:58:44 -04:00
Joey Hess	c79e3b67e9	sync: Avoid recent git's interactive merge.	2012-06-23 10:22:56 -04:00
Joey Hess	75b6ee81f9	avoid ByteString.Char8 where not needed Its truncation behavior is a red flag, so avoid using it in these places where only raw ByteStrings are used, without looking at the data inside.	2012-06-20 13:13:40 -04:00
Joey Hess	da62edb42a	optimisation and memory leak fix	2012-06-12 21:13:15 -04:00
Joey Hess	ca9ee21bd7	crazy optimisation Crazy like a fox..	2012-06-10 19:58:34 -04:00
Joey Hess	c5707c84d3	queue size fix Increase queue size for update-index actions, because otherwise they'll never be flushed.	2012-06-10 13:56:04 -04:00
Joey Hess	5308b51ec0	stage deletions directly using update-index no need to run git-rm separately	2012-06-10 13:05:58 -04:00
Joey Hess	7f39415600	force thunk for precalculated value	2012-06-10 12:50:15 -04:00
Joey Hess	d45a9a7831	refactor and function name cleanup (oops, I had a calcMerge and a calc_merge!)	2012-06-08 00:29:39 -04:00
Joey Hess	20f425be19	make watch use the queue May not work. Certianly needs to flush the queue from time to time when only symlink changes are being made.	2012-06-07 15:40:44 -04:00
Joey Hess	0a11b35d89	extend Git.Queue to be able to queue more than simple git commands While I was in there, I noticed and fixed a bug in the queue size calculations. It was never encountered only because Queue.add was only ever run with 1 file in the list.	2012-06-07 15:19:44 -04:00
Joey Hess	91db540769	add support for staging other types of blobs, like symlinks, into the index Also added a utility TopFilePath type, which could stand to be used more widely.	2012-06-06 14:26:15 -04:00
Joey Hess	4b32ea793d	Merge branch 'master' into watch	2012-06-06 12:52:21 -04:00
Joey Hess	f596084a59	move hashObject to HashObject library and generalize it to support all git object types	2012-06-06 02:31:31 -04:00
Joey Hess	27cfeca4ea	Merge branch 'master' into watch	2012-06-06 02:16:21 -04:00
Joey Hess	f1bd72ea54	factor out generic update-index code from unionmerge code	2012-06-06 00:10:34 -04:00
Joey Hess	7a6fb8ae4e	flush the git queue when a new type of action is being added to it This allows the queue to be used in a single process for multiple possibly conflicting commands, like add and rm, without running them out of order. This assumes that running the same git subcommand with different parameters cannot itself conflict.	2012-06-04 20:41:22 -04:00
Joey Hess	ebbd24e5ed	more worktree improvements Avoid more expensive code path when no core.worktree is configured. Don't change worktree when reading config if one is already set. This could happen if GIT_CORE_WORKTREE is set, and the repo also has core.worktree, and the config is reread. Now GIT_CORE_WORKTREE will prevail.	2012-05-19 11:08:50 -04:00
Joey Hess	9d98144776	avoid chdir when already inside worktree	2012-05-19 10:37:28 -04:00
Joey Hess	0093a456e8	test suite saved my bacon git config reading memoization shouldn't be used when changing config	2012-05-19 10:22:43 -04:00
Joey Hess	a1885bd116	make GIT_DIR, GIT_WORK_TREE absolute GIT_DIR is set to something relative, like ".git" in the pre-commit hook. But internally all the directories are assumed to be absolute.	2012-05-18 18:32:19 -04:00
Joey Hess	eb6cb1b87f	Add support for core.worktree, and fix support for GIT_WORK_TREE and GIT_DIR. The environment needs to override git-config. Changed when git config is read, and avoid rereading it once it's been read. chdir for both worktree settings.	2012-05-18 18:20:53 -04:00
Joey Hess	bb4f31a0ee	Clean up handling of git directory and git worktree. Baked into the code was an assumption that a repository's git directory could be determined by adding ".git" to its work tree (or nothing for bare repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are used to separate the two. This was attacked at the type level, by storing the gitdir and worktree separately, so Nothing for the worktree means a bare repo. A complication arose because we don't learn where a repository is bare until its configuration is read. So another Location type handles repositories that have not had their config read yet. I am not entirely happy with this being a Location type, rather than representing them entirely separate from the Git type. The new code is not worse than the old, but better types could enforce more safety. Added support for core.worktree. Overriding it with -c isn't supported because it's not really clear what to do if a git repo's config is read, is not bare, and is then overridden to bare. What is the right git directory in this case? I will worry about this if/when someone has a use case for overriding core.worktree with -c. (See Git.Config.updateLocation) Also removed and renamed some functions like gitDir and workTree that misused git's terminology. One minor regression is known: git annex add in a bare repository does not print a nice error message, but runs git ls-files in a way that fails earlier with a less nice error message. This is because before --work-tree was always passed to git commands, even in a bare repo, while now it's not.	2012-05-18 17:03:12 -04:00
Joey Hess	84ac8c58db	Add annex.httpheaders and annex.httpheader-command config settings Allow custom headers to be sent with all HTTP requests. (Requested by the Internet Archive)	2012-04-22 01:13:09 -04:00
Joey Hess	ed79596b75	noop	2012-04-21 23:32:33 -04:00
Joey Hess	b4a5e39ee6	Support git's core.sharedRepository configuration This is incomplete, it does not honor it yet for hash directories and other annex bookkeeping files. Some of that is not needed for a bare repo; some of it may be.	2012-04-21 15:36:52 -04:00
Joey Hess	70538dac84	compute distance in correct direction	2012-04-14 16:01:08 -04:00
Joey Hess	52a158a7c6	autocorrection git-annex (but not git-annex-shell) supports the git help.autocorrect configuration setting, doing fuzzy matching using the restricted Damerau-Levenshtein edit distance, just as git does. This adds a build dependency on the haskell edit-distance library.	2012-04-12 15:37:21 -04:00
Joey Hess	c924542e61	bup: Properly handle key names with spaces or other things that are not legal git refs. Continue using the key name as bup ref name, to preserve backwards compatability, unless it is an illegal git ref. In that case, use a sha256 of the key name instead.	2012-04-11 12:45:49 -04:00
Joey Hess	378f61d0ef	nicer style; also empty refs are implicitly not allowed	2012-04-11 12:29:31 -04:00
Joey Hess	0be6ebb0aa	added a git ref legality checker git-check-ref-format is .. wow. Good design on one level, but what a mess.	2012-04-11 12:21:54 -04:00
Joey Hess	184a69171d	removed another 10 lines via ifM	2012-03-16 01:59:07 -04:00
Joey Hess	00d814aecc	fix filename encoding for git cat-file The filename sent to git cat-file needs to be sent on a File encoded handle. Also set the read handle to use the File encoding, so that any error message mentioning the filename is received properly. The actual file content is read using Data.ByteString.Char8, which will ignore the read handle's encoding, so this won't change that. (Whether that is entirely correct remains to be seen.)	2012-02-26 14:11:50 -04:00
Joey Hess	cac130b205	cleanup	2012-02-21 00:16:24 -04:00
Joey Hess	6c0155efb7	refactor	2012-02-20 15:22:21 -04:00
Joey Hess	f0f07db01d	reorder prams and put -- after atrributes, for compatability with old git (cherry picked from commit `c8ec0e233e`)	2012-02-15 14:01:06 -04:00
Joey Hess	52c5b164d8	Added a annex.queuesize setting useful when adding hundreds of thousands of files on a system with plenty of memory. git add gets quite slow in such a large repository, so if the system has more than the ~32 mb of memory the queue can use by default, it's a useful optimisation to increase the queue size, in order to decrease the number of times git add is run.	2012-02-15 11:14:19 -04:00
Joey Hess	7ebd98d8d8	fix memory leak when staging the journal The list of files had to be retained until the end so it could be deleted. Also, a list of update-index lines was generated and only then fed into it. Now everything streams in constant space.	2012-02-14 14:37:59 -04:00
Joey Hess	a40ec5e03e	Fixed a memory leak due to excessive strictness when committing journal files. When hashing the files, the entire list of shas was read strictly. That was entirely unnecessary, since there's a cleanup action run after they're consumed.	2012-02-14 11:20:34 -04:00
Joey Hess	8f76d66f32	set fileEncoding on CheckAttr handles Seemed to work without it, but this is correct.	2012-02-14 04:31:39 -04:00
Joey Hess	a2f241d503	fix LsFiles.typeChanged paths Passing absolute paths to Command.Add used to work, but after recent changes doesn't. All LsFiles should use relative paths anyway, so fix it there.	2012-02-14 00:22:42 -04:00
Joey Hess	cbaebf538a	rework git check-attr interface Now gitattributes are looked up, efficiently, in only the places that really need them, using the same approach used for cat-file. The old CheckAttr code seemed very fragile, in the way it streamed files through git check-attr. I actually found that `cad8824852` was still deadlocking with ghc 7.4, at the end of adding a lot of files. This should fix that problem, and avoid future ones. The best part is that this removes withAttrFilesInGit and withNumCopies, which were complicated Seek methods, as well as simplfying the types for several other Seek methods that had a Backend tupled in.	2012-02-13 23:52:21 -04:00
Joey Hess	d35a8d85b5	another place hGetBoth was used without a writer thread	2012-02-13 20:23:45 -04:00
Joey Hess	cad8824852	thinko I removed the now unnecessary forkProcess, but forgot to change back to pipeBoth, so there was no writer thread.	2012-02-13 20:01:37 -04:00
Joey Hess	3ac2677e00	comment typo	2012-02-13 16:58:26 -04:00
Joey Hess	e4d0923544	wording	2012-02-09 17:35:36 -04:00
Joey Hess	dc682e53a2	use fileEncoding for git-update-index input handle	2012-02-04 13:03:33 -04:00
Joey Hess	586be39952	fix file encoding of HashObject	2012-02-04 13:01:00 -04:00
Joey Hess	d8fb97806c	support all filename encodings with ghc 7.4 Under ghc 7.4, this seems to be able to handle all filename encodings again. Including filename encodings that do not match the LANG setting. I think this will not work with earlier versions of ghc, it uses some ghc internals. Turns out that ghc 7.4 has a special filesystem encoding that it uses when reading/writing filenames (as FilePaths). This encoding is documented to allow "arbitrary undecodable bytes to be round-tripped through it". So, to get FilePaths from eg, git ls-files, set the Handle that is reading from git to use this encoding. Then things basically just work. However, I have not found a way to make Text read using this encoding. Text really does assume unicode. So I had to switch back to using String when reading/writing data to git. Which is a pity, because it's some percent slower, but at least it works. Note that stdout and stderr also have to be set to this encoding, or printing out filenames that contain undecodable bytes causes a crash. IMHO this is a misfeature in ghc, that the user can pass you a filename, which you can readFile, etc, but that default, putStr of filename may cause a crash! Git.CheckAttr gave me special trouble, because the filenames I got back from git, after feeding them in, had further encoding breakage. Rather than try to deal with that, I just zip up the input filenames with the attributes. Which must be returned in the same order queried for this to work. Also of note is an apparent GHC bug I worked around in Git.CheckAttr. It used to forkProcess and feed git from the child process. Unfortunatly, after this forkProcess, accessing the `files` variable from the parent returns []. Not the value that was passed into the function. This screams of a bad bug, that's clobbering a variable, but for now I just avoid forkProcess there to work around it. That forkProcess was itself only added because of a ghc bug, #624389. I've confirmed that the test case for that bug doesn't reproduce it with ghc 7.4. So that's ok, except for the new ghc bug I have not isolated and reported. Why does this simple bit of code magnet the ghc bugs? :) Also, the symlink touching code is currently broken, when used on utf-8 filenames in a non-utf-8 locale, or probably on any filename containing undecodable bytes, and I temporarily commented it out.	2012-02-03 16:23:20 -04:00
Joey Hess	3d49258e5b	attempt at a quick, utf-8 only fix to the ghc 7.4 problem If you have only utf-8 filenames, and need to build git-annex with ghc 7.4, this will work. But, it will crash on non-utf-8 filenames.	2012-02-01 16:16:08 -04:00
Joey Hess	a964012fc3	switch to the strict state monad I had not realized what a memory leak the lazy state monad could be, although I have not seen much evidence of actual leaking in git-annex. However, if running git-annex on a great many files, this could matter. The additional Utility.State.changeState adds even more strictness, avoiding a problem I saw in github-backup where repeatedly modifying state built up a huge pile of thunks.	2012-01-29 22:55:06 -04:00
Joey Hess	97209ac08d	fix error message	2012-01-25 20:43:01 -04:00
Joey Hess	3ca7cf5db1	export fromPath Not used in git-annex, but I am using it in git-backup	2012-01-25 20:42:05 -04:00
Joey Hess	ce5637498f	remove Utility.Conditional and use IfElse This drops the >>! and >>? with the nice low fixity. IfElse does have undocumented >>=>>! and >>=>>? operators, but I deem that too fishy. Anyway, using whenM and unlessM is easier; I sometimes mixed the operators up.	2012-01-24 16:22:07 -04:00
Joey Hess	ba6088b249	rename readMaybe to readish a stricter (but also partial) readMaybe is getting added to base	2012-01-23 17:00:10 -04:00
Joey Hess	8c87293b48	avoid unnecessary stats when traversing to parent	2012-01-14 11:48:10 -04:00
Joey Hess	92a4af8b20	avoid unnecessary chdir	2012-01-14 11:42:51 -04:00
Joey Hess	1f66af2b53	optimize away 3 stats	2012-01-14 11:28:49 -04:00
Joey Hess	ff5703ce77	tweak	2012-01-13 21:06:00 -04:00
Joey Hess	66aac77467	support relative GIT_DIR	2012-01-13 14:40:36 -04:00
Joey Hess	1ae780ee79	git-annex, git-union-merge: Support GIT_DIR and GIT_WORK_TREE. Note that GIT_WORK_TREE cannot influence GIT_DIR; that is necessary for git-fake-bare and vcsh type things to work.	2012-01-13 12:52:09 -04:00
Joey Hess	0d5c402210	Add annex-trustlevel configuration settings, which can be used to override the trust level of a remote. This overrides the trust.log, and is overridden by the command-line trust parameters. It would have been nicer to have Logs.Trust.trustMap just look up the configuration for all remotes, but a dependency loop prevented that (Remotes depends on Logs.Trust in several ways). So instead, look up the configuration when building remotes, storing it in the same forcetrust field used for the command-line trust parameters.	2012-01-09 23:31:44 -04:00
Joey Hess	9fb5f3edc7	log --after=date	2012-01-06 17:24:03 -04:00
Joey Hess	0b27e6baa0	Support unescaped repository urls, like git does. Turns out that git will accept a .git/config containing an url with eg, spaces in its name. Handle this by escaping the url if it's not valid. This also fixes support for urls containing escaped characters like %20 for space. Before, the path from the url was not unescaped properly.	2012-01-05 14:32:20 -04:00
Joey Hess	f0957426c5	skip local remotes that are not available (ie, not mounted) With --fast, unavailable local remotes are filtered out of the fast set. This way, if there are local remotes, --fast always acts only on them, and if none are mounted, acts on nothing. This consistency is better than --fast acting on different remotes depending on what's mounted.	2011-12-31 04:50:39 -04:00
Joey Hess	a2ec2d3760	refactor and check for a detached HEAD	2011-12-31 03:38:58 -04:00
Joey Hess	52104dae6f	refactor	2011-12-30 18:36:40 -04:00
Joey Hess	26040d6419	add base, under The describe function was only intended to generate a human-visible description of a branch, but taking the base of a branch is a useful operation to be able to do no matter the human-visible representation. Converting a branch like refs/heads/master to refs/heads/origin/master is also a useful operation, and under can do that.	2011-12-30 16:48:26 -04:00
Joey Hess	5287d1dc3f	fixed behavior when multiple insteadOf configs are provided for the same url base Consider this git config --list case: url.git+ssh://git@example.com/.insteadOf=gl url.git+ssh://git@example.com/.insteadOf=shared Since config is stored in a Map, only the last of the values for this key was stored and available for use by the insteadOf code. But that is wrong; git allows either "gl" or "shared" to be used in an url and the insteadOf value to be substituted in. To support this, it seems best to keep the existing config map as-is, and add a second map that accumulates a list of multiple values for config keys. This new fullconfig map can be used in the rare places where multiple values for a key make sense, without needing to complicate everything else. Haskell's laziness and data sharing keep the overhead of adding this second map low.	2011-12-30 14:07:46 -04:00
Joey Hess	cba3ce08df	handle C-style escapes in Format I was happily able to repurpose some code from Git.Filename to handle this. I remember writing that code... a whole afternoon at a coffee shop, after which I felt I'd struggled with Haskell and git, and sorta lost, in needing to write this nasty peice of code. But was also pleased at the use of a pair of functions and quickcheck that allowed me to get it 100% right. So, turns out I not only got it right, but the code wasn't as special-purpose as I'd feared. Yay!	2011-12-23 01:05:16 -04:00
Joey Hess	5a275a3f5d	Can now be built with older git versions (before 1.7.7); the resulting binary should only be used with old git. Remove git old version check from configure, and use the git version it was built against in the git check-attr code.	2011-12-22 15:01:13 -04:00
Joey Hess	6bffe509d7	Add --include, which is the same as --not --exclude.	2011-12-22 14:00:17 -04:00
Joey Hess	ee3b5b2a42	use Common in a few more modules	2011-12-20 14:37:53 -04:00
Joey Hess	95d2391f58	more partial function removal Left a few Prelude.head's in where it was checked not null and too hard to remove, etc.	2011-12-15 18:19:36 -04:00
Joey Hess	fbc3d32f7d	avoid partial function, and parse git-ref output better It's possible that a ref name might contain a space, this properly preserves the space.	2011-12-15 16:58:04 -04:00
Joey Hess	eb132a854e	avoid partial head function (although it was used safely)	2011-12-15 16:04:08 -04:00
Joey Hess	111b6937ec	avoid partial functions, and added check for correct sha content	2011-12-15 15:57:47 -04:00
Joey Hess	a8643ca44c	refactor	2011-12-15 13:05:47 -04:00
Joey Hess	09cd042775	Properly handle multiline git config values. A crash on parsing was fixed a while ago. This adds support for fully correctly parsing multiline git config values, using git config --null. Since git-annex-shell configlist uses normal git config output, I left in support for that too; the two forms of config output can be easily identified by the parser. Since configlist only prints the annex.uuid config, there's no risk of multiline values there, so no need to change it.	2011-12-15 12:48:27 -04:00
Joey Hess	ef28b3fef7	split out Git/Command.hs	2011-12-14 15:56:11 -04:00
Joey Hess	02f1bd2bf4	split more stuff out of Git.hs	2011-12-14 15:43:13 -04:00
Joey Hess	9db8ec210f	split out two more Git modules	2011-12-13 15:24:23 -04:00
Joey Hess	25b2cc4148	move commit to Git.Branch	2011-12-13 15:08:44 -04:00
Joey Hess	13fff71f20	split out three modules from Git Constructors and configuration make sense in separate modules. A separate Git.Types is needed to avoid cycles.	2011-12-13 15:06:49 -04:00
Joey Hess	46588674b0	avoid closing pipe before all the shas are read from it Could have just used hGetContentsStrict here, but that would require storing all the shas in memory. Since this is called at the end of a git-annex run, it may have created a lot of shas, so I avoid that memory use and stream them out like before.	2011-12-12 21:41:37 -04:00
Joey Hess	0e45b762a0	broke out Git/HashObject.hs	2011-12-12 21:24:55 -04:00
Joey Hess	31a0c07ee9	broke out Git/Branch.hs and reorganized	2011-12-12 21:12:51 -04:00
Joey Hess	543d0d2501	split out Git/Ref.hs	2011-12-12 18:30:33 -04:00
Joey Hess	acd7a52dfd	always find optimal merge Testing `b9ac585454`, it didn't find the optimal union merge, the second sha was the one to use, at least in the case I tried. Let's just try all shas to see if any can be reused. I stopped using the expensive nub, so despite the use of sets to sort/uniq file contents, this is probably as fast or faster than it was before.	2011-12-12 01:59:29 -04:00
Joey Hess	0cbab5de65	refactor	2011-12-12 00:48:25 -04:00
Joey Hess	b9ac585454	more efficient union merges Tries to avoid generating a new object when the merged content has the same lines that were in the old object. I've noticed some merge commits that only move lines around, like this: - 1323478057.181191s 1 be23c3ac-0ee5-11e0-b185-3b0f9b5b00c5 1323204972.062151s 1 87e06c7a-7388-11e0-ba07-03cdf300bd87 ++1323478057.181191s 1 be23c3ac-0ee5-11e0-b185-3b0f9b5b00c5 Unsure if this will really save anything in practice, since it only looks at one of the two old objects, and maybe I didn't pick the best one.	2011-12-11 23:02:25 -04:00
Joey Hess	d64132a43a	hslint	2011-12-09 01:57:13 -04:00
Joey Hess	9290095fc2	improve type signatures with a Ref newtype In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for those. Note that this does not prevent mixing up of eg, refs and branches at the type level. Since git really doesn't care, except rare cases like git update-ref, or git tag -d, that seems ok for now. There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref may or may not be a tree-ish, depending on the object type, so there seems no point in trying to represent it at the type level.	2011-11-16 02:41:46 -04:00
Joey Hess	272a67921c	better name	2011-11-16 01:46:46 -04:00
Joey Hess	e83b966eb5	cleanup	2011-11-15 23:51:24 -04:00
Joey Hess	21a925dcf1	merge: Now runs in constant space. Before, a merge was first calculated, by running various actions that called git and built up a list of lines, which were at the end sent to git update-index. This necessarily used space proportional to the size of the diff between the trees being merged. Now, lines are streamed into git update-index from each of the actions in turn. Runtime size of git-annex merge when merging 50000 location log files drops from around 100 mb to a constant 4 mb. Presumably it runs quite a lot faster, too.	2011-11-15 23:28:01 -04:00
Joey Hess	922e9af528	cleanup	2011-11-15 22:40:40 -04:00
Joey Hess	b76dc2d210	avoid space leak writing merge This reduces the memory use of a merge by 1/3rd. The space leak was apparently because the whole update-index input was generated strictly, not lazily. I wondered if the change to ByteStrings contributed to this, due to the need to convert with L.pack here. But going back to the old code, I still see a much similar leak, and worse performance besides due to it not using ByteStrings. The fix is to just hPutStr the lines repeatedly. (Note the \0 is written separately, to avoid allocation overheads in adding it to the string.) The Git.pipeWrite interface is probably just wrong for any large inputs to git. This was the only place using it for input of any size. There is still at least one other space leak in the merge code.	2011-11-15 22:19:12 -04:00
Joey Hess	04edae6791	Optimised union merging; now only runs git cat-file once.	2011-11-12 17:45:12 -04:00
Joey Hess	637b5feb45	lint	2011-11-11 01:52:58 -04:00
Joey Hess	bf460a0a98	reorder repo parameters last Many functions took the repo as their first parameter. Changing it consistently to be the last parameter allows doing some useful things with currying, that reduce boilerplate. In particular, g <- gitRepo is almost never needed now, instead use inRepo to run an IO action in the repo, and fromRepo to get a value from the repo. This also provides more opportunities to use monadic and applicative combinators.	2011-11-08 16:27:20 -04:00
Joey Hess	3acdba3995	faster union merge of multiple branches into index only write index once	2011-10-07 13:36:48 -04:00
Joey Hess	7ff89ccfee	convert all git read/write functions to use ByteStrings This yields a second or so speedup in unused, find, etc. Seems that even when the ByteString is immediately split and then converted to Strings, it's faster. I may try to push ByteStrings out into more of git-annex gradually, although I suspect most of the time-critical parts are already covered now, and many of the rest rely on libraries that only support Strings.	2011-09-29 23:48:57 -04:00
Joey Hess	949ef94d5e	layout	2011-09-29 22:31:20 -04:00
Joey Hess	67f2b7cb3e	use ByteStrings when reading content of files didn't bother to benchmark this	2011-09-29 19:19:28 -04:00
Joey Hess	a91c8a15d5	Sped up unused. Added Git.ByteString which replaces Git IO methods with ones using lazy ByteStrings. This can be more efficient when large quantities of data are being read from git. In Git.LsTree, parse git ls-tree output more efficiently, thanks to ByteString. This benchmarks 25% faster, in a benchmark that includes (probably predominately) the run time for git ls-tree itself. In real world numbers, this makes git annex unused 2 seconds faster for each branch it needs to check, in my usual large repo.	2011-09-29 19:04:24 -04:00
Joey Hess	297bc648b9	make unused check branches and tags too needs time and space optimisation	2011-09-28 16:43:10 -04:00
Joey Hess	ad245a6375	refactor catfile code split into generic IO code, and a thin Annex wrapper	2011-09-28 15:17:36 -04:00
Joey Hess	a3cb5c47e5	use FileMode	2011-09-28 14:14:52 -04:00
Joey Hess	93807564d0	add ls-tree interface This parser should be fast. I hope.	2011-09-28 14:03:59 -04:00
Joey Hess	7724f895a8	tweak	2011-09-25 14:37:13 -04:00
Joey Hess	203148363f	split groups of related functions out of Utility	2011-08-22 16:14:12 -04:00
Joey Hess	e784757376	hlint tweaks Did all sources except Remotes/* and Command/*	2011-07-15 03:12:05 -04:00
Joey Hess	ded2591124	unannex: Clean up use of git commit -a. This was more complex than would be expected. unannex has to use git commit -a since it's removing files from git; git commit filelist won't do. Allow commands to be added to the Git queue that have no associated files, and run such commands once.	2011-07-14 17:15:37 -04:00

... 2 3 4 5 6 ...

353 commits