git-annex

Author	SHA1	Message	Date
Joey Hess	7dc6804154	unannex, uninit: Avoid committing after every file is unannexed, for massive speedup. pre-commit hook lock added, so unannex can prevent the hook from running in a confusing state. This commit was sponsored by Fredrik Hammar	2014-03-21 14:41:05 -04:00
Joey Hess	d0fce426c4	pre-commit-annex hook script to automatically extract metadata from lots of types of files Using the extract(1) program to do the heavy lifting. Decided to make git-annex run pre-commit-annex when committing. Since git-annex pre-commit also runs it, it'll be run when git commit is run too, via the pre-commit hook. This basically gives back the pre-commit hook that git-annex took away. The implementation avoids repeatedly looking for the hook script when the assistant is running and committing repeatedly; only checks if the hook is available once. To make the script simpler, made git-annex metadata -s field?=value only set a field when it's not already got a value. This commit was sponsored by bak.	2014-03-02 20:11:58 -04:00
Joey Hess	1435c4f149	factor out new module	2014-02-22 13:35:50 -04:00
Joey Hess	39ebfa1a2e	pre-commit: Update metadata when committing changes to annexed files within a view. So the user can now switch to a view and then move files around within it to manage metadata. For example, moving a file into a new directory when in the tags=* view adds a tag to it. Implementation is fairly efficient. One diff-index, which is no more expensive than the first stage of a git commit, followed by possibly some cat-file --batch traffic to find the key (when deleting a file). Very similar to what's done in direct mode when committing. And like direct mode when updating the WC after a merge, it has to buffer the diff-tree values in order to make 2 passes over them. When not in a view, pre-commit now does one extra git symbolic-ref, which is tiny overhead. This commit was sponsored by Andrew Eskridge.	2014-02-19 14:17:58 -04:00
Joey Hess	cce69eee4d	avoid using function named that conflicts with name used in newer version of process library	2014-01-29 13:44:53 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	03932212ec	Avoid using git commit in direct mode, since in some situations it will read the full contents of files in the tree. The assistant's commit code also always avoids git commit, for simplicity. Indirect mode sync still does a git commit -a to catch unstaged changes. Note that this means that direct mode sync no longer runs the pre-commit hook or any other hooks git commit might call. The git annex pre-commit hook action for direct mode is however explicitly run. (The assistant already ran git commit with hooks disabled, so no change there.)	2013-12-01 13:59:45 -04:00
Joey Hess	7206dcb376	update for DiffTree change This actually fixes a bug; if pre-commit was run in a subdir, it would pass relative files when updating the associated file maps, and so the maps wouldn't update. I don't think this bug happened in practice, due to the way pre-commit is called by the hook. It happened to chdir to the top of the work tree.	2013-10-17 14:52:12 -04:00
Joey Hess	b405295aee	hlint test suite still passes	2013-09-25 03:09:06 -04:00
Joey Hess	006cf7976f	more completely solve catKey memory leak Done using a mode witness, which ensures it's fixed everywhere. Fixing catFileKey was a bear, because git cat-file does not provide a nice way to query for the mode of a file and there is no other efficient way to do it. Oh, for libgit2.. Note that I am looking at tree objects from HEAD, rather than the index. Because I cat-file cannot show a tree object for the index. So this fix is technically incomplete. The only cases where it matters are: 1. A new large file has been directly staged in git, but not committed. 2. A file that was committed to HEAD as a symlink has been staged directly in the index. This could be fixed a lot better using libgit2.	2013-09-19 16:41:21 -04:00
Joey Hess	eb42bde19a	sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory.	2013-09-19 14:48:42 -04:00
guilhem	f15fda60ed	Speed up the 'unused' command. Instead of populating the second-level Bloom filter with every key referenced in every Git reference, consider only those which differ from what's referenced in the index. Incidentaly, unlike with its old behavior, staged modifications/deletion/... will now be detected by 'unused'. Credits to joeyh for the algorithm. :-)	2013-08-25 21:02:13 -04:00
Joey Hess	cfd3b16fe1	add section metadata to all commands Not yet used .. mindless train work.	2013-03-24 18:28:21 -04:00
Joey Hess	547d7745fb	pre-commit: Update direct mode mappings. Making the pre-commit hook look at git diff-index to find changed direct mode files and update the mappings works pretty well. One case where it does not work is when a file is git annex added, and then git rmed, and then this is committed. That's a no-op commit, so the hook probably doesn't even run, and it certianly never notices that the file was deleted, so the mapping will still have the original filename in it. For this and other reasons, it's important that the mappings still be treated as possibly inconsistent. Also, the assistant now allows the pre-commit hook to run when in direct mode, so the mappings also get updated there.	2013-02-06 12:44:19 -04:00
Joey Hess	7272179979	avoid running pre-commit hook in direct mode The code that handles committing unlocked files in indirect mode did something unexpected and data lossy.	2013-01-17 14:11:01 -04:00
Joey Hess	f12202f771	optimize pre-commit in direct mode	2013-01-06 16:56:55 -04:00
Joey Hess	20fafc6a2d	avoid pre-commit in direct mode It was a no-op until my recent change that made lookupFile work in direct mode.	2013-01-05 16:06:20 -04:00
Joey Hess	60ab3d84e1	added ifM and nuked 11 lines of code no behavior changes	2012-03-14 17:43:34 -04:00
Joey Hess	cbaebf538a	rework git check-attr interface Now gitattributes are looked up, efficiently, in only the places that really need them, using the same approach used for cat-file. The old CheckAttr code seemed very fragile, in the way it streamed files through git check-attr. I actually found that `cad8824852` was still deadlocking with ghc 7.4, at the end of adding a lot of files. This should fix that problem, and avoid future ones. The best part is that this removes withAttrFilesInGit and withNumCopies, which were complicated Seek methods, as well as simplfying the types for several other Seek methods that had a Backend tupled in.	2012-02-13 23:52:21 -04:00
Joey Hess	b327227ba5	better limiting of start actions to only run whenAnnexed Mostly only refactoring, but this does remove one redundant stat of the symlink by copy.	2011-11-10 23:45:14 -04:00
Joey Hess	f97c783283	clean up check selection code This new approach allows filtering out checks from the default set that are not appropriate for a command, rather than having to list every check that is appropriate. It also reduces some boilerplate. Haskell does not define Eq for functions, so I had to go a long way around with each check having a unique id. Meh.	2011-10-29 15:19:05 -04:00
Joey Hess	b955238ec7	Fail if --from or --to is passed to commands that do not support them.	2011-10-27 18:56:54 -04:00
Joey Hess	5b74b130a3	refactored and generalized pre-command sanity checking	2011-10-27 16:31:35 -04:00
Joey Hess	5ff04bf2af	tweak	2011-09-15 16:59:52 -04:00
Joey Hess	35145202d2	remove command type definitions These were a mistake, they make the type signatures harder to read and less flexible. The CommandSeek, CommandStart, CommandPerform, and CommandCleanup types were a good idea, but composing them with the parameters expected is going too far.	2011-09-15 16:50:49 -04:00
Joey Hess	9fe3c6d211	clean up params in usage display	2011-09-15 14:33:37 -04:00
Joey Hess	869cb82f49	remove unnecessary imports	2011-06-01 11:53:43 -04:00
Joey Hess	038da52bdd	Somewhat sped up `git commit` of modifications to unlocked files. Avoid git reset here too, so I no longer need to care that it's much more expensive than seems wise (but I asked the git list about that anyway). It's not necessary to reset the staged file content from the index, as the `git add` of the the symlink will replace it anyway. `git commit` of unlocked files is still slow, since git still has to shove their entire content into the index, only to have it be thrown away. So it's still better to use `git annex add`	2011-05-31 16:08:37 -04:00
Joey Hess	56bc3e95ca	refactor some boilerplate	2011-05-15 02:02:46 -04:00
Joey Hess	bc51387e6d	Periodically flush git command queue, to avoid boating memory usage too much. Since the queue is flushed in between subcommand actions being run, there should be no issues with actions that expect to queue up some stuff and have it run after they do other stuff. So I didn't have to audit for such assumptions.	2011-04-07 13:59:31 -04:00
Joey Hess	140a351fc5	avoid version check before running version and upgrade commands There are two types of commands; those that access the repository and those that don't. Sorted.	2011-03-19 18:58:49 -04:00
Joey Hess	bc5c54c987	symlink touching fun When adding files to the annex, the symlinks pointing at the annexed content are made to have the same mtime as the original file. While git does not preserve that information, this allows a tool like metastore to be used with annexed files.	2011-03-14 23:00:23 -04:00
Joey Hess	72d2684016	Rethink filename encoding handling for display. Since filename encoding may or may not match locale settings, any attempt to decode filenames will fail for some files. So instead, do all output in binary mode.	2011-03-12 15:30:17 -04:00
Joey Hess	fcdc4797a9	use ShellParam type So, I have a type checked safe handling of filenames starting with dashes, throughout the code.	2011-02-28 16:18:55 -04:00
Joey Hess	5a50a7cf13	update unicode FilePath handling Based on http://hackage.haskell.org/trac/ghc/ticket/3307 , whether FilePath contains decoded unicode varies by OS. So, add a configure check for it. Also, renamed showFile to filePathToString	2011-02-11 15:37:37 -04:00
Michael Kenney	285fb2bb08	Fixed missing import of Messages module	2011-02-10 21:06:00 -04:00
Joey Hess	fe55b4644e	Fix display of unicode filenames. Internally, the filenames are stored as un-decoded unicode. I tried decoding them, but then haskell tries to access the wrong files. Hmm. So, I've unhappily chosen option "B", which is to decode filenames before they are displayed.	2011-02-10 14:21:44 -04:00
Joey Hess	a89a6f2114	refactor in preparation for adding a git-annex-shell command	2010-12-30 15:06:26 -04:00
Joey Hess	6a5be9d53c	rename some stuff and prepare to break out more into Command/*	2010-12-30 14:19:16 -04:00
Joey Hess	92e5d28ca8	precommit: Optimise to avoid calling git-check-attr more than once.	2010-11-28 14:21:30 -04:00
Joey Hess	eeae910242	finished hlinting	2010-11-22 17:51:55 -04:00
Joey Hess	da0de293d1	refactor param seeking	2010-11-11 18:54:52 -04:00
Joey Hess	ce62f5abf1	rework command dispatching for add and pre-commit Both subcommands do two different operations on different sets of files, so allowing a subcommand to perform a list of operations cleans things up.	2010-11-11 17:58:55 -04:00
Joey Hess	dffe949963	Optimize both pre-commit and lock subcommands. isLocked was doing the expensive check before the cheap one. Let's not fork git diff twice per file when committing, especially. git diff is still run more than strictly necessary (ie, more than once) if multiple unlocked files are being committed. But much better now.	2010-11-11 14:54:29 -04:00
Joey Hess	d0886a9ac7	explicity run queue to git add files	2010-11-10 13:32:46 -04:00
Joey Hess	361d28e138	Unlocked files will now automatically be added back into the annex when committed (and the updated symlink committed), by some magic in the pre-commit hook.	2010-11-10 13:01:17 -04:00
Joey Hess	91c5fe71af	add	2010-11-10 10:52:43 -04:00

47 commits