git-annex

Author	SHA1	Message	Date
Joey Hess	6f66b53a30	newtype Group to ByteString This may speed up queries for things in groups, due to Eq and Ord being faster.	2019-01-09 15:05:49 -04:00
Joey Hess	029ae8d4db	support findred and --branch with file matching options * findref: Support file matching options: --include, --exclude, --want-get, --want-drop, --largerthan, --smallerthan, --accessedwithin * Commands supporting --branch now apply file matching options --include, --exclude, --want-get, --want-drop to filenames from the branch. Previously, combining --branch with those would fail to match anything. * add, import, findref: Support --time-limit. This commit was sponsored by Jake Vosloo on Patreon.	2018-12-09 13:38:35 -04:00
Joey Hess	6e6c9cc6d3	Added --accessedwithin matching option. Useful for dropping old objects from cache repositories. But also, quite a genrally useful thing to have.. Rather than imitiating find's -atime and other options, all of which are pretty horrible to use, I made this match files accessed within a time period, using the same duration format used by git-annex schedule and --limit-time In passing, changed the --limit-time option parser to parse the duration, instead of having it later throw an error. This commit was supported by the NSF-funded DataLad project.	2018-08-01 15:34:03 -04:00
Joey Hess	95f7295b67	followup	2018-06-04 12:12:56 -04:00
Joey Hess	f56594af9e	finish fixing inverted Ord for TrustLevel Flipped all comparisons. When a TrustLevel list was wanted from Trusted downwards, used Down to compare it in that order. This commit was sponsored by mo on Patreon.	2018-04-13 15:17:54 -04:00
Joey Hess	a0e4b9678b	fix inverted Ord for TrustLevel (intermediate commit) This commit removes the Ord and Enum instances, commenting out all code that depends on them, to make sure that all code effected by the inversion fix has been identified. (Assuming no ifdefs involve TrustLevel.) The next commit will fix up all the identified code.	2018-04-13 14:50:14 -04:00
Joey Hess	49114cf4ea	securehash matching Added --securehash option to match files using a secure hash function, and corresponding securehash preferred content expression. This commit was sponsored by Ethan Aubin.	2017-02-27 15:02:44 -04:00
Joey Hess	9c4650358c	add KeyVariety type Where before the "name" of a key and a backend was a string, this makes it a concrete data type. This is groundwork for allowing some varieties of keys to be disabled in file2key, so git-annex won't use them at all. Benchmarks ran in my big repo: old git-annex info: real 0m3.338s user 0m3.124s sys 0m0.244s new git-annex info: real 0m3.216s user 0m3.024s sys 0m0.220s new git-annex find: real 0m7.138s user 0m6.924s sys 0m0.252s old git-annex find: real 0m7.433s user 0m7.240s sys 0m0.232s Surprising result; I'd have expected it to be slower since it now parses all the key varieties. But, the parser is very simple and perhaps sharing KeyVarieties uses less memory or something like that. This commit was supported by the NSF-funded DataLad project.	2017-02-24 15:16:56 -04:00
Joey Hess	9eb10caa27	Some optimisations to string splitting code. Turns out that Data.List.Utils.split is slow and makes a lot of allocations. Here's a much simpler single character splitter that behaves the same (even in wacky corner cases) while running in half the time and 75% the allocations. As well as being an optimisation, this helps move toward eliminating use of missingh. (Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and allocates even more.) I have not benchmarked the effect on git-annex, but would not be surprised to see some parsing of eg, large streams from git commands run twice as fast, and possibly in less memory. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2017-01-31 19:06:22 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	eac26f13db	Fix bug in annex.largefiles mimetype= matching when git-annex is run in a subdirectory of the repository.	2016-04-12 14:19:34 -04:00
Joey Hess	b946ca44c3	Support --metadata field<number, --metadata field>number etc to match ranges of numeric values. Similarly (well, for free), support preferred content expressions like metadata=field<number and metadata=field>number	2016-02-27 10:55:02 -04:00
Joey Hess	a5bf674bec	Avoid crashing when built with MagicMime support, but when the magic database cannot be loaded.	2016-02-23 14:39:56 -04:00
Joey Hess	23cc315c38	matchexpression: Added --largefiles option to parse an annex.largefiles expression.	2016-02-03 16:58:36 -04:00
Joey Hess	5127cb59cc	annex.largefiles: Add support for mimetype=text/* etc, when git-annex is linked with libmagic.	2016-02-03 16:29:34 -04:00
Joey Hess	cdf5977053	simplify	2016-02-03 13:23:34 -04:00
Joey Hess	d37fe6a547	annex.largefiles can be configured in .gitattributes too This is particulary useful for v6 repositories, since the .gitattributes configuration will apply in all clones of the repository.	2016-02-02 15:18:17 -04:00
Joey Hess	d3ba9fe5c8	matchexpression: New plumbing command to check if a preferred content expression matches some data.	2016-01-25 16:16:18 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	cdd27b8920	reorg	2015-12-15 15:34:28 -04:00
Joey Hess	983c1894eb	avoid unnecessary reading of git-annex branch data when matching on annex.largefiles This makes git annex clean not look at the git-annex branch at all, and so speeds it up by 50% or more.	2015-12-04 15:06:41 -04:00
Joey Hess	9dfe03dbcd	Improve shutdown due to --time-limit, especially for fsck * Perform a clean shutdown when --time-limit is reached. This includes running queued git commands, and cleanup actions normally run when a command is finished. * fsck: Commit incremental fsck database when --time-limit is reached. Previously, some of the last files fscked did not make it into the database when using --time-limit. Note that this changes Annex.addCleanup hooks, to run after --time-limit expires. Fsck was using such a hook to clean up after a --incremental-schedule, and that shouldn't run when --time-limit exipires it. So, instead, moved that cleanup code to be run by cleanupIncremental. Resulted in some data type juggling.	2015-07-31 16:01:54 -04:00
Joey Hess	8c46ea22c2	Added new "anything" preferred content expression, which matches all versions of all files.	2015-06-16 17:03:34 -04:00
Joey Hess	38c458b407	refactor	2015-04-30 14:02:56 -04:00
Joey Hess	b94eb9b22c	relFile does not have to be relative; rename to currFile	2015-02-06 16:03:02 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	4f657aa14e	add getFileSize, which can get the real size of a large file on Windows Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.	2015-01-20 17:09:24 -04:00
Joey Hess	b61c6bc2ff	hlint	2014-10-09 15:46:05 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	fe19e15040	reorg matcher types; no non-type code changes	2014-03-29 14:43:34 -04:00
Joey Hess	ed30b81e2c	Improve behavior when unable to parse a preferred content expression (thanks, ion). Fall back to "present" as the preferred conent expression, which will not result in any content movement.	2014-03-20 00:10:12 -04:00
Joey Hess	83ccce68a2	theoretical optimisation of --in Avoids looking up the remote each time, but in practice, does not result in a measurable speedup.	2014-03-13 18:51:44 -04:00
Joey Hess	24f8136504	--metadata field=value can now use globs to match, and matches case insensatively, the same as git annex view field=value does. Also refactored glob code into its own module.	2014-02-21 18:34:34 -04:00
Joey Hess	2075cdeb59	limiting files based on metadata Note that there is currently no caching, so --metadata foo=bar --metadata tag=blah will currently read the log 2x per file.	2014-02-13 02:24:30 -04:00
Joey Hess	40cec65ace	more hlint	2014-02-11 10:48:52 -04:00
Joey Hess	a44e01c29c	--in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}"	2014-02-06 12:43:56 -04:00
Joey Hess	1669e80e85	Windows: Avoid using unix-compat's rename, which refuses to rename directories. Opened a bug about this: https://github.com/jystic/unix-compat/issues/10	2014-01-29 15:19:03 -04:00
Joey Hess	4b55afe9e9	add "unused" preferred content expression With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.	2014-01-22 16:35:32 -04:00
Joey Hess	f2713a3bb9	benchmarked numcopies .gitattributes in preferred content Checking .gitattributes adds a full minute to a git annex find looking for files that don't have enough copies. 2:25 increasts to 3:27. I feel this is too much of a slowdown to justify making it the default. So, exposed two versions of the preferred content expression, a slow one and a fast but approximate one. I'm using the approximate one in the default preferred content expressions to avoid slowing down the assistant.	2014-01-21 18:49:25 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	3159da2693	Add and use numcopiesneeded preferred content expression. * Add numcopiesneeded preferred content expression. * Client, transfer, incremental backup, and archive repositories now want to get content that does not yet have enough copies. This means the asssistant will make copies of files that don't yet meet the configured numcopies, even to places that would not normally want the file. For example, if numcopies is 4, and there are 2 client repos and 2 transfer repos, and 2 removable backup drives, the file will be sent to both transfer repos in order to make 4 copies. Once a removable drive get a copy of the file, it will be dropped from one transfer repo or the other (but not both). Another example, numcopies is 3 and there is a client that has a backup removable drive and two small archive repos. Normally once one of the small archives has a file, it will not be put into the other one. But, to satisfy numcopies, the assistant will duplicate it into the other small archive too, if the backup repo is not available to receive the file. I notice that these examples are fairly unlikely setups .. the old behavior was not too bad, but it's nice to finally have it really correct. .. Almost. I have skipped checking the annex.numcopies .gitattributes out of fear it will be too slow. This commit was sponsored by Florian Schlegel.	2014-01-20 17:35:29 -04:00
Joey Hess	8ce515ffe4	improve matcher data type to allow matching Keys, instead of just files (no behavior changes)	2014-01-18 14:51:55 -04:00
Joey Hess	049e80e865	refactor	2013-10-28 14:05:55 -04:00
Joey Hess	2b6747b6a2	update for Duration type change	2013-10-08 17:36:55 -04:00
Joey Hess	06db8e0bd9	squash compiler warnings on Windows	2013-08-04 13:18:05 -04:00
Joey Hess	749beb5899	fix Android build, broken for 2 days	2013-05-26 18:14:03 -04:00
Joey Hess	2b14fe2c98	refactor	2013-05-24 23:07:26 -04:00

1 2

100 commits