git-annex

Author	SHA1	Message	Date
Joey Hess	3066bdb1fb	fix annex.largefiles largerthan/smallerthan bug Fix bug in handling of annex.largefiles that use largerthan/smallerthan. When adding a modified file, it incorrectly used the file size of the old version of the file, not the current size. That was the only largefiles limit that didn't directly look at the file on disk already. Added a new type to keep straight the two different ways such a limit can be matched. I kind of wanted to extend MatchingFile or FileInfo to indicate that the matcher is supposed to operate on files from disk or annex, but it turned out to be too complex to implement it that way. This also changes the LimitAnnexFiles case when lookupFileKey does not find a key. It used to fall back to statting the file, now it always returns False. I doubt the old code could really get to that point, but if it somehow does, it's better for preferred content matching to be consistent.	2019-09-30 17:15:08 -04:00
Joey Hess	b13a350556	added --unlocked and --locked	2019-09-19 12:33:13 -04:00
Joey Hess	fda1bdd679	Added --mimetype and --mimeencoding file matching options. Already had these for largefiles matching, but I forgot to add them as command-line options.	2019-09-19 12:09:59 -04:00
Joey Hess	d1a0c7b16f	make --in=here fast Use the same optimisation for --in=here as has always been used for --in=. rather than the slow code path that unncessarily queries the git-annex branch. It looks like when "here" got added as an alias for "." back in 2012, I forgot about this place. Also sped up some very unlikely ways of referring to the current repository. Note that, this could in some rare corner case cause a behavior change, if the git-annex branch and inAnnex disagree about whether content is present in the local repository. But --in=. already behaved that way, and the truth on the ground should win also.	2019-08-01 00:29:47 -04:00
Joey Hess	aa7710982b	avoid list lookup by parseToken Minor optimisation to parsing of a preferred content expression.	2019-05-14 13:11:29 -04:00
Joey Hess	9dd764e6f7	Added mimeencoding= term to annex.largefiles expressions. * Added mimeencoding= term to annex.largefiles expressions. This is probably mostly useful to match non-text files with eg "mimeencoding=binary" * git-annex matchexpression: Added --mimeencoding option.	2019-04-30 12:17:22 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	467c3b393d	refactor magic	2019-01-23 12:40:59 -04:00
Joey Hess	727767e1e2	make everything build again after ByteString Key changes	2019-01-11 16:39:46 -04:00
Joey Hess	6f66b53a30	newtype Group to ByteString This may speed up queries for things in groups, due to Eq and Ord being faster.	2019-01-09 15:05:49 -04:00
Joey Hess	029ae8d4db	support findred and --branch with file matching options * findref: Support file matching options: --include, --exclude, --want-get, --want-drop, --largerthan, --smallerthan, --accessedwithin * Commands supporting --branch now apply file matching options --include, --exclude, --want-get, --want-drop to filenames from the branch. Previously, combining --branch with those would fail to match anything. * add, import, findref: Support --time-limit. This commit was sponsored by Jake Vosloo on Patreon.	2018-12-09 13:38:35 -04:00
Joey Hess	6e6c9cc6d3	Added --accessedwithin matching option. Useful for dropping old objects from cache repositories. But also, quite a genrally useful thing to have.. Rather than imitiating find's -atime and other options, all of which are pretty horrible to use, I made this match files accessed within a time period, using the same duration format used by git-annex schedule and --limit-time In passing, changed the --limit-time option parser to parse the duration, instead of having it later throw an error. This commit was supported by the NSF-funded DataLad project.	2018-08-01 15:34:03 -04:00
Joey Hess	95f7295b67	followup	2018-06-04 12:12:56 -04:00
Joey Hess	f56594af9e	finish fixing inverted Ord for TrustLevel Flipped all comparisons. When a TrustLevel list was wanted from Trusted downwards, used Down to compare it in that order. This commit was sponsored by mo on Patreon.	2018-04-13 15:17:54 -04:00
Joey Hess	a0e4b9678b	fix inverted Ord for TrustLevel (intermediate commit) This commit removes the Ord and Enum instances, commenting out all code that depends on them, to make sure that all code effected by the inversion fix has been identified. (Assuming no ifdefs involve TrustLevel.) The next commit will fix up all the identified code.	2018-04-13 14:50:14 -04:00
Joey Hess	49114cf4ea	securehash matching Added --securehash option to match files using a secure hash function, and corresponding securehash preferred content expression. This commit was sponsored by Ethan Aubin.	2017-02-27 15:02:44 -04:00
Joey Hess	9c4650358c	add KeyVariety type Where before the "name" of a key and a backend was a string, this makes it a concrete data type. This is groundwork for allowing some varieties of keys to be disabled in file2key, so git-annex won't use them at all. Benchmarks ran in my big repo: old git-annex info: real 0m3.338s user 0m3.124s sys 0m0.244s new git-annex info: real 0m3.216s user 0m3.024s sys 0m0.220s new git-annex find: real 0m7.138s user 0m6.924s sys 0m0.252s old git-annex find: real 0m7.433s user 0m7.240s sys 0m0.232s Surprising result; I'd have expected it to be slower since it now parses all the key varieties. But, the parser is very simple and perhaps sharing KeyVarieties uses less memory or something like that. This commit was supported by the NSF-funded DataLad project.	2017-02-24 15:16:56 -04:00
Joey Hess	9eb10caa27	Some optimisations to string splitting code. Turns out that Data.List.Utils.split is slow and makes a lot of allocations. Here's a much simpler single character splitter that behaves the same (even in wacky corner cases) while running in half the time and 75% the allocations. As well as being an optimisation, this helps move toward eliminating use of missingh. (Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and allocates even more.) I have not benchmarked the effect on git-annex, but would not be surprised to see some parsing of eg, large streams from git commands run twice as fast, and possibly in less memory. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2017-01-31 19:06:22 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	eac26f13db	Fix bug in annex.largefiles mimetype= matching when git-annex is run in a subdirectory of the repository.	2016-04-12 14:19:34 -04:00
Joey Hess	b946ca44c3	Support --metadata field<number, --metadata field>number etc to match ranges of numeric values. Similarly (well, for free), support preferred content expressions like metadata=field<number and metadata=field>number	2016-02-27 10:55:02 -04:00
Joey Hess	a5bf674bec	Avoid crashing when built with MagicMime support, but when the magic database cannot be loaded.	2016-02-23 14:39:56 -04:00
Joey Hess	23cc315c38	matchexpression: Added --largefiles option to parse an annex.largefiles expression.	2016-02-03 16:58:36 -04:00
Joey Hess	5127cb59cc	annex.largefiles: Add support for mimetype=text/* etc, when git-annex is linked with libmagic.	2016-02-03 16:29:34 -04:00
Joey Hess	cdf5977053	simplify	2016-02-03 13:23:34 -04:00
Joey Hess	d37fe6a547	annex.largefiles can be configured in .gitattributes too This is particulary useful for v6 repositories, since the .gitattributes configuration will apply in all clones of the repository.	2016-02-02 15:18:17 -04:00
Joey Hess	d3ba9fe5c8	matchexpression: New plumbing command to check if a preferred content expression matches some data.	2016-01-25 16:16:18 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	cdd27b8920	reorg	2015-12-15 15:34:28 -04:00
Joey Hess	983c1894eb	avoid unnecessary reading of git-annex branch data when matching on annex.largefiles This makes git annex clean not look at the git-annex branch at all, and so speeds it up by 50% or more.	2015-12-04 15:06:41 -04:00
Joey Hess	9dfe03dbcd	Improve shutdown due to --time-limit, especially for fsck * Perform a clean shutdown when --time-limit is reached. This includes running queued git commands, and cleanup actions normally run when a command is finished. * fsck: Commit incremental fsck database when --time-limit is reached. Previously, some of the last files fscked did not make it into the database when using --time-limit. Note that this changes Annex.addCleanup hooks, to run after --time-limit expires. Fsck was using such a hook to clean up after a --incremental-schedule, and that shouldn't run when --time-limit exipires it. So, instead, moved that cleanup code to be run by cleanupIncremental. Resulted in some data type juggling.	2015-07-31 16:01:54 -04:00
Joey Hess	8c46ea22c2	Added new "anything" preferred content expression, which matches all versions of all files.	2015-06-16 17:03:34 -04:00
Joey Hess	38c458b407	refactor	2015-04-30 14:02:56 -04:00
Joey Hess	b94eb9b22c	relFile does not have to be relative; rename to currFile	2015-02-06 16:03:02 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	4f657aa14e	add getFileSize, which can get the real size of a large file on Windows Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.	2015-01-20 17:09:24 -04:00
Joey Hess	b61c6bc2ff	hlint	2014-10-09 15:46:05 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	fe19e15040	reorg matcher types; no non-type code changes	2014-03-29 14:43:34 -04:00
Joey Hess	ed30b81e2c	Improve behavior when unable to parse a preferred content expression (thanks, ion). Fall back to "present" as the preferred conent expression, which will not result in any content movement.	2014-03-20 00:10:12 -04:00
Joey Hess	83ccce68a2	theoretical optimisation of --in Avoids looking up the remote each time, but in practice, does not result in a measurable speedup.	2014-03-13 18:51:44 -04:00
Joey Hess	24f8136504	--metadata field=value can now use globs to match, and matches case insensatively, the same as git annex view field=value does. Also refactored glob code into its own module.	2014-02-21 18:34:34 -04:00
Joey Hess	2075cdeb59	limiting files based on metadata Note that there is currently no caching, so --metadata foo=bar --metadata tag=blah will currently read the log 2x per file.	2014-02-13 02:24:30 -04:00
Joey Hess	40cec65ace	more hlint	2014-02-11 10:48:52 -04:00
Joey Hess	a44e01c29c	--in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}"	2014-02-06 12:43:56 -04:00
Joey Hess	1669e80e85	Windows: Avoid using unix-compat's rename, which refuses to rename directories. Opened a bug about this: https://github.com/jystic/unix-compat/issues/10	2014-01-29 15:19:03 -04:00
Joey Hess	4b55afe9e9	add "unused" preferred content expression With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.	2014-01-22 16:35:32 -04:00
Joey Hess	f2713a3bb9	benchmarked numcopies .gitattributes in preferred content Checking .gitattributes adds a full minute to a git annex find looking for files that don't have enough copies. 2:25 increasts to 3:27. I feel this is too much of a slowdown to justify making it the default. So, exposed two versions of the preferred content expression, a slow one and a fast but approximate one. I'm using the approximate one in the default preferred content expressions to avoid slowing down the assistant.	2014-01-21 18:49:25 -04:00

1 2 3

109 commits