git-annex

Author	SHA1	Message	Date
Joey Hess	0a7c5a9982	dropdead per-remote metadata Had to refactor pure code into separate modules so it is accessible inside Annex.Branch.Transitions. This commit was sponsored by Peter on Patreon.	2018-09-05 13:52:46 -04:00
Joey Hess	24b76cb8e0	fix prefixing	2018-08-31 13:39:50 -04:00
Joey Hess	b3d42283ad	use per-remote metadata storage for S3 version ID Since the same key can be stored in a versioned S3 bucket multiple times with different version IDs, this allows tracking them all. Not currently needed, but if we ever want to drop from a versioned S3 bucket, we'll need to know them all. This commit was supported by the NSF-funded DataLad project.	2018-08-31 13:27:29 -04:00
Joey Hess	5c99f6247e	per-remote metadata storage Actually very straightforward reuse of the metadata log file code. Although I had to add a todo item as git-annex forget won't clean up dead remote's metadata yet. This would be worth adding to the external special remote interface sometime. Have not opened a todo though, guess I'll wait until something needs it. This commit was supported by the NSF-funded DataLad project.	2018-08-31 12:23:22 -04:00
Joey Hess	256d8f07e8	avoid insertWith' depreaction warning Switch to Data.Map.Strict everywhere that used it. There are still lots of lazy maps in git-annex. I think switching these is safe. The risk is that there might be a map that is used in a way that relies on the values not being evaluated to WHNF, and switching to strict might result in bad performance or memory use. So, I have not switched everything.	2018-04-22 13:28:31 -04:00
Joey Hess	89e1a05a8f	Fix mangling of --json output of utf-8 characters when not running in a utf-8 locale As long as all code imports Utility.Aeson rather than Data.Aeson, and no Strings that may contain utf-8 characters are used for eg, object keys via T.pack, this is guaranteed to fix the problem everywhere that git-annex generates json. It's kind of annoying to need to wrap ToJSON with a ToJSON', especially since every data type that has a ToJSON instance has to be ported over. However, that only took 50 lines of code, which is worth it to ensure full coverage. I initially tried an alternative approach of a newtype FileEncoded, which had to be used everywhere a String was fed into aeson, and chasing down all the sites would have been far too hard. Did consider creating an intentionally overlapping instance ToJSON String, and letting ghc fail to build anything that passed in a String, but am not sure that wouldn't pollute some library that git-annex depends on that happens to use ToJSON String internally. This commit was supported by the NSF-funded DataLad project.	2018-04-16 16:21:21 -04:00
Joey Hess	ef389722ae	don't copy old date metadata when adding new version of a file When adding a new version of a file, and annex.genmetadata is enabled, don't copy the data metadata from the old version of the file, instead use the mtime of the file. Rationalle being that the user has requested to generate metadata and so would expect to get the new mtime into metadata. Also, avoid warning about copying metadata when all the old metadata is date metadata. Which was rather the harder part. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-04-04 13:58:16 -04:00
Joey Hess	812d90022b	metadata: Added --remove-all. Motivation is to remove all metadata when it gets copied from a previous version of the file, and that is not deisrable. This commit was supported by the NSF-funded DataLad project.	2017-09-28 12:36:10 -04:00
Joey Hess	bf3327ff25	Added metadata --batch option, which allows getting, setting, deleting, and modifying metadata for multiple files/keys.	2016-07-27 10:46:25 -04:00
Joey Hess	8bc8469c38	saner format for metadata --json metadata --json output format has changed, adding a inner json object named "fields" which contains only the fields and their values. This should be easier to parse than the old format, which mixed up metadata fields with other keys in the json object. Any consumers of the old format will need to be updated. This adds a dependency on unordered-containers for parsing MetaData from JSON, but it's a free dependency; aeson pulls in that library.	2016-07-26 15:41:04 -04:00
Yaroslav Halchenko	64e844e1fe	minor typo fixes throughout problematic flexibility	2016-06-02 11:22:18 -04:00
Joey Hess	e520366c4d	metadata: Added -r to remove all current values of a field.	2016-02-29 13:00:46 -04:00
Joey Hess	b946ca44c3	Support --metadata field<number, --metadata field>number etc to match ranges of numeric values. Similarly (well, for free), support preferred content expressions like metadata=field<number and metadata=field>number	2016-02-27 10:55:02 -04:00
Joey Hess	4e4e11849a	fix test suite fail in LANG=C This was caused by `23e9d3bb77` an Arbitrary String is not necessarily encoded using the filesystem encoding, and in a non-utf8 locale, encodeBS throws an exception on such a string. All I could think to do is limit test data to ascii. This shouldn't be a problem in practice, because the all Strings in git-annex that are not generated by Arbitrary should be loaded in a way that does apply the filesystem encoding.	2015-08-12 10:36:51 -04:00
Joey Hess	9b93278e8a	metadata: Fix encoding problem that led to mojibake when storing metadata strings that contained both unicode characters and a space (or '!') character. The fix is to stop using w82s, which does not properly reconstitute unicode strings. Instrad, use utf8 bytestring to get the [Word8] to base64. This passes unicode through perfectly, including any invalid filesystem encoded characters. Note that toB64 / fromB64 are also used for creds and cipher embedding. It would be unfortunate if this change broke those uses. For cipher embedding, note that ciphers can contain arbitrary bytes (should really be using ByteString.Char8 there). Testing indicated it's not safe to use the new fromB64 there; I think that characters were incorrectly combined. For credpair embedding, the username or password could contain unicode. Before, that unicode would fail to round-trip through the b64. So, I guess this is not going to break any embedded creds that worked before. This bug may have affected some creds before, and if so, this change will not fix old ones, but should fix new ones at least.	2015-03-04 12:54:30 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	5c79fa0351	avoid generating arbitrary MetaData with illegal fields	2014-03-26 16:40:52 -04:00
Joey Hess	2f52f727c0	hilarious typo	2014-03-18 19:03:35 -04:00
Joey Hess	caa97d1271	Each for each metadata field, there's now an automatically maintained "$field-lastchanged" that gives the timestamp of the last change to that field. Note that this is a nearly entirely free feature. The data was already stored in the metadata log in an easily accessible way, and already was parsed to a time when parsing the log. The generation of the metadata fields may even be done lazily, although probably not entirely (the map has to be evaulated to when queried).	2014-03-18 18:55:43 -04:00
Joey Hess	d0fce426c4	pre-commit-annex hook script to automatically extract metadata from lots of types of files Using the extract(1) program to do the heavy lifting. Decided to make git-annex run pre-commit-annex when committing. Since git-annex pre-commit also runs it, it'll be run when git commit is run too, via the pre-commit hook. This basically gives back the pre-commit hook that git-annex took away. The implementation avoids repeatedly looking for the hook script when the assistant is running and committing repeatedly; only checks if the hook is available once. To make the script simpler, made git-annex metadata -s field?=value only set a field when it's not already got a value. This commit was sponsored by bak.	2014-03-02 20:11:58 -04:00
Joey Hess	06e9080f01	metadata: FIeld names are now case insensative.	2014-02-25 18:45:09 -04:00
Joey Hess	b437787eee	metadata: Field names limited to alphanumerics and a few whitelisted punctuation characters to avoid issues with views, etc.	2014-02-23 13:34:59 -04:00
Joey Hess	7498c5dd96	annex.genmetadata can be set to make git-annex automatically set metadata (year and month) when adding files	2014-02-23 00:08:29 -04:00
Joey Hess	39ebfa1a2e	pre-commit: Update metadata when committing changes to annexed files within a view. So the user can now switch to a view and then move files around within it to manage metadata. For example, moving a file into a new directory when in the tags=* view adds a tag to it. Implementation is fairly efficient. One diff-index, which is no more expensive than the first stage of a git commit, followed by possibly some cat-file --batch traffic to find the key (when deleting a file). Very similar to what's done in direct mode when committing. And like direct mode when updating the WC after a merge, it has to buffer the diff-tree values in order to make 2 passes over them. When not in a view, pre-commit now does one extra git symbolic-ref, which is tiny overhead. This commit was sponsored by Andrew Eskridge.	2014-02-19 14:17:58 -04:00
Joey Hess	67fd06af76	add git annex view command (And a vpop command, which is still a bit buggy.) Still need to do vadd and vrm, though this also adds their documentation. Currently not very happy with the view log data serialization. I had to lose the TDFA regexps temporarily, so I can have Read/Show instances of View. I expect the view log format will change in some incompatable way later, probably adding last known refs for the parent branch to View or something like that. Anyway, it basically works, although it's a bit slow looking up the metadata. The actual git branch construction is about as fast as it can be using the current git plumbing. This commit was sponsored by Peter Hogg.	2014-02-18 18:22:20 -04:00
Joey Hess	613f8f02e3	add another quickcheck property, and several edge cases handled	2014-02-16 21:00:12 -04:00
Joey Hess	9633c67842	filter branches (incomplete) Promosing work toward metadata driven filter branches. A few methods to construct them are stubbed out; all the data types and pure code seems good. This commit was sponsored by Walter Somerville.	2014-02-16 17:39:54 -04:00
Joey Hess	2075cdeb59	limiting files based on metadata Note that there is currently no caching, so --metadata foo=bar --metadata tag=blah will currently read the log 2x per file.	2014-02-13 02:24:30 -04:00
Joey Hess	0e9a72b356	metacata command can now operate on many files at once	2014-02-13 01:49:38 -04:00
Joey Hess	8076530284	improve simplifier	2014-02-12 22:50:41 -04:00
Joey Hess	a05ac13e92	fix metadata log simplifier and additional quickcheck tests	2014-02-12 22:27:55 -04:00
Joey Hess	4d205b0fb9	fix Ord instance for MetaValue to work like Eq instance	2014-02-12 22:01:24 -04:00
Joey Hess	9f7e76130e	add metadata command to get/set metadata Adds metadata log, and command. Note that unsetting field values seems to currently be broken. And in general this has had all of 2 minutes worth of testing. This commit was sponsored by Julien Lefrique.	2014-02-12 21:30:33 -04:00
Joey Hess	1b79d18a40	data types and serialization for metadata A very haskell commit! Just data types, instances to serialize the metadata to a nice format, and QuickCheck tests. This commit was sponsored by Andreas Leha.	2014-02-12 17:57:32 -04:00

35 commits