git-annex

Author	SHA1	Message	Date
Joey Hess	7347e50123	add stage number to stagedDetails parser And convert parser to attoparsec, probably faster. Before, a parse failure threw the whole --stage output line in to the filename, which was certianly a bad idea, so fixed that.	2020-07-08 15:05:12 -04:00
Joey Hess	57b89c635f	support required groupwanted When the required content is set to "groupwanted", use whatever expression has been set in groupwanted as the required content of the repo, similar to how setting required content to "standard" already worked.	2020-04-28 13:31:26 -04:00
Joey Hess	f85ca7dc80	fix all remaining -Wincomplete-uni-patterns warnings A couple of these were probably actual bugs in edge cases. Most of the changes I'm fine with. The fact that aeson's object returns sometihng that we know will be an Object, but the type checker does not know is kind of annoying.	2020-04-15 13:55:08 -04:00
Joey Hess	0e4c92503e	fix warning I don't think the NoConfigValue case ever actually occurs here.	2020-04-15 13:04:00 -04:00
Joey Hess	9cb69dbb76	support boolean git configs that are represented by the name of the setting with no value Eg"core.bare" is the same as "core.bare = true". Note that git treats "core.bare =" the same as "core.bare = false", so the code had to become more complicated in order to treat the absense of a value differently than an empty value. Ugh.	2020-04-13 13:35:22 -04:00
Joey Hess	aeca7c2207	Sped up query commands that read the git-annex branch by around 5% The only price paid is one additional MVar read per write to the journal. Presumably writing a journal file dominiates over a MVar read time by several orders of magnitude. --batch does not get the speedup because then it needs to notice when another process has made a change. Also made the assistant and other damon modes bypass the optimisation, which would not help them anyway.	2020-04-09 13:54:43 -04:00
Joey Hess	c0cd07c36b	Ref ByteString conversion done Test suite passes.	2020-04-07 17:41:09 -04:00
Joey Hess	6c81e0c8f1	ByteString Ref continued Several nice speed wins I think. At 340/633 files converted.	2020-04-07 13:27:11 -04:00
Joey Hess	eaa49ab53d	convert replaceFile to createDirectoryUnder Since it was used on both worktree and .git/annex files, split into multiple functions. In passing, this also improves permissions of created directories in .git/annex, using createAnnexDirectory on those.	2020-03-06 11:31:01 -04:00
Joey Hess	879f52a116	annex.tune.branchhash1=true bugfix Fix support for repositories tuned with annex.tune.branchhash1=true, including --all not working and git-annex log not displaying anything for annexed files.	2020-02-14 15:22:48 -04:00
Joey Hess	71ecfbfccf	be stricter about rejecting invalid configurations for remotes This is a first step toward that goal, using the ProposedAccepted type in RemoteConfig lets initremote/enableremote reject bad parameters that were passed in a remote's configuration, while avoiding enableremote rejecting bad parameters that have already been stored in remote.log This does not eliminate every place where a remote config is parsed and a default value is used if the parse false. But, I did fix several things that expected foo=yes/no and so confusingly accepted foo=true but treated it like foo=no. There are still some fields that are parsed with yesNo but not not checked when initializing a remote, and there are other fields that are parsed in other ways and not checked when initializing a remote. This also lays groundwork for rejecting unknown/typoed config keys.	2020-01-10 14:52:48 -04:00
Joey Hess	37467a008f	annex.addunlocked expressions * annex.addunlocked can be set to an expression with the same format used by annex.largefiles, in case you want to default to unlocking some files but not others. * annex.addunlocked can be configured by git-annex config. Added a git-annex-matching-expression man page, broken out from tips/largefiles. A tricky consequence of this is that git-annex add --relaxed honors annex.addunlocked, but an expression might want to know the size or content of an url, which it's not going to download. I decided it was better not to fail, and just dummy up some plausible data in that case. Performance impact should be negligible. The global config is already loaded for annex.largefiles. The expression only has to be parsed once, and in the simple true/false case, it should not do any additional work matching it.	2019-12-20 15:56:25 -04:00
Joey Hess	ce3fb0b2e5	fixed an oversight that had always prevented annex.resolvemerge from being honored, when it was configured by git-annex config forgot to add it to the merge function	2019-12-20 11:00:08 -04:00
Joey Hess	686791c4ed	more RawFilePath Remove dup definitions and just use the RawFilePath one. </> etc are enough faster that it's probably faster than building a String directly, although I have not benchmarked.	2019-12-18 17:10:28 -04:00
Joey Hess	bdec7fed9c	convert TopFilePath to use RawFilePath Adds a dependency on filepath-bytestring, an as yet unreleased fork of filepath that operates on RawFilePath. Git.Repo also changed to use RawFilePath for the path to the repo. This does eliminate some RawFilePath -> FilePath -> RawFilePath conversions. And filepath-bytestring's </> is probably faster. But I don't expect a major performance improvement from this. This is mostly groundwork for making Annex.Location use RawFilePath, which will allow for a conversion-free pipleline.	2019-12-09 15:07:21 -04:00
Joey Hess	c20f4704a7	all commands building except for assistant also, changed ConfigValue to a newtype, and moved it into Git.Config.	2019-12-05 14:41:18 -04:00
Joey Hess	b88f89c1ef	get the most commonly used commands building again A quick benchmark of whereis shows not much speed improvement, maybe a few percent. Profiling it found a hotspot, adds to todo.	2019-12-04 13:45:18 -04:00
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	81d402216d	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. Previously attempted in `4536c93bb2` and reverted in `96aba8eff7`. The problems mentioned in the latter commit are addressed now: Read/Show of KeyData is backwards-compatible with Read/Show of Key from before this change, so Types.Distribution will keep working. The Eq instance is fixed. Also, Key has smart constructors, avoiding needing to remember to update the cached serialization. Used git-annex benchmark: find is 7% faster whereis is 3% faster get when all files are already present is 5% faster Generally, the benchmarks are running 0.1 seconds faster per 2000 files, on a ram disk in my laptop.	2019-11-22 17:49:16 -04:00
Joey Hess	ce48eb797c	make DropDead transition minimize remote.log for dead sameas remotes All that needs to be retained in remote.log is the sameas-uuid. The rest of the config is eliminated. This doesn't save enough space to bother with, but it prevents anything sensitive in the config of the dead sameas remote from lingering around. Note that minimizesameasdead does not update the VectorClock when changing the log line. That's normally a no-no, but in this case, it makes each DropDead result in the exact same file contents, and vector clocks are not needed because the transition breaks the history chain.	2019-10-15 11:39:25 -04:00
Joey Hess	78f522e7c8	forgot to add this new file	2019-10-14 16:09:04 -04:00
Joey Hess	5e9a2cc37f	forget state of sameas remotes during DropDead transitions It would have been a lot less round-about to just make git annex dead also add the uuids of sameas remotes to the trust.log as dead. But, that would fail in the case where there's an unmerged other clone that has a sameas remote that the current repo does not know about. Then it would not get marked as dead. Handling it at transition time avoids that scenario. Note that the generation of trustmap' in dropDead should only happen once, due to the partial application.	2019-10-14 15:47:42 -04:00
Joey Hess	9828f45d85	add RemoteStateHandle This solves the problem of sameas remotes trampling over per-remote state. Used for: * per-remote state, of course * per-remote metadata, also of course * per-remote content identifiers, because two remote implementations could in theory generate the same content identifier for two different peices of content While chunk logs are per-remote data, they don't use this, because the number and size of chunks stored is a common property across sameas remotes. External special remote had a complication, where it was theoretically possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or EXPORTSUPPORTED. Since the uuid of the remote is typically generate in Remote.setup, it would only be possible to pass a Maybe RemoteStateHandle into it, and it would otherwise have to construct its own. Rather than go that route, I decided to send an ERROR in this case. It seems unlikely that any existing external special remote will be affected. They would have to make up a git-annex key, and set state for some reason during INITREMOTE. I can imagine such a hack, but it doesn't seem worth complicating the code in such an ugly way to support it. Unfortunately, both TestRemote and Annex.Import needed the Remote to have a new field added that holds its RemoteStateHandle.	2019-10-14 13:51:42 -04:00
Joey Hess	91eed85fd4	add sameas inherited configs to newConfig This makes initremote --sameas work with encryption inherited.	2019-10-11 13:05:20 -04:00
Joey Hess	c3975ff3b4	sameas RemoteConfig inheritance I found a way to avoid inheritance complicating anything outside of Logs.Remote. It seems fine to require all inherited values to be inherited and not set in the sameas remote's config. Since inherited values will be used for stuff like encryption and perhaps chunking, which control the actual content stored on the remote, it seems likely that there will not be any reason to need them to vary between two remotes that access the same underlying data store. The newer version of containers is free; the minimum ghc version is bundled with a newer version than that.	2019-10-10 15:58:22 -04:00
Joey Hess	84e729fda5	fix init default description reversion init: Fix a reversion in the last release that prevented automatically generating and setting a description for the repository. Seemed best to factor out uuidDescMapRaw that does not have the default mempty descrition behavior. I don't much like that behavior, but I know things depend on it. One thing in particular is `git annex info` which lists the uuids and descriptions; if the current repo has been initialized in some way that means it does not have a description, it would not show up w/o that. (Not only repos created due to this bug might lack that. For example a repo that was marked dead and had --drop-dead delete its git-annex branch info, and then came back from the dead would similarly not be in the uuid.log. Also there have been other versions of git-annex that didn't set a default description; for years there was no default description.)	2019-06-20 20:30:24 -04:00
Joey Hess	258a7c5cd1	add Key to all ActionItem constructors	2019-06-06 12:53:24 -04:00
Joey Hess	b646a85c4a	oops, fixed wrong change	2019-05-23 12:44:27 -04:00
Joey Hess	e1aa8b9001	avoid build failure with older base	2019-05-23 12:16:57 -04:00
Joey Hess	16a2bed710	avoid build warning on Windows about unused import	2019-05-23 12:15:33 -04:00
Joey Hess	e06feb7316	honor preferred content when importing Importing from a special remote honors its preferred content too; unwanted files are not imported. But, some preferred content expressions can't be checked before files are imported, and trying to import with such an expression will fail. Tested this with scenarios including changing the preferred content expression and making sure merging the import didn't delete files that were no longer wanted. There was one minor inefficiency mentioned in the todo that I punted on.	2019-05-21 14:38:06 -04:00
Joey Hess	97fd9da6e7	add back non-preferred files to imported tree Prevents merging the import from deleting the non-preferred files from the branch it's merged into. adjustTree previously appended the new list of items to the old, which could result in it generating a tree with multiple files with the same name. That is not good and confuses some parts of git. Gave it a function to resolve such conflicts. That allowed dealing with the problem of what happens when the import contains some files (or subtrees) with the same name as files that were filtered out of the export. The files from the import win.	2019-05-20 16:43:52 -04:00
Joey Hess	354c0eb57f	support standard and groupwanted in keyless mode Only when the preferred content expression includes them will a parse failure due to them needing keys result in the preferred content expression not parsing in keyless mode.	2019-05-14 14:59:03 -04:00
Joey Hess	9411a7c93c	matching preferred content before key is known This will let import try to match preferred content expressions before downloading the content and generating its key. If an expression needs a key, it preferredContentParser with preferredContentKeylessTokens will fail to parse it. standard and groupwanted are not in preferredContentKeylessTokens because they may refer to an expression that refers to a key. That needs further work to support them.	2019-05-14 14:28:23 -04:00
Joey Hess	aa7710982b	avoid list lookup by parseToken Minor optimisation to parsing of a preferred content expression.	2019-05-14 13:11:29 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	ee251b2e2e	implement updating the ContentIdentifier db with info from the git-annex branch untested This won't be super slow, but it does need to diff two likely large trees, and since the git-annex branch rarely sits still, it will most likely be run at the beginning of every import. A possible speed improvement would be to only run this when the database did not contain a ContentIdentifier. But that would only speed up imports when there is no new version of a file on the special remote, at most renames of existing files being imported. A better speed improvement would be to record something in the git-annex branch that indicates when an import has been run, and only do the diff if the git-annex branch has record of a newer import than we've seen before. Then, it would only run when there is in fact new ContentIdentifier information available from a remote. Certianly doable, but didn't want to complicate things yet.	2019-03-06 18:04:30 -04:00
Joey Hess	1b3d04979e	speed up slow quickcheck test Only test parsing of ContentIdentifier lists, not the whole log.	2019-03-06 16:43:41 -04:00
Joey Hess	6ef38df881	fix another parser bug	2019-03-06 14:51:31 -04:00
Joey Hess	c0bd202147	fix failing test case An empty list of [ContentIdenfier] serialized to the same thing as a single ContentIdentifier "". Avoid this ambiguity by requiring the list be non-empty.	2019-03-06 14:27:15 -04:00
Joey Hess	b0fe4916b7	forgot to change list delimiter in parser	2019-03-06 11:59:42 -04:00
Joey Hess	7af55de83c	optimisation: use graftTree to remember the export branch Sped up git-annex export in repositories with lots of keys. Old method read whole git-annex branch tree into memory.	2019-02-22 11:16:22 -04:00
Joey Hess	56137ce0d2	use colon not space to delimit content identifier list InodeCache serializes to a value with spaces, and seems likely other things will too, and want to avoid unncessary base64 of content identifiers when possible.	2019-02-21 13:45:16 -04:00
Joey Hess	1f6339ade7	fix parsing of empty content identifier Seems very unlikely an empty content identifier would be used, but quickcheck found this bug.	2019-02-21 13:44:09 -04:00
Joey Hess	9887a378fe	renamings to make clean when old-format logs are being used	2019-02-21 13:43:44 -04:00
Joey Hess	936aee6a60	quickcheck property for parsing of content identifier logs	2019-02-21 13:17:43 -04:00
Joey Hess	5a294f0dd7	add Logs.ContentIdentifier	2019-02-20 17:22:56 -04:00
Joey Hess	a818bc5e73	add Database.ContentIdentifier Does not yet have a way to update with new information from the git-annex branch, which will be needed when multiple repos are importing from the same remote.	2019-02-20 16:59:10 -04:00
Joey Hess	e8bfc3640b	storing ContentIdentifier in the git-annex branch	2019-02-20 15:40:07 -04:00
Joey Hess	ad1d422dd7	fix false positive in export conflict detection Like the earlier fixed one in Command.Export, it occurred when the same tree was exported by multiple clones. Previous fix was incomplete since several other places looked at the list of exported trees to detect when there was an export conflict. Added a single unified function to avoid missing any places it needed to be fixed. This commit was sponsored by mo on Patreon.	2019-01-30 12:36:30 -04:00
Joey Hess	d913961b7b	avoid shadowing warning	2019-01-21 20:40:30 -04:00
Joey Hess	b79ef475ac	still fixing this sigh	2019-01-21 18:39:25 -04:00
Joey Hess	f603fc252d	fixed a different way this test case could fail in LANG=C	2019-01-21 16:46:40 -04:00
Joey Hess	0aea1a7b93	fix inverted logic	2019-01-21 15:41:35 -04:00
Joey Hess	e15d18058b	add back exclusion of whitespace and =	2019-01-21 12:38:00 -04:00
Joey Hess	928e08904d	avoid two test failures with LANG=C Log.Remote.prop_parse_show_Config failed on an input of fromList [("\28162","")] in LANG=C, encodeBS "\28162" == "\STX=", while in UTF-8 locale, encodeBS "\28162" == "\230\184\130". So in the C locale, the String that's the parsed Map key ends up being encoded differently than it was in the input Map. Logs.Presence.Pure.prop_parse_build_log was failing in LANG=C because the Arbitrary LogLine for some reason sometimes generated LogInfo values containing \n or \r, despite using suchThat to prevent that. I don't understand why at all, but switching the suchThat to filter the ByteString instead of the String before conversion with encodeBS somehow avoids the problem. Both of these suggest something wonky with encodeBS in LANG=C, but I think it's not a problem except for with test data generated by Arbitrary.	2019-01-18 13:33:48 -04:00
Joey Hess	d3ab5e626b	rename key2file and file2key What these generate is not really suitable to be used as a filename, which is why keyFile and fileKey further escape it. These are just serializing Keys. Also removed a quickcheck test that was very unlikely to test anything useful, since it relied on random chance creating something that looks like a serialized key. The other test is sufficient for testing what that was intended to test anyway.	2019-01-14 13:03:35 -04:00
Joey Hess	2eadb6cd68	convert transitions.log to attoparsec and bytestring-builder Not likely to be any speed gain here, but this completes porting every log file over. And, it let me get rid of code copied from ghc and modified, so simplifying the licensing.	2019-01-10 17:13:30 -04:00
Joey Hess	591e4b145f	convert old uuid-based log parsers to attoparsec This preserves the workaround for the old bug that caused NoUUID items to be stored in the log, prefixing log lines with " ". It's now handled implicitly, by using takeWhile1 (/= ' ') to get the uuid. There is a behavior change from the old parser, which split the value into words and then recombined it. That meant that "foo bar" and "foo\tbar" came out as "foo bar". That behavior was not documented, and seems surprising; it meant that after a git-annex describe here "foo bar", you wouldn't get that same string back out when git-annex displayed repo descriptions. Otoh, some other parsers relied on the old behavior, and the attoparsec rewrites had to deal with the issue themselves... For group.log, there are some edge cases around the user providing a group name with a leading or trailing space. The old parser would ignore such excess whitespace. The new parser does too, because the alternative is to refuse to parse something like " group1 group2 " due to excess whitespace, which would be even more confusing behavior. The only git-annex branch log file that is not converted to attoparsec and bytestring-builder now is transitions.log.	2019-01-10 16:34:20 -04:00
Joey Hess	66603d6f75	attoparsec parsers for all new-format uuid-based logs There should be some speed gains here, especially for chunk and remote state logs, which are queried once per key. Now only old-format uuid-based logs still need to be converted to attoparsec.	2019-01-10 13:30:36 -04:00
Joey Hess	7e54c215b4	Remove redundant endOfInput parseLogLines consumes till endOfInput	2019-01-10 12:42:59 -04:00
Joey Hess	6f66b53a30	newtype Group to ByteString This may speed up queries for things in groups, due to Eq and Ord being faster.	2019-01-09 15:05:49 -04:00
Joey Hess	2fef43dd71	convert all per-uuid log files to use Builder Mostly didn't push the ByteStrings down very deep, but all of these log files are not written to frequently at all, so slight remaining innefficiency doesn't matter. In Logs.UUID, removed the fixBadUUID code that cleaned up after a bug in git-annex versions 3.20111105-3.20111110. In the unlikely event that a repo was last touched by that ancient git-annex version, the descriptions of remotes would appear missing when used with this version of git-annex. That is such minor breakage, and so unlikely to still be a problem for any repos, that it was not worth forward-porting that code to ByteString.	2019-01-09 14:00:35 -04:00
Joey Hess	de4980ef85	simplify Show instance by deriving	2019-01-09 13:13:31 -04:00
Joey Hess	11f4d993b5	fix format of show instance It's only used to show test suite failures..	2019-01-09 13:12:25 -04:00
Joey Hess	2d46038754	converting more log files to use Builder Probably not any particular speedup in this, since most of these logs are not written to often. Possibly chunk log writing is sped up, but writes to chunk logs are interleaved with expensive data transfers to remotes, so unlikely to be a noticiable speedup.	2019-01-09 13:06:37 -04:00
Joey Hess	cb375977a6	follow-on changes from MetaData type changes Including writing and parsing the metadata log files with bytestring-builder and attoparsec.	2019-01-07 15:51:05 -04:00
Joey Hess	ef8ddaa713	attoparsec parser for presence logs	2019-01-03 15:27:29 -04:00
Joey Hess	bfc9039ead	convert git-annex branch access to ByteStrings and Builders Most of the individual logs are not converted yet, only presense logs have an efficient ByteString Builder implemented so far. The rest convert to and from String.	2019-01-03 13:21:48 -04:00
Joey Hess	7d51b0c109	import Utility.FileSystemEncoding in Common	2019-01-03 11:37:02 -04:00
Joey Hess	894716512d	add a UUIDDesc type containing a ByteString Groundwork for handling uuid.log using ByteString	2019-01-01 16:17:54 -04:00
Joey Hess	9cc6d5549b	convert UUID from String to ByteString This should make == comparison of UUIDs somewhat faster, and perhaps a few other operations around maps of UUIDs etc. FromUUID/ToUUID are used to convert String, which is still used for all IO of UUIDs. Eventually the hope is those instances can be removed, and all git-annex branch log files etc use ByteString throughout, for a real speed improvement. Note the use of fromRawFilePath / toRawFilePath -- while a UUID usually contains only alphanumerics and so could be treated as ascii, it's conceivable that some git-annex repository has been initialized using a UUID that is not only not a canonical UUID, but contains high unicode or invalid unicode. Using the filesystem encoding avoids any problems with such a thing. However, a NUL in a UUID seems extremely unlikely, so I didn't use encodeBS / decodeBS to avoid their extra overhead in handling NULs. The Read/Show instance for UUID luckily serializes the same way for ByteString as it did for String.	2019-01-01 14:45:33 -04:00
Joey Hess	9127fe4821	add DebugLocks build flag Using the method described in https://www.fpcomplete.com/blog/2018/05/pinpointing-deadlocks-in-haskell but my own code to implement it, and with callstacks added. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-11-19 15:02:43 -04:00
Joey Hess	2e9f128dea	moved module and relicensed	2018-10-29 23:13:36 -04:00
Joey Hess	917a2c6095	defer updating unlocked files until after smudge filter The smuge filter no longer provides git with annexed file content, to avoid a git memory leak, and because that did not honor annex.thin. git annex smudge --update has to be run after a checkout to update unlocked files in the working tree with annexed file contents. No hooks yet to run it. This commit was sponsored by Nick Piper on Patreon.	2018-10-25 15:08:20 -04:00
Joey Hess	451171b7c1	clean up url removal presence update * rmurl: Fix a case where removing the last url left git-annex thinking content was still present in the web special remote. * SETURLPRESENT, SETURIPRESENT, SETURLMISSING, and SETURIMISSING used to update the presence information of the external special remote that called them; this was not documented behavior and is no longer done. Done by making setUrlPresent and setUrlMissing only update presence info for the web, and only when the url is a web url. See the comment for reasoning about why that's the right thing to do. In AddUrl, had to make it update location tracking, to handle the non-web-url case. This commit was sponsored by Ewen McNeill on Patreon.	2018-10-04 17:35:49 -04:00
Joey Hess	0a7c5a9982	dropdead per-remote metadata Had to refactor pure code into separate modules so it is accessible inside Annex.Branch.Transitions. This commit was sponsored by Peter on Patreon.	2018-09-05 13:52:46 -04:00
Joey Hess	b3d42283ad	use per-remote metadata storage for S3 version ID Since the same key can be stored in a versioned S3 bucket multiple times with different version IDs, this allows tracking them all. Not currently needed, but if we ever want to drop from a versioned S3 bucket, we'll need to know them all. This commit was supported by the NSF-funded DataLad project.	2018-08-31 13:27:29 -04:00
Joey Hess	5c99f6247e	per-remote metadata storage Actually very straightforward reuse of the metadata log file code. Although I had to add a todo item as git-annex forget won't clean up dead remote's metadata yet. This would be worth adding to the external special remote interface sometime. Have not opened a todo though, guess I'll wait until something needs it. This commit was supported by the NSF-funded DataLad project.	2018-08-31 12:23:22 -04:00
Joey Hess	358178fbfb	don't untrust appendonly exports Make exporttree=yes remotes that are appendonly not be untrusted, and not force verification of content, since the usual concerns about losing data when an export is updated by someone else don't apply. Note that all the remote operations on keys are left as usual for appendonly export remotes, except for storing content. This commit was supported by the NSF-funded DataLad project.	2018-08-30 11:48:04 -04:00
Joey Hess	10138056dc	v6: avoid accidental conversion when annex.largefiles is not configured v6: When annex.largefiles is not configured for a file, running git add or git commit, or otherwise using git to stage a file will add it to the annex if the file was in the annex before, and to git otherwise. This is to avoid accidental conversion. Note that git-annex add's behavior has not changed, for reasons explained in the added comment. Performance: No added overhead when annex.largefiles is configured. When not configured, there is an added call to catObjectMetaData, which involves a round trip through git cat-file --batch. However, the earlier catKeyFile primes the cache for it. This commit was supported by the NSF-funded DataLad project.	2018-08-27 14:51:10 -04:00
Joey Hess	ae11394efa	added annex.commitmessage Added annex.commitmessage config that can specify a commit message for the git-annex branch instead of the usual "update". This commit was supported by the NSF-funded DataLad project.	2018-08-02 14:06:06 -04:00
Joey Hess	2fc768ce72	avoid git annex info remote buffering list of keys This leaves git annex unused --from remote still using loggedKeysFor and buffering more than ought to be necessary, but I can't see a way to improve that.	2018-04-26 16:13:05 -04:00
Joey Hess	bea0ad220a	avoid --all buffering list of all keys In Annex.Branch.branch, the (++) was killing laziness. Rewrote so it streams lazily. filterM also kills laziness, so made loggedKeys use a Unchecked type, and check if the key is dead in the seek loop. Note that loggedKeysFor still buffers, so git-annex info <remote> and git-annex unused --from remote still use more memory than necessary. Also removed some unused functions from Annex.Journal.	2018-04-26 16:00:20 -04:00
Joey Hess	256d8f07e8	avoid insertWith' depreaction warning Switch to Data.Map.Strict everywhere that used it. There are still lots of lazy maps in git-annex. I think switching these is safe. The risk is that there might be a map that is used in a way that relies on the values not being evaluated to WHNF, and switching to strict might result in bad performance or memory use. So, I have not switched everything.	2018-04-22 13:28:31 -04:00
Joey Hess	f56594af9e	finish fixing inverted Ord for TrustLevel Flipped all comparisons. When a TrustLevel list was wanted from Trusted downwards, used Down to compare it in that order. This commit was sponsored by mo on Patreon.	2018-04-13 15:17:54 -04:00
Joey Hess	a0e4b9678b	fix inverted Ord for TrustLevel (intermediate commit) This commit removes the Ord and Enum instances, commenting out all code that depends on them, to make sure that all code effected by the inversion fix has been identified. (Assuming no ifdefs involve TrustLevel.) The next commit will fix up all the identified code.	2018-04-13 14:50:14 -04:00
Joey Hess	10d3b7fc62	Fix reversion introduced in 6.20171214 that caused concurrent transfers to incorrectly fail with "transfer already in progress". Avoid creating transfer info file before transfer lock is created and locked. The wrong order for one thing caused transfer info to be overwritten when a transfer was already in progress. But worse, it caused checkTransfer to see the transfer info, and so lock the transfer lock in order to verify the transfer was not in progress. Which in a concurrent situation, prevented the transferrer from locking the transfer lock, so it failed with "transfer already in progress". Note that the transferinfo command does not lock the transfer lock before creating the transfer info. But, that's only run after recvkey is running, and recvkey does lock the transfer lock, so that seems more or less ok. (Other than being a super complicated legacy mess that the P2P code has mostly obsoleted now.) This commit was supported by the NSF-funded DataLad project.	2018-03-14 18:55:34 -04:00
Joey Hess	faf03ee4da	more core.sharedRepository perm fixes Fix more places where files in .git/annex/ were written with modes that did not take the core.sharedRepository config into account. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2018-01-04 14:46:58 -04:00
Joey Hess	24df95f0f6	Fix several places where files in .git/annex/ were written with modes that did not take the core.sharedRepository config into account. git grep writeFile finds some more that might also be problems, but for now I've concentrated on .git/annex/ log files. There are certianly cases where writeFile is not a problem too. This commit was sponsored by mo on Patreon.	2018-01-02 17:25:25 -04:00
Joey Hess	edd25f04d9	unused: Write .git/annex/unused etc files with appropriate permissions for the core.sharedRepository config. This commit was sponsored by an anonymous bitcoin donor.	2018-01-02 16:25:27 -04:00
Joey Hess	4e38c4f57f	Allow exporttree remotes to be marked as dead. Union with max so that DeadTrusted wins over UnTrusted. This commit was sponsored by Trenton Cronholm on Patreon.	2017-12-05 13:46:55 -04:00
Joey Hess	3febb79c8f	wip	2017-11-28 17:17:40 -04:00
Joey Hess	7ad1d3210f	remove dead code	2017-11-07 14:18:10 -04:00
Joey Hess	e1ac299ad0	better dup key with -J fix This avoids all the complication about redundant work discussed in the previous try at fixing this. At the expense of needing each command that could have the problem to be patched to simply wrap the action in onlyActionOn once the key is known. But there do not seem to be many such commands. onlyActionOn' should not be used with a CommandStart (or CommandPerform), although the types do allow it. onlyActionOn handles running the whole CommandStart chain. I couldn't immediately see a way to avoid mistken use of onlyActionOn'. This commit was supported by the NSF-funded DataLad project.	2017-10-17 18:48:53 -04:00
Joey Hess	68a49adcda	Improve behavior when -J transfers multiple files that point to the same key After a false start, I found a fairly non-intrusive way to deal with it. Although it only handles transfers -- there may be issues with eg concurrent dropping of the same key, or other operations. There is no added overhead when -J is not used, other than an added inAnnex check. When -J is used, it has to maintain and check a small Set, which should be negligible overhead. It could output some message saying that the transfer is being done by another thread. Or it could even display the same progress info for both files that are being downloaded since they have the same content. But I opted to keep it simple, since this is rather an edge case, so it just doesn't say anything about the transfer of the file until the other thread finishes. Since the deferred transfer action still runs, actions that do more than transfer content will still get a chance to do their other work. (An example of something that needs to do such other work is P2P.Annex, where the download always needs to receive the content from the peer.) And, if the first thread fails to complete a transfer, the second thread can resume it. But, this unfortunately means that there's a risk of redundant work being done to transfer a key that just got transferred. That's not ideal, but should never cause breakage; the same thing can occur when running two separate git-annex processes. The get/move/copy/mirror --from commands had extra inAnnex checks added, inside the download actions. Without those checks, the first thread downloaded the content, and then the second thread woke up and downloaded the same content redundantly. move/copy/mirror --to is left doing redundant uploads for now. It would need a second checkPresent of the remote inside the upload to avoid them, which would be expensive. A better way to avoid redundant work needs to be found.. This commit was supported by the NSF-funded DataLad project.	2017-10-17 17:10:50 -04:00
Joey Hess	099bfc2daf	remove redundant definition of lck	2017-10-03 13:23:10 -04:00
Joey Hess	3cd47f9978	info: Improve cleanup of stale transfer info files. In my git-annex repos, I found some stale transfer info files without lock files. Pass a mode to tryLockExclusive, so it will create the lock file if not present, and so not fail to clean up such transfer info files. Normally, transfer info files are accompanied by a lock file. But, when alwaysRunTransfer is used, the locking can fail and it will still write the transfer info file. Perhaps there are other cases too? Note that mkProgressUpdater's meter writes to the transfer info file too, and it might be possible for that meter to fire after runTransfer has cleaned up. This commit was sponsored by andrea rota.	2017-10-02 13:55:26 -04:00
Joey Hess	4d0e522b72	Warn when metadata is inherited from a previous version of a file to avoid the user being surprised in cases where that behavior is not desired or expected This commit was supported by the NSF-funded DataLad project.	2017-09-28 12:56:35 -04:00
Joey Hess	b31d682539	cleanup imports	2017-09-13 12:53:19 -04:00

1 2 3 4 5 ...

477 commits