git-annex

Author	SHA1	Message	Date
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	b8ef1bf3be	Fix find --json to output json once more. Reversion from commit `436f10771`, CustomOutput was forcing quiet output which overrode the json setting. find happened to be the only command that uses CustomOutput and also outputs json. (metadata --get does also use CustomOutput and --json does not enable json output for that, which may be an oversight, but was already the behavior before this regression.)	2019-07-05 09:58:37 -04:00
Joey Hess	ba2551da6f	add startingNoMessage Fixes the last wart in the StartMessage transition. A few commands include other CommandStart actions that generate output, and do not themselves need to display a start/end message.	2019-06-12 14:11:23 -04:00
Joey Hess	70bc30acb1	get rid of implicitMessages state Oh joyous day, this is probably git-annex's oldest implementation wart, source of much unncessary bother. Now that we have a StartMessage, showEndResult' can look at it to know if it needs to display an end message or not. This is also going to be faster, because it avoids an uncessary state lookup for each file processed.	2019-06-12 14:01:41 -04:00
Joey Hess	8e5ea28c26	finish CommandStart transition The hoped for optimisation of CommandStart with -J did not materialize. In fact, not runnign CommandStart in parallel is slower than -J3. So, CommandStart are still run in parallel. (The actual bad performance I've been seeing with -J in my big repo has to do with building the remoteList.) But, this is still progress toward making -J faster, because it gets rid of the onlyActionOn roadblock in the way of making CommandCleanup jobs run separate from CommandPerform jobs. Added OnlyActionOn constructor for ActionItem which fixes the onlyActionOn breakage in the last commit. Made CustomOutput include an ActionItem, so even things using it can specify OnlyActionOn. In Command.Move and Command.Sync, there were CommandStarts that used includeCommandAction, so output messages, which is no longer allowed. Fixed by using startingCustomOutput, but that's still not quite right, since it prevents message display for the includeCommandAction run inside it too.	2019-06-12 13:24:01 -04:00
Joey Hess	436f107715	make CommandStart return a StartMessage The goal is to be able to run CommandStart in the main thread when -J is used, rather than unncessarily passing it off to a worker thread, which incurs overhead that is signficant when the CommandStart is going to quickly decide to stop. To do that, the message it displays needs to be displayed in the worker thread, after the CommandStart has run. Also, the change will mean that CommandStart will no longer necessarily run with the same Annex state as CommandPerform. While its docs already said it should avoid modifying Annex state, I audited all the CommandStart code as part of the conversion. (Note that CommandSeek already sometimes runs with a different Annex state, and that has not been a source of any problems, so I am not too worried that this change will lead to breakage going forward.) The only modification of Annex state I found was it calling allowMessages in some Commands that default to noMessages. Dealt with that by adding a startCustomOutput and a startingUsualMessages. This lets a command start with noMessages and then select the output it wants for each CommandStart. One bit of breakage: onlyActionOn has been removed from commands that used it. The plan is that, since a StartMessage contains an ActionItem, when a Key can be extracted from that, the parallel job runner can run onlyActionOn' automatically. Then commands won't need to worry about this detail. Future work. Otherwise, this was a fairly straightforward process of making each CommandStart compile again. Hopefully other behavior changes were mostly avoided. In a few cases, a command had a CommandStart that called a CommandPerform that then called showStart multiple times. I have collapsed those down to a single start action. The main command to perhaps suffer from it is Command.Direct, which used to show a start for each file, and no longer does. Another minor behavior change is that some commands used showStart before, but had an associated file and a Key available, so were changed to ShowStart with an ActionItemAssociatedFile. That will not change the normal output or behavior, but --json output will now include the key. This should not break it for anyone using a real json parser.	2019-06-06 17:13:54 -04:00
Joey Hess	258a7c5cd1	add Key to all ActionItem constructors	2019-06-06 12:53:24 -04:00
Joey Hess	82186ca58f	annex.jobs=cpus etc Added the ability to run one job per CPU (core), by setting annex.jobs=cpus, or using option --jobs=cpus or -Jcpus. Built with future expansion in mind, including not defaulting matching on Concurrency so more constructors can later be added, and using "cpu" instead of "0".	2019-05-10 13:27:08 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	9127fe4821	add DebugLocks build flag Using the method described in https://www.fpcomplete.com/blog/2018/05/pinpointing-deadlocks-in-haskell but my own code to implement it, and with callstacks added. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-11-19 15:02:43 -04:00
Joey Hess	872af2b2f1	avoid using concurrent-output at all when --quiet or --json Of course, it wasn't used much in those modes, because normal output is avoided. But it was still initialized and used in a few places, including a call to hideRegionsWhile.	2018-11-15 14:26:40 -04:00
Joey Hess	b2bafdb2fc	v6: Fix database inconsistency That could cause git-annex to get confused about whether a locked file's content was present, when the object file got touched. Unfortunately this means more work sometimes when annex.thin is set, since it has to checksum the file to tell if it's still got the right content. Had to suppress output when inAnnex calls isUnmodified, otherwise "(checksum...)" would be printed in places it ought not to be, eg "git annex get" could turn out not need to get anything, and so only display that. This commit was sponsored by Ole-Morten Duesund on Patreon.	2018-10-16 13:51:37 -04:00
Joey Hess	7d9f0e0fbe	Added INFO to external special remote protocol. It's left up to the special remote to detect when git-annex is new enough to support the message; an old git-annex will blow up. This commit was supported by the NSF-funded DataLad project.	2018-02-06 13:03:55 -04:00
Joey Hess	4781ca297b	showStart variant for when there's no worktree file Clean up some uses of showStart with "" for the file, or in some cases, a non-filename description string. That would generate bad json, although none of the commands doing that supported --json. Using "" for the file resulted in output like "foo rest"; now the extra space is eliminated. This commit was sponsored by Fernando Jimenez on Patreon.	2017-11-28 15:14:16 -04:00
Joey Hess	1d45e47e3f	clear regions before ssh prompt When built with concurrent-output 1.9, ssh password prompts will no longer interfere with the -J display. To avoid flicker, only done when ssh actually does need to prompt; ssh is first run in batch mode and if that succeeds the connection is up and no need to clear regions. This commit was supported by the NSF-funded DataLad project.	2017-05-16 15:50:11 -04:00
Joey Hess	2c6cfbe503	also serialize ssh password prompting when json or quiet output is enable	2017-05-13 13:13:13 -04:00
Joey Hess	6992fe133b	Ssh password prompting improved when using -J When ssh connection caching is enabled (and when GIT_ANNEX_USE_GIT_SSH is not set), only one ssh password prompt will be made per host, and only one ssh password prompt will be made at a time. This also fixes a race in prepSocket's stale ssh connection stopping when run with -J. It was possible for one thread to start a cached ssh connection, and another thread to immediately stop it, resulting in excess connections being made. This commit was supported by the NSF-funded DataLad project.	2017-05-11 17:36:03 -04:00
Joey Hess	8484c0c197	Always use filesystem encoding for all file and handle reads and writes. This is a big scary change. I have convinced myself it should be safe. I hope!	2016-12-24 14:46:31 -04:00
Joey Hess	d7ea6a5684	drop incremental json object display; clean up code This gets rid of quite a lot of ugly hacks around json generation. I doubt that any real-world json parsers can parse incomplete objects, so while it's not as nice to need to wait for the complete object, especially for commands like `git annex info` that take a while, it doesn't seem worth the added complexity. This also causes the order of fields within the json objects to be reordered. Since any real json parser shouldn't care, the only possible problem would be with ad-hoc parsers of the old json output.	2016-09-09 18:13:55 -04:00
Joey Hess	a108235565	better locking for json with -J Avoid threads emitting json at the same time and scrambling, which was still possible even with the buffering, just less likely. Converted json IO actions to JSONChunk data too.	2016-09-09 15:51:34 -04:00
Joey Hess	05d4438383	addurl, get: Added --json-progress option, which adds progress objects to the json output. This doesn't work right when used with -J yet, and there is some really ugly hand-crafting of part of the json output.	2016-09-09 15:06:54 -04:00
Joey Hess	4a09b4bbbd	make maybeShowJSON also add to the buffer	2016-09-09 14:21:06 -04:00
Joey Hess	089c592977	buffer json output until done when in concurrent mode	2016-09-09 13:21:38 -04:00
Joey Hess	8ef494a833	disentangle concurrency and message type This makes -Jn work with --json and --quiet, where before setting -Jn disabled those options. Concurrent json output is currently a mess though since threads output chunks over top of one-another.	2016-09-09 12:57:42 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	bf3327ff25	Added metadata --batch option, which allows getting, setting, deleting, and modifying metadata for multiple files/keys.	2016-07-27 10:46:25 -04:00
Joey Hess	a030d0a8b7	allow using Aeson for streaming JSON output Keeping Text.JSON use for now, because it seems a better fit for most of the commands, which don't use very structured JSON objects, but just output whatever fields suites them. But this lets Aeson be used when a more structured data type is available to serialize to JSON.	2016-07-26 13:30:07 -04:00
Joey Hess	d13194b230	--branch, stage 2 Show branch:file that is being operated on. I had to make ActionItem a type and not a type class because withKeyOptions' passed two different types of values when using the type class, and I could not get the type checker to accept that.	2016-07-20 15:23:43 -04:00
Joey Hess	847944e6b1	more generic showStart'	2016-07-20 14:03:54 -04:00
Joey Hess	acf74ae945	improve json when showStart' is given only a key Before, the json contained file:key; change that to key: If a file and a key are given, inclue both file: and key:	2016-03-06 12:57:24 -04:00
Joey Hess	0f18636c8a	Work around problem with concurrent-output when in a non-unicode locale by avoiding use of it in such a locale. Instead -J will behave as if it was built without concurrent-output support in this situation. Ie, it will be mostly quiet, except when there's an error. Note that it's not a problem for a filename to contain invalid utf-8 when in a utf-8 locale. That is handled ok by concurrent-output. It's only displaying unicode characters in a non-unicode locale that doesn't work.	2016-02-14 15:02:42 -04:00
Joey Hess	70b8cad9c8	make noMessages disable closing of json object in --json mode This allows things like Command.Find to use noMessages and generate their own complete json objects. Previouly, Command.Find managed that only via a hack, which wasn't compatable with batch mode. Only Command.Find, Command.Smudge, and Commange.Status use noMessages currently, and none except for Command.Find are impacted by this change. Fixes find --json --batch output	2016-01-20 14:10:13 -04:00
Joey Hess	249f7f4801	Force output to be line-buffered, even when it's not connected to the terminal. This is particuarly important for commands with --batch output, which was not always being flushed at an appropriate time.	2016-01-18 13:01:23 -04:00
Joey Hess	b96cfdc094	whereis --json: Make url list be included in machine-parseable form.	2016-01-06 12:33:32 -04:00
Joey Hess	4224fae71f	optimise read and write for Keys database (untested) Writes are optimised by queueing up multiple writes when possible. The queue is flushed after the Annex monad action finishes. That makes it happen on program termination, and also whenever a nested Annex monad action finishes. Reads are optimised by checking once (per AnnexState) if the database exists. If the database doesn't exist yet, all reads return mempty. Reads also cause queued writes to be flushed, so reads will always be consistent with writes (as long as they're made inside the same Annex monad). A future optimisation path would be to determine when that's not necessary, which is probably most of the time, and avoid flushing unncessarily. Design notes for this commit: - separate reads from writes - reuse a handle which is left open until program exit or until the MVar goes out of scope (and autoclosed then) - writes are queued - queue is flushed periodically - immediate queue flush before any read - auto-flush queue when database handle is garbage collected - flush queue on exit from Annex monad (Note that this may happen repeatedly for a single database connection; or a connection may be reused for multiple Annex monad actions, possibly even concurrent ones.) - if database does not exist (or is empty) the handle is not opened by reads; reads instead return empty results - writes open the handle if it was not open previously	2015-12-23 19:18:52 -04:00
Joey Hess	4b02af57b6	display a message in the unlikely scenario of fsking a dead repository	2015-11-10 14:44:58 -04:00
Joey Hess	468e52fbe3	add back missing newline to showRaw	2015-11-10 14:06:07 -04:00
Joey Hess	4fd03ccd7b	concurrent-output, first pass Output without -Jn should be unchanged from before. With -Jn, concurrent-output is used for messages, but regions are not used yet, so it's a mess.	2015-11-04 13:45:34 -04:00
Joey Hess	4ed82e5328	fsck: Work around bug in persistent that broke display of problematically encoded filenames on stderr when using --incremental.	2015-09-09 17:02:00 -04:00
Joey Hess	43aa881b47	--debug is passed along to git-annex-shell when git-annex is in debug mode.	2015-08-13 15:05:39 -04:00
Joey Hess	7584e47ba3	--debug log messages are now timestamped with fractional seconds.	2015-08-12 14:42:49 -04:00
Joey Hess	e27b97d364	Merge branch 'master' into concurrentprogress Conflicts: Command/Fsck.hs Messages.hs Remote/Directory.hs Remote/Git.hs Remote/Helper/Special.hs Types/Remote.hs debian/changelog git-annex.cabal	2015-05-12 13:23:22 -04:00
Joey Hess	efb37e7c78	Improve behavior when a git-annex command is told to operate on a file that doesn't exist. It will now continue to other files specified after that on the command line, and only error out at the end.	2015-04-30 15:28:17 -04:00
Joey Hess	5e9f0a3493	reuse strings	2015-04-14 16:46:06 -04:00
Joey Hess	f8e700ed06	use built-in progress meters for git when in parallel mode	2015-04-10 15:15:21 -04:00
Joey Hess	2343f99c85	well along the way to fully quiet --quiet Came up with a generic way to filter out progress messages while keeping errors, for commands that use stderr for both. --json mode will disable command outputs too.	2015-04-04 14:34:03 -04:00
Joey Hess	20fb91a7ad	WIP on making --quiet silence progress, and infra for concurrent progress bars	2015-04-03 16:48:30 -04:00
Joey Hess	bc0180da83	rename showProgress -> showProgressDots	2015-04-03 13:51:32 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	0ee09f05d2	lower case for consistency	2015-01-10 13:41:25 -04:00

1 2 3

110 commits