git-annex

Author	SHA1	Message	Date
Joey Hess	eb42cd4d46	more RawFilePath conversion 535/645 This commit was sponsored by Brett Eisenberg on Patreon.	2020-11-03 10:11:04 -04:00
Joey Hess	3a05d53761	add SeekInput (not yet used) No behavior changes (hopefully), just adding SeekInput and plumbing it through to the JSON display code for later use. Over the course of 2 grueling days. withFilesNotInGit reimplemented in terms of seekHelper should be the only possible behavior change. It seems to test as behaving the same. Note that seekHelper dummies up the SeekInput in the case where segmentPaths' gives up on sorting the expanded paths because there are too many input paths. When SeekInput later gets exposed as a json field, that will result in it being a little bit wrong in the case where 100 or more paths are passed to a git-annex command. I think this is a subtle enough problem to not matter. If it does turn out to be a problem, fixing it would require splitting up the input parameters into groups of < 100, which would make git ls-files run perhaps more than is necessary. May want to revisit this, because that fix seems fairly low-impact.	2020-09-15 15:41:13 -04:00
Joey Hess	d732ef1a89	move, copy: Sped up seeking for annexed files to operate on by a factor of nearly 2x.	2020-07-24 12:56:02 -04:00
Joey Hess	00865cdae8	Fix a bug in find --branch in the previous version inAnnex check was lost for that code path. To avoid more such mistakes, made withKeyOptions check it when the AnnexedFileSeeker specifies.	2020-07-24 12:05:28 -04:00
Joey Hess	1be92381ec	unify batch mode with non-batch by using AnnexedFileSeeker	2020-07-22 14:23:28 -04:00
Joey Hess	75aab72d23	mostly done with location log precaching Some nice wins.	2020-07-13 17:04:02 -04:00
Joey Hess	88a7fb5cbb	convert all applicable commands to new 2x faster annexed file seeking This removes all calls to inAnnex, except for some involving --batch. It may be that the batch code could get a similar speedup, but I don't know if people habitually pass a huge number of files through --batch that git-annex does not need to do anything to process, so I skipped it for now. A few calls to ifAnnexed remain, and might be worth doing more to convert. In particular, Command.Sync has one that would probably speed it up by a good amount. (also removed some dead code from Command.Lock)	2020-07-10 15:45:38 -04:00
Joey Hess	89b2542d3c	annex.skipunknown with transition plan Added annex.skipunknown git config, that can be set to false to change the behavior of commands like `git annex get foo*`, to not skip over files/dirs that are not checked into git and are explicitly listed in the command line. Significant complexity was needed to handle git-annex add, which uses some git ls-files calls, but needs to not use --error-unmatch because of course the files are not known to git. annex.skipunknown is planned to change to default to false in a git-annex release in early 2022. There's a todo for that.	2020-05-28 15:55:17 -04:00
Joey Hess	b88f89c1ef	get the most commonly used commands building again A quick benchmark of whereis shows not much speed improvement, maybe a few percent. Profiling it found a hotspot, adds to todo.	2019-12-04 13:45:18 -04:00
Joey Hess	53882ab4a7	make WorkerStage an open type Rather than limiting it to PerformStage and CleanupStage, this opens it up so any number of stages can be added as needed by commands. Each concurrent command has a set of stages that it uses, and only transitions between those can block waiting for a free slot in the worker pool. Calling enteringStage for some other stage does not block, and has very little overhead. Note that while before the Annex state was duplicated on the first call to commandAction, this now happens earlier, in startConcurrency. That means that seek stage actions should that use startConcurrency and then modify Annex state won't modify the state of worker threads they then start. I audited all of them, and only Command.Seek did so; prepMerge changes the working directory and so has to come before startConcurrency. Also, the remote list is built before duplicating the state, which means that it gets built earlier now than it used to. This would only have an effect of making commands that end up not needing to perform any actions unncessary build the remote list (only when they're run with concurrency enable), but that's a minor overhead compared to commands seeking through the work tree and determining they don't need to do anything.	2019-06-19 13:05:03 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	53526136e8	move commandAction out of CmdLine.Seek This is groundwork for nested seek loops, eg seeking over all files and then performing commandActions on a list of remotes, which can be done concurrently. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-10-01 14:12:06 -04:00
Joey Hess	1d1054faa6	added -z Added -z option to git-annex commands that use --batch, useful for supporting filenames containing newlines. It only controls input to --batch, the output will still be line delimited unless --json or etc is used to get some other output. While git often makes -z affect both input and output, I don't like trying them together, and making it affect output would have been a significant complication, and also git-annex output is generally not intended to be machine parsed, unless using --json or a format option. Commands that take pairs like "file key" still separate them with a space in --batch mode. All such commands take care to support filenames with spaces when parsing that, so there was no need to change it, and it would have needed significant changes to the batch machinery to separate tose with a null. To make fromkey and registerurl support -z, I had to give them a --batch option. The implicit batch mode they enter when not provided with input parameters does not support -z as that would have complicated option parsing. Seemed better to move these toward using the same --batch as everything else, though the implicit batch mode can still be used. This commit was sponsored by Ole-Morten Duesund on Patreon.	2018-09-20 16:11:47 -04:00
Joey Hess	12460fcea6	make --batch honor matching options When --batch is used with matching options like --in, --metadata, etc, only operate on the provided files when they match those options. Otherwise, a blank line is output in the batch protocol. Affected commands: find, add, whereis, drop, copy, move, get In the case of find, the documentation for --batch already said it honored the matching options. The docs for the rest didn't, but it makes sense to have them honor them. While this is a behavior change, why specify the matching options with --batch if you didn't want them to apply? Note that the batch output for all of the affected commands could already output a blank line in other cases, so batch users should already be prepared to deal with it. git-annex metadata didn't seem worth making support the matching options, since all it does is output metadata or set metadata, the use cases for using it in combination with the martching options seem small. Made it refuse to run when they're combined, leaving open the possibility for later support if a use case develops. This commit was sponsored by Brett Eisenberg on Patreon.	2018-08-08 12:07:06 -04:00
Joey Hess	af8546990d	move: --safe/--unsafe and potential drop race fix move: Added --safe option, which makes move honor numcopies settings. Also --unsafe enables the default behavior, anticipating that the default may one day change. This commit was sponsored by Ethan Aubin.	2018-04-09 16:20:10 -04:00
Joey Hess	ae530f043e	disentagle copy and move option parsing	2018-04-09 14:38:46 -04:00
Joey Hess	6583448bab	add --json-error-messages (not yet implemented) Added --json-error-messages option, which includes error messages in the json output, rather than outputting them to stderr. The actual rediretion of errors is not implemented yet, this is only the docs and option plumbing. This commit was supported by the NSF-funded DataLad project.	2018-02-19 14:32:15 -04:00
Joey Hess	85ed38a574	Avoid repeated checking that files passed on the command line exist. git annex add, git annex lock etc make multiple seek passes, and each seek pass checked that files existed. That was unncessary redundant work. Fixed by adding a new WorkTreeItem type, make seek actions use it, and check that the files exist when constructing it. This commit was supported by the NSF-funded DataLad project.	2017-10-16 14:10:20 -04:00
Joey Hess	2eb6309d3e	move, copy: Support --batch.	2017-08-15 12:39:10 -04:00
Joey Hess	bb18026b2c	move --to=here * move --to=here moves from all reachable remotes to the local repository. The output of move --from remote is changed slightly, when the remote and local both have the content. It used to say: move foo ok Now: move foo (from theremote...) ok That was done so that, when move --to=here is used and the content is locally present and also in several remotes, it's clear which remotes the content gets dropped from. Note that move --to=here will report an error if a non-reachable remote contains the file, even if the local repository also contains the file. I think that's reasonable; the user may be intending to move all other copies of the file from remotes. OTOH, if a copy of the file is believed to be present in some repository that is not a configured remote, move --to=here does not report an error. So a little bit inconsistent, but erroring in this case feels wrong. copy --to=here came along for free, but it's basically the same behavior as git-annex get, and probably with not as good messages in edge cases (especially on failure), so I've not documented it. This commit was sponsored by Anthony DeRobertis on Patreon.	2017-05-31 17:00:18 -04:00
Joey Hess	5ee6912cf3	support parsing options like --to=here Reworked remote name parsing to allow things like that. Command.Move uses it for --to=here, although there's not yet an implementation of that option. This commit was sponsored by Ignacio on Patreon.	2017-05-31 16:49:28 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	3e22d60549	copy, move, mirror: Support --json and --json-progress.	2016-09-09 16:24:26 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	8ea594f565	missed adding allowConcurrentOutput here	2015-11-06 13:41:26 -04:00
Joey Hess	1ac79e6fe5	copy --auto was checking the wrong repo's preferred content. (--from was checking what --to should, and vice-versa.) Fixed this bug, which was introduced in version 5.20150727.	2015-10-06 17:29:44 -04:00
Joey Hess	b7a5d9c3e1	The last release accidentially removed a number of options from the copy command. (-J, file matching options, etc). These have been added back.	2015-07-30 13:33:35 -04:00
Joey Hess	a7f58634b8	wip	2015-07-09 16:05:45 -04:00
Joey Hess	8ad927dbc6	converted copy and move Got a little tricky..	2015-07-09 15:23:14 -04:00
Joey Hess	6e5c1f8db3	convert all commands to work with optparse-applicative Still no options though.	2015-07-08 15:08:02 -04:00
Joey Hess	a2ba701056	started converting to use optparse-applicative This is a work in progress. It compiles and is able to do basic command dispatch, including git autocorrection, while using optparse-applicative for the core commandline parsing. * Many commands are temporarily disabled before conversion. * Options are not wired in yet. * cmdnorepo actions don't work yet. Also, removed the [Command] list, which was only used in one place.	2015-07-08 13:36:25 -04:00
Joey Hess	38c458b407	refactor	2015-04-30 14:02:56 -04:00
Joey Hess	cd6b62f35e	--auto is no longer a global option; only get, drop, and copy accept it. Not a behavior change unless you were passing it to a command that ignored it.	2015-03-25 17:06:14 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	59f88558d5	doh't use "def" for command definitions, it conflicts with Data.Default.def	2014-10-14 14:20:10 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	86ffeb73d1	reorganize some files and imports	2014-01-26 16:25:55 -04:00
Joey Hess	3518c586cf	fix transfers of key with no associated file Several places assumed this would not happen, and when the AssociatedFile was Nothing, did nothing. As part of this, preferred content checks pass the Key around. Note that checkMatcher is sometimes now called with Just Key and Just File. It currently constructs a FileMatcher, ignoring the Key. However, if it constructed a FileKeyMatcher, which contained both, then it might be possible to speed up parts of Limit, which currently call the somewhat expensive lookupFileKey to get the Key. I have not made this optimisation yet, because I am not sure if the key is always the same. Will need some significant checking to satisfy myself that's the case..	2014-01-23 16:44:02 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	0f921307e7	mirror: New command, makes two repositories contain the same set of files. This is a simple approach for setting up a mirroring repository. It will work with any type of remotes. Mirror --from is more expensive than mirror --to in general. OTOH, mirror --from will get the file from any remote that has it, not only the named mirror remote. And if the named mirror remote is not the fastest available remote with a file, that can speed things up. It would be possible to make the assistant or watch command do a more dynamic mirroring, that didn't need to scan every time.	2013-08-20 15:46:35 -04:00
Joey Hess	04d07f2c1f	--unused: New switch that makes git-annex operate on all data found by the last run of git annex unused. Supported by fsck, get, move, copy.	2013-07-03 15:26:59 -04:00
Joey Hess	b337a8b4c7	--all for get, move, and copy	2013-07-03 13:55:50 -04:00
Joey Hess	cfd3b16fe1	add section metadata to all commands Not yet used .. mindless train work.	2013-03-24 18:28:21 -04:00
Joey Hess	921f29c004	two types of byName Clean up from `9769235d6b`. In some cases, looking up a remote by name even though it has no UUID is desirable. This includes git annex sync, which can operate on remotes without an annex, and XMPP pairing, which runs addRemote (with calls byName) before the UUID of the XMPP remote has been configured in git.	2013-03-05 15:43:56 -04:00
Joey Hess	b68eee625f	More commands work in direct mode repositories: find, whereis, move, copy, drop, log. These started working, for free, once lookupFile supported direct mode. yay!!	2013-01-05 17:17:04 -04:00
Joey Hess	2ce736ac50	block all commands that don't work in direct mode I left status working in direct mode, although it doesn't show correct stats for known annex keys.	2012-12-29 14:28:19 -04:00
Joey Hess	99a8a5297c	--auto fixes * get/copy --auto: Transfer data even if it would exceed numcopies, when preferred content settings want it. * drop --auto: Fix dropping content when there are no preferred content settings.	2012-12-06 13:22:16 -04:00

1 2

70 commits