git-annex

Author	SHA1	Message	Date
Joey Hess	9d36c826c0	use fine-grained WorkerStages when transferring and verifying This means that Command.Move and Command.Get don't need to manually set the stage, and is a lot cleaner conceptually. Also, this makes Command.Sync.syncFile use the worker pool better. In the scenario where it first downloads content and then uploads it to some other remotes, it will start in TransferStage, then enter VerifyStage and then go back to TransferStage for each transfer to the remotes. Before, it entered CleanupStage after the download, and stayed in it for the upload, so too many transfer jobs could run at the same time. Note that, in Remote.Git, it uses runTransfer and also verifyKeyContent inside onLocal. That has a Annex state for the remote, with no worker pool. So the resulting calls to enteringStage won't block in there. While Remote.Git.copyToRemote does do checksum verification, I realized that should not use a verification slot in the WorkerPool to do it. Because, it's reading back from eg, a removable disk to checksum. That will contend with other writes to that disk. It's best to treat that checksum verification as just part of the transer. So, removed the todo item about that, as there's nothing needing to be done.	2019-06-19 13:24:20 -04:00
Joey Hess	e19408ed9d	Merge branch 'master' of ssh://git-annex.branchable.com	2019-06-17 15:26:57 -04:00
Joey Hess	04cc470201	run download checksum verification in separate job pool get, move, copy, sync: When -J or annex.jobs has enabled concurrency, checksum verification uses a separate job pool than is used for downloads, to keep bandwidth saturated. Not yet done for upload checksum verification, but that only affects remotes on local disks.	2019-06-17 14:58:02 -04:00
Joey Hess	1a8d06d251	thought	2019-06-17 11:50:18 -04:00
jsag@f84637fe752e0235291a118b1cd007bafad0997e	ae9f2d5e6a		2019-06-17 12:43:17 +00:00
Joey Hess	e589a9b3fc	moving this to a bug	2019-06-12 15:00:14 -04:00
Joey Hess	8e5ea28c26	finish CommandStart transition The hoped for optimisation of CommandStart with -J did not materialize. In fact, not runnign CommandStart in parallel is slower than -J3. So, CommandStart are still run in parallel. (The actual bad performance I've been seeing with -J in my big repo has to do with building the remoteList.) But, this is still progress toward making -J faster, because it gets rid of the onlyActionOn roadblock in the way of making CommandCleanup jobs run separate from CommandPerform jobs. Added OnlyActionOn constructor for ActionItem which fixes the onlyActionOn breakage in the last commit. Made CustomOutput include an ActionItem, so even things using it can specify OnlyActionOn. In Command.Move and Command.Sync, there were CommandStarts that used includeCommandAction, so output messages, which is no longer allowed. Fixed by using startingCustomOutput, but that's still not quite right, since it prevents message display for the includeCommandAction run inside it too.	2019-06-12 13:24:01 -04:00
Joey Hess	3893d84764	todo	2019-06-06 12:02:27 -04:00
Joey Hess	3eac4e01a4	idea	2019-06-05 19:43:01 -04:00
Joey Hess	659640e224	separate queue for cleanup actions When running multiple concurrent actions, the cleanup phase is run in a separate queue than the main action queue. This can make some commands faster, because less time is spent on bookkeeping in between each file transfer. But as far as I can see, nothing will be sped up much by this yet, because all the existing cleanup actions are very light-weight. This is just groundwork for deferring checksum verification to cleanup time. This change does mean that if the user expects -J2 will mean that they see no more than 2 jobs running at a time, they may be surprised to see 4 in some cases (if the cleanup actions are slow enough to notice). It might also make sense to enable background cleanup without the -J, for at least one cleanup action. Indeed, that's the behavior that -J1 has now. At some point in the future, it make make sense to make the behavior with no -J the same as -J1. The only reason it's not currently is that git-annex can build w/o concurrent-output, and also any bugs in concurrent-output (such as perhaps misbehaving on non-VT100 compatible terminals) are avoided by default by only using it when -J is used.	2019-06-05 17:54:35 -04:00
Joey Hess	7dcc815c29	more thoughts	2019-06-04 14:38:55 -04:00
Joey Hess	cd20dc4158	thoughts	2019-06-04 14:13:15 -04:00
Joey Hess	cd5e8be2dc	comment	2019-05-23 13:36:59 -04:00
Joey Hess	e06feb7316	honor preferred content when importing Importing from a special remote honors its preferred content too; unwanted files are not imported. But, some preferred content expressions can't be checked before files are imported, and trying to import with such an expression will fail. Tested this with scenarios including changing the preferred content expression and making sure merging the import didn't delete files that were no longer wanted. There was one minor inefficiency mentioned in the todo that I punted on.	2019-05-21 14:38:06 -04:00
Joey Hess	ec11575d17	hairyness	2019-05-21 12:54:57 -04:00
Joey Hess	3b9a19171a	Merge branch 'master' into preferred	2019-05-21 11:34:45 -04:00
Joey Hess	5e1221ad53	Improve shape of commit tree when importing from unversioned special remotes Make the import have the previous import as a parent, so eg `git log --stat` displays a useful diff. Also a minor optimisation, only calculate the depth of the imported history once.	2019-05-21 11:32:54 -04:00
Joey Hess	5af9e7f3d0	break out a todo	2019-05-21 11:10:13 -04:00
Joey Hess	97fd9da6e7	add back non-preferred files to imported tree Prevents merging the import from deleting the non-preferred files from the branch it's merged into. adjustTree previously appended the new list of items to the old, which could result in it generating a tree with multiple files with the same name. That is not good and confuses some parts of git. Gave it a function to resolve such conflicts. That allowed dealing with the problem of what happens when the import contains some files (or subtrees) with the same name as files that were filtered out of the export. The files from the import win.	2019-05-20 16:43:52 -04:00
Joey Hess	7d177b78e4	docs for export preferred content This includes a note about how include= and exclude= match when exporting a subtree. I don't know if the note is prominent enough, but the behavior seems unsurprising enough.	2019-05-20 12:06:02 -04:00
Joey Hess	12451ea010	Merge branch 'master' into preferred	2019-05-20 10:00:03 -04:00
Joey Hess	8958556fe3	thought	2019-05-16 20:41:17 -04:00
Joey Hess	24c8b1b15a	update	2019-05-14 15:25:09 -04:00
Joey Hess	9411a7c93c	matching preferred content before key is known This will let import try to match preferred content expressions before downloading the content and generating its key. If an expression needs a key, it preferredContentParser with preferredContentKeylessTokens will fail to parse it. standard and groupwanted are not in preferredContentKeylessTokens because they may refer to an expression that refers to a key. That needs further work to support them.	2019-05-14 14:28:23 -04:00
Joey Hess	a3e24ed533	more design work	2019-05-14 11:49:23 -04:00
Joey Hess	c5a61ee808	closing in on final design for this	2019-05-14 10:52:00 -04:00
Ilya_Shlyakhter	0610789285	Added a comment: checksums of remote data	2019-05-13 22:03:38 +00:00
Joey Hess	0c7569bb6f	close	2019-05-10 13:59:39 -04:00
Joey Hess	c77d79d343	close old todo	2019-05-10 13:54:32 -04:00
Joey Hess	ae562ad4d7	update old todo item with what still needs doing removed old comments that are no longer relevant	2019-05-10 13:52:40 -04:00
Joey Hess	daa0c6c1c6	close old todo	2019-05-10 13:35:55 -04:00
Joey Hess	d32143e7ad	close	2019-05-10 13:34:44 -04:00
Joey Hess	ccfb800fa6	Merge branch 'master' of ssh://git-annex.branchable.com	2019-05-10 13:31:49 -04:00
Joey Hess	82186ca58f	annex.jobs=cpus etc Added the ability to run one job per CPU (core), by setting annex.jobs=cpus, or using option --jobs=cpus or -Jcpus. Built with future expansion in mind, including not defaulting matching on Concurrency so more constructors can later be added, and using "cpu" instead of "0".	2019-05-10 13:27:08 -04:00
Ilya_Shlyakhter	e0c73c7f29	Added a comment	2019-05-09 21:07:39 +00:00
Ilya_Shlyakhter	5638ae9688	Added a comment	2019-05-07 00:59:31 +00:00
Joey Hess	b03e65d260	Improved locking when multiple git-annex processes are writing to the .git/index file	2019-05-06 15:15:12 -04:00
Joey Hess	4bc99e4c21	add todo	2019-05-06 14:58:59 -04:00
Joey Hess	6845c1e020	comment	2019-05-06 12:16:19 -04:00
Ilya_Shlyakhter	437fa438e3	Added a comment	2019-05-03 16:31:18 +00:00
Ilya_Shlyakhter	6535d0c1b2	Added a comment	2019-05-03 16:26:40 +00:00
Ilya_Shlyakhter	03a20b225a	Added a comment	2019-05-03 16:11:06 +00:00
Joey Hess	40c749387f	comment	2019-05-03 11:53:03 -04:00
Ilya_Shlyakhter	64bcaff016	added todo for speculate-can-get : extension of speculate-present	2019-05-03 15:34:41 +00:00
Joey Hess	700a3f2787	Merge branch 'master' into import-from-s3	2019-05-01 14:30:52 -04:00
Joey Hess	a405ae015d	remove simple fast-forward todo I think the history looks nice enough without that special case.	2019-05-01 14:29:52 -04:00
Joey Hess	a32f31235a	reuse old imported commits This avoids proliferation of different import commits for the same trees, and makes the resulting git history nice.	2019-05-01 14:20:26 -04:00
Joey Hess	83a420dd66	update todo	2019-04-30 16:31:46 -04:00
Joey Hess	1503b86a14	make import tree from remote generate a merge commit This way no history is lost, neither what was exported to the remote, or the history of changes that is imported from it. No complicated correlation of two possibly very different histories is needed, just record what we know and then git merge will do a good job. Also, it notices when the remote tracking branch doesn't need to be updated, and avoids doing anything, so noop remotes are super cheap. The only catch here is that, since the commits generated for imports from the remote don't have a stable date or author/committer, each (non-noop) import generates different commits for the same imported trees. So, when the imported remote tracking branch is merged into master and then a change is imported again, there will be an extra series of commits, which will get more and more expensive each time. This seems to call for making stable commits for imports. Also that seems a good idea to make importing in several repositories have the same result.	2019-04-30 16:13:21 -04:00
Joey Hess	cd5e685fd1	comment	2019-04-26 10:18:55 -04:00

1 2 3 4 5 ...

2597 commits