support --batch -J

--batch combined with -J now runs batch requests concurrently for many
commands. Before, the combination was accepted, but did not enable
concurrency. Since the output of batch requests can be in any order, --json
with the new "input" field is recommended to be used, to determine which
batch request each response corresponds to.

If --json is not used, batch mode still runs concurrently, using the usual
concurrent-output. That will not be very useful for most batch mode users,
probably, but who knows.

If a program was using --batch -J before, and was parsing non-json output,
this could break it. But, it was relying on git-annex not supporting
concurrency despite it being enabled, so it should have expected concurrent
output. So, I think that's ok.

annex.jobs does not enable concurrency in --batch mode, because that would
confuse programs that use --batch but don't expect concurrency.
This commit is contained in:
Joey Hess 2020-09-16 11:58:19 -04:00
parent 77c42782d0
commit 877ef84a1b
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 21 additions and 1 deletions

View file

@ -2,6 +2,11 @@ git-annex (8.20200909) UNRELEASED; urgency=medium
* --json output now includes a new field "input" which is the input
value (filename, url, etc) that caused a json object to be output.
* --batch combined with -J now runs batch requests concurrently for many
commands. Before, the combination was accepted, but did not enable
concurrency. Since the output of batch requests can be in any order,
--json with the new "input" field is recommended to be used,
to determine which batch request each response corresponds to.
-- Joey Hess <id@joeyh.name> Mon, 14 Sep 2020 13:13:10 -0400

View file

@ -8,6 +8,7 @@
module CmdLine.Batch where
import Annex.Common
import qualified Annex
import Types.Command
import CmdLine.Action
import CmdLine.GitAnnex.Options
@ -18,6 +19,8 @@ import Types.FileMatcher
import Annex.BranchState
import Annex.WorkTree
import Annex.Content
import Annex.Concurrent
import Types.Concurrency
data BatchMode = Batch BatchFormat | NoBatch
@ -85,6 +88,7 @@ batchInput fmt parser a = go =<< batchLines fmt
batchLines :: BatchFormat -> Annex [String]
batchLines fmt = do
checkBatchConcurrency
enableInteractiveBranchAccess
liftIO $ splitter <$> getContents
where
@ -92,8 +96,17 @@ batchLines fmt = do
BatchLine -> lines
BatchNull -> splitc '\0'
-- When concurrency is enabled at the command line, it is used in batch
-- mode. But, if it's only set in git config, don't use it, because the
-- program using batch mode may not expect interleaved output.
checkBatchConcurrency :: Annex ()
checkBatchConcurrency = Annex.getState Annex.concurrency >>= \case
ConcurrencyCmdLine _ -> noop
ConcurrencyGitConfig _ ->
setConcurrency (ConcurrencyGitConfig (Concurrent 1))
batchCommandAction :: CommandStart -> Annex ()
batchCommandAction = void . callCommandAction . batchCommandStart
batchCommandAction = commandAction . batchCommandStart
-- The batch mode user expects to read a line of output, and it's up to the
-- CommandStart to generate that output as it succeeds or fails to do its

View file

@ -1014,6 +1014,8 @@ Like other git commands, git-annex is configured via `.git/config`.
Setting this to "cpus" will run one job per CPU core.
When the `--batch` option is used, this configuration is ignored.
* `annex.queuesize`
git-annex builds a queue of git commands, in order to combine similar