39 lines
1.8 KiB
Markdown
39 lines
1.8 KiB
Markdown
git-annex has good support for running commands in parallel, but there
|
|
are still some things that could be improved, tracked here:
|
|
|
|
* Maybe support -Jn in more commands. Just needs changing a few lines of code
|
|
and testing each.
|
|
|
|
* Maybe extend --jobs/annex.jobs for more control. `--jobs=cpus` is already
|
|
supported; it might be good to have `--jobs=cpus-1` to leave a spare
|
|
cpu to avoid contention, or `--jobs=remotes*2` to run 2 jobs per remote.
|
|
|
|
* Parallelism is often used when the user wants to full saturate the pipe
|
|
to a remote, since having some extra transfers running avoid being
|
|
delayed while git-annex runs cleanup actions, checksum verification,
|
|
and other non-transfer stuff.
|
|
|
|
But, the user will sometimes be disappointed, because every job
|
|
can still end up stuck doing checksum verification at the same time,
|
|
so the pipe to the remote is not saturated.
|
|
|
|
Now that cleanup actions don't occupy space in the main worker queue,
|
|
all that needs to be done is make checksum verification be done as the
|
|
cleanup action. Currently, it's bundled into the same action that
|
|
transfers content.
|
|
|
|
* onlyActionOn collapses the cleanup action into the start action,
|
|
and so prevents use of the separate cleanup queue.
|
|
|
|
* Don't parallelize start stage actions. They are supposed to run fast,
|
|
and often a huge number of them don't print out anything. The overhead of
|
|
bookkeeping for parallizing those swamps the benefit of parallelizing by
|
|
what seems to be a large degree. Compare `git annex get` in a directory
|
|
where the first several thousand files are already present with and
|
|
without -J.
|
|
|
|
Only once the start stage has decided
|
|
something needs to be done should a job be started up.
|
|
|
|
This probably needs display of any output to be moved out of the start
|
|
stage, because no console region will be allocated for it.
|