git-annex/doc/todo/parallel_possibilities.mdwn

25 lines
1.2 KiB
Text
Raw Normal View History

git-annex has good support for running commands in parallel, but there
are still some things that could be improved, tracked here:
* Maybe support -Jn in more commands. Just needs changing a few lines of code
and testing each.
* Maybe extend --jobs/annex.jobs for more control. `--jobs=cpus` is already
supported; it might be good to have `--jobs=cpus-1` to leave a spare
cpu to avoid contention, or `--jobs=remotes*2` to run 2 jobs per remote.
* Parallelism is often used when the user wants to full saturate the pipe
to a remote, since having some extra transfers running avoid being
delayed while git-annex runs cleanup actions, checksum verification,
and other non-transfer stuff.
But, the user will sometimes be disappointed, because every job
can still end up stuck doing checksum verification at the same time,
so the pipe to the remote is not saturated.
Running cleanup actions in a separate queue from the main job queue
wouldn't be sufficient for this, because verification is done as part
of the same action that transfers content. That needs to somehow be
refactored to a cleanup action that ingests the file, and then
the cleanup action can be run in a separate queue.