git-annex/doc/todo/parallel_possibilities.mdwn
Joey Hess 9d36c826c0
use fine-grained WorkerStages when transferring and verifying
This means that Command.Move and Command.Get don't need to
manually set the stage, and is a lot cleaner conceptually.

Also, this makes Command.Sync.syncFile use the worker pool better.
In the scenario where it first downloads content and then uploads it to
some other remotes, it will start in TransferStage, then enter VerifyStage
and then go back to TransferStage for each transfer to the remotes.
Before, it entered CleanupStage after the download, and stayed in it for
the upload, so too many transfer jobs could run at the same time.

Note that, in Remote.Git, it uses runTransfer and also verifyKeyContent
inside onLocal. That has a Annex state for the remote, with no worker pool.
So the resulting calls to enteringStage won't block in there.

While Remote.Git.copyToRemote does do checksum verification, I
realized that should not use a verification slot in the WorkerPool
to do it. Because, it's reading back from eg, a removable disk to checksum.
That will contend with other writes to that disk. It's best to treat
that checksum verification as just part of the transer. So, removed the todo
item about that, as there's nothing needing to be done.
2019-06-19 13:24:20 -04:00

9 lines
453 B
Markdown

git-annex has good support for running commands in parallel, but there
are still some things that could be improved, tracked here:
* Maybe support -Jn in more commands. Just needs changing a few lines of code
and testing each.
* Maybe extend --jobs/annex.jobs for more control. `--jobs=cpus` is already
supported; it might be good to have `--jobs=cpus-1` to leave a spare
cpu to avoid contention, or `--jobs=remotes*2` to run 2 jobs per remote.