assistant: Add 1/200th second delay between checking each file in the full transfer scan, to avoid using too much CPU.
The slowdown is not going to be large in typical small-ish repos. And it does not seem to matter if the assistant reacts a little bit slower in situations involving the expensive scan, since: a) Those situations typically involve getting back in sync after something has changed on a remote, often after a disconnect of some duration. So taking a few seconds more is not noticable. b) If the scan finds things that it needs to do, it will start blocking anyway after 10 transfers are queued (due to use of queueTransferWhenSmall). So, only the speed of finding the first 10 transfers will be impacted by this change. This commit was sponsored by Jochen Bartl on Patreon.
This commit is contained in:
parent
113b48ba19
commit
af2a6d578e
3 changed files with 44 additions and 0 deletions
|
@ -25,6 +25,7 @@ import qualified Types.Remote as Remote
|
|||
import Utility.ThreadScheduler
|
||||
import Utility.NotificationBroadcaster
|
||||
import Utility.Batch
|
||||
import Utility.ThreadScheduler
|
||||
import qualified Git.LsFiles as LsFiles
|
||||
import Annex.WorkTree
|
||||
import Annex.Content
|
||||
|
@ -32,6 +33,7 @@ import Annex.Wanted
|
|||
import CmdLine.Action
|
||||
|
||||
import qualified Data.Set as S
|
||||
import Control.Concurrent
|
||||
|
||||
{- This thread waits until a remote needs to be scanned, to find transfers
|
||||
- that need to be made, to keep data in sync.
|
||||
|
@ -145,6 +147,10 @@ expensiveScan urlrenderer rs = batch <~> do
|
|||
(findtransfers f unwanted)
|
||||
=<< liftAnnex (lookupFile f)
|
||||
mapM_ (enqueue f) ts
|
||||
|
||||
{- Delay for a short time to avoid using too much CPU. -}
|
||||
liftIO $ threadDelay $ fromIntegral $ oneSecond `div` 200
|
||||
|
||||
scan unwanted' fs
|
||||
|
||||
enqueue f (r, t) =
|
||||
|
|
|
@ -5,6 +5,8 @@ git-annex (6.20170301.2) UNRELEASED; urgency=medium
|
|||
* status: Propigate nonzero exit code from git status.
|
||||
* Linux standalone builds put the bundled ssh last in PATH,
|
||||
so any system ssh will be preferred over it.
|
||||
* assistant: Add 1/200th second delay between checking each file
|
||||
in the full transfer scan, to avoid using too much CPU.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Thu, 02 Mar 2017 12:51:40 -0400
|
||||
|
||||
|
|
|
@ -0,0 +1,36 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2017-03-06T17:03:12Z"
|
||||
content="""
|
||||
The scan that is skipped is one of the files on disk in order to find
|
||||
changes that were made while the assistant was not running.
|
||||
|
||||
What you are seeing is the full transfer scan. While annex.startupscan
|
||||
could be made to also skip that scan, a full transfer scan is not only run
|
||||
at startup, but after merging git-annex branch changes from a remote. So
|
||||
disabling it only at startup does not seem very useful.
|
||||
|
||||
There could be an option to disable the full transfer scan ever running.
|
||||
However, this would make the assistant not notice certian transfers/drops
|
||||
that you would normally want it to do. For example, if a remote got a bunch
|
||||
of files in an archive/ directory from somewhere else, and the local
|
||||
repository contains those files, the full transfer scan is needed to notice
|
||||
that the archived files can now be removed from the local repository.
|
||||
In other situations, the local repository would not get files that it
|
||||
ought to contain.
|
||||
|
||||
So, I think it might be better to make the expensive transfer scan run a
|
||||
little bit slower so it doesn't peg your CPU. I've added a 1/200th second
|
||||
delay after each file it checks.
|
||||
|
||||
That will make it use something like
|
||||
5-10% of the CPU, instead of 100%. At the same time it doesn't slow down the
|
||||
total scan very much. In a repository with 5k files, it makes the scan 25
|
||||
seconds slower, which makes the assistant react that much slower -- but
|
||||
the expensive scan is only needed to make sure things turn out consistent,
|
||||
so its overall speed is not super important.
|
||||
|
||||
Check it out, let me know if it's still using too much CPU. We could always
|
||||
make that 1/200th second tunable, or find a better value for it.
|
||||
"""]]
|
Loading…
Reference in a new issue