linux-uconsole/kernel/time
Chris Redpath 0eb0f9eb4c UPSTREAM: cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely
The way the schedutil governor uses the PELT metric causes it to
underestimate the CPU utilization in some cases.

That can be easily demonstrated by running kernel compilation on
a Sandy Bridge Intel processor, running turbostat in parallel with
it and looking at the values written to the MSR_IA32_PERF_CTL
register.  Namely, the expected result would be that when all CPUs
were 100% busy, all of them would be requested to run in the maximum
P-state, but observation shows that this clearly isn't the case.
The CPUs run in the maximum P-state for a while and then are
requested to run slower and go back to the maximum P-state after
a while again.  That causes the actual frequency of the processor to
visibly oscillate below the sustainable maximum in a jittery fashion
which clearly is not desirable.

That has been attributed to CPU utilization metric updates on task
migration that cause the total utilization value for the CPU to be
reduced by the utilization of the migrated task.  If that happens,
the schedutil governor may see a CPU utilization reduction and will
attempt to reduce the CPU frequency accordingly right away.  That
may be premature, though, for example if the system is generally
busy and there are other runnable tasks waiting to be run on that
CPU already.

This is unlikely to be an issue on systems where cpufreq policies are
shared between multiple CPUs, because in those cases the policy
utilization is computed as the maximum of the CPU utilization values
over the whole policy and if that turns out to be low, reducing the
frequency for the policy most likely is a good idea anyway.  On
systems with one CPU per policy, however, it may affect performance
adversely and even lead to increased energy consumption in some cases.

On those systems it may be addressed by taking another utilization
metric into consideration, like whether or not the CPU whose
frequency is about to be reduced has been idle recently, because if
that's not the case, the CPU is likely to be busy in the near future
and its frequency should not be reduced.

To that end, use the counter of idle calls in the timekeeping code.
Namely, make the schedutil governor look at that counter for the
current CPU every time before its frequency is about to be reduced.
If the counter has not changed since the previous iteration of the
governor computations for that CPU, the CPU has been busy for all
that time and its frequency should not be decreased, so if the new
frequency would be lower than the one set previously, the governor
will skip the frequency update.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Joel Fernandes <joelaf@google.com>
(cherry picked from commit b7eaf1aab9)
(simple CPUFREQ_RT_DL vs CPUFREQ_DL usage conflicts)
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Change-Id: I531ec02c052944ee07a904dc2a25c59948ee762b
2017-08-11 19:31:04 +05:30
..
alarmtimer.c alarmtimer: don't rate limit one-shot timers 2017-07-27 15:06:10 -07:00
clockevents.c clockevents: Remove unused set_mode() callback 2015-09-14 11:00:55 +02:00
clocksource.c clocksource: Allow unregistering the watchdog 2016-09-15 08:27:47 +02:00
hrtimer.c Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2016-09-20 15:18:54 +08:00
itimer.c itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper 2016-02-25 12:01:25 -08:00
jiffies.c tick: Move clocksource related stuff to timekeeping.h 2015-04-01 14:22:58 +02:00
Kconfig rcu: Drop RCU_USER_QS in favor of NO_HZ_FULL 2015-07-06 13:52:18 -07:00
Makefile time: Remove development rules from Kbuild/Makefile 2015-07-01 09:57:35 +02:00
ntp.c ntp: Fix ADJ_SETOFFSET being used w/ ADJ_NANO 2016-09-15 08:27:47 +02:00
ntp_internal.h ntp/pps: use timespec64 for hardpps() 2015-10-01 09:57:59 -07:00
posix-clock.c posix-clock: Fix return code on the poll method's error path 2016-03-03 15:07:15 -08:00
posix-cpu-timers.c posix_cpu_timer: Exit early when process has been reaped 2016-08-10 11:49:29 +02:00
posix-timers.c posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper 2016-02-25 12:01:25 -08:00
sched_clock.c timers, sched/clock: Clean up the code a bit 2015-03-27 08:34:01 +01:00
test_udelay.c time: Rename udelay_test.c to test_udelay.c 2014-11-21 11:59:55 -08:00
tick-broadcast-hrtimer.c kernel: broadcast-hrtimer: Migrate to new 'set-state' interface 2015-08-10 11:41:08 +02:00
tick-broadcast.c tick/broadcast: Prevent NULL pointer dereference 2017-01-12 11:22:51 +01:00
tick-common.c clockevents: Remove unused set_mode() callback 2015-09-14 11:00:55 +02:00
tick-internal.h timer: Minimize nohz off overhead 2015-06-19 15:18:28 +02:00
tick-oneshot.c clockevents: Provide functions to set and get the state 2015-06-02 14:40:47 +02:00
tick-sched.c UPSTREAM: cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely 2017-08-11 19:31:04 +05:30
tick-sched.h tick/broadcast: Make idle check independent from mode and config 2015-07-07 18:46:47 +02:00
time.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 14:04:50 -07:00
timeconst.bc timeconst: Update path in comment 2015-10-26 10:06:06 +09:00
timeconv.c
timecounter.c timecounter: keep track of accumulated fractional nanoseconds 2014-12-30 18:29:27 -05:00
timekeeping.c Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2017-07-11 16:22:22 +08:00
timekeeping.h hrtimer: Make offset update smarter 2015-04-22 17:06:49 +02:00
timekeeping_debug.c timekeeping: Cap array access in timekeeping_debug 2016-09-15 08:27:52 +02:00
timekeeping_internal.h clocksource: Move cycle_last validation to core code 2014-07-23 15:01:51 -07:00
timer.c BACKPORT: timer: convert timer_slack_ns from unsigned long to u64 2016-07-11 12:43:04 +05:30
timer_list.c hrtimer: Handle remaining time proper for TIME_LOW_RES 2016-02-17 12:30:57 -08:00
timer_stats.c timer: Stats: Simplify the flags handling 2015-06-19 15:18:27 +02:00