When builtin compiled, there is a chance for this driver
be probed before cpufreq driver is up and running. In this
case, the cpucooling device can be wrong initialized.
Thus, this patch makes sure this driver is probed only
when cpufreq driver is ready. Whenever there is no
cpufreq driver registered, the probe will return -EPROBE_DEFER.
Tested-by: J Keerthy <j-keerthy@ti.com>
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update the constants to the correct hotspot extrapolation
equation constants. OMAP4 constants are revisited and correct.
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch defines and utilizes the extrapolation constants for OMAP4430.
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TC1/TC2 are not needed anymore, API has been upgraded.
This is a TODO left-over.
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Min/Max cooling state are defined by registration helper
function, if no specific limits are passed. No need to change
this code.
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch removes out of the TODO list those already completed.
Here is the status and why they are removed:
on ti-bandgap.c:
-- Add support to hwmon: REMOVED, no need to have hwmon interfaces as
the control is done via thermal framework.
-- Test every exposed API to userland: DONE, via thermal fw APIs
By now, no specific API is exposed by this driver
-- Revisit data structures and simplify them: DONE, all
unused fields are flagged for future removal.
-- Once SCM-core api settles, update this driver accordingly: DONE,
the BG driver can exist without SCM driver by ioremapping its own
registers and doing its own locking.
on ti-thermal-common.c/ti-thermal.h:
-- Revisit trips and its definitions: DONE, for now there is no
need to change current definition. Alert based policy will be add
in future.
-- Revisit trending: DONE, OMAP5 history buffer support has been
implemented. Devices without history buffer will use thermal fw
trending capability.
on omap5-thermal.c
-- Add support for GPU cooling: REMOVED: this will not be part
of this driver. Must be done in a separated cooling device.
generally:
-- make checkpatch.pl and sparse happy: DONE, sparse remaining
warning is not an issue.
-- update documentation: DONE, kernel-doc for ti-bandgap is now
available.
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
need set '\0' at the end. or cause issue.
it is called by c4_ioctl in drivers/staging/cxt1e1/linux.c
all things need be initialized, before provide them to user mode.
so we can not use strlcpy instead of strncpy.
code style:
all contents of the file use 4 spaces instead of '\t',
so this patch has to follow, now.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'thisboard' macro relies on a local variable having a specific
name and yields a pointer derived from that local variable.
Replace the macro with local variables and use the comedi_board()
helper to get the pointer.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'boardtype' macro relies on a local variable having a specific
name and yields a struct derived from that local variable.
Replace the macro with local variables and use the comedi_board()
helper to get the struct as a pointer. Use pointer access when
using the variable.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'boardtype' macro relies on a local variable having a specific
name and yields a struct derived from that local variable.
Replace the macro with local variables and use the comedi_board()
helper to get the struct as a pointer. Use pointer access when
using the variable.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'thisboard' macro relies on a local variable having a specific
name and yields a pointer derived from that local variable.
Replace the macro with local variables and use the comedi_board()
helper to get the pointer.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'thisboard' macro relies on a local variable having a specific
name and yields a pointer derived from that local variable.
Replace the macro with local variables and use the comedi_board()
helper to get the pointer.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function is not used. Just remove it.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This pr_err() is just added noise, the user can't do anything about it.
Just remove it.
Since this is the only pr_level() message in the driver, also remove
the pr_fmt() macro.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Many of the static functions in this driver have names that could
potentially clash with external symbols (tty_ioctl, tty_write, etc.).
Rename all the static functions so they have a 'serial2002_' prefix
to avoid any issues.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For aesthetic reasons, hookup the comedi_device (*open) and (*close)
functions after everything else in the attach has succeeded.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A return value of >=0 indicates a successful attach to the comedi core.
Return 0 since that is more common in the kernel.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is just added noise. Remove it.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To improve the readability, add some whitespace to the subdevice
init.
Also, for aesthetic reasons and the help with greps, rename the
(*insn_{read,write}) functions.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the number of subdevices allocated (dev->n_subdevices) in the
(*detach) instead of assuming a given number.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Define and document the bit shifts of the serial.data read from
the device that is used to configure the subdevice channels.
Use the new defines to tidy up the configuration process.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Split out the code that sets up the comedi subdevices that are
attached to the serial port.
There are actually two steps:
1) Read the configuration of the attached subdevices.
2) Use the configuration data to setup the comedi subdevices.
Step 1 is split out as serial2002_setup_subdevs().
Step 2 is split out as serial2002_setup_subdevice().
Cleanup the split out code to remove all the extra '{ }' and indents.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove the unnecessary '{ }' around the code and the extra indents
in the switch().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rename the two local variables used to set the serial port speed
and latency so thy are unique.
Remove the unnecessary '{ }' around the code and the extra indents.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Factor the (*poll) busy wait code out of tty_read() so the indent
level can be reduced and tty_read() is a bit cleaner.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The struct file_operations (*read) and (*write) operations expect the
buffer to be a __user space pointer.
Currently the (*write) operations in this driver cause this warning:
warning: incorrect type in argument 2 (different address spaces)
expected char const [noderef] <asn:1>*<noident>
got unsigned char [usertype] *buf
And the (*read) operations cause this warning:
warning: incorrect type in argument 2 (different address spaces)
expected char [noderef] <asn:1>*<noident>
got unsigned char *<noident>
Use __force to cast the buffer to a __user pointer to suppress the
warnings.
Consolidate the (*read) calls into a helper function, __tty_readb().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function tracing control loop used by perf spits out a warning
if the called function is not a control function. This is because
the control function references a per cpu allocated data structure
on struct ftrace_ops that is not allocated for other types of
functions.
commit 0a016409e4 "ftrace: Optimize the function tracer list loop"
Had an optimization done to all function tracing loops to optimize
for a single registered ops. Unfortunately, this allows for a slight
race when tracing starts or ends, where the stub function might be
called after the current registered ops is removed. In this case we
get the following dump:
root# perf stat -e ftrace:function sleep 1
[ 74.339105] WARNING: at include/linux/ftrace.h:209 ftrace_ops_control_func+0xde/0xf0()
[ 74.349522] Hardware name: PRIMERGY RX200 S6
[ 74.357149] Modules linked in: sg igb iTCO_wdt ptp pps_core iTCO_vendor_support i7core_edac dca lpc_ich i2c_i801 coretemp edac_core crc32c_intel mfd_core ghash_clmulni_intel dm_multipath acpi_power_meter pcspk
r microcode vhost_net tun macvtap macvlan nfsd kvm_intel kvm auth_rpcgss nfs_acl lockd sunrpc uinput xfs libcrc32c sd_mod crc_t10dif sr_mod cdrom mgag200 i2c_algo_bit drm_kms_helper ttm qla2xxx mptsas ahci drm li
bahci scsi_transport_sas mptscsih libata scsi_transport_fc i2c_core mptbase scsi_tgt dm_mirror dm_region_hash dm_log dm_mod
[ 74.446233] Pid: 1377, comm: perf Tainted: G W 3.9.0-rc1 #1
[ 74.453458] Call Trace:
[ 74.456233] [<ffffffff81062e3f>] warn_slowpath_common+0x7f/0xc0
[ 74.462997] [<ffffffff810fbc60>] ? rcu_note_context_switch+0xa0/0xa0
[ 74.470272] [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0
[ 74.478117] [<ffffffff81062e9a>] warn_slowpath_null+0x1a/0x20
[ 74.484681] [<ffffffff81102ede>] ftrace_ops_control_func+0xde/0xf0
[ 74.491760] [<ffffffff8162f400>] ftrace_call+0x5/0x2f
[ 74.497511] [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f
[ 74.503486] [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f
[ 74.509500] [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50
[ 74.516088] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[ 74.522268] [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50
[ 74.528837] [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0
[ 74.536696] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[ 74.542878] [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50
[ 74.548869] [<ffffffff81105c67>] unregister_ftrace_function+0x27/0x50
[ 74.556243] [<ffffffff8111eadf>] perf_ftrace_event_register+0x9f/0x140
[ 74.563709] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[ 74.569887] [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50
[ 74.575898] [<ffffffff8111e94e>] perf_trace_destroy+0x2e/0x50
[ 74.582505] [<ffffffff81127ba9>] tp_perf_event_destroy+0x9/0x10
[ 74.589298] [<ffffffff811295d0>] free_event+0x70/0x1a0
[ 74.595208] [<ffffffff8112a579>] perf_event_release_kernel+0x69/0xa0
[ 74.602460] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[ 74.608667] [<ffffffff8112a640>] put_event+0x90/0xc0
[ 74.614373] [<ffffffff8112a740>] perf_release+0x10/0x20
[ 74.620367] [<ffffffff811a3044>] __fput+0xf4/0x280
[ 74.625894] [<ffffffff811a31de>] ____fput+0xe/0x10
[ 74.631387] [<ffffffff81083697>] task_work_run+0xa7/0xe0
[ 74.637452] [<ffffffff81014981>] do_notify_resume+0x71/0xb0
[ 74.643843] [<ffffffff8162fa92>] int_signal+0x12/0x17
To fix this a new ftrace_ops flag is added that denotes the ftrace_list_end
ftrace_ops stub as just that, a stub. This flag is now checked in the
control loop and the function is not called if the flag is set.
Thanks to Jovi for not just reporting the bug, but also pointing out
where the bug was in the code.
Link: http://lkml.kernel.org/r/514A8855.7090402@redhat.com
Link: http://lkml.kernel.org/r/1364377499-1900-15-git-send-email-jovi.zhangwei@huawei.com
Tested-by: WANG Chao <chaowang@redhat.com>
Reported-by: WANG Chao <chaowang@redhat.com>
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
If we reenable ftrace via syctl, we currently set ftrace_trace_function
based on the previous simplistic algorithm. This is inconsistent with
what update_ftrace_function does. So better call that helper instead.
Link: http://lkml.kernel.org/r/5151D26F.1070702@siemens.com
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The commit 34600f0e9 "tracing: Fix race with max_tr and changing tracers"
fixed the updating of the main buffers with the race of changing
tracers, but left out the fix to the updating of just a per cpu buffer.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
It seems that the reason why the dev features were ignored was because
they were enabled after registeration.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some cases, the VM_PKT_COMP message can arrive later than RNDIS completion
message, which will free the packet memory. This may cause panic due to access
to freed memory in netvsc_send_completion().
This patch fixes this problem by removing rndis_filter_send_request_completion()
from the code path. The function was a no-op.
Reported-by: Long Li <longli@microsoft.com>
Tested-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The warning about local_bh_enable inside IRQ happens when disconnecting a
virtual NIC.
The reason for the warning is -- netif_tx_disable() is called when the NIC
is disconnected. And it's called within irq context. netif_tx_disable() calls
local_bh_enable() which displays warning if in irq.
The fix is to remove the unnecessary netif_tx_disable & wake_queue() in the
netvsc_linkstatus_callback().
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Tested-by: Long Li <longli@microsoft.com>
Tested-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
move might_sleep operations out of the rcu_read_lock() section.
Also fix iterating over ifa_dev->ifa_list
Introduced by: commit 5c766d642b "ipv4: introduce address lifetime"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will result in calling check_lifetime in nearest opportunity and
that function will adjust next time to call check_lifetime correctly.
Without this, check_lifetime is called in time computed by previous run,
not affecting modified lifetime.
Introduced by: commit 5c766d642b "ipv4: introduce address lifetime"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
- A couple mxs boards that run I2C at 400 kHz experience some unstable
issue occasionally. Slow down the clock speed to have I2C work
reliably.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJRXCXBAAoJEFBXWFqHsHzOqg4H/irUyWQkIfch7us/vnMQnntr
y67TFoq1ucfHDA4/Okd5YxeUWjh0vaFJTVu5JLKfuT7HGTzSeqG4sFCp/A6/i9F9
ZGbcmTUSgG3erjpzN0R+PLxWO5iqjT8Mg3/cyK83SguKtf4u/MVJq8jpTyc0JJ9R
UNj4OaKN5n7tvPGAkj8ypWDpQwlczRRie6v+Nt1qWP1K7T7Ez9ZMkhOvd7WLUi4T
H7HcWo5IOUBzLvf7JmBAQwTCTdl7BJglBO5FjTn7ao0uaMEHLJOwzPWdXs08Agzh
qMVJcWqVCcQQmB5nuDDb08mlXXEYrt/92qL3LUejJaX76bPH+lzfCnvrDGUIuvo=
=Towm
-----END PGP SIGNATURE-----
Merge tag 'mxs-fixes-3.9-4' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes
From Shawn Guo <shawn.guo@linaro.org>:
The mxs fixes for 3.9, take 4:
- A couple mxs boards that run I2C at 400 kHz experience some unstable
issue occasionally. Slow down the clock speed to have I2C work
reliably.
* tag 'mxs-fixes-3.9-4' of git://git.linaro.org/people/shawnguo/linux-2.6:
ARM: mxs: Slow down the I2C clock speed
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Recent commit 6fac4829 ("cputime: Use accessors to read task
cputime stats") introduced a bug, where we account many times
the cputime of the first thread, instead of cputimes of all
the different threads.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130404085740.GA2495@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For NUL terminated string we always need to set '\0' at the end.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: rostedt@goodmis.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/516243B7.9020405@asianux.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For NUL terminated string we always need to set '\0' at the end.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: rostedt@goodmis.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/51624254.30301@asianux.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For NUL terminated string, always make sure that there's '\0' at the end.
In our case we need a return value, so still use strncpy() and
fix up the tail explicitly.
(strlcpy() returns the size, not the pointer)
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: a.p.zijlstra@chello.nl <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org <paulus@samba.org>
Cc: acme@ghostprotocols.net <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/51623E0B.7070101@asianux.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 201c373e8e ("sched/debug: Limit sd->*_idx range on
sysctl") was an incomplete bug fix.
This patch fixes sd->*_idx limit range to [0 ~ CPU_LOAD_IDX_MAX-1]
avoiding array overflow caused by setting sd->*_idx to CPU_LOAD_IDX_MAX
on sysctl.
Signed-off-by: Libin <huawei.libin@huawei.com>
Cc: <jiang.liu@huawei.com>
Cc: <guohanjun@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51626610.2040607@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The sched_clock_remote() implementation has the following inatomicity
problem on 32bit systems when accessing the remote scd->clock, which
is a 64bit value.
CPU0 CPU1
sched_clock_local() sched_clock_remote(CPU0)
...
remote_clock = scd[CPU0]->clock
read_low32bit(scd[CPU0]->clock)
cmpxchg64(scd->clock,...)
read_high32bit(scd[CPU0]->clock)
While the update of scd->clock is using an atomic64 mechanism, the
readout on the remote cpu is not, which can cause completely bogus
readouts.
It is a quite rare problem, because it requires the update to hit the
narrow race window between the low/high readout and the update must go
across the 32bit boundary.
The resulting misbehaviour is, that CPU1 will see the sched_clock on
CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value
to this bogus timestamp. This stays that way due to the clamping
implementation for about 4 seconds until the synchronization with
CLOCK_MONOTONIC undoes the problem.
The issue is hard to observe, because it might only result in a less
accurate SCHED_OTHER timeslicing behaviour. To create observable
damage on realtime scheduling classes, it is necessary that the bogus
update of CPU1 sched_clock happens in the context of an realtime
thread, which then gets charged 4 seconds of RT runtime, which results
in the RT throttler mechanism to trigger and prevent scheduling of RT
tasks for a little less than 4 seconds. So this is quite unlikely as
well.
The issue was quite hard to decode as the reproduction time is between
2 days and 3 weeks and intrusive tracing makes it less likely, but the
following trace recorded with trace_clock=global, which uses
sched_clock_local(), gave the final hint:
<idle>-0 0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80
<idle>-0 0d..30 400269.477151: hrtimer_start: hrtimer=0xf7061e80 ...
irq/20-S-587 1d..32 400273.772118: sched_wakeup: comm= ... target_cpu=0
<idle>-0 0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80
What happens is that CPU0 goes idle and invokes
sched_clock_idle_sleep_event() which invokes sched_clock_local() and
CPU1 runs a remote wakeup for CPU0 at the same time, which invokes
sched_remote_clock(). The time jump gets propagated to CPU0 via
sched_remote_clock() and stays stale on both cores for ~4 seconds.
There are only two other possibilities, which could cause a stale
sched clock:
1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic
wrong value.
2) sched_clock() which reads the TSC returns a sporadic wrong value.
#1 can be excluded because sched_clock would continue to increase for
one jiffy and then go stale.
#2 can be excluded because it would not make the clock jump
forward. It would just result in a stale sched_clock for one jiffy.
After quite some brain twisting and finding the same pattern on other
traces, sched_clock_remote() remained the only place which could cause
such a problem and as explained above it's indeed racy on 32bit
systems.
So while on 64bit systems the readout is atomic, we need to verify the
remote readout on 32bit machines. We need to protect the local->clock
readout in sched_clock_remote() on 32bit as well because an NMI could
hit between the low and the high readout, call sched_clock_local() and
modify local->clock.
Thanks to Siegfried Wulsch for bearing with my debug requests and
going through the tedious tasks of running a bunch of reproducer
systems to generate the debug information which let me decode the
issue.
Reported-by: Siegfried Wulsch <Siegfried.Wulsch@rovema.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304051544160.21884@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
This reverts commit 8761a3dc1f.
There are situations where the destruction path is called
with the bdev->bd_mutex already held, which then deadlocks in
loop_clr_fd(). The normal partition cleanup does a trylock()
on the mutex, but it'd be nice to have a more bullet proof
method in loop. So punt this more involved fix to the next
merge window, and just back out this buggy fix for now.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Some versions of pHyp will perform the adjunct partition test before the
ANDCOND test. The result of this is that H_RESOURCE can be returned and
cause the BUG_ON condition to occur. The HPTE is not removed. So add a
check for H_RESOURCE, it is ok if this HPTE is not removed as
pSeries_lpar_hpte_remove is looking for an HPTE to remove and not a
specific HPTE to remove. So it is ok to just move on to the next slot
and try again.
Cc: stable@vger.kernel.org
Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Check KR2 recovery time at the beginning of the work-around function.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>