linux-uconsole/kernel/trace
Steven Rostedt (Red Hat) f05e999f16 ftrace: Check module functions being traced on reload
commit 8c4f3c3fa9 upstream.

There's been a nasty bug that would show up and not give much info.
The bug displayed the following warning:

 WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230()
 Pid: 20903, comm: bash Tainted: G           O 3.6.11+ #38405.trunk
 Call Trace:
  [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230
  [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0
  [<ffffffff811401cc>] ? kfree+0x2c/0x110
  [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150
  [<ffffffff81149f1e>] __fput+0xae/0x220
  [<ffffffff8114a09e>] ____fput+0xe/0x10
  [<ffffffff8105fa22>] task_work_run+0x72/0x90
  [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0
  [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
  [<ffffffff815c0f88>] int_signal+0x12/0x17
 ---[ end trace 793179526ee09b2c ]---

It was finally narrowed down to unloading a module that was being traced.

It was actually more than that. When functions are being traced, there's
a table of all functions that have a ref count of the number of active
tracers attached to that function. When a function trace callback is
registered to a function, the function's record ref count is incremented.
When it is unregistered, the function's record ref count is decremented.
If an inconsistency is detected (ref count goes below zero) the above
warning is shown and the function tracing is permanently disabled until
reboot.

The ftrace callback ops holds a hash of functions that it filters on
(and/or filters off). If the hash is empty, the default means to filter
all functions (for the filter_hash) or to disable no functions (for the
notrace_hash).

When a module is unloaded, it frees the function records that represent
the module functions. These records exist on their own pages, that is
function records for one module will not exist on the same page as
function records for other modules or even the core kernel.

Now when a module unloads, the records that represents its functions are
freed. When the module is loaded again, the records are recreated with
a default ref count of zero (unless there's a callback that traces all
functions, then they will also be traced, and the ref count will be
incremented).

The problem is that if an ftrace callback hash includes functions of the
module being unloaded, those hash entries will not be removed. If the
module is reloaded in the same location, the hash entries still point
to the functions of the module but the module's ref counts do not reflect
that.

With the help of Steve and Joern, we found a reproducer:

 Using uinput module and uinput_release function.

 cd /sys/kernel/debug/tracing
 modprobe uinput
 echo uinput_release > set_ftrace_filter
 echo function > current_tracer
 rmmod uinput
 modprobe uinput
 # check /proc/modules to see if loaded in same addr, otherwise try again
 echo nop > current_tracer

 [BOOM]

The above loads the uinput module, which creates a table of functions that
can be traced within the module.

We add uinput_release to the filter_hash to trace just that function.

Enable function tracincg, which increments the ref count of the record
associated to uinput_release.

Remove uinput, which frees the records including the one that represents
uinput_release.

Load the uinput module again (and make sure it's at the same address).
This recreates the function records all with a ref count of zero,
including uinput_release.

Disable function tracing, which will decrement the ref count for uinput_release
which is now zero because of the module removal and reload, and we have
a mismatch (below zero ref count).

The solution is to check all currently tracing ftrace callbacks to see if any
are tracing any of the module's functions when a module is loaded (it already does
that with callbacks that trace all functions). If a callback happens to have
a module function being traced, it increments that records ref count and starts
tracing that function.

There may be a strange side effect with this, where tracing module functions
on unload and then reloading a new module may have that new module's functions
being traced. This may be something that confuses the user, but it's not
a big deal. Another approach is to disable all callback hashes on module unload,
but this leaves some ftrace callbacks that may not be registered, but can
still have hashes tracing the module's function where ftrace doesn't know about
it. That situation can cause the same bug. This solution solves that case too.
Another benefit of this solution, is it is possible to trace a module's
function on unload and load.

Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.com

Reported-by: Jörn Engel <joern@logfs.org>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Steve Hodgson <steve@purestorage.com>
Tested-by: Steve Hodgson <steve@purestorage.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-29 09:47:34 -07:00
..
blktrace.c Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-block 2013-05-08 11:51:05 -07:00
ftrace.c ftrace: Check module functions being traced on reload 2013-08-29 09:47:34 -07:00
Kconfig ring-buffer: Select IRQ_WORK 2013-05-03 19:24:17 -04:00
Makefile trace: Stop compiling in trace_clock unconditionally 2012-09-13 22:52:08 -04:00
power-traces.c PM / tracing: remove deprecated power trace API 2013-01-26 00:39:12 +01:00
ring_buffer.c ring-buffer: Do not poll non allocated cpu buffers 2013-05-28 10:53:20 -04:00
ring_buffer_benchmark.c tracing: Use NUMA allocation for per-cpu ring buffer pages 2011-06-14 22:04:39 -04:00
rpm-traces.c PM / Runtime: Introduce trace points for tracing rpm_* functions 2011-09-27 22:53:27 +02:00
trace.c tracing: Change tracing_fops/snapshot_fops to rely on tracing_get_cpu() 2013-08-29 09:47:32 -07:00
trace.h tracing: Add trace_array_get/put() to event handling 2013-07-25 14:07:43 -07:00
trace_branch.c tracing: Fix the branch tracer that broke with buffer change 2013-03-15 00:35:54 -04:00
trace_clock.c tracing: Add "uptime" trace clock that uses jiffies 2013-03-15 00:36:09 -04:00
trace_entries.h tracing: Add trace_puts() for even faster trace_printk() tracing 2013-03-15 00:35:55 -04:00
trace_event_perf.c perf/core improvements and fixes: 2012-08-21 11:27:00 +02:00
trace_events.c tracing: trace_remove_event_call() should fail if call/file is in use 2013-08-29 09:47:34 -07:00
trace_events_filter.c tracing: Change event_filter_read/write to verify i_private != NULL 2013-08-29 09:47:33 -07:00
trace_events_filter_test.h tracing/filter: Add startup tests for events filter 2011-08-19 14:35:59 -04:00
trace_export.c tracing: Fix some section mismatch warnings 2013-03-15 00:34:54 -04:00
trace_functions.c tracing: Add function probe to trigger stack traces 2013-03-15 00:36:05 -04:00
trace_functions_graph.c tracing: Consolidate max_tr into main trace_array structure 2013-03-15 00:35:40 -04:00
trace_irqsoff.c tracing: Use flag buffer_disabled for irqsoff tracer 2013-08-14 22:59:07 -07:00
trace_kdb.c tracing: Consolidate max_tr into main trace_array structure 2013-03-15 00:35:40 -04:00
trace_kprobe.c tracing/kprobes: Fail to unregister if probe event files are in use 2013-08-29 09:47:34 -07:00
trace_mmiotrace.c tracing: Consolidate max_tr into main trace_array structure 2013-03-15 00:35:40 -04:00
trace_nop.c
trace_output.c Tracing updates for Linux 3.10 2013-04-29 13:55:38 -07:00
trace_output.h tracing: Rename trace_event_mutex to trace_event_sem 2013-03-15 13:22:10 -04:00
trace_printk.c tracing: Add percpu buffers for trace_printk() 2012-04-23 21:15:55 -04:00
trace_probe.c tracing: Replace strict_strto* with kstrto* 2012-10-31 16:45:23 -04:00
trace_probe.h uprobes/tracing: Introduce is_trace_uprobe_enabled() 2013-02-08 18:24:30 +01:00
trace_sched_switch.c tracing: Consolidate max_tr into main trace_array structure 2013-03-15 00:35:40 -04:00
trace_sched_wakeup.c tracing: Add function-trace option to disable function tracing of latency tracers 2013-03-15 00:36:08 -04:00
trace_selftest.c tracing: Fix bad parameter passed in branch selftest 2013-05-29 16:00:03 -04:00
trace_selftest_dynamic.c ftrace: Add self-tests for multiple function trace users 2011-05-18 19:24:51 -04:00
trace_stack.c Tracing updates for Linux 3.10 2013-04-29 13:55:38 -07:00
trace_stat.c tracing: Check return value of tracing_init_dentry() 2013-04-12 23:02:32 -04:00
trace_stat.h
trace_syscalls.c tracing: Fix irqs-off tag display in syscall tracing 2013-07-25 14:07:43 -07:00
trace_uprobe.c tracing/uprobes: Fail to unregister if probe event files are in use 2013-08-29 09:47:34 -07:00