linux-uconsole/drivers/usb/core
Douglas Anderson 812e484133 USB: core: Avoid race of async_completed() w/ usbdev_release()
commit ed62ca2f4f upstream.

While running reboot tests w/ a specific set of USB devices (and
slub_debug enabled), I found that once every few hours my device would
be crashed with a stack that looked like this:

[   14.012445] BUG: spinlock bad magic on CPU#0, modprobe/2091
[   14.012460]  lock: 0xffffffc0cb055978, .magic: ffffffc0, .owner: cryption contexts: %lu/%lu
[   14.012460] /1025536097, .owner_cpu: 0
[   14.012466] CPU: 0 PID: 2091 Comm: modprobe Not tainted 4.4.79 #352
[   14.012468] Hardware name: Google Kevin (DT)
[   14.012471] Call trace:
[   14.012483] [<....>] dump_backtrace+0x0/0x160
[   14.012487] [<....>] show_stack+0x20/0x28
[   14.012494] [<....>] dump_stack+0xb4/0xf0
[   14.012500] [<....>] spin_dump+0x8c/0x98
[   14.012504] [<....>] spin_bug+0x30/0x3c
[   14.012508] [<....>] do_raw_spin_lock+0x40/0x164
[   14.012515] [<....>] _raw_spin_lock_irqsave+0x64/0x74
[   14.012521] [<....>] __wake_up+0x2c/0x60
[   14.012528] [<....>] async_completed+0x2d0/0x300
[   14.012534] [<....>] __usb_hcd_giveback_urb+0xc4/0x138
[   14.012538] [<....>] usb_hcd_giveback_urb+0x54/0xf0
[   14.012544] [<....>] xhci_irq+0x1314/0x1348
[   14.012548] [<....>] usb_hcd_irq+0x40/0x50
[   14.012553] [<....>] handle_irq_event_percpu+0x1b4/0x3f0
[   14.012556] [<....>] handle_irq_event+0x4c/0x7c
[   14.012561] [<....>] handle_fasteoi_irq+0x158/0x1c8
[   14.012564] [<....>] generic_handle_irq+0x30/0x44
[   14.012568] [<....>] __handle_domain_irq+0x90/0xbc
[   14.012572] [<....>] gic_handle_irq+0xcc/0x18c

Investigation using kgdb() found that the wait queue that was passed
into wake_up() had been freed (it was filled with slub_debug poison).

I analyzed and instrumented the code and reproduced.  My current
belief is that this is happening:

1. async_completed() is called (from IRQ).  Moves "as" onto the
   completed list.
2. On another CPU, proc_reapurbnonblock_compat() calls
   async_getcompleted().  Blocks on spinlock.
3. async_completed() releases the lock; keeps running; gets blocked
   midway through wake_up().
4. proc_reapurbnonblock_compat() => async_getcompleted() gets the
   lock; removes "as" from completed list and frees it.
5. usbdev_release() is called.  Frees "ps".
6. async_completed() finally continues running wake_up().  ...but
   wake_up() has a pointer to the freed "ps".

The instrumentation that led me to believe this was based on adding
some trace_printk() calls in a select few functions and then using
kdb's "ftdump" at crash time.  The trace follows (NOTE: in the trace
below I cheated a little bit and added a udelay(1000) in
async_completed() after releasing the spinlock because I wanted it to
trigger quicker):

<...>-2104   0d.h2 13759034us!: async_completed at start: as=ffffffc0cc638200
mtpd-2055    3.... 13759356us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3d..1 13759362us : async_getcompleted after list_del_init: as=ffffffc0cc638200
mtpd-2055    3.... 13759371us+: proc_reapurbnonblock_compat: free_async(ffffffc0cc638200)
mtpd-2055    3.... 13759422us+: async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759479us : usbdev_release at start: ps=ffffffc0cc042080
mtpd-2055    3.... 13759487us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759497us!: usbdev_release after kfree(ps): ps=ffffffc0cc042080
<...>-2104   0d.h2 13760294us : async_completed before wake_up(): as=ffffffc0cc638200

To fix this problem we can just move the wake_up() under the ps->lock.
There should be no issues there that I'm aware of.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-13 14:09:44 -07:00
..
buffer.c Usb: core: buffer: fixed the checkpatch warning 2015-05-10 15:44:10 +02:00
config.c usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk 2017-03-30 09:35:16 +02:00
devices.c usb: define USB_SPEED_SUPER_PLUS speed for SuperSpeedPlus USB3.1 devices 2016-09-07 08:32:39 +02:00
devio.c USB: core: Avoid race of async_completed() w/ usbdev_release() 2017-09-13 14:09:44 -07:00
driver.c usb: hub: Do not attempt to autosuspend disconnected devices 2017-05-20 14:26:59 +02:00
endpoint.c usb: endpoint: convert spaces to tabs 2015-08-14 16:50:36 -07:00
file.c USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously 2017-05-20 14:26:58 +02:00
generic.c staging: usbip: convert usbip-host driver to usb_device_driver 2014-02-07 10:54:30 -08:00
hcd-pci.c usb: define USB_SPEED_SUPER_PLUS speed for SuperSpeedPlus USB3.1 devices 2016-09-07 08:32:39 +02:00
hcd.c usb: core: unlink urbs from the tail of the endpoint's urb_list 2017-08-16 13:40:30 -07:00
hub.c USB: Check for dropped connection before switching to full speed 2017-08-16 13:40:30 -07:00
hub.h usb: hub: convert khubd into workqueue 2014-09-23 22:33:19 -07:00
Kconfig usb: kconfig: fix warning of select USB_OTG 2015-11-19 16:31:42 -08:00
Makefile USB: core: remove CONFIG_USB_DEBUG usage 2013-12-21 16:01:00 -08:00
message.c usb: message: remove redundant declaration 2015-10-04 10:45:11 +01:00
notify.c
otg_whitelist.h usb: otg_whitelist: remove whitespace 2015-08-14 16:50:36 -07:00
port.c usb: Quiet down false peer failure messages 2015-12-04 08:19:55 -08:00
quirks.c usb: Add device quirk for Logitech HD Pro Webcam C920-C 2017-09-13 14:09:44 -07:00
sysfs.c usb: core: lpm: fix usb3_hardware_lpm sysfs node 2016-01-31 11:28:58 -08:00
urb.c USB: core: replace %p with %pK 2017-05-25 14:30:07 +02:00
usb-acpi.c usb: optimize acpi companion search for usb port devices 2017-08-24 17:02:36 -07:00
usb.c usb: interface authorization: Use a flag for the default device authorization 2015-09-22 12:08:40 -07:00
usb.h usb: define USB_SPEED_SUPER_PLUS speed for SuperSpeedPlus USB3.1 devices 2016-09-07 08:32:39 +02:00