linux-uconsole/drivers
Lyude Paul 3e1a12754d drm/nouveau: Fix deadlocks in nouveau_connector_detect()
When we disable hotplugging on the GPU, we need to be able to
synchronize with each connector's hotplug interrupt handler before the
interrupt is finally disabled. This can be a problem however, since
nouveau_connector_detect() currently grabs a runtime power reference
when handling connector probing. This will deadlock the runtime suspend
handler like so:

[  861.480896] INFO: task kworker/0:2:61 blocked for more than 120 seconds.
[  861.483290]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.485158] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.486332] kworker/0:2     D    0    61      2 0x80000000
[  861.487044] Workqueue: events nouveau_display_hpd_work [nouveau]
[  861.487737] Call Trace:
[  861.488394]  __schedule+0x322/0xaf0
[  861.489070]  schedule+0x33/0x90
[  861.489744]  rpm_resume+0x19c/0x850
[  861.490392]  ? finish_wait+0x90/0x90
[  861.491068]  __pm_runtime_resume+0x4e/0x90
[  861.491753]  nouveau_display_hpd_work+0x22/0x60 [nouveau]
[  861.492416]  process_one_work+0x231/0x620
[  861.493068]  worker_thread+0x44/0x3a0
[  861.493722]  kthread+0x12b/0x150
[  861.494342]  ? wq_pool_ids_show+0x140/0x140
[  861.494991]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.495648]  ret_from_fork+0x3a/0x50
[  861.496304] INFO: task kworker/6:2:320 blocked for more than 120 seconds.
[  861.496968]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.497654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.498341] kworker/6:2     D    0   320      2 0x80000080
[  861.499045] Workqueue: pm pm_runtime_work
[  861.499739] Call Trace:
[  861.500428]  __schedule+0x322/0xaf0
[  861.501134]  ? wait_for_completion+0x104/0x190
[  861.501851]  schedule+0x33/0x90
[  861.502564]  schedule_timeout+0x3a5/0x590
[  861.503284]  ? mark_held_locks+0x58/0x80
[  861.503988]  ? _raw_spin_unlock_irq+0x2c/0x40
[  861.504710]  ? wait_for_completion+0x104/0x190
[  861.505417]  ? trace_hardirqs_on_caller+0xf4/0x190
[  861.506136]  ? wait_for_completion+0x104/0x190
[  861.506845]  wait_for_completion+0x12c/0x190
[  861.507555]  ? wake_up_q+0x80/0x80
[  861.508268]  flush_work+0x1c9/0x280
[  861.508990]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[  861.509735]  nvif_notify_put+0xb1/0xc0 [nouveau]
[  861.510482]  nouveau_display_fini+0xbd/0x170 [nouveau]
[  861.511241]  nouveau_display_suspend+0x67/0x120 [nouveau]
[  861.511969]  nouveau_do_suspend+0x5e/0x2d0 [nouveau]
[  861.512715]  nouveau_pmops_runtime_suspend+0x47/0xb0 [nouveau]
[  861.513435]  pci_pm_runtime_suspend+0x6b/0x180
[  861.514165]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.514897]  __rpm_callback+0x7a/0x1d0
[  861.515618]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.516313]  rpm_callback+0x24/0x80
[  861.517027]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.517741]  rpm_suspend+0x142/0x6b0
[  861.518449]  pm_runtime_work+0x97/0xc0
[  861.519144]  process_one_work+0x231/0x620
[  861.519831]  worker_thread+0x44/0x3a0
[  861.520522]  kthread+0x12b/0x150
[  861.521220]  ? wq_pool_ids_show+0x140/0x140
[  861.521925]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.522622]  ret_from_fork+0x3a/0x50
[  861.523299] INFO: task kworker/6:0:1329 blocked for more than 120 seconds.
[  861.523977]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.524644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.525349] kworker/6:0     D    0  1329      2 0x80000000
[  861.526073] Workqueue: events nvif_notify_work [nouveau]
[  861.526751] Call Trace:
[  861.527411]  __schedule+0x322/0xaf0
[  861.528089]  schedule+0x33/0x90
[  861.528758]  rpm_resume+0x19c/0x850
[  861.529399]  ? finish_wait+0x90/0x90
[  861.530073]  __pm_runtime_resume+0x4e/0x90
[  861.530798]  nouveau_connector_detect+0x7e/0x510 [nouveau]
[  861.531459]  ? ww_mutex_lock+0x47/0x80
[  861.532097]  ? ww_mutex_lock+0x47/0x80
[  861.532819]  ? drm_modeset_lock+0x88/0x130 [drm]
[  861.533481]  drm_helper_probe_detect_ctx+0xa0/0x100 [drm_kms_helper]
[  861.534127]  drm_helper_hpd_irq_event+0xa4/0x120 [drm_kms_helper]
[  861.534940]  nouveau_connector_hotplug+0x98/0x120 [nouveau]
[  861.535556]  nvif_notify_work+0x2d/0xb0 [nouveau]
[  861.536221]  process_one_work+0x231/0x620
[  861.536994]  worker_thread+0x44/0x3a0
[  861.537757]  kthread+0x12b/0x150
[  861.538463]  ? wq_pool_ids_show+0x140/0x140
[  861.539102]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.539815]  ret_from_fork+0x3a/0x50
[  861.540521]
               Showing all locks held in the system:
[  861.541696] 2 locks held by kworker/0:2/61:
[  861.542406]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.543071]  #1: 0000000076868126 ((work_completion)(&drm->hpd_work)){+.+.}, at: process_one_work+0x1b3/0x620
[  861.543814] 1 lock held by khungtaskd/64:
[  861.544535]  #0: 0000000059db4b53 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[  861.545160] 3 locks held by kworker/6:2/320:
[  861.545896]  #0: 00000000d9e1bc59 ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.546702]  #1: 00000000c9f92d84 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[  861.547443]  #2: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: nouveau_display_fini+0x96/0x170 [nouveau]
[  861.548146] 1 lock held by dmesg/983:
[  861.548889] 2 locks held by zsh/1250:
[  861.549605]  #0: 00000000348e3cf6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[  861.550393]  #1: 000000007009a7a8 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[  861.551122] 6 locks held by kworker/6:0/1329:
[  861.551957]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.552765]  #1: 00000000ddb499ad ((work_completion)(&notify->work)#2){+.+.}, at: process_one_work+0x1b3/0x620
[  861.553582]  #2: 000000006e013cbe (&dev->mode_config.mutex){+.+.}, at: drm_helper_hpd_irq_event+0x6c/0x120 [drm_kms_helper]
[  861.554357]  #3: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: drm_helper_hpd_irq_event+0x78/0x120 [drm_kms_helper]
[  861.555227]  #4: 0000000044f294d9 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x3d/0x100 [drm_kms_helper]
[  861.556133]  #5: 00000000db193642 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x4b/0x130 [drm]

[  861.557864] =============================================

[  861.559507] NMI backtrace for cpu 2
[  861.560363] CPU: 2 PID: 64 Comm: khungtaskd Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.561197] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
[  861.561948] Call Trace:
[  861.562757]  dump_stack+0x8e/0xd3
[  861.563516]  nmi_cpu_backtrace.cold.3+0x14/0x5a
[  861.564269]  ? lapic_can_unplug_cpu.cold.27+0x42/0x42
[  861.565029]  nmi_trigger_cpumask_backtrace+0xa1/0xae
[  861.565789]  arch_trigger_cpumask_backtrace+0x19/0x20
[  861.566558]  watchdog+0x316/0x580
[  861.567355]  kthread+0x12b/0x150
[  861.568114]  ? reset_hung_task_detector+0x20/0x20
[  861.568863]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.569598]  ret_from_fork+0x3a/0x50
[  861.570370] Sending NMI from CPU 2 to CPUs 0-1,3-7:
[  861.571426] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120
[  861.571429] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120
[  861.571432] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120
[  861.571464] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120
[  861.571467] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120
[  861.571469] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120
[  861.571472] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7f/0x120
[  861.572428] Kernel panic - not syncing: hung_task: blocked tasks

So: fix this by making it so that normal hotplug handling /only/ happens
so long as the GPU is currently awake without any pending runtime PM
requests. In the event that a hotplug occurs while the device is
suspending or resuming, we can simply defer our response until the GPU
is fully runtime resumed again.

Changes since v4:
- Use a new trick I came up with using pm_runtime_get() instead of the
  hackish junk we had before

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
..
accessibility
acpi libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
amba
android
ata Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2018-08-24 13:20:33 -07:00
atm
auxdisplay Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
base Driver core patches for 4.19-rc1 2018-08-18 11:44:53 -07:00
bcma
block Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax 2018-08-26 11:48:42 -07:00
bluetooth Bluetooth: mediatek: pass correct size to h4_recv_buf() 2018-08-13 15:59:39 +02:00
bus ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
cdrom
char RTC for 4.19 2018-08-20 16:30:27 -07:00
clk ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
clocksource RISC-V Updates for the 4.19 Merge Window 2018-08-19 09:56:38 -07:00
connector
cpufreq ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
cpuidle More power management updates for 4.19-rc1 2018-08-22 07:42:36 -07:00
crypto Merge branch 'akpm' (patches from Andrew) 2018-08-23 19:20:12 -07:00
dax libnvdimm-for-4.19_dax-memory-failure 2018-08-25 18:43:59 -07:00
dca
devfreq Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
dio
dma Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax 2018-08-26 11:48:42 -07:00
dma-buf
edac EDAC: Add missing MEM_LRDDR4 entry in edac_mem_types[] 2018-08-17 15:13:34 +02:00
eisa
extcon
firewire firewire: use 64-bit time_t based interfaces 2018-08-17 16:20:27 -07:00
firmware fbdev changes for v4.19: 2018-08-23 15:44:58 -07:00
fmc
fpga
fsi
gnss
gpio - New Drivers 2018-08-20 15:38:44 -07:00
gpu drm/nouveau: Fix deadlocks in nouveau_connector_detect() 2018-09-07 06:54:26 +10:00
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid 2018-08-20 15:59:01 -07:00
hsi
hv
hwmon ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
hwspinlock
hwtracing drivers/hwtracing/intel_th/msu.c: change return type to vm_fault_t 2018-08-23 18:48:43 -07:00
i2c i2c: don't use any __deprecated handling anymore 2018-08-24 17:26:43 +02:00
ide Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide 2018-08-22 07:40:33 -07:00
idle
iio treewide: convert ISO_8859-1 text comments to utf-8 2018-08-23 18:48:43 -07:00
infiniband Second merge window update 2018-08-23 15:34:48 -07:00
input ARM: 32-bit SoC platform updates 2018-08-23 13:44:43 -07:00
iommu ARM: SoC: late updates 2018-08-25 14:12:36 -07:00
ipack
irqchip Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-26 09:55:28 -07:00
isdn isdn: Disable IIOCDBGVAR 2018-08-16 12:26:24 -07:00
leds
lightnvm
macintosh macintosh: therm_windtunnel: drop using attach_adapter 2018-08-24 14:42:42 +02:00
mailbox mailbox: Add support for i.MX messaging unit 2018-08-15 09:53:07 +05:30
mcb
md libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
media Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax 2018-08-26 11:48:42 -07:00
memory ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
memstick
message
mfd Merge branch 'i2c/for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2018-08-21 17:40:46 -07:00
misc Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax 2018-08-26 11:48:42 -07:00
mmc
mtd This pull request contains updates for both UBI and UBIFS: 2018-08-23 15:58:04 -07:00
mux
net ARM: 32-bit SoC platform updates 2018-08-23 13:44:43 -07:00
nfc
ntb
nubus
nvdimm libnvdimm-for-4.19_dax-memory-failure 2018-08-25 18:43:59 -07:00
nvme Merge branch 'linus/master' into rdma.git for-next 2018-08-16 14:21:29 -06:00
nvmem
of Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
opp
oprofile
parisc
parport Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
pci Merge branch 'akpm' (patches from Andrew) 2018-08-22 12:34:08 -07:00
pcmcia pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function 2018-08-18 12:30:42 -07:00
perf Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
phy
pinctrl - New Drivers 2018-08-20 15:38:44 -07:00
platform platform-drivers-x86 for v4.19-1 2018-08-22 14:14:15 -07:00
pnp
power treewide: convert ISO_8859-1 text comments to utf-8 2018-08-23 18:48:43 -07:00
powercap
pps
ps3
ptp Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
pwm pwm: mediatek: Add MT7628 support 2018-08-20 11:36:07 +02:00
rapidio drivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md 2018-08-22 10:52:51 -07:00
ras
regulator - New Drivers 2018-08-20 15:38:44 -07:00
remoteproc remoteproc/davinci: use the reset framework 2018-08-16 17:39:55 -07:00
reset ARM: SoC: late updates 2018-08-25 14:12:36 -07:00
rpmsg
rtc RTC for 4.19 2018-08-20 16:30:27 -07:00
s390 libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
sbus
scsi Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax 2018-08-26 11:48:42 -07:00
sfi
sh
siox
slimbus
sn
soc ARM: Device-tree updates 2018-08-23 14:02:22 -07:00
soundwire
spi hwspinlock updates for v4.19 2018-08-18 16:45:27 -07:00
spmi
ssb ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUG 2018-08-09 18:47:47 +03:00
staging ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
target Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax 2018-08-26 11:48:42 -07:00
tc
tee ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
thermal Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2018-08-24 13:03:51 -07:00
thunderbolt
tty powerpc fixes for 4.19 #2 2018-08-24 09:34:23 -07:00
uio Char/Misc fix for 4.19-rc1 2018-08-19 09:30:44 -07:00
usb ARM: SoC driver updates 2018-08-23 13:52:46 -07:00
uwb
vfio powerpc updates for 4.19 2018-08-17 11:32:50 -07:00
vhost virtio, vhost: fixes, tweaks 2018-08-24 08:45:19 -07:00
video fbdev changes for v4.19: 2018-08-23 15:44:58 -07:00
virt
virtio virtio, vhost: fixes, tweaks 2018-08-24 08:45:19 -07:00
visorbus
vlynq
vme
w1 power supply and reset changes for the v4.19 series 2018-08-21 18:06:27 -07:00
watchdog include/linux/compiler*.h: make compiler-*.h mutually exclusive 2018-08-22 17:31:34 -07:00
xen xen: fixes and cleanups for 4.19-rc1, second round 2018-08-23 14:52:23 -07:00
zorro
Kconfig
Makefile Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00