linux-uconsole/drivers/base
Zhang Wensheng a93f33aeef driver core: fix potential deadlock in __driver_attach
[ Upstream commit 70fe758352 ]

In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").

stack like commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
list below:
    In __driver_attach function, The lock holding logic is as follows:
    ...
    __driver_attach
    if (driver_allows_async_probing(drv))
      device_lock(dev)      // get lock dev
        async_schedule_dev(__driver_attach_async_helper, dev); // func
          async_schedule_node
            async_schedule_node_domain(func)
              entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
              /* when fail or work limit, sync to execute func, but
                 __driver_attach_async_helper will get lock dev as
                 will, which will lead to A-A deadlock.  */
              if (!entry || atomic_read(&entry_count) > MAX_WORK) {
                func;
              else
                queue_work_node(node, system_unbound_wq, &entry->work)
      device_unlock(dev)

    As above show, when it is allowed to do async probes, because of
    out of memory or work limit, async work is not be allowed, to do
    sync execute instead. it will lead to A-A deadlock because of
    __driver_attach_async_helper getting lock dev.

Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:

[  370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[  370.787154] task:swapper/0       state:D stack:    0 pid:    1 ppid:
0 flags:0x00004000
[  370.788865] Call Trace:
[  370.789374]  <TASK>
[  370.789841]  __schedule+0x482/0x1050
[  370.790613]  schedule+0x92/0x1a0
[  370.791290]  schedule_preempt_disabled+0x2c/0x50
[  370.792256]  __mutex_lock.isra.0+0x757/0xec0
[  370.793158]  __mutex_lock_slowpath+0x1f/0x30
[  370.794079]  mutex_lock+0x50/0x60
[  370.794795]  __device_driver_lock+0x2f/0x70
[  370.795677]  ? driver_probe_device+0xd0/0xd0
[  370.796576]  __driver_attach_async_helper+0x1d/0xd0
[  370.797318]  ? driver_probe_device+0xd0/0xd0
[  370.797957]  async_schedule_node_domain+0xa5/0xc0
[  370.798652]  async_schedule_node+0x19/0x30
[  370.799243]  __driver_attach+0x246/0x290
[  370.799828]  ? driver_allows_async_probing+0xa0/0xa0
[  370.800548]  bus_for_each_dev+0x9d/0x130
[  370.801132]  driver_attach+0x22/0x30
[  370.801666]  bus_add_driver+0x290/0x340
[  370.802246]  driver_register+0x88/0x140
[  370.802817]  ? virtio_scsi_init+0x116/0x116
[  370.803425]  scsi_register_driver+0x1a/0x30
[  370.804057]  init_sd+0x184/0x226
[  370.804533]  do_one_initcall+0x71/0x3a0
[  370.805107]  kernel_init_freeable+0x39a/0x43a
[  370.805759]  ? rest_init+0x150/0x150
[  370.806283]  kernel_init+0x26/0x230
[  370.806799]  ret_from_fork+0x1f/0x30

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: ef0ff68351 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21 15:15:55 +02:00
..
firmware_loader firmware_loader: use kernel credentials when reading firmware 2022-05-18 10:23:46 +02:00
power PM: runtime: Redefine pm_runtime_release_supplier() 2022-07-12 16:32:18 +02:00
regmap regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips 2022-06-29 08:59:49 +02:00
test drivers/base: build kunit tests without structleak plugin 2021-03-17 17:06:24 +01:00
arch_topology.c arch_topology: Do not set llc_sibling if llc_id is invalid 2022-05-09 09:04:59 +02:00
attribute_container.c
base.h driver core: add deferring probe reason to devices_deferred property 2020-07-30 09:03:43 +02:00
bus.c driver: base: fix UAF when driver_attach failed 2022-06-14 18:32:34 +02:00
cacheinfo.c drivers core: Use sysfs_emit for shared_cpu_map_show and shared_cpu_list_show 2020-10-02 13:24:40 +02:00
class.c drivers core: Miscellaneous changes for sysfs_emit 2020-10-02 13:12:07 +02:00
component.c
container.c
core.c PM: runtime: Redefine pm_runtime_release_supplier() 2022-07-12 16:32:18 +02:00
cpu.c x86/bugs: Report AMD retbleed vulnerability 2022-07-25 11:26:40 +02:00
dd.c driver core: fix potential deadlock in __driver_attach 2022-08-21 15:15:55 +02:00
devcoredump.c drivers core: Miscellaneous changes for sysfs_emit 2020-10-02 13:12:07 +02:00
devres.c devres: provide devm_krealloc() 2020-09-08 13:32:06 +02:00
devtmpfs.c devtmpfs regression fix: reconfigure on each mount 2022-01-20 09:17:49 +01:00
driver.c drivers: base: Convert to printk alias functions 2020-07-10 14:16:44 +02:00
firmware.c
hypervisor.c
init.c
isa.c
Kconfig drivers: base: default KUNIT_* fragments to KUNIT_ALL_TESTS 2020-06-01 14:24:25 -06:00
Makefile device property: Move fwnode_connection_find_match() under drivers/base/property.c 2020-09-08 13:32:06 +02:00
map.c
memory.c drivers/base/memory: fix an unlikely reference counting issue in __add_memory_block() 2022-06-09 10:21:16 +02:00
module.c
node.c drivers/base/node.c: fix compaction sysfs file leak 2022-06-09 10:21:16 +02:00
pinctrl.c
platform-msi.c
platform.c drivers core: Miscellaneous changes for sysfs_emit 2020-10-02 13:12:07 +02:00
property.c device property: Fix fwnode_graph_devcon_match() fwnode leak 2022-01-27 10:54:25 +01:00
soc.c drivers core: Miscellaneous changes for sysfs_emit 2020-10-02 13:12:07 +02:00
swnode.c software node: fix wrong node passed to find nargs_prop 2022-01-27 10:53:59 +01:00
syscore.c syscore: Use pm_pr_dbg() for syscore_{suspend,resume}() 2020-09-08 13:32:06 +02:00
topology.c drivers core: Miscellaneous changes for sysfs_emit 2020-10-02 13:12:07 +02:00
transport_class.c