* pci/host-mvebu:
PCI: mvebu: Remove duplicate of_clk_get_by_name() call
PCI: mvebu: Support a bridge with no IO port window
PCI: mvebu: Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits
PCI: mvebu: Drop writes to bridge Secondary Status register
* pci/deletion:
PCI: Remove from bus_list and release resources in pci_release_dev()
PCI: Move pci_proc_attach_device() to pci_bus_add_device()
PCI: Use device_release_driver() in pci_stop_root_bus()
PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev()
Conflicts:
drivers/pci/remove.c
Previously we removed the pci_dev from the bus_list and released its
resources in pci_destroy_dev(). But that's too early: it's possible to
call pci_destroy_dev() twice for the same device (e.g., via sysfs), and
that will cause an oops when we try to remove it from bus_list the second
time.
We should remove it from the bus_list only when the last reference to the
pci_dev has been released, i.e., in pci_release_dev().
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4f535093cf ("PCI: Put pci_dev in device tree as early as possible")
moved pci_proc_attach_device() from pci_bus_add_device() to
pci_device_add().
This moves it back to pci_bus_add_device(), essentially reverting that
part of 4f535093cf. This makes it symmetric with pci_stop_dev(),
where we call pci_proc_detach_device() and pci_remove_sysfs_dev_files()
and set dev->is_added = 0.
[bhelgaas: changelog, create sysfs then attach proc for symmetry]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
To be consistent with 4bff674990 ("PCI: Move device_del() from
pci_stop_dev() to pci_destroy_dev()", this changes pci_stop_root_bus()
to use device_release_driver() instead of device_del().
This also changes pci_remove_root_bus() to use device_unregister()
instead of put_device() so it corresponds with the device_register()
call in pci_create_root_bus().
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After commit bcdde7e221 (sysfs: make __sysfs_remove_dir() recursive)
I'm seeing traces analogous to the one below in Thunderbolt testing:
WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0()
sysfs group ffffffff81c6c500 not found for kobject '0000:08'
Modules linked in: ...
CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76
Hardware name: Acer Aspire S5-391/Venus , BIOS V1.02 05/29/2012
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007
ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800
0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800
Call Trace:
[<ffffffff816b23bf>] dump_stack+0x4e/0x71
[<ffffffff81046607>] warn_slowpath_common+0x87/0xb0
[<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50
[<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80
[<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0
[<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50
[<ffffffff81495818>] device_del+0x58/0x1c0
[<ffffffff814959c8>] device_unregister+0x48/0x60
[<ffffffff813254fe>] pci_remove_bus+0x6e/0x80
[<ffffffff81325548>] pci_remove_bus_device+0x38/0x110
[<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110
[<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20
[<ffffffff813418d0>] disable_slot+0x20/0xe0
[<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0
[<ffffffff813427ad>] hotplug_event+0x17d/0x220
[<ffffffff81342880>] hotplug_event_work+0x30/0x70
[<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24
[<ffffffff81061331>] process_one_work+0x261/0x450
[<ffffffff81061a7e>] worker_thread+0x21e/0x370
[<ffffffff81061860>] ? rescuer_thread+0x300/0x300
[<ffffffff81068342>] kthread+0xd2/0xe0
[<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
[<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0
[<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
(Mika Westerberg sees them too in his tests).
Some investigation documented in kernel bug #65281 led me to the
conclusion that the source of the problem is the device_del() in
pci_stop_dev() as it now causes the sysfs directory of the device to be
removed recursively along with all of its subdirectories. That includes
the sysfs directory of the device's subordinate bus (dev->subordinate) and
its "power" group.
Consequently, when pci_remove_bus() is called for dev->subordinate in
pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this
point the sysfs directory of bus->dev doesn't exist any more and its
"power" group doesn't exist either. Thus, when dpm_sysfs_remove() called
from device_del() tries to remove that group, it triggers the above
warning.
That indicates a logical mistake in the design of
pci_stop_and_remove_bus_device(), which causes bus device objects to be
left behind their parents (bridge device objects) and can be fixed by
moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so
pci_remove_bus() can be called for the device's subordinate bus before the
device itself is unregistered from the hierarchy. Still, the driver, if
any, should be detached from the device in pci_stop_dev(), so use
device_release_driver() directly from there.
References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
aer_hest_parse() and aer_hest_parse_aff() are almost identical. We use
aer_hest_parse() to check the ACPI_HEST_FIRMWARE_FIRST flag for a specific
device, and we use aer_hest_parse_aff() to check to see if any device sets
the flag.
This drops aer_hest_parse_aff() and enhances aer_hest_parse() so it
collects the union of the PCIe ACPI_HEST_FIRMWARE_FIRST flag settings when
no specific device is supplied.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Betty Dall <betty.dall@hp.com>
aer_set_firmware_first() searches the HEST for an error source descriptor
matching the specified PCI device. It uses the apei_hest_parse() iterator
to call aer_hest_parse() for every descriptor in the HEST.
Previously, aer_hest_parse() incorrectly assumed every descriptor was for a
PCIe error source. This patch adds a check to avoid that error.
[bhelgaas: factor check into helper, use in aer_hest_parse_aff(), changelog]
Signed-off-by: Betty Dall <betty.dall@hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Save one indentation level in aer_print_error() for the generic case where
we have info->status of an error, disregard 80 cols rule a bit for the sake
of better readability, fix alignment.
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
... and call it instead of duplicating the large printk format
statement.
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We need to give up the last reference to edev->dev, so we need to call
put_device().
Signed-off-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/yijing-dev_is_pci:
alpha/PCI: Use dev_is_pci() to identify PCI devices
arm/PCI: Use dev_is_pci() to identify PCI devices
arm/PCI: Use dev_is_pci() to identify PCI devices
parisc/PCI: Use dev_is_pci() to identify PCI devices
sparc/PCI: Use dev_is_pci() to identify PCI devices
ia64/PCI: Use dev_is_pci() to identify PCI devices
x86/PCI: Use dev_is_pci() to identify PCI devices
PCI: Use dev_is_pci() to identify PCI devices
* pci/misc:
PCI: Stop clearing bridge Secondary Status when setting up I/O aperture
PCI: Prevent bus conflicts while checking for bridge apertures
PCI: Drop "irq" param from *_restore_msi_irqs()
PCI/portdrv: Remove superfluous name cast
PCI: Clear NumVFs when disabling SR-IOV in sriov_init()
* for-linus:
MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers
PCI: Disable Bus Master only on kexec reboot
PCI: mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin
PCI: Omit PCI ID macro strings to shorten quirk names
PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev()
Revert "workqueue: allow work_on_cpu() to be called recursively"
PCI: Avoid unnecessary CPU switch when calling driver .probe() method
pci_setup_bridge_io() accessed PCI_IO_BASE and PCI_IO_LIMIT using dword
(32-bit) reads and writes, which also access the Secondary Status register.
Since the Secondary Status register is in the upper 16 bits of the dword,
and we preserved those upper 16 bits, this had the effect of clearing any
of the write-1-to-clear bits that happened to be set in the Secondary
Status register.
That's not what we want, so use word (16-bit) accesses to update only
PCI_IO_BASE and PCI_IO_LIMIT.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_bridge_check_ranges() determines whether the bridge supports an I/O
aperture and a prefetchable memory aperture.
Previously, if the I/O aperture was unsupported, disabled, or configured at
[io 0x0000-0x0fff], we wrote 0xf0 to PCI_IO_BASE and PCI_IO_LIMIT, which,
if the bridge supports it, enables the I/O aperture at [io 0xf000-0xffff].
The enabled aperture may conflict with other devices in the system.
Similarly, we wrote 0xfff0 to PCI_PREF_MEMORY_BASE and
PCI_PREF_MEMORY_LIMIT, which enables the prefetchable memory aperture at
[mem 0xfff00000-0xffffffff], and that may also conflict with other devices.
All we need to know is whether the base and limit registers are writable,
so we can use values that leave the apertures disabled, e.g., PCI_IO_BASE =
0xf0, PCI_IO_LIMIT = 0xe0, PCI_PREF_MEMORY_BASE = 0xfff0,
PCI_PREF_MEMORY_LIMIT = 0xffe0.
Writing non-zero values to both the base and limit registers means we
detect whether either or both are writable, as we did before.
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Based-on-patch-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Change x86_msi.restore_msi_irqs(struct pci_dev *dev, int irq) to
x86_msi.restore_msi_irqs(struct pci_dev *dev).
restore_msi_irqs() restores multiple MSI-X IRQs, so param 'int irq' is
unneeded. This makes code more consistent between vm and bare metal.
Dom0 MSI-X restore code can also be optimized as XEN only has a hypercall
to restore all MSI-X vectors at one time.
Tested-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use dev_is_pci() instead of equivalent local function.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use dev_is_pci() instead of checking bus type directly.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Probably due to a merge conflict resolution gone bad, the PCI clock is
got twice. Remove the redundant call of of_clk_get_by_name().
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
If runtime PM is enabled in the kernel config, the PCI clocks are not
forced on at start-up, and thus, are never enabled. Use
pm_runtime_get_sync() to enable the clocks.
While at it, use dev_info() instead of pr_info() since now we have the
device pointer available in the PCI setup callback.
Signed-off-by: Valentine Barshak <valentine.barshak@cogentembedded.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There is no need to use 'goto err' as we can directly return the errors.
No functional change.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
write_msi_msg() does exactly the same so there is no need to explicitly
call pci_write_config_word() and do the same twice.
Tested-by: Mohit Kumar <mohit.kumar@st.com>
Signed-off-by: Bjørn Erik Nilsen <ben@datarespons.no>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marek Vasut <marex@denx.de>
Acked-by: Jingoo Han <jg1.han@samsung.com>
904d0e7889 ("PCI: designware: Add irq_create_mapping()") resulted in
pre-allocated irq descs. Problem was that in assign_irq() these descs were
explicitly allocated and hence also freed, resulting in a crash. We also
need to clear the entire irq range in teardown. With this commit the
teardown basically does exactly the opposite of what was done in setup.
The crash this fixes looks like:
Unable to handle kernel NULL pointer dereference at virtual address 00000020
PC is at dw_msi_teardown_irq+0x40/0x118
LR is at trace_hardirqs_on_caller+0xf4/0x1c0
Backtrace:
[<802c401c>] (dw_msi_teardown_irq+0x0/0x118) from [<802c1844>] (arch_teardown_msi_irq+0x3c/0x40)
[<802c1808>] (arch_teardown_msi_irq+0x0/0x40) from [<802c1a08>] (default_teardown_msi_irqs+0x68/0x84)
[<802c19a0>] (default_teardown_msi_irqs+0x0/0x84) from [<802c1a34>] (arch_teardown_msi_irqs+0x10/0x14)
[<802c1a24>] (arch_teardown_msi_irqs+0x0/0x14) from [<802c1ad0>] (free_msi_irqs+0x98/0x144)
[<802c1a38>] (free_msi_irqs+0x0/0x144) from [<802c2570>] (pci_disable_msi+0x48/0x60)
[<802c2528>] (pci_disable_msi+0x0/0x60) from [<7f0057d4>] (sxdma_irq_free+0x44/0x48 [sxdma])
[bhelgaas: add crash info]
Tested-by: Mohit Kumar <mohit.kumar@st.com>
Signed-off-by: Bjørn Erik Nilsen <ben@datarespons.no>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marek Vasut <marex@denx.de>
Acked-by: Jingoo Han <jg1.han@samsung.com>
When using devm_ioremap_resource(), we do not need to check the return
value of platform_get_resource(), so just remove it.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marek Vasut <marex@denx.de>
In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Tegra20 and Tegra30 do not support gen2 PCIe, so correct the
register setting to disable it.
Signed-off-by: Eric Brower <ebrower@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Add a flag to tell the PCI subsystem that kernel is shutting down in
preparation to kexec a kernel. Add code in PCI subsystem to use this flag
to clear Bus Master bit on PCI devices only in case of kexec reboot.
This fixes a power-off problem on Acer Aspire V5-573G and likely other
machines and avoids any other issues caused by clearing Bus Master bit on
PCI devices in normal shutdown path. The problem was introduced by
b566a22c23 ("PCI: disable Bus Master on PCI device shutdown").
This patch is based on discussion at
http://marc.info/?l=linux-pci&m=138425645204355&w=2
Link: https://bugzilla.kernel.org/show_bug.cgi?id=63861
Reported-by: Chang Liu <cl91tp@gmail.com>
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: stable@vger.kernel.org # v3.5+
Make pcie-io-aperture and the IO port MBUS ID in ranges optional. If not
provided the bridge reports to Linux that IO space mapping is not supported
and refuses to configure an IO MBUS window.
This allows both complete disable (do not specify pcie-io-aperture) and
per-port disable (do not specify a IO target ranges entry for the port).
Most PCIe devices these days do not require IO support to function, so
having an option to disable it in the driver is useful.
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
When PCI_COMMAND_MEMORY/PCI_COMMAND_IO are cleared, the bridge should not
allocate windows or even look at the window limit/base registers.
Otherwise we may set up bogus windows while the PCI core code performs
discovery. The core will leave PCI_COMMAND_IO cleared if it doesn't need
an IO window.
Have mvebu_pcie_handle_*_change respect the bits, and call the change
function whenever the bits changes.
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
The emulated bridge does not support interrupts, so it should return the
value 0 for Interrupt Line and Interrupt Pin. This indicates that
interrupts are not supported.
Since Max_Lat and Min_Gnt are also in the same 32-bit word, we return
0 for them, which means "do not care."
This corrects an error message from the kernel:
pci 0000:00:01.0: of_irq_parse_pci() failed with rc=135
Which is due to the default return of 0xFFFFFFFF indicating that
interrupts are supported.
The error message regression was caused by 16b84e5a50 ("of/irq: Create
of_irq_parse_and_map_pci() to consolidate arch code.")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
There are no writable bits in the secondary status register, only RO and
RW1C (write-1-to-clear) bits. The driver never sets any of the RW1C bits,
so the status register should always be 0, just remove the set from the
write path.
Someday the RW1C bits should be copied/cleared directly from registers in
the HW.
[bhelgaas: changelog tweaks]
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Pasting the verbatim PCI_(VENDOR|DEVICE)_* macros in the __pci_fixup_*
symbol names results in insanely long names such as
__pci_fixup_resumePCI_VENDOR_ID_SERVERWORKSPCI_DEVICE_ID_SERVERWORKS_HT1000SBquirk_disable_broadcom_boot_interrupt
When Link-Time Optimization adds its numeric suffix to such symbol, it
overflows the namebuf[KSYM_NAME_LEN] array in kernel/kallsyms.c. Use the
line number instead to create (nearly) unique symbol names.
Reported-by: Joe Mario <jmario@redhat.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
After commit bcdde7e221 (sysfs: make __sysfs_remove_dir() recursive)
I'm seeing traces analogous to the one below in Thunderbolt testing:
WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0()
sysfs group ffffffff81c6c500 not found for kobject '0000:08'
Modules linked in: ...
CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76
Hardware name: Acer Aspire S5-391/Venus , BIOS V1.02 05/29/2012
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007
ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800
0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800
Call Trace:
[<ffffffff816b23bf>] dump_stack+0x4e/0x71
[<ffffffff81046607>] warn_slowpath_common+0x87/0xb0
[<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50
[<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80
[<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0
[<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50
[<ffffffff81495818>] device_del+0x58/0x1c0
[<ffffffff814959c8>] device_unregister+0x48/0x60
[<ffffffff813254fe>] pci_remove_bus+0x6e/0x80
[<ffffffff81325548>] pci_remove_bus_device+0x38/0x110
[<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110
[<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20
[<ffffffff813418d0>] disable_slot+0x20/0xe0
[<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0
[<ffffffff813427ad>] hotplug_event+0x17d/0x220
[<ffffffff81342880>] hotplug_event_work+0x30/0x70
[<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24
[<ffffffff81061331>] process_one_work+0x261/0x450
[<ffffffff81061a7e>] worker_thread+0x21e/0x370
[<ffffffff81061860>] ? rescuer_thread+0x300/0x300
[<ffffffff81068342>] kthread+0xd2/0xe0
[<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
[<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0
[<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
(Mika Westerberg sees them too in his tests).
Some investigation documented in kernel bug #65281 led me to the
conclusion that the source of the problem is the device_del() in
pci_stop_dev() as it now causes the sysfs directory of the device to be
removed recursively along with all of its subdirectories. That includes
the sysfs directory of the device's subordinate bus (dev->subordinate) and
its "power" group.
Consequently, when pci_remove_bus() is called for dev->subordinate in
pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this
point the sysfs directory of bus->dev doesn't exist any more and its
"power" group doesn't exist either. Thus, when dpm_sysfs_remove() called
from device_del() tries to remove that group, it triggers the above
warning.
That indicates a logical mistake in the design of
pci_stop_and_remove_bus_device(), which causes bus device objects to be
left behind their parents (bridge device objects) and can be fixed by
moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so
pci_remove_bus() can be called for the device's subordinate bus before the
device itself is unregistered from the hierarchy. Still, the driver, if
any, should be detached from the device in pci_stop_dev(), so use
device_release_driver() directly from there.
References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This reverts commit c2fda50966.
c2fda50966 removed lockdep annotation from work_on_cpu() to work around
the PCI path that calls work_on_cpu() from within a work_on_cpu() work item
(PF driver .probe() method -> pci_enable_sriov() -> add VFs -> VF driver
.probe method).
961da7fb6b22 ("PCI: Avoid unnecessary CPU switch when calling driver
.probe() method) avoids that recursive work_on_cpu() use in a different
way, so this revert restores the work_on_cpu() lockdep annotation.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Tejun Heo <tj@kernel.org>