linux-uconsole/drivers/pci
Johannes Thumshirn ac6e42d7a7 PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)
commit e42010d820 upstream.

Per PCIe spec r3.0, sec 2.3.1.1, the Read Completion Boundary (RCB)
determines the naturally aligned address boundaries on which a Read Request
may be serviced with multiple Completions:

  - For a Root Complex, RCB is 64 bytes or 128 bytes
    This value is reported in the Link Control Register

    Note: Bridges and Endpoints may implement a corresponding command bit
    which may be set by system software to indicate the RCB value for the
    Root Complex, allowing the Bridge/Endpoint to optimize its behavior
    when the Root Complex’s RCB is 128 bytes.

  - For all other system elements, RCB is 128 bytes

Per sec 7.8.7, if a Root Port only supports a 64-byte RCB, the RCB of all
downstream devices must be clear, indicating an RCB of 64 bytes.  If the
Root Port supports a 128-byte RCB, we may optionally set the RCB of
downstream devices so they know they can generate larger Completions.

Some BIOSes supply an _HPX that tells us to set RCB, even though the Root
Port doesn't have RCB set, which may lead to Malformed TLP errors if the
Endpoint generates completions larger than the Root Port can handle.

The IBM x3850 X6 with BIOS version -[A8E120CUS-1.30]- 08/22/2016 supplies
such an _HPX and a Mellanox MT27500 ConnectX-3 device fails to initialize:

  mlx4_core 0000:41:00.0: command 0xfff timed out (go bit not cleared)
  mlx4_core 0000:41:00.0: device is going to be reset
  mlx4_core 0000:41:00.0: Failed to obtain HW semaphore, aborting
  mlx4_core 0000:41:00.0: Fail to reset HCA
  ------------[ cut here ]------------
  kernel BUG at drivers/net/ethernet/mellanox/mlx4/catas.c:193!

After 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
and 7a1562d4f2 ("PCI: Apply _HPX Link Control settings to all devices
with a link"), we apply _HPX settings to *all* devices, not just those
hot-added after boot.

Before 7a1562d4f2, we didn't touch the Mellanox RCB, and the device
worked.  After 7a1562d4f2, we set its RCB to 128, and it failed.

Set the RCB to 128 iff the Root Port supports a 128-byte RCB.  Otherwise,
set RCB to 64 bytes.  This effectively ignores what _HPX tells us about
RCB.

Note that this change only affects _HPX handling.  If we have no _HPX, this
does nothing with RCB.

[bhelgaas: changelog, clear RCB if not set for Root Port]
Fixes: 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
Fixes: 7a1562d4f2 ("PCI: Apply _HPX Link Control settings to all devices with a link")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=187781
Tested-by: Frank Danapfel <fdanapfe@redhat.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08 07:15:24 +01:00
..
host PCI: keystone: Fix MSI code that retrieves struct pcie_port pointer 2016-03-09 15:34:49 -08:00
hotplug ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot() 2016-03-03 15:07:24 -08:00
pcie PCI: Export pcie_find_root_port 2016-12-08 07:15:24 +01:00
access.c PCI: Use function 0 VPD for identical functions, regular VPD for others 2015-09-24 17:06:32 -05:00
ats.c PCI: Remove pci_ats_enabled() 2015-08-13 15:59:59 -05:00
bus.c PCI: Fix minimum allocation address overwrite 2016-02-17 12:30:56 -08:00
host-bridge.c Merge branch 'pci/misc' into next 2015-04-10 08:27:18 -05:00
hotplug-pci.c
htirq.c x86/htirq: Use hierarchical irqdomain to manage Hypertransport interrupts 2015-04-24 15:36:50 +02:00
iov.c Merge branches 'pci/aer', 'pci/hotplug', 'pci/misc', 'pci/msi', 'pci/resource' and 'pci/virtualization' into next 2015-11-02 15:57:03 -06:00
irq.c
Kconfig PCI,parisc: Enable 64-bit bus addresses on PA-RISC 2015-09-08 15:30:47 +02:00
Makefile PCI: Build setup-irq.o for arm64 2015-08-20 12:02:49 -05:00
msi.c genirq/msi: Make sure PCI MSIs are activated early 2016-09-07 08:32:38 +02:00
of.c PCI/MSI: Use of_msi_get_domain instead of open-coded "msi-parent" parsing 2015-10-16 13:07:14 +01:00
pci-acpi.c PCI / ACPI: Fix pci_acpi_optimize_delay() comment 2015-07-15 15:11:50 -05:00
pci-driver.c PCI / PM: Tune down retryable runtime suspend error messages 2015-12-02 15:24:21 +01:00
pci-label.c PCI: Make a shareable UUID for PCI firmware ACPI _DSM 2015-04-08 14:39:30 -05:00
pci-stub.c
pci-sysfs.c PCI: Support PCIe devices with short cfg_size 2016-09-07 08:32:37 +02:00
pci.c PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr() 2016-03-16 08:42:58 -07:00
pci.h ARM/PCI: Move align_resource function pointer to pci_host_bridge structure 2015-11-25 13:23:38 -06:00
probe.c PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX) 2016-12-08 07:15:24 +01:00
proc.c
quirks.c PCI: Mark Atheros AR9580 to avoid bus reset 2016-10-28 03:01:26 -04:00
remove.c PCI: Embed ATS info directly into struct pci_dev 2015-08-13 15:57:21 -05:00
rom.c PCI: Fix infinite loop with ROM image of size 0 2015-01-23 17:42:59 -06:00
search.c PCI: Delete unnecessary NULL pointer checks 2014-11-10 21:02:17 -07:00
setup-bus.c PCI: Handle IORESOURCE_PCI_FIXED when assigning resources 2015-10-29 17:35:39 -05:00
setup-irq.c PCI: Export symbols required for loadable host driver modules 2015-04-08 14:17:10 -05:00
setup-res.c Merge branches 'pci/aer', 'pci/hotplug', 'pci/misc', 'pci/msi', 'pci/resource' and 'pci/virtualization' into next 2015-11-02 15:57:03 -06:00
slot.c PCI: Hold pci_slot_mutex while searching bus->slots list 2015-07-30 16:19:53 -05:00
syscall.c
vc.c PCI: Use dev->has_secondary_link to find downstream PCIe links 2015-05-29 15:35:26 -05:00
vpd.c
xen-pcifront.c xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted. 2016-03-03 15:07:30 -08:00