Commit graph

520 commits

Author SHA1 Message Date
Gavin Shan
027fa02f84 powerpc/powernv: Don't map M64 segments using M32DT
If M64 has been supported, the prefetchable 64-bits memory resources
shouldn't be mapped to the corresponding PE# via M32DT. Unfortunately,
we're doing that in pnv_ioda_setup_pe_seg() wrongly. The issue was
introduced by commit 262af55 ("powerpc/powernv: Enable M64 aperatus
for PHB3"). The patch fixes the issue by simply skipping M64 resources
when updating to M32DT.

Cc: <stable@vger.kernel.org>  # v3.17+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:10:40 +11:00
Wei Yang
250c7b277c powerpc/pci: Remove unused struct pci_dn.pcidev field
In struct pci_dn, the pcidev field is assigned but not used, so remove it.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:38 +11:00
Wei Yang
02639b0e13 powerpc/powernv: Group VF PE when IOV BAR is big on PHB3
When IOV BAR is big, each is covered by 4 M64 windows.  This leads to
several VF PE sits in one PE in terms of M64.

Group VF PEs according to the M64 allocation.

[bhelgaas: use dev_printk() when possible]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:38 +11:00
Wei Yang
5b88ec2284 powerpc/powernv: Reserve additional space for IOV BAR, with m64_per_iov supported
M64 aperture size is limited on PHB3.  When the IOV BAR is too big, this
will exceed the limitation and failed to be assigned.

Introduce a different mechanism based on the IOV BAR size:

  - if IOV BAR size is smaller than 64MB, expand to total_pe
  - if IOV BAR size is bigger than 64MB, roundup power2

[bhelgaas: make dev_printk() output more consistent, use PCI_SRIOV_NUM_BARS]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:38 +11:00
Wei Yang
781a868f31 powerpc/powernv: Shift VF resource with an offset
On PowerNV platform, resource position in M64 BAR implies the PE# the
resource belongs to. In some cases, adjustment of a resource is necessary
to locate it to a correct position in M64 BAR .

This patch adds pnv_pci_vf_resource_shift() to shift the 'real' PF IOV BAR
address according to an offset.

Note:

    After doing so, there would be a "hole" in the /proc/iomem when offset
    is a positive value. It looks like the device return some mmio back to
    the system, which actually no one could use it.

[bhelgaas: rework loops, rework overlap check, index resource[]
conventionally, remove pci_regs.h include, squashed with next patch]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:38 +11:00
Wei Yang
5350ab3fd7 powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv
Implement pcibios_iov_resource_alignment() on powernv platform.

On PowerNV platform, there are 3 cases for the IOV BAR:
1. initial state, the IOV BAR size is multiple times of VF BAR size
2. after expanded, the IOV BAR size is expanded to meet the M64 segment size
3. sizing stage, the IOV BAR is truncated to 0

pnv_pci_iov_resource_alignment() handle these three cases respectively.

[bhelgaas: adjust to drop "align" parameter, return pci_iov_resource_size()
if no ppc_md machdep_call version]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Wei Yang
6e628c7d33 powerpc/powernv: Reserve additional space for IOV BAR according to the number of total_pe
On PHB3, PF IOV BAR will be covered by M64 BAR to have better PE isolation.
M64 BAR is a type of hardware resource in PHB3, which could map a range of
MMIO to PE numbers on powernv platform. And this range is divided equally
by the number of total_pe with each divided range mapping to a PE number.
Also, the M64 BAR must map a MMIO range with power-of-two size.

The total_pe number is usually different from total_VFs, which can lead to
a conflict between MMIO space and the PE number.

For example, if total_VFs is 128 and total_pe is 256, the second half of
M64 BAR will be part of other PCI device, which may already belong to other
PEs.

This patch prevents the conflict by reserving additional space for the PF
IOV BAR, which is total_pe number of VF's BAR size.

[bhelgaas: make dev_printk() output more consistent, index resource[]
conventionally]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Wei Yang
9e8d4a19ab powerpc/powernv: Allocate struct pnv_ioda_pe iommu_table dynamically
Previously the iommu_table had the same lifetime as a struct pnv_ioda_pe
and was embedded in it. The pnv_ioda_pe was assigned to a PE on the bootup
stage. Since PEs are based on the hardware layout which is static in the
system, they will never get released. This means the iommu_table in the
pnv_ioda_pe will never get released either.

This no longer works for VF PE. VF PEs are created and released dynamically
when VFs are created and released. So we need to assign pnv_ioda_pe to VF
PEs respectively when VFs are enabled and clean up those resources for VF
PE when VFs are disabled. And iommu_table is one of the resources we need
to handle dynamically.

Current iommu_table is a static field in pnv_ioda_pe, which will face a
problem when freeing it. During the disabling of a VF,
pnv_pci_ioda2_release_dma_pe will call iommu_free_table to release the
iommu_table for this PE. A static iommu_table will fail in
iommu_free_table.

According to these requirement, this patch allocates iommu_table
dynamically.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Gavin Shan
a8b2f8288a powerpc/pci: Create pci_dn for VFs
pci_dn is the extension of PCI device node and is created from device node.
Unfortunately, VFs are enabled dynamically by PF's driver and they don't
have corresponding device nodes and pci_dn, which is required to access
VFs' config spaces.

The patch creates pci_dn for VFs in pcibios_sriov_enable() on their PF,
and removes pci_dn for VFs in pcibios_sriov_disable() on their PF. When
VF's pci_dn is created, it's put to the child list of the pci_dn of PF's
upstream bridge. The pci_dn is linked to pci_dev during early fixup time
to setup the fast path.

[bhelgaas: add ifdef around add_one_dev_pci_info(), use dev_printk()]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Michael Ellerman
df60f57684 Merge branch 'next-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into test
Merge miscellaneous bits from benh. Fix a minor conflict with
OpalMessageType changing names to opal_msg_type.
2015-03-26 20:04:28 +11:00
Preeti U Murthy
605f302053 powerpc/powernv: Avoid explicit endian conversions while parsing device tree
We currently read the information about idle states from the device
tree, so as to find out the CPU idle states supported by the platform.

Use the of_property_read/count_xxx() APIs, which handle endian
conversions for us, and mean we don't need any endian annotations in the
code.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-26 15:23:18 +11:00
Neelesh Gupta
b921e90260 powerpc/powernv: Add OPAL message notifier unregister function
Provide an unregister interface for the opal message notifiers
to be called when not needed like during driver unload/remove.

Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-25 16:53:28 +11:00
Neelesh Gupta
792f96e9a7 powerpc/powernv: Fix the overflow of OPAL message notifiers head array
Fixes the condition check of incoming message type which can
otherwise shoot beyond the message notifiers head array.

Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-25 16:53:28 +11:00
Benjamin Herrenschmidt
3bf57561d4 powerpc/powernv: Support OPAL requested heartbeat
If OPAL requests it, call it back via opal_poll_events() at a
regular interval. Some versions of OPAL on some machines require
this to operate some internal timeouts properly.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-25 16:53:27 +11:00
Vasant Hegde
3f77df7f81 powerpc/powernv: Check image loaded or not before calling flash
Present code checks for update_flash_data in opal_flash_term_callback().
update_flash_data has been statically initialized to zero, and that
is the value of FLASH_IMG_READY. Also code update initialization happens
during subsys init.

So if reboot is issued before the subsys init stage then we endup displaying
"Flashing new firmware" message.. which may confuse end user.

This patch fixes above described issue by initializes update_flash status
to invalid state.

Reported-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-25 16:17:02 +11:00
Gavin Shan
0bd785873c powerpc/eeh: Replace device_node with pci_dn in eeh_ops
There are 3 EEH operations whose arguments contain device_node:
read_config(), write_config() and restore_config(). The patch
replaces device_node with pci_dn.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:52 +11:00
Gavin Shan
ff57b454dd powerpc/eeh: Do probe on pci_dn
Originally, EEH core probes on device_node or pci_dev to populate
EEH devices and PEs, which conflicts with the fact: SRIOV VFs are
usually enabled and created by PF's driver and they don't have the
corresponding device_nodes. Instead, SRIOV VFs have dynamically
created pci_dn, which can be used for EEH probe.

The patch reworks EEH probe for PowerNV and pSeries platforms to
do probing based on pci_dn, instead of pci_dev or device_node any
more.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:52 +11:00
Gavin Shan
3532a741f8 powerpc/powernv: Use pci_dn, not device_node, in PCI config accessor
The PCI config accessors previously relied on device_node.  Unfortunately,
VFs don't have a corresponding device_node, so change the accessors to use
pci_dn instead.

[bhelgaas: changelog]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:50 +11:00
Hari Bathini
f7618299b4 powerpc/powernv: Add pstore support on powernv
This patch extends pstore, a generic interface to platform dependent
persistent storage, support for powernv  platform to capture certain
useful information, during dying moments. Such support is already in
place for  pseries platform. This patch re-uses most of that code.

It is a common practice to compile kernels with both CONFIG_PPC_PSERIES=y
and CONFIG_PPC_POWERNV=y. The code in nvram_init_oops_partition() routine
still works as intended, as the caller is platform specific code which
passes the appropriate value for "rtas_partition_exists" parameter.
In all other places, where CONFIG_PPC_PSERIES or CONFIG_PPC_POWERNV
flag is used in this patchset, it is to reduce the kernel size in cases
where this flag is not set and doesn't have any impact logic wise.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-23 14:06:10 +11:00
Paul Mackerras
755563bc79 powerpc/powernv: Fixes for hypervisor doorbell handling
Since we can now use hypervisor doorbells for host IPIs, this makes
sure we clear the host IPI flag when taking a doorbell interrupt, and
clears any pending doorbell IPI in pnv_smp_cpu_kill_self() (as we
already do for IPIs sent via the XICS interrupt controller).  Otherwise
if there did happen to be a leftover pending doorbell interrupt for
an offline CPU thread for any reason, it would prevent that thread from
going into a power-saving mode; it would instead keep waking up because
of the interrupt.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-20 14:51:53 +11:00
Gavin Shan
2f6cf79448 powerpc/powernv: Remove unused file
The patch removes unused file eeh-ioda.c and updates makefile
accordingly. Besides, the definition of "struct pnv_eeh_ops" and
the instances are all removed. Until now, the chip layer of EEH
implementation for PowerNV platform is removed completely.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:20 +11:00
Gavin Shan
cadf364d14 powerpc/powernv: Drop PHB operation reset()
The patch drops PHB EEH operation reset() and merges its logic to
eeh_ops::reset().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:19 +11:00
Gavin Shan
2a485ad7c8 powerpc/powernv: Drop PHB operation next_error()
The patch drops PHB EEH operation next_error() and merges its
logic to eeh_ops::next_error().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:19 +11:00
Gavin Shan
40ae5f693f powerpc/powernv: Drop PHB operation get_state()
The patch drops PHB EEH operation get_state() and merges its logic
to eeh_ops::get_state().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:19 +11:00
Gavin Shan
7e3e4f8d5e powerpc/powernv: Drop PHB operation set_option()
The patch drops PHB EEH operation set_option() and merges its
logic to eeh_ops::set_option().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:19 +11:00
Gavin Shan
bbe170ede1 powerpc/powernv: Drop PHB operation configure_bridge()
The patch drops PHB EEH operation configure_bridge() and merges
its logic to eeh_ops::configure_bridge().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:19 +11:00
Gavin Shan
95edcdeadf powerpc/powernv: Drop PHB operation get_log()
The patch drops PHB operation get_log() and merges its logic to
eeh_ops::get_log().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:19 +11:00
Gavin Shan
4cf1744558 powerpc/powernv: Drop PHB operation post_init()
The patch drops PHB EEH operation post_init() and merge its logic
to eeh_ops::post_init().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:18 +11:00
Gavin Shan
fa646c3cab powerpc/powernv: Drop PHB operation err_inject()
The patch drops PHB EEH operation err_inject() and merge its logic
to eeh_ops::err_inject().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:18 +11:00
Gavin Shan
01f3bfb780 powerpc/powernv: Shorten EEH function names
The patch shortens names of EEH functions in powernv-eeh.c and no
logic change introduced by this patch.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-17 10:31:18 +11:00
Michael Ellerman
d7cf83fcaf powerpc/powernv: Move opal-api.h closer to the Skiboot version
This commit gets opal-api.h to mostly match the version in Skiboot as of
commit ea7d806ab0ba.

The exceptions are things which are not (currently) used in Linux.

Most of this is just whitespace and a few things moving around. I think
the diff is readable.

Also OpalMessageType became opal_msg_type, requiring a change in the
Linux code.

Finally Skiboot and Linux disagree on CAPI vs CXL, because CAPI means
something else in Linux. To handle that we just point the Linux wrapper,
which is named "cxl" to the OPAL token OPAL_PCI_SET_PHB_CAPI_MODE.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-03-16 18:50:16 +11:00
Stewart Smith
7e73a3b7f3 powerpc/powernv: only call OPAL_RESEND_DUMP if firmware supports it
Not all OPAL platforms support resending system dumps, so check
that current firmware supports it first. Otherwise we get firmware
complaining:
"OPAL: Called with bad token 91 !"

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-16 18:50:14 +11:00
Stewart Smith
fc81de6310 powerpc/powernv: only call OPAL_ELOG_RESEND if firmware supports it
Otherwise firmware complains: "OPAL: Called with bad token 74 !"
as not all OPAL systems have the ability to resend error logs.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-16 18:50:14 +11:00
Stewart Smith
b962f5a446 powerpc/powernv: only register log if OPAL supports doing so
Correct use of REGISTER/UNREGISTER is to check if the token exists
before calling. If we don't we get a "OPAL: Called with bad token 101 !"
error, which is harmless but may be alarming to some.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-16 18:50:13 +11:00
Nishanth Aravamudan
4ad04e5987 powerpc/iommu: Remove IOMMU device references via bus notifier
After d905c5df9a ("PPC: POWERNV: move iommu_add_device earlier"), the
refcnt on the kobject backing the IOMMU group for a PCI device is
elevated by each call to pci_dma_dev_setup_pSeriesLP() (via
set_iommu_table_base_and_group). When we go to dlpar a multi-function
PCI device out:

        iommu_reconfig_notifier ->
                iommu_free_table ->
                        iommu_group_put
                        BUG_ON(tbl->it_group)

We trip this BUG_ON, because there are still references on the table, so
it is not freed. Fix this by moving the powernv bus notifier to common
code and calling it for both powernv and pseries.

Fixes: d905c5df9a ("PPC: POWERNV: move iommu_add_device earlier")
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-04 13:19:33 +11:00
Linus Torvalds
d3f180ea1a powerpc updates for 3.20
Including:
 
 - Update of all defconfigs
 - Addition of a bunch of config options to modernise our defconfigs
 - Some PS3 updates from Geoff
 - Optimised memcmp for 64 bit from Anton
 - Fix for kprobes that allows 'perf probe' to work from Naveen
 - Several cxl updates from Ian & Ryan
 - Expanded support for the '24x7' PMU from Cody & Sukadev
 - Freescale updates from Scott:
   "Highlights include 8xx optimizations, some more work on datapath device
    tree content, e300 machine check support, t1040 corenet error reporting,
    and various cleanups and fixes."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU2/LSAAoJEFHr6jzI4aWATDAQAKPU6v2Mq0sLnGst69waHU/Q
 vvpIq9hqVeSr6znHhrnazc3iQTLk0acqIdxUl/dT+5ADhi9+FxGD5Ckk+BH1DDve
 g6mQelSMlVZF9hKonHsbr4iUuTUyZyx2vj2qjdgOaRiv9Xubq6vUFNeolq3AeHxv
 J33vqRTmowj3VJ52u+V1dmzXQGfUye7DG2jHpjXoBieZsroTvyuYm5GoIPblWFO6
 zbYRh6IitALnQRtXfwIManPyWMkJti9JX8PwDkmvacr+V+MXbrksHpIOITMhNlo1
 WsVnFMpxuk80XuUfhaKZgISgBSfCqBckvKDn2QwztF2/kBnV6Su5xiOKVgouzM6B
 myy+maiMZlNJlNjqdMK5v2bqMXICP048zgfMbDN2e1K25jSSlRawt0RngoCQO2EP
 7aWmEDAlL3shgzkl68pj1fevQokxC/40C1yExIgAa9C31+bjtMz4Xb1SfN1SSveW
 7uWEY/eG9eLsrSE1CeBDvh6B8BRdyuIHgPhux4Tgc/bUtBGFQ29NuXwKh3QCeEy9
 9wWrRGx3U69eP06Ey7P5js3jPTQs80bjJewyGaiPQF5XHB89To8Dg8VfXjEV49Dx
 Pa3OLL5QsQloKfEBiEhQeGfKYImC00pVYAxc0qpmnr9T+25Ri1TLdF1EBAwriSYE
 5p9kSW+ZIht0lvzsdPNm
 =xDU3
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc updates from Michael Ellerman:

 - Update of all defconfigs

 - Addition of a bunch of config options to modernise our defconfigs

 - Some PS3 updates from Geoff

 - Optimised memcmp for 64 bit from Anton

 - Fix for kprobes that allows 'perf probe' to work from Naveen

 - Several cxl updates from Ian & Ryan

 - Expanded support for the '24x7' PMU from Cody & Sukadev

 - Freescale updates from Scott:
    "Highlights include 8xx optimizations, some more work on datapath
     device tree content, e300 machine check support, t1040 corenet
     error reporting, and various cleanups and fixes"

* tag 'powerpc-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (102 commits)
  cxl: Add missing return statement after handling AFU errror
  cxl: Fail AFU initialisation if an invalid configuration record is found
  cxl: Export optional AFU configuration record in sysfs
  powerpc/mm: Warn on flushing tlb page in kernel context
  powerpc/powernv: Add OPAL soft-poweroff routine
  powerpc/perf/hv-24x7: Document sysfs event description entries
  powerpc/perf/hv-gpci: add the remaining gpci requests
  powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated
  powerpc/perf/hv-24x7: parse catalog and populate sysfs with events
  perf: define EVENT_DEFINE_RANGE_FORMAT_LITE helper
  perf: add PMU_EVENT_ATTR_STRING() helper
  perf: provide sysfs_show for struct perf_pmu_events_attr
  powerpc/kernel: Avoid initializing device-tree pointer twice
  powerpc: Remove old compile time disabled syscall tracing code
  powerpc/kernel: Make syscall_exit a local label
  cxl: Fix device_node reference counting
  powerpc/mm: bail out early when flushing TLB page
  powerpc: defconfigs: add MTD_SPI_NOR (new dependency for M25P80)
  perf/powerpc: reset event hw state when adding it to the PMU
  powerpc/qe: Use strlcpy()
  ...
2015-02-11 18:15:38 -08:00
Joel Stanley
7f43e71e8c powerpc/powernv: Add OPAL soft-poweroff routine
Register a notifier for a OPAL message indicating that the machine
should prepare itself for a graceful power off.

OPAL will tell us if the power off is a reboot or shutdown, but for now
we perform the same orderly_poweroff action.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-04 13:08:25 +11:00
Ryan Grimm
6f963ec2d6 cxl: Fix device_node reference counting
When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
call of_node_get() otherwise np's reference count isn't incremented and
it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
so it's clear it calls of_node_get().

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-02 14:51:31 +11:00
Gavin Shan
31494cf353 powerpc/powernv: Don't alloc IRQ map if necessary
On PowerNV platform, the OPAL interrupts are exported by firmware
through device-node property (/ibm,opal::opal-interrupts). Under
some extreme circumstances (e.g. simulator), we don't have this
property found from the device tree. For that case, we shouldn't
allocate the interrupt map. Otherwise, slab complains allocating
zero sized memory chunk.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-28 15:28:10 +11:00
Gavin Shan
c1c3a526bb powerpc/powernv: Separate function for OPAL IRQ setup
The patch put the OPAL interrupt setup logic in opal_init() into
seperate function opal_irq_init() for easier code maintaining. The
patch doesn't introduce logic changes except:

   * Rename variable names.
   * Release virtual IRQ upon error from request_irq().
   * Don't cache the virtual IRQ to opal_irqs[] upon error from
     request_irq().

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-28 15:28:04 +11:00
Michael Ellerman
0813513943 powerpc/powernv: Remove "opal" prefix from pr_xxx()s
In commit c8742f8512 "powerpc/powernv: Expose OPAL firmware symbol
map" I added pr_fmt() to opal.c. This left some existing pr_xxx()s with
duplicate "opal" prefixes, eg:

    opal: opal: Found 0 interrupts reserved for OPAL

Fix them all up. Also make the "Not not found" message a bit more
verbose.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-28 15:10:33 +11:00
Pranith Kumar
6501ab5e38 powerpc/powernv: Skip registering log region when CONFIG_PRINTK=n
When CONFIG_PRINTK=n, log_buf_addr_get() returns NULL and log_buf_len_get()
return 0. Check for these return values and skip registering the dump buffer.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-27 14:04:00 +11:00
Gavin Shan
a113de373b powerpc/powernv: Remove pnv_pci_probe_mode()
The callback (ppc_md.pci_probe_mode()) is used to determine if the
child PCI devices of the indicated PCI bus should be probed from
device-tree or hardware. On PowerNV platform, we always expect
probing PCI devices from hardware, which is PowerPC PCI core's
default behaviour. Also, the callback had some delay implemented
based on PHB's device node property "reset-clear-timestamp", which
wasn't exported from skiboot. So we don't need this function and
it's safe to remove it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-23 14:02:54 +11:00
Gavin Shan
2aa5cf9e48 powerpc/eeh: Fix missed PE#0 on P7IOC
PE#0 should be regarded as valid for P7IOC, while it's invalid for
PHB3. The patch adds flag EEH_VALID_PE_ZERO to differentiate those
two cases. Without the patch, we possibly see frozen PE#0 state is
cleared without EEH recovery taken on P7IOC as following kernel logs
indicate:

[root@ltcfbl8eb ~]# dmesg
       :
pci 0000:00     : [PE# 000] Secondary bus 0 associated with PE#0
pci 0000:01     : [PE# 001] Secondary bus 1 associated with PE#1
pci 0001:00     : [PE# 000] Secondary bus 0 associated with PE#0
pci 0001:01     : [PE# 001] Secondary bus 1 associated with PE#1
pci 0002:00     : [PE# 000] Secondary bus 0 associated with PE#0
pci 0002:01     : [PE# 001] Secondary bus 1 associated with PE#1
pci 0003:00     : [PE# 000] Secondary bus 0 associated with PE#0
pci 0003:01     : [PE# 001] Secondary bus 1 associated with PE#1
pci 0003:20     : [PE# 002] Secondary bus 32..63 associated with PE#2
       :
EEH: Clear non-existing PHB#3-PE#0
EEH: PHB location: U78AE.001.WZS00M9-P1-002

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-23 14:02:52 +11:00
Thadeu Lima de Souza Cascardo
4e28784024 powernv/iommu: disable IOMMU bypass with param iommu=nobypass
When IOMMU bypass is enabled, a PCI device can read and write memory
that was not mapped by the driver without causing an EEH. That might
cause memory corruption, for example.

When we disable bypass, DMA reads and writes to addresses not mapped by
the IOMMU will cause an EEH, allowing us to debug such issues.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-23 14:02:46 +11:00
Wei Yang
e9863e687d powerpc/powernv: Print the M64 range information in bootup log
The M64 range information is missed in dmesg, which would be helpful in debug.

This patch prints the M64 range information in the same format as M32.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-23 11:17:30 +11:00
Ryan Grimm
1212aa1c8c cxl: Enable CAPP recovery
Turning snoops on is the last step in CAPP recovery. Sapphire is expected to
have reinitialized the PHB and done the previous recovery steps.

Add mode argument to opal call to do this. Driver can turn snoops off although
it does not currently.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:52 +11:00
Shreyas B. Prabhu
0eb13208aa powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared
LPCR_PECE1 bit controls whether decrementer interrupts are allowed to
cause exit from power-saving mode. While waking up from winkle, restoring
LPCR with LPCR_PECE1 set (i.e Decrementer interrupts allowed) can cause
issue in the following scenario:

- All the threads in a core are offlined. The core enters deep winkle.
- Spurious interrupt wakes up a thread in the core. Here LPCR is restored
  with LPCR_PECE1 bit set.
- Since it was a spurious interrupt on a offline thread, the thread clears
  the interrupt and goes back to winkle.
- Here before the thread executes winkle and puts the core into deep winkle,
  if a decrementer interrupt occurs on any of the sibling threads in the core
  that thread wakes up.
- Since in offline loop we are flushing interrupt only in case of external
  interrupt, the decrementer interrupt does not get flushed. So at this stage
  the thread is stuck in this is loop of waking up at 0x100 due to decrementer
  interrupt, not flushing the interrupt as only external interrupts get flushed,
  entering winkle, waking up at 0x100 again.

Fix this by programming PORE to restore LPCR with LPCR_PECE1 bit
cleared when waking up from winkle.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:22:57 +11:00
Anton Blanchard
bfe5fda8e7 powernv: Fix OPAL tracepoint code
Patch c49f63530b ("powernv: Add OPAL tracepoints") has a spurious
store to the stack:

	ld      r12,opal_tracepoint_refcount@toc(r2);           \
	std     r12,32(r1);                                     \

The store was originally used to save the current tracepoint status
so the entry and the exit tracepoints were always balanced. In the
end I just created a separate path when tracepoints are enabled.

The offset on the stack used for this store is not valid for ABIv2
and it causes strange issues. I noticed it because OPAL console input
was broken.

Fixes: c49f63530b ("powernv: Add OPAL tracepoints")
Cc: <stable@vger.kernel.org> # v3.17+
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-12 16:40:02 +11:00
Linus Torvalds
34b85e3574 powerpc updates for 3.19 batch 2
The highlight is the series that reworks the idle management on powernv, which
 allows us to use deeper idle states on those machines.
 
 There's the fix from Anton for the "BUG at kernel/smpboot.c:134!" problem.
 
 An i2c driver for powernv. This is acked by Wolfram Sang, and he asked that we
 take it through the powerpc tree.
 
 A fix for audit from rgb at Red Hat, acked by Paul Moore who is one of the audit
 maintainers.
 
 A patch from Ben to export the symbol map of our OPAL firmware as a sysfs file,
 so that tools can use it.
 
 Also some CXL fixes, a couple of powerpc perf fixes, a fix for smt-enabled, and
 the patch to add __force to get_user() so we can use bitwise types.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUk+oCAAoJEFHr6jzI4aWADBAP/i/CJ+cu6o4mzNDdfs8bnxqn
 RGZCSV+SrkTZPcoLbLiM9iaqq34ORVIn7hwkhkTz2/koluMVfTsqtVulMoFf+hVd
 GTVt81MjMFzA3hM3bXEV58KRT79+64K54dLCe0F7OaD6f4AikKR4LLz/PY0EBMiZ
 2h13uQlfglaMeYTsaD9eeUpIIKs7+PwsNqUknmN9We07WWfxWqnRpiTR4TYTMXx4
 3lQPvCnnHokwDqjuKgwiqDVSaCfCl8laS1i+BPk0G0aRV1AnPDvR3MhgVb2IpNxX
 Joxy2D1HSawwDhqHOsId8dkGZXOM4vzo+Y658qnC1XfThqE0MhA+kCfa5/b6xlOR
 K7nDO5A41B6nXB3mMOQh/szTXSIa8KJRTR3ibbJJrMdF6F0TN0JLLQNUcmM4j/5D
 vvgZEzvFNZhWX98ktlQLde2E4ClWJg6mWESCGSgJeVjIXaxe/6GneIa8vLKm5QMu
 OoykNsASMDGqddYMGoYeX/mSsvjPjK0PDO2q19sPbkP8xpyDLx6J8xo+5hO4l8xc
 0Cdb38ECfeno+w5oKAnjidHnz0KYBsuYFLeS+rV0b8sUSWAzfdEjSn2AVIQ8gLOv
 IOCAqwZ5tL9EcUs+AKru5EHtBEV+2XB54xPRxfdFS/k+vYRE7MpS3ipxveIynN2l
 eRxf9hsSO7ASNDd0b3ID
 =GXdK
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull second batch of powerpc updates from Michael Ellerman:
 "The highlight is the series that reworks the idle management on
  powernv, which allows us to use deeper idle states on those machines.

  There's the fix from Anton for the "BUG at kernel/smpboot.c:134!"
  problem.

  An i2c driver for powernv.  This is acked by Wolfram Sang, and he
  asked that we take it through the powerpc tree.

  A fix for audit from rgb at Red Hat, acked by Paul Moore who is one of
  the audit maintainers.

  A patch from Ben to export the symbol map of our OPAL firmware as a
  sysfs file, so that tools can use it.

  Also some CXL fixes, a couple of powerpc perf fixes, a fix for
  smt-enabled, and the patch to add __force to get_user() so we can use
  bitwise types"

* tag 'powerpc-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc/powernv: Ignore smt-enabled on Power8 and later
  powerpc/uaccess: Allow get_user() with bitwise types
  powerpc/powernv: Expose OPAL firmware symbol map
  powernv/powerpc: Add winkle support for offline cpus
  powernv/cpuidle: Redesign idle states management
  powerpc/powernv: Enable Offline CPUs to enter deep idle states
  powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode
  i2c: Driver to expose PowerNV platform i2c busses
  powerpc: add little endian flag to syscall_get_arch()
  power/perf/hv-24x7: Use kmem_cache_free() instead of kfree
  powerpc/perf/hv-24x7: Use per-cpu page buffer
  cxl: Unmap MMIO regions when detaching a context
  cxl: Add timeout to process element commands
  cxl: Change contexts_lock to a mutex to fix sleep while atomic bug
  powerpc: Secondary CPUs must set cpu_callin_map after setting active and online
2014-12-19 12:57:45 -08:00