This file is not modular, nor is it using modular functions. The
only thing close is the global THIS_MODULE which comes from export.h
so lets replace it appropriately and cut back on the amount of
header stuff we draw in by several thousand lines.
Cc: linux-gpio@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
gicv3_init_bases() is the only caller for its_init(),
also it is a __init function, so mark its_init() as __init too,
then recursively mark the functions called as __init.
This will help to introduce ITS initialization using ACPI tables as
we will use acpi_table_parse_entries family functions there which
belong to __init section as well.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The gic_root_node variable defined in ITS driver is not actually
used, so just remove it.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Following ACPI spec:
On systems supporting GICv3 and above, GICR Base Address in MADT GICC
structure holds the 64-bit physical address of the associated Redistributor.
If all of the GIC Redistributors are in the always-on power domain,
GICR structures should be used to describe the Redistributors instead,
and this field must be set to 0.
It means that we have two ways to initialize registirbutors map.
1. via GICD structure which can accommodate many redistributors as a region
2. via GICC which is able to describe single redistributor
This patch is going to add support for second option.
Considering redistributors, GICD and GICC subtables have be mutually
exclusive. While discovering and mapping redistributor, we need to know
its size in advance. For the GICC case, redistributor can be in
a power-domain that is off, thus we cannot relay on GICR TYPER register.
Therefore, we get GIC version from distributor register and map 2xSZ_64K
for GICv3 and 4xSZ_64K for GICv4.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
With the refator of gic_of_init(), GICv3/4 can be initialized
by gic_init_bases() with gic distributor base address and gic
redistributor region(s).
So get the redistributor region base addresses from MADT GIC
redistributor subtable, and the distributor base address from
GICD subtable to init GICv3 irqchip in ACPI way.
Note: GIC redistributor base address may also be provided in
GICC structures on systems supporting GICv3 and above if the GIC
Redistributors are not in the always-on power domain, this
patch didn't implement such feature yet.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Isolate hardware abstraction (FDT) code to gic_of_init().
Rest of the logic goes to gic_init_bases() and expects well
defined data to initialize GIC properly. The same solution
is used for GICv2 driver.
This is needed for ACPI initialization later.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Create a driver to support the hardware monitoring chip present in
the Zyxel NSA320 and some of the other Zyxel NAS devices.
The driver reads fan speed and temperature from a suitably
pre-programmed MCU on the device.
Signed-off-by: Adam Baker <linux@baker-net.org.uk>
[groeck: Dropped .owner field initialization]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The function __spi_pump_messages() is called by spi_pump_messages() and
__spi_sync(). The function __spi_sync() has an argument 'bus_locked'
that indicates if it is called with the SPI bus mutex held or not. If
'bus_locked' is false then __spi_sync() will acquire the mutex itself.
Commit 556351f14e ("spi: introduce accelerated read support for spi
flash devices") made a change to acquire the SPI bus mutex within
__spi_pump_messages(). However, this change did not check to see if the
mutex is already held. If __spi_sync() is called with the mutex held
(ie. 'bus_locked' is true), then a deadlock occurs when
__spi_pump_messages() is called.
Fix this deadlock by passing the 'bus_locked' state from __spi_sync() to
__spi_pump_messages() and only acquire the mutex if not already held. In
the case where __spi_pump_messages() is called from spi_pump_messages()
it is assumed that the mutex is not held and so call
__spi_pump_messages() with 'bus_locked' set to false. Finally, move the
unlocking of the mutex to the end of the __spi_pump_messages() function
to simplify the code and only call cond_resched() if there are no
errors.
Fixes: 556351f14e ("spi: introduce accelerated read support for spi flash devices")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The pxa2xxx fails some automated builds because of unexported
symbols.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Per the x86-specific footnote to PCI spec r3.0, sec 6.2.4, the value 255 in
the Interrupt Line register means "unknown" or "no connection."
Previously, when we couldn't derive an IRQ from the _PRT, we fell back to
using the value from Interrupt Line as an IRQ. It's questionable whether
we should do that at all, but the spec clearly suggests we shouldn't do it
for the value 255 on x86.
Calling request_irq() with IRQ 255 may succeed, but the driver won't
receive any interrupts. Or, if IRQ 255 is shared with another device, it
may succeed, and the driver's ISR will be called at random times when the
*other* device interrupts. Or it may fail if another device is using IRQ
255 with incompatible flags. What we *want* is for request_irq() to fail
predictably so the driver can fall back to polling.
On x86, assume 255 in the Interrupt Line means the INTx line is not
connected. In that case, set dev->irq to IRQ_NOTCONNECTED so request_irq()
will fail gracefully with -ENOTCONN.
We found this problem on a system where Secure Boot firmware assigned
Interrupt Line 255 to an i801_smbus device and another device was already
using MSI-X IRQ 255. This was in v3.10, where i801_probe() fails if
request_irq() fails:
i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
i801_smbus 0000:00:1f.3: PCI INT C: no GSI
genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa)
CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5
Call Trace:
dump_stack+0x19/0x1b
__setup_irq+0x54a/0x570
request_threaded_irq+0xcc/0x170
i801_probe+0x32f/0x508 [i2c_i801]
local_pci_probe+0x45/0xa0
i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
i801_smbus: probe of 0000:00:1f.3 failed with error -16
After aeb8a3d16a ("i2c: i801: Check if interrupts are disabled"),
i801_probe() will fall back to polling if request_irq() fails. But we
still need this patch because request_irq() may succeed or fail depending
on other devices in the system. If request_irq() fails, i801_smbus will
work by falling back to polling, but if it succeeds, i801_smbus won't work
because it expects interrupts that it may not receive.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the function of_genpd_get_from_provider(), we never check to see if
the argument 'genpdspec' is NULL before dereferencing it. Add error
checking to handle any NULL pointers.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 30e7a65b3f (PM / Domains: Ensure subdomain is not in use
before removing) added a test to ensure that a subdomain is not a
master to another subdomain or if any devices are using the subdomain
before removing. This change incorrectly used the "slave_links" list to
determine if the subdomain is a master to another subdomain, where it
should have been using the "master_links" list instead. The
"slave_links" list will never be empty for a subdomain and so a
subdomain can never be removed. Fix this by testing if the
"master_links" list is empty instead.
Fixes: 30e7a65b3f (PM / Domains: Ensure subdomain is not in use before removing)
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
During runtime resume the return values of the start and restore steps
are ignored. As a result drivers are not notified of runtime resume
failures and can't propagate them up. Fix it by returning an error if
either the start or restore step fails, and clean up properly in the
error path.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For low-power states, the state index is part of the state, hence join
them with a hyphen in the /sys/kernel/debug/pm_genpd/pm_genpd_summary
output. E.g. "off 0" becomes "off-0".
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The slave domains are no longer aligned with the table header in the
/sys/kernel/debug/pm_genpd/pm_genpd_summary output. Worse, the alignment
differs depending on the actual name of the state.
Format the state name and index into a buffer, and print that like
before to restore alignment.
Use "%u" for unsigned int while we're at it.
Fixes: fc5cbf0c94 (PM / Domains: Support for multiple states)
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
RAPL driver operates on MSRs that are under package/socket
scope instead of core scope. However, the current code does not
keep track of which CPUs are available on each package for MSR
access. Therefore it has to search for an active CPU on a given
package each time.
This patch optimizes the package level operations by tracking a
per package lead CPU during initialization and CPU hotplug. The
runtime search for active CPU is avoided.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch adds to each rapl domain a reference of the package
it belongs to. At runtime, we can then avoid searching the package
data for each access. It simplifies the domain level operations
which depend on package level information.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reduce remote CPU calls for MSR access by combining read
modify write into one function.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Export cpumask_any_but() for module use. This will be used by drivers
such as intel_rapl to locate an active cpu on a socket.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit eade8f78f2aa21e8eabc3380a5728db47273bcf1
Revert commit ae90fbf562 (ACPICA: Parser: Fix for SuperName method
invocation).
Support for method invocations as part of super_name will be
removed from the ACPI specification, since no AML interpreter
supports it.
Fixes: ae90fbf562 (ACPICA: Parser: Fix for SuperName method invocation)
Link: eade8f78
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 0824ab90e03c2e4239e890615f447e7962b1daa2
Was not using the correct macro. Updated a comment in
acoutput.h
Link: 0824ab90
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It's always an ambivalent feeling to send a large pull request at the
late stage like this, especially when most of patches came from me.
In anyway, this is a collection of lots of small fixes that slipped
from the previous pull request.
All fixes are about ASoC, and the majority of changes are corrections
of the wrong access types in ALSA ctl enum items. They are mostly
harmless on 32bit architectures, but actually buggy on 64bit. So we
addressed all these now in a shot. The rest are various small ASoC
driver fixes.
Among them, only two changes have been done to ASoC core, and both of
them are trivial. The rest are all device-specific. So overall, they
should be safe to apply.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJW3rMXAAoJEGwxgFQ9KSmkVycP/3413WwXgXsqvAdbHiqvJQGf
B+KQw/tN7qXhut7+mubuOk+eKYpDH3Jj2hokb/BFH5AHADzOMdBfCnLvoewZUrTJ
GURasd0JvMmaoZaIB9Nk1G6QEaiobZLJjsjKLdaAu3yeQOyo2FDghiWAVkDMzMV8
73v+eStoGeDQX41vWnV63747Zpfd80q5c5qudKR0FMGphzkV0j7JUEkVNWiedMfL
MjchPbLBlxK6CDJh+cbjKMZK2Y/h+j05b4oLdqwdt98z00RJP4vJTJ3bSWyDyqil
HZb2F4Ugsv1nI9sQ8nHLb7PH4/u45PKuBxILNv7mGfS1WPOCuV7j+PHyMOBIr97P
seS38DjukIDU1Q+zpar+p5v3/PrshQuxKknwrI/Z+tRMNWPurbgSIWNeNOIUNIoF
HTg/pETlwr4zLMBy78lTot+7NqDJwLit1Yl4tI1+7ac0wSycJ4yYwWXv5mr++23G
QZxXznJELhhpYhUKT/b804STu2bT3dpSJxUYe/EYApYgoDx3TxWRhWhbKaILzPH9
8EYLJ3Xgd7AZcMsEpC5R2SNKRMnQvPQnnDncwcNUSA/bju8H0eqThvRs2n30xc2q
9Ris9m0iLJAXwi/bVcxwMvJddfzLSL1DWPBkpQsJ2yWmux9lohUA/cFXWDrGiDWs
0L6G1f8mH8iQuLmQcHWi
=GmQ0
-----END PGP SIGNATURE-----
Merge tag 'sound-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"It's always an ambivalent feeling to send a large pull request at the
late stage like this, especially when most of patches came from me.
Anyway, this is a collection of lots of small fixes that slipped from
the previous pull request.
All fixes are about ASoC, and the majority of changes are corrections
of the wrong access types in ALSA ctl enum items. They are mostly
harmless on 32bit architectures, but actually buggy on 64bit. So we
addressed all these now in a shot. The rest are various small ASoC
driver fixes.
Among them, only two changes have been done to ASoC core, and both of
them are trivial. The rest are all device-specific. So overall, they
should be safe to apply"
* tag 'sound-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits)
ASoC: wm_adsp: Fix enum ctl accesses in a wrong type
ASoC: wm9081: Fix enum ctl accesses in a wrong type
ASoC: wm8996: Fix enum ctl accesses in a wrong type
ASoC: wm8994: Fix enum ctl accesses in a wrong type
ASoC: wm8985: Fix enum ctl accesses in a wrong type
ASoC: wm8983: Fix enum ctl accesses in a wrong type
ASoC: wm8958: Fix enum ctl accesses in a wrong type
ASoC: wm8904: Fix enum ctl accesses in a wrong type
ASoC: wm8753: Fix enum ctl accesses in a wrong type
ASoC: wl1273: Fix enum ctl accesses in a wrong type
ASoC: tlv320dac33: Fix enum ctl accesses in a wrong type
ASoC: max98095: Fix enum ctl accesses in a wrong type
ASoC: max98088: Fix enum ctl accesses in a wrong type
ASoC: ab8500: Fix enum ctl accesses in a wrong type
ASoC: da732x: Fix enum ctl accesses in a wrong type
ASoC: cs42l51: Fix enum ctl accesses in a wrong type
ASoC: intel: mfld: Fix enum ctl accesses in a wrong type
ASoC: omap: rx51: Fix enum ctl accesses in a wrong type
ASoC: omap: n810: Fix enum ctl accesses in a wrong type
ASoC: pxa: tosa: Fix enum ctl accesses in a wrong type
...
during the merge window.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW3cVJAAoJEBLB8Bhh3lVKPxMP/0BKTlS7mp7FRjEhnUQPorEb
sW20KtpC372uD5eVREbC9dUA1q9FGQacOKs4sZvSZUxj8Nx+BGEkpTLH+sHwKnk0
mfI0roq0oWcXGWb19A9oI506V94Z0ejs2p1aEBA/iidUjpu1RoR1ZhOPMmkE6+PS
gMWJj10xonodP5y8e3ZlTg+2dDohdy1KA6GKbLr1145cQLvH6p01Sf+cvNncZwMk
r22e89I8CVSn96kXOuyiu2mXZ+9sDCkDvwc/uYECs34XfhGAoVJ6Qy3xy2ZvaakH
t6Idc4Z5eVbvXKErfRzrUbj12qBQ2zIs2h8feVtmyNM57Awjk/XQkTncBp3Kb5XG
szB/pPFM2DEffNMOeXGE1ZJyZVuErhVbcsZYPmBdXKITeUuvzAbOF47qZpXdsJZr
d65BzKCEHyoV7+M3lRti13KAestPQupvS7b++qyLhSnqKFP5pacVJPyhNDkMPDic
wSXiu5z7IdjJgH7GwTmcvMHrI1vwL2nlAqanfQNNYVAzGzJ2GSUm9ClK6KGtaHY9
QUtk8VGw3ZlemU+PT0PghmrgcPnGRbVLyaH/ufEb+IE3F1+9i0f1ZpkcHF42faMq
GKFMXCdQPfpy67y6oGql5PmMrwbaDwXSn2atoPb6gbbeJWTlSf0TnfRCeJe/S3/7
eW9geMHYK4QgTIuWodYp
=le3x
-----END PGP SIGNATURE-----
Merge tag 'edac_fix_for_4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC fix from Borislav Petkov:
"Last minute fix for sb_edac which fixes DIMM detection on certain Xeon
Phi configurations:
A single fix to the Xeon Phi section of sb_edac. The issue was
introduced during this merge window"
* tag 'edac_fix_for_4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, sb_edac: Fix logic when computing DIMM sizes on Xeon Phi
Make use of the EXTABLE_FAULT exception table entries to write
a kernel copy routine that doesn't crash the system if it
encounters a machine check. Prime use case for this is to copy
from large arrays of non-volatile memory used as storage.
We have to use an unrolled copy loop for now because current
hardware implementations treat a machine check in "rep mov"
as fatal. When that is fixed we can simplify.
Return type is a "bool". True means that we copied OK, false means
that it didn't.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@gmail.com>
Link: http://lkml.kernel.org/r/a44e1055efc2d2a9473307b22c91caa437aa3f8b.1456439214.git.tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Drop the quirk() function pointer in favor of a simple boolean which
says whether the quirk should be applied or not. Update comment while at
it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/20160308164041.GF16568@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When I fixed the dp rate selection in:
3b73b168cffd9c392584d3f665021fa2190f8612
drm/amdgpu: fix dp link rate selection (v2)
I accidently dropped the special handling for NUTMEG
DP bridge chips. They require a fixed link rate.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
When I fixed the dp rate selection in:
092c96a8ab
drm/radeon: fix dp link rate selection (v2)
I accidently dropped the special handling for NUTMEG
DP bridge chips. They require a fixed link rate.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: Ken Moffat <zarniwhoop@ntlworld.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Commit e91467ecd1 ("bug in futex unqueue_me") introduced a barrier() in
unqueue_me() to prevent the compiler from rereading the lock pointer which
might change after a check for NULL.
Replace the barrier() with a READ_ONCE() for the following reasons:
1) READ_ONCE() is a weaker form of barrier() that affects only the specific
load operation, while barrier() is a general compiler level memory barrier.
READ_ONCE() was not available at the time when the barrier was added.
2) Aside of that READ_ONCE() is descriptive and self explainatory while a
barrier without comment is not clear to the casual reader.
No functional change.
[ tglx: Massaged changelog ]
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Cc: dave@stgolabs.net
Cc: peterz@infradead.org
Cc: linux@rasmusvillemoes.dk
Cc: akpm@linux-foundation.org
Cc: fengguang.wu@intel.com
Cc: bigeasy@linutronix.de
Link: http://lkml.kernel.org/r/1457314344-5685-1-git-send-email-nasa4836@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The pgtable.c file is quite big, before it grows any larger split it
into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related
header definitions into the new gmap.h header and all of the pgste
helpers from pgtable.h to pgtable.c.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The pmdp_xxx function are smaller than their ptep_xxx counterparts
but to keep things symmetrical unline them as well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The code in the various ptep_xxx functions has grown quite large,
consolidate them to four out-of-line functions:
ptep_xchg_direct to exchange a pte with another with immediate flushing
ptep_xchg_lazy to exchange a pte with another in a batched update
ptep_modify_prot_start to begin a protection flags update
ptep_modify_prot_commit to commit a protection flags update
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It no longer has any users.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
x86_64 has very clean espfix handling on paravirt: espfix64 is set
up in native_iret, so paravirt systems that override iret bypass
espfix64 automatically. This is robust and straightforward.
x86_32 is messier. espfix is set up before the IRET paravirt patch
point, so it can't be directly conditionalized on whether we use
native_iret. We also can't easily move it into native_iret without
regressing performance due to a bizarre consideration. Specifically,
on 64-bit kernels, the logic is:
if (regs->ss & 0x4)
setup_espfix;
On 32-bit kernels, the logic is:
if ((regs->ss & 0x4) && (regs->cs & 0x3) == 3 &&
(regs->flags & X86_EFLAGS_VM) == 0)
setup_espfix;
The performance of setup_espfix itself is essentially irrelevant, but
the comparison happens on every IRET so its performance matters. On
x86_64, there's no need for any registers except flags to implement
the comparison, so we fold the whole thing into native_iret. On
x86_32, we don't do that because we need a free register to
implement the comparison efficiently. We therefore do espfix setup
before restoring registers on x86_32.
This patch gets rid of the explicit paravirt_enabled check by
introducing X86_BUG_ESPFIX on 32-bit systems and using an ALTERNATIVE
to skip espfix on paravirt systems where iret != native_iret. This is
also messy, but it's at least in line with other things we do.
This improves espfix performance by removing a branch, but no one
cares. More importantly, it removes a paravirt_enabled user, which is
good because paravirt_enabled is ill-defined and is going away.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We can fit the 2k for the STFLE interpretation and the crypto
control block into one DMA page. As we now only have to allocate
one DMA page, we can clean up the code a bit.
As a nice side effect, this also fixes a problem with crycbd alignment in
case special allocation debug options are enabled, debugged by Sascha
Silbe.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Not setting the facility list designation disables STFLE interpretation,
this is what we want if the guest was told to not have it.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
When the VCPU cpu timer expires, we have to wake up just like when the ckc
triggers. For now, setting up a cpu timer in the guest and going into
enabled wait will never lead to a wakeup. This patch fixes this problem.
Just as for the ckc, we have to take care of waking up too early. We
have to recalculate the sleep time and go back to sleep.
Please note that the timer callback calls kvm_s390_get_cpu_timer() from
interrupt context. As the timer is canceled when leaving handle_wait(),
and we don't do any VCPU cpu timer writes/updates in that function, we can
be sure that we will never try to read the VCPU cpu timer from the same cpu
that is currentyl updating the timer (deadlock).
Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Tested-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The cpu timer is a mean to measure task execution time. We want
to account everything for a VCPU for which it is responsible. Therefore,
if the VCPU wants to sleep, it shall be accounted for it.
We can easily get this done by not disabling cpu timer accounting when
scheduled out while sleeping because of enabled wait.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
For now, only the owning VCPU thread (that has loaded the VCPU) can get a
consistent cpu timer value when calculating the delta. However, other
threads might also be interested in a more recent, consistent value. Of
special interest will be the timer callback of a VCPU that executes without
having the VCPU loaded and could run in parallel with the VCPU thread.
The cpu timer has a nice property: it is only updated by the owning VCPU
thread. And speaking about accounting, a consistent value can only be
calculated by looking at cputm_start and the cpu timer itself in
one shot, otherwise the result might be wrong.
As we only have one writing thread at a time (owning VCPU thread), we can
use a seqcount instead of a seqlock and retry if the VCPU refreshed its
cpu timer. This avoids any heavy locking and only introduces a counter
update/check plus a handful of smp_wmb().
The owning VCPU thread should never have to retry on reads, and also for
other threads this might be a very rare scenario.
Please note that we have to use the raw_* variants for locking the seqcount
as lockdep will produce false warnings otherwise. The rq->lock held during
vcpu_load/put is also acquired from hardirq context. Lockdep cannot know
that we avoid potential deadlocks by disabling preemption and thereby
disable concurrent write locking attempts (via vcpu_put/load).
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Architecturally we should only provide steal time if we are scheduled
away, and not if the host interprets a guest exit. We have to step
the guest CPU timer in these cases.
In the first shot, we will step the VCPU timer only during the kvm_run
ioctl. Therefore all time spent e.g. in interception handlers or on irq
delivery will be accounted for that VCPU.
We have to take care of a few special cases:
- Other VCPUs can test for pending irqs. We can only report a consistent
value for the VCPU thread itself when adding the delta.
- We have to take care of STP sync, therefore we have to extend
kvm_clock_sync() and disable preemption accordingly
- During any call to disable/enable/start/stop we could get premeempted
and therefore get start/stop calls. Therefore we have to make sure we
don't get into an inconsistent state.
Whenever a VCPU is scheduled out, sleeping, in user space or just about
to enter the SIE, the guest cpu timer isn't stepped.
Please note that all primitives are prepared to be called from both
environments (cpu timer accounting enabled or not), although not completely
used in this patch yet (e.g. kvm_s390_set_cpu_timer() will never be called
while cpu timer accounting is enabled).
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We want to manually step the cpu timer in certain scenarios in the future.
Let's abstract any access to the cpu timer, so we can hide the complexity
internally.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
By storing the cpu id, we have a way to verify if the current cpu is
owning a VCPU.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
DIAG 0x288 may occur now. Let's add its code to the diag table in
sie.h.
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull nohz enhancements from Frederic Weisbecker:
"Currently in nohz full configs, the tick dependency is checked
asynchronously by nohz code from interrupt and context switch for each
concerned subsystem with a set of function provided by these. Such
functions are made of many conditions and details that can be heavyweight
as they are called on fastpath: sched_can_stop_tick(),
posix_cpu_timer_can_stop_tick(), perf_event_can_stop_tick()...
Thomas suggested a few months ago to make that tick dependency check
synchronous. Instead of checking subsystems details from each interrupt
to guess if the tick can be stopped, every subsystem that may have a tick
dependency should set itself a flag specifying the state of that
dependency. This way we can verify if we can stop the tick with a single
lightweight mask check on fast path.
This conversion from a pull to a push model to implement tick dependency
is the core feature of this patchset that is split into:
* Nohz wide kick simplification
* Improve nohz tracing
* Introduce tick dependency mask
* Migrate scheduler, posix timers, perf events and sched clock tick
dependencies to the tick dependency mask."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Define a binding for the hardware monitoring chip present in the Zyxel
NSA-320 and some of the other Zyxel NAS devices.
Signed-off-by: Adam Baker <linux@baker-net.org.uk>
Acked-by: Rob Herring <robh@kernel.org>
[groeck: Fixed whitespace error]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
ignore_nmis is used in two distinct places:
1. modified through {stop,restart}_nmi by alternative_instructions
2. read by do_nmi to determine if default_do_nmi should be called or not
thus the access pattern conforms to __read_mostly and do_nmi() is a fastpath.
Signed-off-by: Kostenzer Felix <fkostenzer@live.at>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With MACHINE_HAS_VX, we convert the floating point registers from the
vector registeres when storing the status. For other VCPUs, these are
stored to vcpu->run->s.regs.vrs, but we are using current->thread.fpu.vxrs,
which resolves to the currently loaded VCPU.
So kvm_s390_store_status_unloaded() currently writes the wrong floating
point registers (converted from the vector registers) when called from
another VCPU on a z13.
This is only the case for old user space not handling SIGP STORE STATUS and
SIGP STOP AND STORE STATUS, but relying on the kernel implementation. All
other calls come from the loaded VCPU via kvm_s390_store_status().
Fixes: 9abc2a08a7 (KVM: s390: fix memory overwrites when vx is disabled)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>