Pull perf fixes from Ingo Molnar:
"Two kprobes fixes and a handful of tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Make sparc64 arch point to sparc
perf symbols: Define EM_AARCH64 for older OSes
perf top: Fix SIGBUS on sparc64
perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag
perf tools: Fix pthread_attr_setaffinity_np build error
perf tools: Define _GNU_SOURCE on pthread_attr_setaffinity_np feature check
perf bench: Fix order of arguments to memcpy_alloc_mem
kprobes/x86: Check for invalid ftrace location in __recover_probed_insn()
kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace
Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
table levels folded. Usually, these defines are provided by
<asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.
But some architectures fold page table levels in a custom way. They
need to define these macros themself. This patch adds missing defines.
The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
and __pud_alloc() on architectures without these page table levels.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The smc91x driver traditionally gets configured at compile-time
for whichever hardware it runs on. This no longer works on
ARM as we continue to move to building all-in-one kernels.
Most ARM configurations with this driver already use run-time
configuration through DT or through platform_data, but a
few have not been converted yet.
I've checked all ARM boards that use this driver in their
legacy board files, and converted the ones that were using
compile-time configuration in smc91x.h to behave like the
other ones and provide the interrupt polarity along with
the MMIO configuration (width, stride) at platform device
creation time.
In particular, these combinations were previously selectable
in Kconfig but in fact broken:
- sa1100 assabet plus pleb
- msm combined with any other armv6/v7 platform
- pxa-idp combined with any non-DMA pxa variant
- LogicPD PXA270 combined with any other pxa
- nomadik combined with any other armv4/v5 platform,
e.g. versatile.
None of these seem critical enough to warrant a backport
to stable, but it would be nice to clean this up for good.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
----
I would like the patch to get merged through netdev, after
Robert and/or Linus have verified it on at least some hardware.
There are a few other non-ARM platforms using this driver,
I could do the same patch for those if we want to take
it further.
arch/arm/mach-msm/board-halibut.c | 8 ++++-
arch/arm/mach-msm/board-qsd8x50.c | 8 ++++-
arch/arm/mach-pxa/idp.c | 5 +++
arch/arm/mach-pxa/lpd270.c | 8 ++++-
arch/arm/mach-realview/core.c | 7 ++++
arch/arm/mach-realview/realview_eb.c | 2 +-
arch/arm/mach-sa1100/neponset.c | 6 ++++
arch/arm/mach-sa1100/pleb.c | 7 ++++
drivers/net/ethernet/smsc/smc91x.c | 9 +++--
drivers/net/ethernet/smsc/smc91x.h | 114 ++----------------------------------------------------------
10 files changed, 57 insertions(+), 117 deletions(-)
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit:
1e02ce4ccc ("x86: Store a per-cpu shadow copy of CR4")
added a shadow CR4 such that reads and writes that do not
modify the CR4 execute much faster than always reading the
register itself.
The change modified cpu_init() in common.c, so that the
shadow CR4 gets initialized before anything uses it.
Unfortunately, there's two cpu_init()s in common.c. There's
one for 64-bit and one for 32-bit. The commit only added
the shadow init to the 64-bit path, but the 32-bit path
needs the init too.
Link: http://lkml.kernel.org/r/20150227125208.71c36402@gandalf.local.home Fixes: 1e02ce4ccc "x86: Store a per-cpu shadow copy of CR4"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150227145019.2bdd4354@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The arm-soc bug fixes this time around are mostly for the omap
platform, coming from a pull request from Tony Lindgren and are
almost entirely fixing dts files.
The other two changes enable support for the shmobile platform
in generic armv7 kernels and change some properties in the
ARM64 reference board dts files.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAVPDl6WCrR//JCVInAQJHDBAAhrvnfpj2vz8mzkI38LrQrBbygu5/oVbK
6+mmHDBVgJwnK2Y2W8XWlKBJyWySLS+WKH0iJL1LArC32BXn83vt5yA4wVex1BoM
luX3NzoB4fC1DApbH4xQqp69gwiU8RnPoyASE3BzEENuIms2P3ymiEOY4ursCAdj
zhUe5m/ksbJCsce+zVXYKLwtDDNByo/JQPstfd7l3e43jzdBtV6IBs+xznUMyWND
B0OR62WSVQ8UOAxvtuLMxLuMlPlcJ6o51fiTyke1tjy2aXr1QXuphZnYmK5oAsuO
qihLdmfR5wbU8QmYDzoUomZ6aPAjMOW1O5J35ZfteO3WgB79GS2N/21WspbvDeFh
J6aq2VNTnpFaUqOcqgkUuOzO5PqPN5vEbt60gx918PROFjOrOJFidDQOw7YnKUOq
IIV/vp0v8xo3zx96b3mOgVtq8cZn4o8HyAvTx7+YnNw22aF4i7JndsIZtDOoNtZW
a5rkdQloyhpfAgrxmp8gS+o8VhuxZNutAMIQw8rtXCX1Z4EMrtN3FA3GBCkRvzPb
R7XNI+dyTHPe3KKlM4cJqKf+HU+bAvPQ40+l/jgd9Dipmt0+noKQin45ooB6NzOs
0Z6+vBB8AtgrLsJomqYTsXfCIskh9B+KQWrt53bwZnPOzAHHJzxcynwt+tfusz4I
L1/Ew6lJtaM=
=tYZw
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"The arm-soc bug fixes this time around are mostly for the omap
platform, coming from a pull request from Tony Lindgren and are almost
entirely fixing dts files.
The other two changes enable support for the shmobile platform in
generic armv7 kernels and change some properties in the ARM64
reference board dts files"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: multi_v7_defconfig: Enable shmobile platforms
arm64: Add L2 cache topology to ARM Ltd boards/models
ARM: dts: am335x-bone*: usb0 is hardwired for peripheral
ARM: dts: dra7x-evm: beagle-x15: Fix USB Host
ARM: omap2plus_defconfig: Fix SATA boot
ARM: omap2plus_defconfig: Enable OMAP NAND BCH driver
ARM: dts: dra7: Correct the dma controller's property names
ARM: dts: omap5: Correct the dma controller's property names
ARM: dts: omap4: Correct the dma controller's property names
ARM: dts: omap3: Correct the dma controller's property names
ARM: dts: omap2: Correct the dma controller's property names
ARM: dts: am437x-idk: fix sleep pinctrl state
ARM: omap2plus_defconfig: enable TPS62362 regulator
ARM: dts: am437x-idk: fix TPS62362 i2c bus
ARM: dts: n900: Fix offset for smc91x ethernet
ARM: dts: n900: fix i2c bus numbering
ARM: dts: Fix USB dts configuration for dm816x
ARM: dts: OMAP5: Fix SATA PHY node
ARM: dts: DRA7: Fix SATA PHY node
- ftrace branch generation fix
- branch instruction encoding fix
- include files, guards and unused prototypes clean-up
- minor VDSO ABI fix (clock_getres)
- PSCI functions moved to .S to avoid compilation error with gcc 5
- pte_modify fix to not ignore the mapping type
- crypto: AES interleaved increased to 4x (for performance reasons)
- text patching fix for modules
- swiotlb increased back to 64MB
- copy_siginfo_to_user32() fix for big endian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU8LqKAAoJEGvWsS0AyF7xE/IQALJa+y7Ow3vkXEaMmDwTWXDD
o2G/8ecZXgKhLnO/1XokHOOpRvkmWRlo4gQSTRmhEtDl4+4x0KNNOJ2CnctTZpvc
rIUS0PrjR3EWigU9WLdXCg4FtyO44w1hTBUimmPddzD1LcxsgMJd041HQcvsWhAL
gHq0Y+asB3cAHSEdRuiBevDFJ0Y+O+JocR911hPbfgv5do/rzcCoHlvOmgOaFe1S
voAq40XR0w2nRLxRj33EMYFngtkZgvzEEV2kCCSOn4yOhX91kX/Sg4gQL7AhcAJ/
Vla8Im7lmZiOSxFrqc536TQ5GG6x7dKaFh2OftC1vdMBtaS3oe104/2tF7IIpErP
t6OTkbdAUMImf0m+XhNpP4pUQneQIfl58WxqBOPk+H/YBoGOXzOrbozfxTp13OPS
LrOk5f8fdisgCYDvEODJZVtnRiqGNtkWZiNEmSFj+EpqBxJTlNTEsEAQe/ThuSY2
/Gha+Ar64zwKRY+KtqcuCAuNEvJpyg4z7PAGMMtznSXtBZ6Xn+H6ME8IB/B+y9j6
URqKlcMnPNiLJAE/ecEx7EoNsvg+DLO30fxs+LPhQ11zoEAWIQTMQiw0fTYdMgU5
KXXCelXTrIyX5uC/aJulDzknNWohXZQlMR/9hwKxRtNnGvMjQM19oRsgM3g3WVRT
U1Xnv69tFHJG3kSh2bY5
=IGAn
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"Various arm64 fixes:
- ftrace branch generation fix
- branch instruction encoding fix
- include files, guards and unused prototypes clean-up
- minor VDSO ABI fix (clock_getres)
- PSCI functions moved to .S to avoid compilation error with gcc 5
- pte_modify fix to not ignore the mapping type
- crypto: AES interleaved increased to 4x (for performance reasons)
- text patching fix for modules
- swiotlb increased back to 64MB
- copy_siginfo_to_user32() fix for big endian"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpuidle: add asm/proc-fns.h inclusion
arm64: compat Fix siginfo_t -> compat_siginfo_t conversion on big endian
arm64: Increase the swiotlb buffer size 64MB
arm64: Fix text patching logic when using fixmap
arm64: crypto: increase AES interleave to 4x
arm64: enable PTE type bit in the mask for pte_modify
arm64: mm: remove unused functions and variable protoypes
arm64: psci: move psci firmware calls out of line
arm64: vdso: minor ABI fix for clock_getres
arm64: guard asm/assembler.h against multiple inclusions
arm64: insn: fix compare-and-branch encodings
arm64: ftrace: fix ftrace_modify_graph_caller for branch replace
ARM64 CPUidle driver requires the cpu_do_idle function so that it can
be used to enter the shallowest idle state, and it is declared in
asm/proc-fns.h.
The current ARM64 CPUidle driver does not include asm/proc-fns.h
explicitly and it has so far relied on implicit inclusion from other
header files.
Owing to some header dependencies reshuffling this currently triggers
build failures when CONFIG_ARM64_64K_PAGES=y:
drivers/cpuidle/cpuidle-arm64.c: In function "arm64_enter_idle_state"
drivers/cpuidle/cpuidle-arm64.c:42:3: error: implicit declaration of
function "cpu_do_idle" [-Werror=implicit-function-declaration]
cpu_do_idle();
^
This patch adds the explicit inclusion of the asm/proc-fns.h header file
in the arm64 asm/cpuidle.h header file, so that the build breakage is fixed
and the required header inclusion is added to the appropriate arch back-end
CPUidle header, already included by the CPUidle arm64 driver, where
CPUidle arch related function declarations belong.
Reported-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The native (64-bit) sigval_t union contains sival_int (32-bit) and
sival_ptr (64-bit). When a compat application invokes a syscall that
takes a sigval_t value (as part of a larger structure, e.g.
compat_sys_mq_notify, compat_sys_timer_create), the compat_sigval_t
union is converted to the native sigval_t with sival_int overlapping
with either the least or the most significant half of sival_ptr,
depending on endianness. When the corresponding signal is delivered to a
compat application, on big endian the current (compat_uptr_t)sival_ptr
cast always returns 0 since sival_int corresponds to the top part of
sival_ptr. This patch fixes copy_siginfo_to_user32() so that sival_int
is copied to the compat_siginfo_t structure.
Cc: <stable@vger.kernel.org>
Reported-by: Bamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Tested-by: Bamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
With commit 3690951fc6 (arm64: Use swiotlb late initialisation), the
swiotlb buffer size is limited to MAX_ORDER_NR_PAGES. However, there are
platforms with 32-bit only devices that require bounce buffering via
swiotlb. This patch changes the swiotlb initialisation to an early 64MB
memblock allocation. In order to get the swiotlb buffer correctly
allocated (via memblock_virt_alloc_low_nopanic), this patch also defines
ARCH_LOW_ADDRESS_LIMIT to the maximum physical address capable of 32-bit
DMA.
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since commit 8cfc99b583 ("s390: add pci_iomap_range") we use
EXPORT_SYMBOL for pci_iomap but EXPORT_SYMBOL_GPL for pci_iounmap.
Change the related functions to use EXPORT_SYMBOL like the asm-generic
variants do.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Commit 8cfc99b583 ("s390: add pci_iomap_range") introduced counters
to keep track of the number of mappings created. This revealed that
we don't have our internal mappings in order when using hotunplug or
resume from hibernate. This patch addresses both issues.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Commit 054954eb05 ("xen: switch to
linear virtual mapped sparse p2m list") introduced an error.
During initialization of the p2m list a p2m identity area mapped by
a complete identity pmd entry has to be split up into smaller chunks
sometimes, if a non-identity pfn is introduced in this area.
If this non-identity pfn is not at index 0 of a p2m page the new
p2m page needed is initialized with wrong identity entries, as the
identity pfns don't start with the value corresponding to index 0,
but with the initial non-identity pfn. This results in weird wrong
mappings.
Correct the wrong initialization by starting with the correct pfn.
Cc: stable@vger.kernel.org # 3.19
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The old implementation assumed that SP at the time of __switch_to() is
right above pt_regs which is almost certainly not the case as there will
be some stack build up between entry into kernel and leading up to
__switch_to
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
/proc/<pid>/maps currently don't annotate stack vma with "[stack]"
This is because KSTK_ESP ie expected to return usermode SP of tsk while
currently it returns the kernel mode SP of a sleeping tsk.
While the fix is trivial, we also need to adjust the ARC kernel stack
unwinder to not use KSTK_SP and friends any more.
Cc: <stable@vger.kernel.org>
Reported-and-suggested-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The arc unwinder can also be used for perf callchains.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
During CPU shutdown the exynos_cpu_power_down() is called after
disabling cache coherency and it uses LDREX and STREX instructions (by
calling of_machine_is_compatible() -> kobject_get() -> kref_get()).
The LDREX and STREX should not be used after disabling the cache
coherency so just use soc_is_exynos().
Fixes: adc548d77c ("ARM: EXYNOS: Use MCPM call-backs to support S2R
on exynos5420")
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
The patch adds domain definition and references to it in appropriate devices.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
[mszyprow: rebased onto generic power domains dt bindings]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Mixed block needs to control hdmi clock to properly perform power on/off
operation, so add 'hdmi' clock also to mixer nodes.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
This patch adds configuration of hw modules required to enable HDMI
support on Universal C210 board.
Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
This patch adds nodes specific to Exynos4412 based Odroid X/X2/U2/U3
boards required for enabling HDMI display.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
TV Mixer needs both TV and LCD0 domains enabled to be fully operational.
This dependency is modelled by making TV power domains a sub-domain of
LCD0 power domain.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
This patch adds entries for HDMI, Mixer and i2c with hdmi-phy modules
found in Exynos 4210 and 4x12 SoCs.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
This patch adds support for making one power domain a sub-domain of
other domain. This is useful for modeling power dependences for devices
like TV Mixer or Camera ISP, which needs to have more than one power
domain enabled to be operational.
Based on previous work by Amit Daniel Kachhap <amit.daniel@samsung.com>.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Presented device tree bindings provide data already hardcoded in the
exynos_tmu_data.c file.
After this commit, it should be possible to reuse common thermal core
framework in Exynos SoCs.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
This commit provides information about Exynos5440 device configuration.
Previously this information was available in exynos_tmu_data.c file.
Now it is available in the device tree.
Such approach allows reusing some common code for thermal.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Trip points corresponding to the one defined in the exynos_tmu_data.c
for Exynos4 have been included.
This thermal-zones attribute is afterwards reused for Exynos4210,
Exynos4412 and Exynos5250.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
This code groups in one place default settings of trip points.
It is used in SoCs with multiple instances of TMU sensor.
Separate device tree file prevents from multiple copying of the
same data.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Exynos 4 and 5 family of SoCs uses almost identical TMU sensor to
measure the on chip temperature. For this reason it is possible to
group TMU configuration parameters in one dts file.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Presented patch aims to move data necessary for correct CPU cooling device
configuration from exynos_tmu_data.c to device tree.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
This commit enables TMU IP block on the Exynos4412 Odroid based
devices such as Odroid U3.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
The thermal IP block (Thermal Management Unit) called TMU has been
enabled in this device.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Patch 2f896d5866 ("arm64: use fixmap for text patching") changed
the way we patch the kernel text, using a fixmap when the kernel or
modules are flagged as read only.
Unfortunately, a flaw in the logic makes it fall over when patching
modules without CONFIG_DEBUG_SET_MODULE_RONX enabled:
[...]
[ 32.032636] Call trace:
[ 32.032716] [<fffffe00003da0dc>] __copy_to_user+0x2c/0x60
[ 32.032837] [<fffffe0000099f08>] __aarch64_insn_write+0x94/0xf8
[ 32.033027] [<fffffe000009a0a0>] aarch64_insn_patch_text_nosync+0x18/0x58
[ 32.033200] [<fffffe000009c3ec>] ftrace_modify_code+0x58/0x84
[ 32.033363] [<fffffe000009c4e4>] ftrace_make_nop+0x3c/0x58
[ 32.033532] [<fffffe0000164420>] ftrace_process_locs+0x3d0/0x5c8
[ 32.033709] [<fffffe00001661cc>] ftrace_module_init+0x28/0x34
[ 32.033882] [<fffffe0000135148>] load_module+0xbb8/0xfc4
[ 32.034044] [<fffffe0000135714>] SyS_finit_module+0x94/0xc4
[...]
This is triggered by the use of virt_to_page() on a module address,
which ends to pointing to Nowhereland if you're lucky, or corrupt
your precious data if not.
This patch fixes the logic by mimicking what is done on arm:
- If we're patching a module and CONFIG_DEBUG_SET_MODULE_RONX is set,
use vmalloc_to_page().
- If we're patching the kernel and CONFIG_DEBUG_RODATA is set,
use virt_to_page().
- Otherwise, use the provided address, as we can write to it directly.
Tested on 4.0-rc1 as a KVM guest.
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch increases the interleave factor for parallel AES modes
to 4x. This improves performance on Cortex-A57 by ~35%. This is
due to the 3-cycle latency of AES instructions on the A57's
relatively deep pipeline (compared to Cortex-A53 where the AES
instruction latency is only 2 cycles).
At the same time, disable inline expansion of the core AES functions,
as the performance benefit of this feature is negligible.
Measured on AMD Seattle (using tcrypt.ko mode=500 sec=1):
Baseline (2x interleave, inline expansion)
------------------------------------------
testing speed of async cbc(aes) (cbc-aes-ce) decryption
test 4 (128 bit key, 8192 byte blocks): 95545 operations in 1 seconds
test 14 (256 bit key, 8192 byte blocks): 68496 operations in 1 seconds
This patch (4x interleave, no inline expansion)
-----------------------------------------------
testing speed of async cbc(aes) (cbc-aes-ce) decryption
test 4 (128 bit key, 8192 byte blocks): 124735 operations in 1 seconds
test 14 (256 bit key, 8192 byte blocks): 92328 operations in 1 seconds
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Caught during Trinity testing. The pte_modify does not allow
modification for PTE type bit. This cause the test to hang
the system. It is found that the PTE can't transit from an
inaccessible page (b00) to a valid page (b11) because the mask
does not allow it. This happens when a big block of mmaped
memory is set the PROT_NONE, then the a small piece is broken
off and set to PROT_WRITE | PROT_READ cause a huge page split.
Signed-off-by: Feng Kan <fkan@apm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The functions __cpu_flush_user_tlb_range and __cpu_flush_kern_tlb_range
were removed in commit fa48e6f780 'arm64: mm: Optimise tlb flush logic
where we have >4K granule'. Global variable cpu_tlb was never used in
arm64.
Remove them.
Signed-off-by: Yingjoe Chen <yingjoe.chen@mediatek.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
An arm64 allmodconfig fails to build with GCC 5 due to __asmeq
assertions in the PSCI firmware calling code firing due to mcount
preambles breaking our assumptions about register allocation of function
arguments:
/tmp/ccDqJsJ6.s: Assembler messages:
/tmp/ccDqJsJ6.s:60: Error: .err encountered
/tmp/ccDqJsJ6.s:61: Error: .err encountered
/tmp/ccDqJsJ6.s:62: Error: .err encountered
/tmp/ccDqJsJ6.s:99: Error: .err encountered
/tmp/ccDqJsJ6.s💯 Error: .err encountered
/tmp/ccDqJsJ6.s:101: Error: .err encountered
This patch fixes the issue by moving the PSCI calls out-of-line into
their own assembly files, which are safe from the compiler's meddling
fingers.
Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The vdso implementation of clock_getres currently returns 0 (success)
whenever a null timespec is provided by the caller, regardless of the
clock id supplied.
This behavior is incorrect. It should fall back to syscall when an
unrecognized clock id is passed, even when the timespec argument is
null. This ensures that clock_getres always returns an error for
invalid clock ids.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The hardware folks told me that for page clearing "when you exactly
know what to do, hand written xc+pfd is usally faster then mvcl for
page clearing, as it saves millicode overhead and parameter parsing
and checking" as long as you dont need the cache bypassing.
Turns out that gcc already does a proper xc,pfd loop.
A small test on z196 that does
buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 256)
buff[i] = 0x5;
gets 20% faster (touches every cache line of a page)
and
buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 4096)
buff[i] = 0x5;
is within noise ratio (touches one cache line of a page).
As the clear_page is usually called for first memory accesses
we can assume that at least one cache line is used afterwards,
so this change should be always better.
Another benchmark, a make -j 40 of my testsuite in tmpfs with
hot caches on a 32cpu system:
-- unpatched -- -- patched --
real 0m1.017s real 0m0.994s (~2% faster, but in noise)
user 0m5.339s user 0m5.016s (~6% faster)
sys 0m0.691s sys 0m0.632s (~8% faster)
Let use the same define to memset as the asm-generic variant
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make sure that even in error situations we do not use copy_to_user
on uninitialized kernel memory.
Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix the output of the jump label sanity check and also print the
code pattern that is supposed to be written to the jump label.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixed hwmod data for pcie by having the correct module mode offset.
Previously this module mode offset was part of pcie PHY which was wrong.
Now this module mode offset was moved to pcie hwmod and removed the hwmod data
for pcie phy. While at that renamed pcie_hwmod to pciess_hwmod in order
to match with the name given in TRM.
This helps to get rid of the following warning
"omap_hwmod: pcie1: _wait_target_disable failed"
[Grygorii.Strashko@linaro.org: Found the issue that actually caused
"omap_hwmod: pcie1: _wait_target_disable failed"]
Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Add struct lock_class_key to omap_hwmod struct and use it to set unique
lockdep class per hwmod.
This will ensure that lockdep will know that each omap_hwmod->_lock should
be treated as separate class and will not give false warning about deadlock
or other issues due to nested use of hwmods.
DRA7x's ATL hwmod is one example for this since McASP can select ATL clock
as functional clock, which will trigger nested oh->_lock usage. This will
trigger false warning from lockdep validator as it is dealing with classes
and for it all hwmod clocks are the same class.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>