Use the generic idle loop and replace enable/disable_hlt with the
respective core functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Tested-by: Kevin Hilman <khilman@linaro.org> # OMAP
Link: http://lkml.kernel.org/r/20130321215233.826238797@linutronix.de
Many ARMv7 cores have hardware page table walkers that can read the L1
cache. This is discoverable from the ID_MMFR3 register, although this
can be expensive to access from the low-level set_pte functions and is a
pain to cache, particularly with multi-cluster systems.
A useful observation is that the multi-processing extensions for ARMv7
require coherent table walks, meaning that we can make use of ALT_SMP
patching in proc-v7-* to patch away the cache flush safely for these
cores.
Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
commit 91d1aa43 (context_tracking: New context tracking susbsystem)
generalized parts of the RCU userspace extended quiescent state into
the context tracking subsystem. Context tracking is then used
to implement adaptive tickless (a.k.a extended nohz)
To support the new context tracking subsystem on ARM, the user/kernel
boundary transtions need to be instrumented.
For exceptions and IRQs in usermode, the existing usr_entry macro is
used to instrument the user->kernel transition. For the return to
usermode path, the ret_to_user* path is instrumented. Using the
usr_entry macro, this covers interrupts in userspace, data abort and
prefetch abort exceptions in userspace as well as undefined exceptions
in userspace (which is where FP emulation and VFP are handled.)
For syscalls, the slow return path is covered by instrumenting the
ret_to_user path. In addition, the syscall entry point is
instrumented which covers the user->kernel transition for both fast
and slow syscalls, and an additional instrumentation point is added
for the fast syscall return path (ret_fast_syscall).
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
To ease page table updates with 64-bit descriptors, CPUs implementing
LPAE are required to implement ldrd/strd as atomic operations.
This patch uses these accessors instead of the exclusive variants when
performing atomic64_{read,set} on LPAE systems.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The PCI specifications says that an I/O region must be aligned on a 4
KB boundary, and a memory region aligned on a 1 MB boundary.
However, the Marvell PCIe interfaces rely on address decoding windows
(which allow to associate a range of physical addresses with a given
device). For PCIe memory windows, those windows are defined with a 1
MB granularity (which matches the PCI specs), but PCIe I/O windows can
only be defined with a 64 KB granularity, so they have to be 64 KB
aligned. We therefore need to tell the PCI core about this special
alignement requirement.
The PCI core already calls pcibios_align_resource() in the ARM PCI
core, specifically for such purposes. So this patch extends the ARM
PCI core so that it calls a ->align_resource() hook registered by the
PCI driver, exactly like the existing ->map_irq() and ->swizzle()
hooks.
A particular PCI driver can register a align_resource() hook, and do
its own specific alignement, depending on the specific constraints of
the underlying hardware.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 70264367a2 ("ARM: 7653/2: do not scale loops_per_jiffy when
using a constant delay clock") fixed a problem with our timer-based
delay loop, where loops_per_jiffy is scaled by cpufreq yet used directly
by the timer delay ops.
This patch fixes the problem in a more elegant way by keeping a private
ticks_per_jiffy field in the delay ops, independent of loops_per_jiffy
and therefore not subject to scaling. The loop-based delay continues to
use loops_per_jiffy directly, as it should.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On Cortex-A15 (r0p0..r3p2) the TLBI/DSB are not adequately shooting down
all use of the old entries. This patch implements the erratum workaround
which consists of:
1. Dummy TLBIMVAIS and DSB on the CPU doing the TLBI operation.
2. Send IPI to the CPUs that are running the same mm (and ASID) as the
one being invalidated (or all the online CPUs for global pages).
3. CPU receiving the IPI executes a DMB and CLREX (part of the exception
return code already).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'gic' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
irqchip: gic: Perform the gic_secondary_init() call via CPU notifier
irqchip: gic: Call handle_bad_irq() directly
arm: Move chained_irq_(enter|exit) to a generic file
arm: Move the set_handle_irq and handle_arch_irq declarations to asm/irq.h
+ Linux 3.9-rc3
Signed-off-by: Olof Johansson <olof@lixom.net>
These functions have been introduced by commit 10a8c383 (irq: introduce
entry and exit functions for chained handlers) in asm/mach/irq.h. This
patch moves them to linux/irqchip/chained_irq.h so that generic irqchip
drivers do not rely on architecture specific header files.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
This patch prepares the removal of <asm/mach/irq.h> include in the
GIC and VIC irqchip drivers.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
This is only used by 740t, which is a v4 core and (by my reading of the
datasheet for the CPU) ignores CRm for the cp15 cache flush operation,
making the v4 cache implementation in cache-v4.S sufficient for this
CPU.
Tested with 740T core-tile on Integrator/AP baseboard.
Acked-by: Hyok S. Choi <hyok.choi@samsung.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
* Compile warnings and errors (one on x86, two on ARM)
* WARNING in xen-pciback
* Use the acpi_processor_get_performance_info instead of the 'register' version
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRP33uAAoJEFjIrFwIi8fJfxMIAJxK7jI7NghJ87b8uk3IrkDl
VDsY5Re86XWE0HjsFDJXfWZCGumhpMb8hI1RJBavLDWQmk4THBqsKbx9rDgRZUJO
HepvhwD/RdK+nnjVmbAzJKuc7aOKQmeEW3Weysb32DW9C5Wg27LRrto/oVJIUsyj
4HrLcsZZqCiEYUZjGwrkrMC1yXx+2XvkX9mXq81hGj0xU4X1hMizwBAJvSE5lKca
r4LaCoZ0SuayHYHjEtxyjAcu39wnMdlCkdW9DUlqrjK7L3X29ifVvnb+aS951O5E
t57DCHY8WkI1BkaoHaAQqI7/Y1HpYbT8BALVmxXoIInI9lc8VLHZOs8/fVVDppQ=
=MxoB
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
- Compile warnings and errors (one on x86, two on ARM)
- WARNING in xen-pciback
- Use the acpi_processor_get_performance_info instead of the 'register'
version
* tag 'stable/for-linus-3.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/acpi: remove redundant acpi/acpi_drivers.h include
xen: arm: mandate EABI and use generic atomic operations.
acpi: Export the acpi_processor_get_performance_info
xen/pciback: Don't disable a PCI device that is already disabled.
Rob Herring has observed that c81611c4e9 "xen: event channel arrays are
xen_ulong_t and not unsigned long" introduced a compile failure when building
without CONFIG_AEABI:
/tmp/ccJaIZOW.s: Assembler messages:
/tmp/ccJaIZOW.s:831: Error: even register required -- `ldrexd r5,r6,[r4]'
Will Deacon pointed out that this is because OABI does not require even base
registers for 64-bit values. We can avoid this by simply using the existing
atomic64_xchg operation and the same containerof trick as used by the cmpxchg
macros. However since this code is used on memory which is shared with the
hypervisor we require proper atomic instructions and cannot use the generic
atomic64 callbacks (which are based on spinlocks), therefore add a dependency
on !GENERIC_ATOMIC64. Since we already depend on !CPU_V6 there isn't much
downside to this.
While thinking about this we also observed that OABI has different struct
alignment requirements to EABI, which is a problem for hypercall argument
structs which are shared with the hypervisor and which must be in EABI layout.
Since I don't expect people to want to run OABI kernels on Xen depend on
CONFIG_AEABI explicitly too (although it also happens to be enforced by the
!GENERIC_ATOMIC64 requirement too).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robherring2@gmail.com>
Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Now that we have OF based init with CLKSRC_OF, convert smp_twd init
function to use it and covert all callers of
twd_local_timer_of_register.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Viresh Kumar <viresh.linux@gmail.com>
Cc: Shiraz Hashim <shiraz.hashim@st.com>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-omap@vger.kernel.org
Cc: spear-devel@list.st.com
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
v8 is capable of invalidating Stage-2 by IPA, but v7 is not.
Change kvm_tlb_flush_vmid() to take an IPA parameter, which is
then ignored by the invalidation code (and nuke the whole TLB
as it always did).
This allows v8 to implement a more optimized strategy.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
arm64 cannot represent the kernel VAs in HYP mode, because of the lack
of TTBR1 at EL2. A way to cope with this situation is to have HYP VAs
to be an offset from the kernel VAs.
Introduce macros to convert a kernel VA to a HYP VA, make the HYP
mapping functions use these conversion macros. Also change the
documentation to reflect the existence of the offset.
On ARM, where we can have an identity mapping between kernel and HYP,
the macros are without any effect.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In order to keep the VFP allocation code common, use an abstract type
for the VFP containers. Maps onto struct vfp_hard_struct on ARM.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Make the split of the pgd_ptr an implementation specific thing
by moving the init call to an inline function.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Move low level MMU-related operations to kvm_mmu.h. This makes
the MMU code reusable by the arm64 port.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This one got lost in the move to handle_exit, so let's reintroduce it
using an accessor to the immediate value field like the other ones.
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
The exit handler selection code cannot be shared with arm64
(two different modes, more exception classes...).
Move it to a separate file (handle_exit.c).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Bit 8 is cache maintenance, bit 9 is external abort.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Instead of directly accessing the fault registers, use proper accessors
so the core code can be shared.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
On 32bit ARM, unsigned long is guaranteed to be a 32bit quantity.
On 64bit ARM, it is a 64bit quantity.
In order to be able to share code between the two architectures,
convert the registers to be unsigned long, so the core code can
be oblivious of the change.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The ARM architecture requires explicit branch predictor maintenance
when updating an instruction stream for a given virtual address. In
reality, this isn't so much of a burden because the branch predictor
is flushed during the cache maintenance required to make the new
instructions visible to the I-side of the processor.
However, there are still some cases where explicit flushing is required,
so add a local_bp_flush_all operation to deal with this.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mm->context.id is updated under asid_lock when a new ASID is allocated
to an mm_struct. However, it is also read without the lock when a task
is being scheduled and checking whether or not the current ASID
generation is up-to-date.
If two threads of the same process are being scheduled in parallel and
the bottom bits of the generation in their mm->context.id match the
current generation (that is, the mm_struct has not been used for ~2^24
rollovers) then the non-atomic, lockless access to mm->context.id may
yield the incorrect ASID.
This patch fixes this issue by making mm->context.id and atomic64_t,
ensuring that the generation is always read consistently. For code that
only requires access to the ASID bits (e.g. TLB flushing by mm), then
the value is accessed directly, which GCC converts to an ldrb.
Cc: <stable@vger.kernel.org> # 3.8
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull late ARM updates from Russell King:
"Here is the late set of ARM updates for this merge window; in here is:
- The ARM parts of the broadcast timer support, core parts merged
through tglx's tree. This was left over from the previous merge to
allow the dependency on tglx's tree to be resolved.
- A fix to the VFP code which shows up on Raspberry Pi's, as well as
fixing the fallout from a previous commit in this area.
- A number of smaller fixes scattered throughout the ARM tree"
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm:
ARM: Fix broken commit 0cc41e4a21 corrupting kernel messages
ARM: fix scheduling while atomic warning in alignment handling code
ARM: VFP: fix emulation of second VFP instruction
ARM: 7656/1: uImage: Error out on build of multiplatform without LOADADDR
ARM: 7640/1: memory: tegra_ahb_enable_smmu() depends on TEGRA_IOMMU_SMMU
ARM: 7654/1: Preserve L_PTE_VALID in pte_modify()
ARM: 7653/2: do not scale loops_per_jiffy when using a constant delay clock
ARM: 7651/1: remove unused smp_timer_broadcast #define
Pull DMA-mapping updates from Marek Szyprowski:
"This time all patches are related only to ARM DMA-mapping subsystem.
The main extension provided by this pull request is highmem support.
Besides that it contains a bunch of small bugfixes and cleanups."
* 'for-v3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: DMA-mapping: fix memory leak in IOMMU dma-mapping implementation
ARM: dma-mapping: Add maximum alignment order for dma iommu buffers
ARM: dma-mapping: use himem for DMA buffers for IOMMU-mapped devices
ARM: dma-mapping: add support for CMA regions placed in highmem zone
arm: dma mapping: export arm iommu functions
ARM: dma-mapping: Add arm_iommu_detach_device()
ARM: dma-mapping: Add macro to_dma_iommu_mapping()
ARM: dma-mapping: Set arm_dma_set_mask() for iommu->set_dma_mask()
ARM: iommu: Include linux/kref.h in asm/dma-iommu.h
Pull slave-dmaengine updates from Vinod Koul:
"This is fairly big pull by my standards as I had missed last merge
window. So we have the support for device tree for slave-dmaengine,
large updates to dw_dmac driver from Andy for reusing on different
architectures. Along with this we have fixes on bunch of the drivers"
Fix up trivial conflicts, usually due to #include line movement next to
each other.
* 'next' of git://git.infradead.org/users/vkoul/slave-dma: (111 commits)
Revert "ARM: SPEAr13xx: Pass DW DMAC platform data from DT"
ARM: dts: pl330: Add #dma-cells for generic dma binding support
DMA: PL330: Register the DMA controller with the generic DMA helpers
DMA: PL330: Add xlate function
DMA: PL330: Add new pl330 filter for DT case.
dma: tegra20-apb-dma: remove unnecessary assignment
edma: do not waste memory for dma_mask
dma: coh901318: set residue only if dma is in progress
dma: coh901318: avoid unbalanced locking
dmaengine.h: remove redundant else keyword
dma: of-dma: protect list write operation by spin_lock
dmaengine: ste_dma40: do not remove descriptors for cyclic transfers
dma: of-dma.c: fix memory leakage
dw_dmac: apply default dma_mask if needed
dmaengine: ioat - fix spare sparse complain
dmaengine: move drivers/of/dma.c -> drivers/dma/of-dma.c
ioatdma: fix race between updating ioat->head and IOAT_COMPLETION_PENDING
dw_dmac: add support for Lynxpoint DMA controllers
dw_dmac: return proper residue value
dw_dmac: fill individual length of descriptor
...
This can be built without CONFIG_ARM_DMA_USE_IOMMU.
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>