Commit graph

59 commits

Author SHA1 Message Date
Santosh Shilimkar
b0f20ff9d7 omap4: l2x0: Set share override bit
Clearing bit 22 in the PL310 Auxiliary Control register (shared
attribute override enable) has the side effect of transforming Normal
Shared Non-cacheable reads into Cacheable no-allocate reads.

Coherent DMA buffers in Linux always have a Cacheable alias via the
kernel linear mapping and the processor can speculatively load cache
lines into the PL310 controller. With bit 22 cleared, Non-cacheable
reads would unexpectedly hit such cache lines leading to buffer
corruption

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:32:55 -08:00
Mans Rullgard
11e0264046 omap4: l2x0: enable instruction and data prefetching
Enabling L2 prefetching improves performance as shown on Panda
ES2.1 board with mem test, and it has measurable impact on
performances. I think we should consider it, even though it damages
"writes" a bit. (rebased to k.org)
Usually the prefetch is used at both levels together L1 + L2, however,
to enable the CP15 prefetch engines, these are under security, and on
GP devices, we cannot enable it(e.g. on PandaBoard). However, just
enabling PL310 prefetch seems to provide performance improvement,
as shown in the data below (from Ubuntu) and would be a great thing
to pull in.

What prefetch does is enable automatic next line prefetching. With this
enabled, whenever the PL310 receives a cachable read request, it
automatically prefetches the following cache line as well.

Measurement Data:
==
STOCK 10.10 WITHOUT PATCH

========================
~# ./memspeed
size    8388608 8192k 8M
offset  8388608, 0
buffers 0x2aaad000 0x2b2ad000
copy  libc          133 MB/s
copy  Android v5    273 MB/s
copy  Android NEON  235 MB/s
copy  INT32         116 MB/s
copy  ASM ARM       187 MB/s
copy  ASM VLDM 64   204 MB/s
copy  ASM VLDM 128  173 MB/s
copy  ASM VLD1      216 MB/s
read  ASM ARM       286 MB/s
read  ASM VLDM      242 MB/s
read  ASM VLD1      286 MB/s
write libc         1947 MB/s
write ASM ARM      1943 MB/s
write ASM VSTM     1942 MB/s
write ASM VST1     1935 MB/s

10.10 + PATCH
=============
~# ./memspeed
size    8388608 8192k 8M
offset  8388608, 0
buffers 0x2ab17000 0x2b317000
copy  libc          129 MB/s
copy  Android v5    256 MB/s
copy  Android NEON  356 MB/s
copy  INT32         127 MB/s
copy  ASM ARM       321 MB/s
copy  ASM VLDM 64   337 MB/s
copy  ASM VLDM 128  321 MB/s
copy  ASM VLD1      350 MB/s
read  ASM ARM       496 MB/s
read  ASM VLDM      470 MB/s
read  ASM VLD1      488 MB/s
write libc         1701 MB/s
write ASM ARM      1682 MB/s
write ASM VSTM     1693 MB/s
write ASM VST1     1681 MB/s

Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:32:41 -08:00
Santosh Shilimkar
1773e60a81 omap4: l2x0: Construct the AUXCTRL value using defines
This patch removes the hardcoded value of auxctrl value and
construct it using bitfields

Bit 25 is reserved and is always set to 1. Same value
of this bit is retained in this patch

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:31:59 -08:00
Russell King
ff2e27ae0b ARM: GIC: consolidate gic_cpu_base_addr to common GIC code
Every architecture using the GIC has a gic_cpu_base_addr pointer for
GIC 0 for their entry assembly code to use to decode the cause of the
current interrupt.  Move this into the common GIC code.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-14 19:21:42 +00:00
Russell King
b580b899dd ARM: GIC: provide a single initialization function for boot CPU
Provide gic_init() which initializes the GIC distributor and current
CPU's GIC interface for the boot (or single) CPU.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-14 19:21:30 +00:00
Russell King
be6786ac73 Merge branch 'l2x0-pull-rmk' of git://dev.omapzoom.org/pub/scm/santosh/kernel-omap4-base into devel-stable 2010-10-28 14:42:06 +01:00
Santosh Shilimkar
4e803c40b3 omap4: l2x0: Override the default l2x0_disable
The machine_kexec() calls outer_disable which can crash on OMAP4
becasue of trustzone restrictions.

This patch overrides the default l2x0_disable with a OMAP4
specific implementation taking care of trustzone

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
2010-10-26 11:40:00 +05:30
Santosh Shilimkar
a777b7277e omap4: l2x0: Fix init parameter for es2.0
On ES2.0 the L2 cache init parameter ineeds to be changed to take
care of cache size. The cache size is 1MB on ES2.0 vs 512KB on ES1.0

This patch fixes the init parameter to update the same using
dynamic cpu version check

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-09-24 11:30:18 +05:30
Santosh Shilimkar
fbc9be106e omap4: Move SOC specific code from board file
This patch moves OMAP4 soc specific code from 4430sdp board file.
The change is necessary so that newer board support can be added
with minimal changes. This will be also problematic for
multi-board, multi-omap builds.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-05-20 11:17:51 -07:00