Commit graph

36,522 commits

Author SHA1 Message Date
Yinghai Lu
2759c3287d x86: don't call read_apic_id if !cpu_has_apic
should not call that if apic is disabled.

[ Impact: fix crash on certain UP configs ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <4A09CCBB.2000306@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 08:43:25 +02:00
Yinghai Lu
e5198075c6 x86, apic: introduce io_apic_irq_attr
according to Ingo, io_apic irq-setup related functions have too many
parameters with a repetitive signature.

So reduce related funcs to get less params by passing a pointer
to a newly defined io_apic_irq_attr structure.

v2: io_apic_irq ==> irq_attr
    triggering ==> trigger

v3: add set_io_apic_irq_attr

[ Impact: cleanup ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Len Brown <lenb@kernel.org>
LKML-Reference: <4A08ACD3.2070401@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 08:38:55 +02:00
Magnus Lilja
135cad366b i.MX31: Add support for the CPLD on PDK Debug board.
The i.MX31 PDK consists of several boards, one of them is a debug
board containing a CPLD which controls some debug leds, switch
buttons, an interrupt chip and an Ethernet controller.
This patch adds support for detecting if the PDK board is present
(during boot) and adds the interrupt chip to the kernel.

Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-05-18 08:36:27 +02:00
Paul Mackerras
c0daaf3f1f perf_counter: powerpc: initialize cpuhw pointer before use
Commit 9e35ad38 ("perf_counter: Rework the perf counter
disable/enable") added code to the powerpc hw_perf_enable (renamed
from hw_perf_restore) to test cpuhw->disabled and return immediately
if it is not set (i.e. if the PMU is already enabled).

Unfortunately the test got added before cpuhw was initialized,
resulting in an oops the first time hw_perf_enable got called.
This fixes it by moving the initialization of cpuhw to before
cpuhw->disabled is tested.

[ Impact: fix oops-causing bug on powerpc ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18960.56772.869734.304631@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 07:38:42 +02:00
Ingo Molnar
dc3f81b129 Merge commit 'v2.6.30-rc6' into perfcounters/core
Merge reason: this branch was on an -rc4 base, merge it up to -rc6
              to get the latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 07:37:49 +02:00
Benjamin Herrenschmidt
0e337b42d6 powerpc: Explicit alignment for .data.cacheline_aligned
I don't think anything guarantees that the objects in data.page_aligned
are a multiple of PAGE_SIZE, thus the section may end on any boundary.

So the following section, .data.cacheline_aligned needs an explicit
alignment.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:05 +10:00
Geoff Levand
dc892288f4 powerpc/ps3: Update ps3_defconfig
Refresh and set these options:

 CONFIG_SYSFS_DEPRECATED_V2: y -> n
 CONFIG_INPUT_JOYSTICK:      y -> n
 CONFIG_HID_SONY:            n -> m
 CONFIG_RTC_DRV_PS3:         - -> m

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:05 +10:00
Steven Rostedt
c3cf8667ed powerpc/ftrace: Fix constraint to be early clobber
After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
graph tracer broke. This was discovered on my x86 boxes.

The issue is that gcc used the same register for an output as it did for
an input in an asm statement. I first thought this was a bug in gcc and
reported it. I was notified that gcc was correct and that the output had
to be flagged as an "early clobber".

I noticed that powerpc had the same issue and this patch fixes it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:05 +10:00
Michael Ellerman
021376a3b6 powerpc/ftrace: Use pr_devel() in ftrace.c
pr_debug() can now result in code being generated even when #DEBUG
is not defined. That's not really desirable in the ftrace code
which we want to be snappy.

With CONFIG_DYNAMIC_DEBUG=y:

size before:
   text	   data	    bss	    dec	    hex	filename
   3334	    672	      4	   4010	    faa	arch/powerpc/kernel/ftrace.o

size after:
   text	   data	    bss	    dec	    hex	filename
   2616	    360	      4	   2980	    ba4	arch/powerpc/kernel/ftrace.o

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:04 +10:00
Mel Gorman
af3e4aca47 powerpc: Do not assert pte_locked for hugepage PTE entries
With CONFIG_DEBUG_VM, an assertion is made when changing the protection
flags of a PTE that the PTE is locked. Huge pages use a different pagetable
format and the assertion is bogus and will always trigger with a bug looking
something like

 Unable to handle kernel paging request for data at address 0xf1a00235800006f8
 Faulting instruction address: 0xc000000000034a80
 Oops: Kernel access of bad area, sig: 11 [#1]
 SMP NR_CPUS=32 NUMA Maple
 Modules linked in: dm_snapshot dm_mirror dm_region_hash
  dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic
  pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid
  windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor
  windfarm_cpufreq_clamp windfarm_core i2c_powermac
 NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003
 REGS: c000000003037600 TRAP: 0300   Not tainted (2.6.30-rc3-autokern1)
 MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 28002484  XER: 200fffff
 DAR: f1a00235800006f8, DSISR: 0000000040010000
 TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2
 GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500
 GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001
 GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8
 GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000
 GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20
 GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000
 GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8
 GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880
 NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0
 LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
 Call Trace:
 [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable)
 [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
 [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674
 [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8
 [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828
 [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584
 [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c
 Instruction dump:
 7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4
 7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000

This patch fixes the problem by not asseting the PTE is locked for VMAs
backed by huge pages.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:04 +10:00
Ben Dooks
543899f610 [ARM] S3C64XX: Use common watchdog reset for system reset.
Use the newly moved <plat/watchdog-reset.h> to perform the
arch_reset() call which has been unimplemented for a while.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-17 23:40:30 +01:00
Ben Dooks
3cba5ef862 [ARM] S3C: Move watchdog system reset to own file.
Move the watchdog reset code from <mach/system-reset.h> to
a new file <plat/watchdog-reset.h> as this code is needed
by both s3c2410, s3c64xx and soon-to-be added s3c24a0.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-17 23:30:45 +01:00
Ben Dooks
9f05f6a921 [ARM] S3C24XX: GPIO: Remove pin specific input and output defines
The use of S3C2410_GP[A-Z]x_INP and S3C2410_GP[A-Z]x_OUTP are
very rare and are taking up large amounts of space in the
regs-gpio.h header.

The GPIO layer has had generic input and out defines called
S3C2410_GPIO_INPUT and S3C2410_GPIO_OUTPUT for a while which work
for all S3C24XX GPIOs.

Do the following replacements:

   S3C2410_GP[A-Z][0-9]*_\OUTP => S3C2410_GPIO_OUTPUT
   S3C2410_GP[A-Z][0-9]*_\INP  => /S3C2410_GPIO_INPUT
   S3C2410_GPA[0-9]*_OUT       => S3C2410_GPIO_OUTPUT

to remove any usages of these and prepare the header for
the removal of these.

The following command was used to acheive this:

find . -type f -writable ! -name regs-gpio.h ! -name "*~" | xargs sed -i~ -e 's/S3C2410_GP[A-Z][0-9]*_\OUTP/S3C2410_GPIO_OUTPUT/g' -e 's/S3C2410_GP[A-Z][0-9]*_\INP/S3C2410_GPIO_INPUT/g' -e 's/S3C2410_GPA[0-9]*_OUT/S3C2410_GPIO_OUTPUT/g'

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-17 22:21:26 +01:00
Ben Dooks
ec7f4d5d67 [ARM] S3C24XX: GPIO: Remove s3c2410_gpio_irq2pin() call
Remove the s3c2410_gpio_irq2pin() function as it is not being
used in any in kernel driver and the function is probably not
being used anywhere else.

This is also part of the effort to remove any of the s3c24xx gpio
specific code that cannot be recreated by using the gpiolib
framework now in the kernel.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-17 22:21:11 +01:00
Russell King
4c5158d4c3 [ARM] smp: fix style issues in smp_twd.c
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17 19:16:41 +01:00
Russell King
f32f4ce257 [ARM] smp: allow re-use of realview localtimer TWD support
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17 19:16:41 +01:00
Russell King
a8cbcd92bd [ARM] smp: separate SCU support code from realview
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17 19:00:37 +01:00
Russell King
49613d4d9a [ARM] smp: SCU is used on non-realview platforms
The SCU can be used by non-realview platforms, so make it visible
for other people to use rather than having them copy the header file.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17 18:59:57 +01:00
Russell King
bc28248ee2 [ARM] smp: move core localtimer support out of platform specific files
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17 18:58:34 +01:00
Russell King
e1342f1da0 Merge branch 'smp-fix' 2009-05-17 17:13:18 +01:00
Russell King
ee348d5a1d [ARM] realview: fix broadcast tick support
Having discussed broadcast tick support with Thomas Glexiner, the
broadcast tick devices should be registered with a higher rating
than the global tick device, and it should have the ONESHOT and
PERIODIC feature flags set.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Thomas Glexiner <tglx@linutronix.de>
2009-05-17 17:11:35 +01:00
Russell King
78d236c2b3 [ARM] realview: remove useless smp_cross_call_done()
smp_cross_call_done() is a no-op for MPCore, and since it's only
used by platform code, there's no point in having it unless it's
doing something.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17 16:23:45 +01:00
Russell King
826681043d [ARM] smp: fix cpumask usage in ARM SMP code
The ARM SMP code wasn't properly updated for the cpumask changes, which
results in smp_timer_broadcast() broadcasting ticks to non-online CPUs.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17 16:22:46 +01:00
Ingo Molnar
d2517a49d5 perf_counter, x86: fix zero irq_period counters
The quirk to irq_period unearthed an unrobustness we had in the
hw_counter initialization sequence: we left irq_period at 0, which
was then quirked up to 2 ... which then generated a _lot_ of
interrupts during 'perf stat' runs, slowed them down and skewed
the counter results in general.

Initialize irq_period to the maximum instead.

[ Impact: fix perf stat results ]

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-17 12:27:37 +02:00
Nelson Castillo
3f7ea467be [ARM] S3C: ADC: Expose number of remaining conversions to
convert callback

This patch allow us to efficiently modify the number of
remaining conversions from the client side. This us useful
when we do not know in advance how many conversions we will
need or when we need to cancel pending conversions.

This change is simple enough to be compatible with existing
code that can just define the new pointer in the callback
and ignore it.

Sample usage:

http://tinyurl.com/s3c2410-ts-c (function stylus_adc_action).

Signed-off-by: Nelson Castillo <arhuaco@freaks-unidos.net>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-16 22:22:01 +01:00
Nelson Castillo
b57f0fe107 [ARM] S3C: ADC: Fix lines with more than 80 chars in adc.h
Small cleanup.

Signed-off-by: Nelson Castillo <arhuaco@freaks-unidos.net>
[ben-linux@fluff.org: rewrote subject]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-16 22:22:01 +01:00
Ben Dooks
06fa1d37ce [ARM] SMDK6410: Add USB high-speed/OtG gadget device
Add the USB gadget support to the SMDK6410.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-16 22:14:09 +01:00
Ben Dooks
f0e1fa7600 [ARM] S3C: Add USB high-speed/OtG device definitions
Add platform device definitions for the high-speed and OtG
capable device block on the newer Samsung parts.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-05-16 22:13:52 +01:00
Ricardo Martins
776abac817 [ARM] 5513/1: Eurotech VIPER SBC: fix compilation error
Compilation for this board yields the following errors:

arch/arm/mach-pxa/viper.c:511: error: 'FFUART' undeclared here (not in a function)
arch/arm/mach-pxa/viper.c:520: error: 'BTUART' undeclared here (not in a function)
arch/arm/mach-pxa/viper.c:529: error: 'STUART' undeclared here (not in a function)

Fix them by including the necessary header.

Signed-off-by: Ricardo Martins <rasm@fe.up.pt>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-16 19:55:56 +01:00
Hartley Sweeten
ff05c0330b [ARM] 5509/1: ep93xx: clkdev enable UARTS
Fix the clkdev API support for the ep93xx uart clocks.

The uarts available in the ep93xx have individual clock controls.
The current implementation assumes that the bootloader has enabled
the clocks before the kernel has booted. It also assumes that the
bootloader has set the UARTBAUD bit indicating that the uarts are
running off the 14.7456MHz external crystal.

This fixes both issues. It also allows the uart clocks to be stopped
when there are no users.

Tested-by: Matthias Kaehlcke <matthias@kaehlcke.net>

Cc: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-16 19:55:56 +01:00
Russell King
cddb783552 Merge branch 'omap-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 2009-05-16 19:51:20 +01:00
Russell King
b477dfba38 Merge branch 'fixes-rc5' of git://aeryn.fluff.org.uk/bjdooks/linux 2009-05-16 17:54:19 +01:00
Tony Lindgren
005187eeca ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2
This makes the framebuffer work on omap3.

Also fix the clk_get usage for checkpatch.pl
"ERROR: do not use assignment in if condition".

Cc: Imre Deak <imre.deak@nokia.com>
Cc: linux-fbdev-devel@lists.sourceforge.net
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2009-05-16 08:28:17 -07:00
Kalle Jokiniemi
8dbe43930a ARM: OMAP3: Fix HW SAVEANDRESTORE shift define
The OMAP3430ES2_SAVEANDRESTORE_SHIFT macro is used
by powerdomain code in
"1 << OMAP3430ES2_SAVEANDRESTORE_SHIFT" manner, but
the definition was also (1 << 4), meaning we actually
modified bit 16. So the definition needs to be 4.

This fixes also a cold reset HW bug in OMAP3430 ES3.x
where some of the efuse bits are not isolated during
wake-up from off mode. This can cause randomish
cold resets with off mode. Enabling the USBTLL hardware
SAVEANDRESTORE causes the core power up assert to be
delayed in a way that we will not get faulty values
when boot ROM is reading the unisolated registers.

Signed-off-by: Kalle Jokiniemi <kalle.jokiniemi@digia.com>
Acked-by: Kevin Hilman <khilman@deeprootsystems.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2009-05-16 08:28:17 -07:00
Vikram Pandita
e102657ed1 ARM: OMAP3: Fix number of GPIO lines for 34xx
As per 3430 TRM, there are 6 banks [0 to 191]

Signed-off-by: Tom Rix <Tom.Rix@windriver.com>
Signed-off-by: Vikram Pandita <vikram.pandita@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2009-05-16 08:28:16 -07:00
Magnus Lilja
153fa1d8c6 i.MX31: Restructure UART setup for PDK board.
Restructure UART pin setup in preparation for adding other pins
in later patches.

Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-05-16 16:25:04 +02:00
Alberto Panizzo
92ab0a5015 ARM MXC: Make-sure-ipg_per_clk-is-generated-by-ipg_clk-and-not-usb_pll
From ff1fd9d7015d9b9ad3e0df2016d0415e2719747c Mon Sep 17 00:00:00 2001
From: Alberto Panizzo <maramaopercheseimorto@gmail.com>
Date: Fri, 15 May 2009 17:21:21 +0200
Subject: [PATCH] Make sure ipg_per_clk is generated by ipg_clk and not usb_pll

Signed-off-by: Alberto Panizzo <maramaopercheseimorto@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-05-16 16:19:02 +02:00
Magnus Lilja
183c7fff50 i.MX31: Add NAND device driver for Litekit board.
Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-05-16 16:17:40 +02:00
Hartley Sweeten
a2bd40d215 [ARM] 5504/1: ep93xx: Merge all edb93xx platforms
The Cirrus Logic EDB93xx development board platform init files
share redundant code. The only differences are in the flash
memory configuration, MACH_TYPE, and additional on-board
I2C devices. This patch merges all of them into one file.

Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Herbert Valerio Riedel <hvr@gnu.org>
Cc: Toufeeq Hussain <toufeeq_hussain@infosys.com>
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-15 20:42:57 +01:00
Jeremy Fitzhardinge
b4ecc12699 x86: Fix performance regression caused by paravirt_ops on native kernels
Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_ops when running native.

It appears that the hit is entirely due to the paravirtualized
spinlocks introduced by:

 | commit 8efcbab674
 | Date:   Mon Jul 7 12:07:51 2008 -0700
 |
 |     paravirt: introduce a "lock-byte" spinlock implementation

The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test).  The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons).  This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions.  But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).

If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:

		nopv		Pv-nospin	Pv-spin
CPU cycles	100.00%		99.89%		102.18%
instructions	100.00%		100.10%		100.15%
CPI		100.00%		99.79%		102.03%
cache ref	100.00%		100.84%		100.28%
cache miss	100.00%		90.47%		88.56%
cache miss rate	100.00%		89.72%		88.31%
branches	100.00%		99.93%		100.04%
branch miss	100.00%		103.66%		107.72%
branch miss rt	100.00%		103.73%		107.67%
wallclock	100.00%		99.90%		102.20%

The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.

(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates.  Not too surprising, but it suggests that
the non-pvops kernel is over-inlined.  On the flipside,
the branch misses go up correspondingly...)

So, what's the fix?

Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls.  For example, the compiler
generated code for paravirtualized _spin_lock is:

<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq  *0xffffffff805a5b30
<_spin_lock+22>:	retq

The indirect call will get patched to:
<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq <__ticket_spin_lock>
<_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
<_spin_lock+22>:	retq

One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled).  That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case.  The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.

The other obvious answer is to disable pv-spinlocks.  Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin).  Still it is a
reasonable short-term workaround.

[ Impact: fix pvops performance regression when running native ]

Analysed-by: "Xin Xiaohui" <xiaohui.xin@intel.com>
Analysed-by: "Li Xin" <xin.li@intel.com>
Analysed-by: "Nakajima Jun" <jun.nakajima@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15 20:07:42 +02:00
Linus Torvalds
c244450dac Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ASoC: DaVinci EVM board support buildfixes
  ASoC: DaVinci I2S updates
  ASoC: davinci-pcm buildfixes
  ALSA: pcsp: fix printk format warning
  ALSA: riptide: postfix increment and off by one
  pxa2xx-ac97: fix reset gpio mode setting
  ASoC: soc-core: fix crash when removing not instantiated card
2009-05-15 08:06:56 -07:00
Linus Torvalds
ade385e4d1 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb: gdb documentation fix
  kgdb,i386: use address that SP register points to in the exception frame
  sysrq, intel_fb: fix sysrq g collision
2009-05-15 08:06:45 -07:00
Linus Torvalds
c653849981 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Revert "mm: add /proc controls for pdflush threads"
  viocd: needs to depend on BLOCK
  block: fix the bio_vec array index out-of-bounds test
2009-05-15 08:05:37 -07:00
Linus Torvalds
662f11cf2a Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix PCI ROM access
  powerpc/pseries: Really fix the oprofile CPU type on pseries
  serial/nwpserial: Fix wrong register read address and add interrupt acknowledge.
  powerpc/cell: Make ptcal more reliable
  powerpc: Allow mem=x cmdline to work with 4G+
  powerpc/mpic: Fix incorrect allocation of interrupt rev-map
  powerpc: Fix oprofile sampling of marked events on POWER7
  powerpc/iseries: Fix pci breakage due to bad dma_data initialization
  powerpc: Fix mktree build error on Mac OS X host
  powerpc/virtex: Fix duplicate level irq events.
  powerpc/virtex: Add uImage to the default images list
  powerpc/boot: add simpleImage.* to clean-files list
  powerpc/8xx: Update defconfigs
  powerpc/embedded6xx: Update defconfigs
  powerpc/86xx: Update defconfigs
  powerpc/85xx: Update defconfigs
  powerpc/83xx: Update defconfigs
  powerpc/fsl_soc: Remove mpc83xx_wdt_init, again
2009-05-15 08:05:02 -07:00
Jaswinder Singh Rajput
52650257ea x86, mtrr: replace MTRRdefType_MSR with msr-index's MSR_MTRRdefType
Use standard msr-index.h's MSR declaration and no need to declare again.

[ Impact: cleanup, no object code change ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-15 07:49:01 -07:00
Jaswinder Singh Rajput
ba5673ff1f x86, mtrr: replace MTRRfix4K_C0000_MSR with msr-index's MSR_MTRRfix4K_C0000
Use standard msr-index.h's MSR declaration and no need to declare again.

[ Impact: cleanup, no object code change ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-15 07:49:01 -07:00
Jaswinder Singh Rajput
654ac05801 x86, mtrr: remove mtrr MSRs double declaration
Removed MTRR MSR from mtrr/mtrr.h as these are already declared in
msr-index.h and nobody is using them:
 MTRRfix16K_A0000_MSR
 MTRRfix4K_C8000_MSR
 MTRRfix4K_D0000_MSR
 MTRRfix4K_D8000_MSR
 MTRRfix4K_E0000_MSR
 MTRRfix4K_E8000_MSR
 MTRRfix4K_F0000_MSR
 MTRRfix4K_F8000_MSR

Use standard msr-index.h's MSR declaration and no need to declare again

[ Impact: cleanup, no object code change ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-15 07:49:01 -07:00
Jaswinder Singh Rajput
7d9d55e449 x86, mtrr: replace MTRRfix16K_80000_MSR with msr-index's MSR_MTRRfix16K_80000
Use standard msr-index.h's MSR declaration and no need to declare again

[ Impact: cleanup, no object code change ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-15 07:49:00 -07:00
Jaswinder Singh Rajput
a036c7a358 x86, mtrr: replace MTRRfix64K_00000_MSR with msr-index's MSR_MTRRfix64K_00000
Use standard msr-index.h's MSR declaration and no need to declare again.

[ Impact: cleanup, no object code change ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-15 07:49:00 -07:00
Jaswinder Singh Rajput
d9bcc01d58 x86, mtrr: replace MTRRcap_MSR with msr-index's MSR_MTRRcap
Use standard msr-index.h's MSR declaration and no need to declare again.

[ Impact: cleanup, no object code change ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-15 07:49:00 -07:00