Commit graph

3984 commits

Author SHA1 Message Date
Andreas Herrmann
d04ec773d7 x86: pda_init(): fix memory leak when using CPU hotplug
pda->irqstackptr is allocated whenever a CPU is set online.
But it is never freed. This results in a memory leak of 16K
for each CPU offline/online cycle.

Fix is to allocate pda->irqstackptr only once.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:48:02 +02:00
Eduardo Habkost
e4a6be4d28 x86, xen: Use native_pte_flags instead of native_pte_val for .pte_flags
Using native_pte_val triggers the BUG_ON() in the paravirt_ops
version of pte_flags().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:13:58 +02:00
Jan Beulich
17b746278d x86: pgd_{c,d}tor() cleanup
Giving pgd_ctor() a properly typed parameter allows eliminating a local
variable. Adjust pgd_dtor() to match.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "Jeremy Fitzhardinge" <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 19:47:09 +02:00
Christoph Hellwig
0722bba8f1 x86: kill sys32_pause
It's an unused duplicate of the generic sys_pause.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 18:44:47 +02:00
Yinghai Lu
dd786dd12c x86: move mtrr cpu cap setting early in early_init_xxxx
Krzysztof Helt found MTRR is not detected on k6-2

root cause:
	we moved mtrr_bp_init() early for mtrr trimming,
and in early_detect we only read the CPU capability from cpuid,
so some cpu doesn't have that bit in cpuid.

So we need to add early_init_xxxx to preset those bit before mtrr_bp_init
for those earlier cpus.

this patch is for v2.6.27

Reported-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 17:50:55 +02:00
Krzysztof Helt
12cf105cd6 x86: delay early cpu initialization until cpuid is done
Move early cpu initialization after cpu early get cap so the
early cpu initialization can fix up cpu caps.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 17:50:38 +02:00
Thomas Gleixner
72d43d9bc9 x86: HPET: read back compare register before reading counter
After fixing the u32 thinko I sill had occasional hickups on ATI chipsets
with small deltas. There seems to be a delay between writing the compare
register and the transffer to the internal register which triggers the
interrupt. Reading back the value makes sure, that it hit the internal
match register befor we compare against the counter value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-06 07:21:17 +02:00
Thomas Gleixner
f7676254f1 x86: HPET fix moronic 32/64bit thinko
We use the HPET only in 32bit mode because:
1) some HPETs are 32bit only
2) on i386 there is no way to read/write the HPET atomic 64bit wide

The HPET code unification done by the "moron of the year" did
not take into account that unsigned long is different on 32 and
64 bit.

This thinko results in a possible endless loop in the clockevents
code, when the return comparison fails due to the 64bit/332bit
unawareness. 

unsigned long cnt = (u32) hpet_read() + delta can wrap over 32bit.
but the final compare will fail and return -ETIME causing endless
loops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-06 07:21:17 +02:00
H. Peter Anvin
f31d731e44 x86: use X86_FEATURE_NOPL in alternatives
Use X86_FEATURE_NOPL to determine if it is safe to use P6 NOPs in
alternatives.  Also, replace table and loop with simple if statement.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:14:01 -07:00
H. Peter Anvin
b6734c35af x86: add NOPL as a synthetic CPU feature bit
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum.  Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:13:52 -07:00
H. Peter Anvin
b74b06c5f6 x86: boot: stub out unimplemented CPU feature words
The CPU feature detection code in the boot code is somewhat minimal,
and doesn't include all possible CPUID words.  In particular, it
doesn't contain the code for CPU feature words 2 (Transmeta),
3 (Linux-specific), 5 (VIA), or 7 (scattered).  Zero them out, so we
can still set those bits as known at compile time; in particular, this
allows creating a Linux-specific NOPL flag and have it required (and
therefore resolvable at compile time) in 64-bit mode.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:13:44 -07:00
Linus Torvalds
1c402c8cd1 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: add io delay quirk for Presario F700
2008-09-05 14:36:21 -07:00
David Woodhouse
e51af66308 x86: blacklist DMAR on Intel G31/G33 chipsets
Some BIOSes (the Intel DG33BU, for example) wrongly claim to have DMAR
when they don't. Avoid the resulting crashes when it doesn't work as
expected.

I'd still be grateful if someone could test it on a DG33BU with the old
BIOS though, since I've killed mine. I tested the DMI version, but not
this one.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 20:20:25 +02:00
Joerg Roedel
cf169702ba x86, gart: add detection of AMD family 0x11 northbridges
This patch adds the detection of the northbridges in the AMD family 0x11
processors. It also fixes the magic numbers there while changing this code.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 19:11:44 +02:00
H. Peter Anvin
5b7e41ff37 x86: additional defconfig updates
Additional updates to the x86 defconfigs.  The goals are, as before:

- Make them usable to testers, more so than distributors or end users,
  both of which are likely to have their own config already.
- Keep 32 and 64 bits as similar as is practical.

Changes:

- Use a more generic CPU type (ppro and generic, respectively).
- Bump number of CPUs to 64 (few if any NR_CPUS arrays left).
- Enable PAT.
- Enable OPTIMIZE_INLINE.
- Enable microcode update support.
- Build SMT scheduler support (in addition to MC).

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 18:57:14 +02:00
Ingo Molnar
616ad8c442 Merge branch 'linus' into x86/defconfig 2008-09-05 18:56:57 +02:00
Ingo Molnar
28c3cfd5fb Merge branch 'linus' into x86/tracehook 2008-09-05 17:53:05 +02:00
Alex Nixon
efd327a2d4 x86/paravirt: Remove duplicate paravirt_pagetable_setup_{start, done}()
They were already called once in arch/x86/kernel/setup.c - we don't need to call them again.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:46:44 +02:00
Jeremy Fitzhardinge
110e0358e7 x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious
The CPA test code uses _PAGE_UNUSED1, so make sure its obvious.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:09:57 +02:00
FUJITA Tomonori
551b4545bf x86: gart alloc_coherent doesn't need to check NULL device argument
asm/dma-mapping.h guarantees that gart alloc_coherent doesn't get NULL
device argument.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 12:48:13 +02:00
Thomas Gleixner
7cfb043533 HPET: make minimum reprogramming delta useful
The minimum reprogramming delta was hardcoded in HPET ticks,
which is stupid as it does not work with faster running HPETs.
The C1E idle patches made this prominent on AMD/RS690 chipsets,
where the HPET runs with 25MHz. Set it to 5us which seems to be
a reasonable value and fixes the problems on the bug reporters
machines. We have a further sanity check now in the clock events,
which increases the delta when it is not sufficient.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Tested-by: Dmitry Nezhevenko <dion@inhex.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 11:11:54 +02:00
Ingo Molnar
4156e9a8ef x86: quick TSC calibration, improve
- make sure the final TSC timestamp is reliable too

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 23:21:57 +02:00
Linus Torvalds
6ac40ed041 x86: quick TSC calibration
Introduce a fast TSC-calibration method on sane hardware.

It only uses 17920 PIT timer ticks to calibrate the TSC, plus 256 ticks on
each side to make sure the TSC values were very close to the tick, so the
whole calibration takes 15ms. Yet, despite only takign 15ms,
we can actually give pretty stringent guarantees of accuracy:

 - the code requires that we hit each 256-counter block at least 50 times,
   so the TSC error is basically at *MOST* just a few PIT cycles off in
   any direction. In practice, it's going to be about one microseconds
   off (which is how long it takes to read the counter)

 - so over 17920 PIT cycles, we can pretty much guarantee that the
   calibration error is less than one half of a percent.

My testing bears this out: on my machine, the quick-calibration reports
2934.085kHz, while the slow one reports 2933.415.

Yes, the slower calibration is still more precise. For me, the slow
calibration is stable to within about one hundreth of a percent, so it's
(at a guess) roughly an order-and-a-half of magnitude more precise. The
longer you wait, the more precise you can be.

However, the nice thing about the fast TSC PIT synchronization is that
it's pretty much _guaranteed_ to give that 0.5% precision, and fail
gracefully (and very quickly) if it doesn't get it. And it really is
fairly simple (even if there's a lot of _details_ there, and I didn't get
all of those right ont he first try or even the second ;)

The patch says "110 insertions", but 63 of those new lines are actually
comments.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/tsc.c |  111 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 110 insertions(+), 1 deletions(-)
2008-09-04 22:54:50 +02:00
Andi Kleen
dc44e65943 x86: capitalize function call interrupts consistently
Impact: aestetic

Capitalize function call interrupts consistently.

All other descriptions in /proc/interrupts are capitalized except
for "function call interrupts". Capitalize it too for consistency.

While that's technically a published ABI I think the risk of anyone
relying on that text to stay the same is negligible.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-04 10:51:36 -07:00
Thomas Gleixner
a977c40095 x86: TSC make the calibration loop smarter
The last changes made the calibration loop 250ms long which is far
too much. Try to do that more clever.

Experiments have shown that using a 10ms delay for the PIT based calibration
gives us a good enough value. If we have a reference (HPET/PMTIMER) and the
result of the PIT and the reference is close enough, then we can break out of
the calibration loop on a match right away and use the reference value.

Otherwise we just loop 3 times and decide then, which value to take.

One caveat is that for virtualized environments the PIT calibration often does
not work at all and I found out that 10us is a bit too short as well for the
reference to give a sane result. The solution here is to make the last loop
longer when the first two PIT calibrations failed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 17:35:35 +02:00
Thomas Gleixner
827014be05 x86: TSC: use one set of reference variables
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 17:35:34 +02:00
Thomas Gleixner
d683ef7afe x86: TSC: separate hpet/pmtimer calculation out
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 17:35:33 +02:00
Thomas Gleixner
cce3e05724 x86: TSC: define the PIT latch value separate
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 17:35:33 +02:00
Alok N Kataria
de014d6176 x86: Change warning message in TSC calibration.
When calibration against PIT fails, the warning that we print is misleading.
In a virtualized environment the VM may get descheduled while calibration
or, the check in PIT calibration may fail due to other virtualization
overheads.

The warning message explicitly assumes that calibration failed due to SMI's
which may not be the case. Change that to something proper.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-03 20:10:37 -07:00
Chuck Ebbert
e6a5652fd1 x86: add io delay quirk for Presario F700
Manually adding "io_delay=0xed" fixes system lockups in ioapic
mode on this machine.

System Information
	Manufacturer: Hewlett-Packard
	Product Name: Presario F700 (KA695EA#ABF)

Base Board Information
	Manufacturer: Quanta
	Product Name: 30D3

Reference:
https://bugzilla.redhat.com/show_bug.cgi?id=459546

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-03 16:42:51 -07:00
Linus Torvalds
ec0c15afb4 Split up PIT part of TSC calibration from native_calibrate_tsc
The TSC calibration function is still very complicated, but this makes
it at least a little bit less so by moving the PIT part out into a
helper function of its own.

Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-03 07:30:13 -07:00
Thomas Gleixner
fbb16e2438 [x86] Fix TSC calibration issues
Larry Finger reported at http://lkml.org/lkml/2008/9/1/90:
An ancient laptop of mine started throwing errors from b43legacy when
I started using 2.6.27 on it. This has been bisected to commit bfc0f59
"x86: merge tsc calibration".

The unification of the TSC code adopted mostly the 64bit code, which
prefers PMTIMER/HPET over the PIT calibration.

Larrys system has an AMD K6 CPU. Such systems are known to have
PMTIMER incarnations which run at double speed. This results in a
miscalibration of the TSC by factor 0.5. So the resulting calibrated
CPU/TSC speed is half of the real CPU speed, which means that the TSC
based delay loop will run half the time it should run. That might
explain why the b43legacy driver went berserk.

On the other hand we know about systems, where the PIT based
calibration results in random crap due to heavy SMI/SMM
disturbance. On those systems the PMTIMER/HPET based calibration logic
with SMI detection shows better results.

According to Alok also virtualized systems suffer from the PIT
calibration method.

The solution is to use a more wreckage aware aproach than the current
either/or decision.

1) reimplement the retry loop which was dropped from the 32bit code
during the merge. It repeats the calibration and selects the lowest
frequency value as this is probably the closest estimate to the real
frequency

2) Monitor the delta of the TSC values in the delay loop which waits
for the PIT counter to reach zero. If the maximum value is
significantly different from the minimum, then we have a pretty safe
indicator that the loop was disturbed by an SMI.

3) keep the pmtimer/hpet reference as a backup solution for systems
where the SMI disturbance is a permanent point of failure for PIT
based calibration

4) do the loop iteration for both methods, record the lowest value and
decide after all iterations finished.

5) Set a clear preference to PIT based calibration when the result
makes sense.

The implementation does the reference calibration based on
HPET/PMTIMER around the delay, which is necessary for the PIT anyway,
but keeps separate TSC values to ensure the "independency" of the
resulting calibration values.

Tested on various 32bit/64bit machines including Geode 266Mhz, AMD K6
(affected machine with a double speed pmtimer which I grabbed out of
the dump), Pentium class machines and AMD/Intel 64 bit boxen.

Bisected-by:  Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-02 20:35:56 -07:00
Linus Torvalds
011fec7486 Un-break printk strings in x86 PCI probing code
Breaking lines due to some imaginary problem with a long line length is
often stupid and wrong, but never more so when it splits a string that
is printed out into multiple lines.  This really ended up making it much
harder to find where some error strings were printed out, because a
simple 'grep' didn't work.

I'm sure there is tons more of this particular idiocy hiding in other
places, but this particular case hit me once more last week. So fix it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-02 10:38:28 -07:00
Linus Torvalds
b460947211 Revert "x86: fix HPET regression in 2.6.26 versus 2.6.25, check hpet against BAR, v3"
This reverts commit a2bd7274b4.

It wasn't really right to begin with (there's a better fix for the
problem with e820 reservations clashing with PCI BAR's pending), but it
also actually causes more regressions, so it should be reverted even
before the better fix is finalized.

Rafael reports that this commit broke AHCI detection, and thus causes
the kernel to not boot on his quad core test box.

Reported-and-bisected-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: David Witbrodt <dawitbro@sbcglobal.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-29 14:46:05 -07:00
Austin Zhang
8cb51ba8e0 crypto: crc32c - Use Intel CRC32 instruction
From NHM processor onward, Intel processors can support hardware accelerated
CRC32c algorithm with the new CRC32 instruction in SSE 4.2 instruction set.
The patch detects the availability of the feature, and chooses the most proper
way to calculate CRC32c checksum.
Byte code instructions are used for compiler compatibility.
No MMX / XMM registers is involved in the implementation.

Signed-off-by: Austin Zhang <austin.zhang@intel.com>
Signed-off-by: Kent Liu <kent.liu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-29 15:49:50 +10:00
Linus Torvalds
e52c8857e0 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: update defconfigs
  x86: msr: fix bogus return values from rdmsr_safe/wrmsr_safe
  x86: cpuid: correct return value on partial operations
  x86: msr: correct return value on partial operations
  x86: cpuid: propagate error from smp_call_function_single()
  x86: msr: propagate errors from smp_call_function_single()
  smp: have smp_call_function_single() detect invalid CPUs
2008-08-28 12:30:59 -07:00
Joe Korty
2c7e9fd4c6 x86: make poll_idle behave more like the other idle methods
Make poll_idle() behave more like the other idle methods.

Currently, poll_idle() returns immediately.  The other
idle methods all wait indefinately for some condition
to come true before returning.  poll_idle should emulate
these other methods and also wait for a return condition,
in this case, for need_resched() to become 'true'.

Without this delay the idle loop spends all of its time
in the outer loop that calls poll_idle.  This outer loop,
these days, does real work, some of it under rcu locks.
That work should only be done when idle is entered and
when idle exits, not continuously while idle is spinning.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-28 11:29:48 +02:00
Hiroshi Shimamoto
a817260874 x86: acpi: move acpi_mcfg_64bit_base_addr into CONFIG_PCI_MMCONFIG
acpi_mcfg_64bit_base_addr is used when CONFIG_PCI_MMCONFIG is enabled.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-27 08:25:06 +02:00
H. Peter Anvin
c1b362e3b4 x86: update defconfigs
Enable some option commonly used by testers in defconfig, including
some very common device drivers and network boot support.  defconfig
is still not meant to be a kitchen-sink configuration.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-27 08:14:17 +02:00
H. Peter Anvin
bdd314616f x86: msr-on-cpu: remove unnecessary level of abstraction
Remove an unnecessary level of abstraction in the msr-on-cpu library.
Although this duplicates some code, the duplicated code is less than
the additional code, and this way should be faster.

Additionally, change the order of the functions to make the regular
structure of this file more obvious.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 22:45:50 -07:00
H. Peter Anvin
94d4ac2f4a Merge branch 'x86/urgent' into x86/cleanups 2008-08-25 22:45:37 -07:00
H. Peter Anvin
9ea2b82ed6 x86: cpuid: correct return value on partial operations
Return the correct return value when the CPUID driver partially
completes a request (we should return the number of bytes actually
read or written, instead of the error code.)

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 17:46:12 -07:00
H. Peter Anvin
85f1cb6015 x86: msr: correct return value on partial operations
Return the correct return value when the MSR driver partially
completes a request (we should return the number of bytes actually
read or written, instead of the error code.)

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 17:46:12 -07:00
H. Peter Anvin
4b46ca701b x86: cpuid: propagate error from smp_call_function_single()
Propagate error (-ENXIO) from smp_call_function_single() in the CPUID
driver.  This can happen when a CPU is unplugged while the CPUID
driver is open.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 17:45:48 -07:00
H. Peter Anvin
c6f31932d0 x86: msr: propagate errors from smp_call_function_single()
Propagate error (-ENXIO) from smp_call_function_single().  These
errors can happen when a CPU is unplugged while the MSR driver is
open.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 17:45:48 -07:00
Linus Torvalds
d25e26b61d [x86] Clean up MAXSMP Kconfig, and limit NR_CPUS to 512
This fixes a regression that was indirectly caused by commit
1184dc2ffe ("x86: modify Kconfig to allow
up to 4096 cpus").

Allowing 4k CPU's is not practical at this time, because we still have a
number of places that have several 'cpumask_t's on the stack, and a
4k-bit cpumask is 512 bytes of stack-space for each such variable.  This
literally caused functions like 'smp_call_function_mask' to have a 2.5kB
stack frame, and several functions to have 2kB stackframes.

With an 8kB stack total, smashing the stack was simply much too likely.
At least bugzilla entry

	http://bugzilla.kernel.org/show_bug.cgi?id=11342

was due to this.

The earlier commit to not inline load_module() into sys_init_module()
fixed the particular symptoms of this that Alan Brunelle saw in that
bugzilla entry, but the huge stack waste by cpumask_t's was the more
direct cause.

Some day we'll have allocation helpers that allocate large CPU masks
dynamically, but in the meantime we simply cannot allow cpumasks this
large.

Cc: Alan D. Brunelle <Alan.Brunelle@hp.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-25 14:15:38 -07:00
Linus Torvalds
ec73adba51 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: add X86_FEATURE_XMM4_2 definitions
  x86: fix cpufreq + sched_clock() regression
  x86: fix HPET regression in 2.6.26 versus 2.6.25, check hpet against BAR, v3
  x86: do not enable TSC notifier if we don't need it
  x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs
  x86: fix: make PCI ECS for AMD CPUs hotplug capable
  x86: fix: do not run code in amd_bus.c on non-AMD CPUs
2008-08-25 11:26:33 -07:00
Avi Kivity
cd5998ebfb KVM: MMU: Fix torn shadow pte
The shadow code assigns a pte directly in one place, which is nonatomic on
i386 can can cause random memory references.  Fix by using an atomic setter.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-08-25 17:24:27 +03:00
Peter Zijlstra
52a8968ce9 x86: fix cpufreq + sched_clock() regression
I noticed that my sched_clock() was slow on a number of machine, so I
started looking at cpufreq.

The below seems to fix the problem for me.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 14:39:19 +02:00
Ingo Molnar
f58899bb02 Merge branch 'linus' into x86/urgent 2008-08-25 14:39:12 +02:00