Commit graph

26255 commits

Author SHA1 Message Date
Ingo Molnar
5df4551551 x86, tsc calibration: fix
my brown paperbag day ...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 23:55:40 +02:00
Andreas Herrmann
23952a96ae x86: cpu_init(): fix memory leak when using CPU hotplug
Exception stacks are allocated each time a CPU is set online.
But the allocated space is never freed. Thus with one CPU hotplug
offline/online cycle there is a memory leak of 24K (6 pages) for
a CPU.

Fix is to allocate exception stacks only once -- when the CPU is
set online for the first time.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:48:16 +02:00
Andreas Herrmann
d04ec773d7 x86: pda_init(): fix memory leak when using CPU hotplug
pda->irqstackptr is allocated whenever a CPU is set online.
But it is never freed. This results in a memory leak of 16K
for each CPU offline/online cycle.

Fix is to allocate pda->irqstackptr only once.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:48:02 +02:00
Eduardo Habkost
e4a6be4d28 x86, xen: Use native_pte_flags instead of native_pte_val for .pte_flags
Using native_pte_val triggers the BUG_ON() in the paravirt_ops
version of pte_flags().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:13:58 +02:00
Jan Beulich
17b746278d x86: pgd_{c,d}tor() cleanup
Giving pgd_ctor() a properly typed parameter allows eliminating a local
variable. Adjust pgd_dtor() to match.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "Jeremy Fitzhardinge" <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 19:47:09 +02:00
Christoph Hellwig
0722bba8f1 x86: kill sys32_pause
It's an unused duplicate of the generic sys_pause.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 18:44:47 +02:00
Yinghai Lu
dd786dd12c x86: move mtrr cpu cap setting early in early_init_xxxx
Krzysztof Helt found MTRR is not detected on k6-2

root cause:
	we moved mtrr_bp_init() early for mtrr trimming,
and in early_detect we only read the CPU capability from cpuid,
so some cpu doesn't have that bit in cpuid.

So we need to add early_init_xxxx to preset those bit before mtrr_bp_init
for those earlier cpus.

this patch is for v2.6.27

Reported-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 17:50:55 +02:00
Krzysztof Helt
12cf105cd6 x86: delay early cpu initialization until cpuid is done
Move early cpu initialization after cpu early get cap so the
early cpu initialization can fix up cpu caps.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 17:50:38 +02:00
Lennert Buytenhek
1ad77a876d [ARM] 5241/1: provide ioremap_wc()
This patch provides an ARM implementation of ioremap_wc().

We use different page table attributes depending on which CPU we
are running on:

- Non-XScale ARMv5 and earlier systems: The ARMv5 ARM documents four
  possible mapping types (CB=00/01/10/11).  We can't use any of the
  cached memory types (CB=10/11), since that breaks coherency with
  peripheral devices.  Both CB=00 and CB=01 are suitable for _wc, and
  CB=01 (Uncached/Buffered) allows the hardware more freedom than
  CB=00, so we'll use that.

  (The ARMv5 ARM seems to suggest that CB=01 is allowed to delay stores
  but isn't allowed to merge them, but there is no other mapping type
  we can use that allows the hardware to delay and merge stores, so
  we'll go with CB=01.)

- XScale v1/v2 (ARMv5): same as the ARMv5 case above, with the slight
  difference that on these platforms, CB=01 actually _does_ allow
  merging stores.  (If you want noncoalescing bufferable behavior
  on Xscale v1/v2, you need to use XCB=101.)

- Xscale v3 (ARMv5) and ARMv6+: on these systems, we use TEXCB=00100
  mappings (Inner/Outer Uncacheable in xsc3 parlance, Uncached Normal
  in ARMv6 parlance).

  The ARMv6 ARM explicitly says that any accesses to Normal memory can
  be merged, which makes Normal memory more suitable for _wc mappings
  than Device or Strongly Ordered memory, as the latter two mapping
  types are guaranteed to maintain transaction number, size and order.
  We use the Uncached variety of Normal mappings for the same reason
  that we can't use C=1 mappings on ARMv5.

  The xsc3 Architecture Specification documents TEXCB=00100 as being
  Uncacheable and allowing coalescing of writes, which is also just
  what we need.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 13:13:44 +01:00
Russell King
fbd3bdb213 [ARM] Convert asm/bitops.h to linux/bitops.h
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 12:13:59 +01:00
Russell King
8029db12ae [ARM] Convert asm/delay.h to linux/delay.h
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 12:11:37 +01:00
Russell King
fced80c735 [ARM] Convert asm/io.h to linux/io.h
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 12:10:45 +01:00
Russell King
33fa9b1328 [ARM] Convert asm/uaccess.h to linux/uaccess.h
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 11:35:55 +01:00
Russell King
5ed5fdf50c [ARM] clean up a load of old declarations
... some of which are now in linux/*.h headers.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 11:23:30 +01:00
Russell King
012d1f4af1 [ARM] move initrd code from kernel/setup.c to mm/init.c
This quietens some sparse warnings about phys_initrd_start and
phys_initrd_size.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 10:57:03 +01:00
Russell King
446616dbb4 [ARM] sparse: quieten arch/arm/kernel/irq.c
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 10:56:27 +01:00
Russell King
1de765c1e9 [ARM] remove pc_pointer()
pc_pointer() was a function to mask the PC for 26-bit ARMs, which
we no longer support.  Remove it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-06 10:35:51 +01:00
Thomas Gleixner
72d43d9bc9 x86: HPET: read back compare register before reading counter
After fixing the u32 thinko I sill had occasional hickups on ATI chipsets
with small deltas. There seems to be a delay between writing the compare
register and the transffer to the internal register which triggers the
interrupt. Reading back the value makes sure, that it hit the internal
match register befor we compare against the counter value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-06 07:21:17 +02:00
Thomas Gleixner
f7676254f1 x86: HPET fix moronic 32/64bit thinko
We use the HPET only in 32bit mode because:
1) some HPETs are 32bit only
2) on i386 there is no way to read/write the HPET atomic 64bit wide

The HPET code unification done by the "moron of the year" did
not take into account that unsigned long is different on 32 and
64 bit.

This thinko results in a possible endless loop in the clockevents
code, when the return comparison fails due to the 64bit/332bit
unawareness. 

unsigned long cnt = (u32) hpet_read() + delta can wrap over 32bit.
but the final compare will fail and return -ETIME causing endless
loops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-06 07:21:17 +02:00
H. Peter Anvin
f31d731e44 x86: use X86_FEATURE_NOPL in alternatives
Use X86_FEATURE_NOPL to determine if it is safe to use P6 NOPs in
alternatives.  Also, replace table and loop with simple if statement.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:14:01 -07:00
H. Peter Anvin
b6734c35af x86: add NOPL as a synthetic CPU feature bit
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum.  Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:13:52 -07:00
H. Peter Anvin
b74b06c5f6 x86: boot: stub out unimplemented CPU feature words
The CPU feature detection code in the boot code is somewhat minimal,
and doesn't include all possible CPUID words.  In particular, it
doesn't contain the code for CPU feature words 2 (Transmeta),
3 (Linux-specific), 5 (VIA), or 7 (scattered).  Zero them out, so we
can still set those bits as known at compile time; in particular, this
allows creating a Linux-specific NOPL flag and have it required (and
therefore resolvable at compile time) in 64-bit mode.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:13:44 -07:00
Linus Torvalds
1c402c8cd1 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: add io delay quirk for Presario F700
2008-09-05 14:36:21 -07:00
Linus Torvalds
6f74b1849b Merge git://git.infradead.org/~dwmw2/dwmw2-2.6.27
* git://git.infradead.org/~dwmw2/dwmw2-2.6.27:
  Revert "[ARM] use the new byteorder headers"
  Fix conditional export of kvh.h and a.out.h to userspace.
  [MTD] [NAND] tmio_nand: fix base address programming
2008-09-05 14:31:54 -07:00
Linus Torvalds
b693ffe673 Merge branch 'sh/for-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  i2c: fix i2c-sh_mobile timing issues
  sh64: resume_kernel fix for kernel oops built with CONFIG_BKL_PREEMPT=y.
  sh: resume_kernel fix for kernel oops built with CONFIG_BKL_PREEMPT=y.
  sh: fix semtimedop syscall
  sh: update AP325RXA defconfig
  sh: update Migo-R defconfig
  sh: fix platform_resource_setup_memory() section mismatch
  sh: fix kexec entry point for crash kernels
  sh: crash kernel resource fix
  sh: fix ptrace_64.c:user_disable_single_step()
  sh64: re-add the __strnlen_user() prototype
2008-09-05 14:30:58 -07:00
Atsushi Nemoto
0011036bee [MIPS] Probe initrd header only if explicitly specified
Currently init_initrd() probes initrd header at the last page of kernel
image, but it is valid only if addinitrd was used.  If addinitrd was not
used, the area contains garbage so probing there might misdetect initrd
header (magic number is not strictly robust).

This patch introduces CONFIG_PROBE_INITRD_HEADER to explicitly enable this
probing.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-09-05 21:24:12 +01:00
Atsushi Nemoto
3885ec8ca2 [MIPS] TX39xx: Add missing local_flush_icache_range initialization
Commmit 59e39ecd933ba49eb6efe84cbfa5597a6c9ef18a ("Fix WARNING: at
kernel/smp.c:290") introduced local_flush_icache_range but lacks
initialization for some TX39 case.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-09-05 21:24:12 +01:00
Atsushi Nemoto
073828d078 [MIPS] TXx9: Fix txx9_pcode initialization
The txx9_pcode variable was introduced in commit
fe1c2bc64f65003b39f331a8e4b0d15b235a4afd ("TXx9: Add 64-bit support")
but was not initialized properly.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-09-05 21:24:12 +01:00
Thomas Bogendoerfer
e0cee3eea7 [MIPS] Fix WARNING: at kernel/smp.c:290
trap_init issues flush_icache_range(), which uses ipi functions to
get icache flushing done on all cpus. But this is done before interrupts
are enabled and caused WARN_ON messages. This changeset introduces
a new local_flush_icache_range() and uses it before interrupts (and
additional CPUs) are enabled to avoid this problem.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-09-05 21:24:11 +01:00
Thomas Bogendoerfer
0510617b85 [MIPS] Fix data bus error recovery
With -ffunction-section the entries in __dbe_table aren't no longer
sorted, so the lookup of exception addresses in do_be() failed for
some addresses. To avoid this we now sort __dbe_table.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-09-05 21:24:11 +01:00
David Woodhouse
e51af66308 x86: blacklist DMAR on Intel G31/G33 chipsets
Some BIOSes (the Intel DG33BU, for example) wrongly claim to have DMAR
when they don't. Avoid the resulting crashes when it doesn't work as
expected.

I'd still be grateful if someone could test it on a DG33BU with the old
BIOS though, since I've killed mine. I tested the DMI version, but not
this one.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 20:20:25 +02:00
Joerg Roedel
cf169702ba x86, gart: add detection of AMD family 0x11 northbridges
This patch adds the detection of the northbridges in the AMD family 0x11
processors. It also fixes the magic numbers there while changing this code.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 19:11:44 +02:00
H. Peter Anvin
5b7e41ff37 x86: additional defconfig updates
Additional updates to the x86 defconfigs.  The goals are, as before:

- Make them usable to testers, more so than distributors or end users,
  both of which are likely to have their own config already.
- Keep 32 and 64 bits as similar as is practical.

Changes:

- Use a more generic CPU type (ppro and generic, respectively).
- Bump number of CPUs to 64 (few if any NR_CPUS arrays left).
- Enable PAT.
- Enable OPTIMIZE_INLINE.
- Enable microcode update support.
- Build SMT scheduler support (in addition to MC).

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 18:57:14 +02:00
Ingo Molnar
616ad8c442 Merge branch 'linus' into x86/defconfig 2008-09-05 18:56:57 +02:00
David Woodhouse
b35de672e7 Revert "[ARM] use the new byteorder headers"
This reverts commit ae82cbfc8b. It
needs the new byteorder headers to be exported to userspace, and
they aren't yet -- and probably shouldn't be, at this point in the
2.6.27 release cycle (or ever, for that matter).

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-05 17:01:50 +01:00
Ingo Molnar
28c3cfd5fb Merge branch 'linus' into x86/tracehook 2008-09-05 17:53:05 +02:00
Alex Nixon
efd327a2d4 x86/paravirt: Remove duplicate paravirt_pagetable_setup_{start, done}()
They were already called once in arch/x86/kernel/setup.c - we don't need to call them again.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:46:44 +02:00
Jeremy Fitzhardinge
110e0358e7 x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious
The CPA test code uses _PAGE_UNUSED1, so make sure its obvious.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:09:57 +02:00
Russell King
09d9bae064 [ARM] sparse: fix several warnings
arch/arm/kernel/process.c:270:6: warning: symbol 'show_fpregs' was not declared. Should it be static?

This function isn't used, so can be removed.

arch/arm/kernel/setup.c:532:9: warning: symbol 'len' shadows an earlier one
arch/arm/kernel/setup.c:524:6: originally declared here

A function containing two 'len's.

arch/arm/mm/fault-armv.c:188:13: warning: symbol 'check_writebuffer_bugs' was not declared. Should it be static?
arch/arm/mm/mmap.c:122:5: warning: symbol 'valid_phys_addr_range' was not declared. Should it be static?
arch/arm/mm/mmap.c:137:5: warning: symbol 'valid_mmap_phys_addr_range' was not declared. Should it be static?

Missing includes.

arch/arm/kernel/traps.c:71:77: warning: Using plain integer as NULL pointer
arch/arm/mm/ioremap.c:355:46: error: incompatible types in comparison expression (different address spaces)

Sillies.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-05 14:11:24 +01:00
FUJITA Tomonori
551b4545bf x86: gart alloc_coherent doesn't need to check NULL device argument
asm/dma-mapping.h guarantees that gart alloc_coherent doesn't get NULL
device argument.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 12:48:13 +02:00
Thomas Gleixner
7cfb043533 HPET: make minimum reprogramming delta useful
The minimum reprogramming delta was hardcoded in HPET ticks,
which is stupid as it does not work with faster running HPETs.
The C1E idle patches made this prominent on AMD/RS690 chipsets,
where the HPET runs with 25MHz. Set it to 5us which seems to be
a reasonable value and fixes the problems on the bug reporters
machines. We have a further sanity check now in the clock events,
which increases the delta when it is not sufficient.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Tested-by: Dmitry Nezhevenko <dion@inhex.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 11:11:54 +02:00
Paul Mundt
dbce1f649e sh64: resume_kernel fix for kernel oops built with CONFIG_BKL_PREEMPT=y.
Follows the SH change.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-09-05 14:51:28 +09:00
Carmelo Amoroso
323b8c410a sh: resume_kernel fix for kernel oops built with CONFIG_BKL_PREEMPT=y.
This patch fixes a problem within the SH implementation of resume_kernel code,
that implements in assembly the bulk of preempt_schedule_irq function without
taking care of the extra code needed to handle the BKL preemptible.

The patch basically consists of removing this asm code and calling the common
C implementation (see kernel/sched.c) as other archs do.

Another change is the missing 'cli' macro invocation at the beginning of
the resume_kernel.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-09-05 14:42:16 +09:00
Lennert Buytenhek
ac840605f3 mv643xx_eth: remove force_phy_addr field
Currently, there are two different fields in the
mv643xx_eth_platform_data struct that together describe the PHY
address -- one field (phy_addr) has the address of the PHY, but if
that address is zero, a second field (force_phy_addr) needs to be
set to distinguish the actual address zero from a zero due to not
having filled in the PHY address explicitly (which should mean
'use the default PHY address').

If we are a bit smarter about the encoding of the phy_addr field,
we can avoid the need for a second field -- this patch does that.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05 06:33:59 +02:00
Lennert Buytenhek
fc0eb9f226 mv643xx_eth: smi sharing is a per-unit property, not a per-port one
Which top-level unit's SMI interface to use should be a property of
the top-level unit, not of the individual ports.  This patch moves the
->shared_smi pointer from the per-port platform data to the global
platform data.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05 06:33:58 +02:00
Jeremy Kerr
b65fe0356b powerpc/spufs: Fix race for a free SPU
We currently have a race for a free SPE. With one thread doing a
spu_yield(), and another doing a spu_activate():

thread 1				thread 2
spu_yield(oldctx)			spu_activate(ctx)
  __spu_deactivate(oldctx)
  spu_unschedule(oldctx, spu)
  spu->alloc_state = SPU_FREE
					spu = spu_get_idle(ctx)
					    - searches for a SPE in
					      state SPU_FREE, gets
					      the context just
					      freed by thread 1
					spu_schedule(ctx, spu)
					  spu->alloc_state = SPU_USED
spu_schedule(newctx, spu)
  - assumes spu is still free
  - tries to schedule context on
    already-used spu

This change introduces a 'free_spu' flag to spu_unschedule, to indicate
whether or not the function should free the spu after descheduling the
context. We only set this flag if we're not going to re-schedule
another context on this SPU.

Add a comment to document this behaviour.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
2008-09-05 10:52:03 +10:00
Jeremy Kerr
9f43e3914d powerpc/spufs: Fix multiple get_spu_context()
Commit 8d5636fbca introduced a reference
count on SPU contexts during find_victim, but this may cause a leak in
the reference count if we later find a better contender for a context to
unschedule.

Change the reference to after we've found our victim context, so we
don't do the extra get_spu_context().

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
2008-09-05 10:51:00 +10:00
Ingo Molnar
4156e9a8ef x86: quick TSC calibration, improve
- make sure the final TSC timestamp is reliable too

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 23:21:57 +02:00
Russell King
65846909d6 [ARM] omap: fix virtual vs physical address space confusions
mcbsp is confused as to what takes a physical or virtual address.
Fix the two instances where it gets it wrong.

Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-04 22:21:19 +01:00
Linus Torvalds
6ac40ed041 x86: quick TSC calibration
Introduce a fast TSC-calibration method on sane hardware.

It only uses 17920 PIT timer ticks to calibrate the TSC, plus 256 ticks on
each side to make sure the TSC values were very close to the tick, so the
whole calibration takes 15ms. Yet, despite only takign 15ms,
we can actually give pretty stringent guarantees of accuracy:

 - the code requires that we hit each 256-counter block at least 50 times,
   so the TSC error is basically at *MOST* just a few PIT cycles off in
   any direction. In practice, it's going to be about one microseconds
   off (which is how long it takes to read the counter)

 - so over 17920 PIT cycles, we can pretty much guarantee that the
   calibration error is less than one half of a percent.

My testing bears this out: on my machine, the quick-calibration reports
2934.085kHz, while the slow one reports 2933.415.

Yes, the slower calibration is still more precise. For me, the slow
calibration is stable to within about one hundreth of a percent, so it's
(at a guess) roughly an order-and-a-half of magnitude more precise. The
longer you wait, the more precise you can be.

However, the nice thing about the fast TSC PIT synchronization is that
it's pretty much _guaranteed_ to give that 0.5% precision, and fail
gracefully (and very quickly) if it doesn't get it. And it really is
fairly simple (even if there's a lot of _details_ there, and I didn't get
all of those right ont he first try or even the second ;)

The patch says "110 insertions", but 63 of those new lines are actually
comments.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/tsc.c |  111 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 110 insertions(+), 1 deletions(-)
2008-09-04 22:54:50 +02:00