Triple-fault and keyboard reset may assert INIT instead of RESET; however
INIT is blocked when Intel VT is enabled. This leads to a partially reset
machine when invoking emergency_restart via sysrq-b: the processor is still
working but other parts of the system are dead.
Default to rebooting via ACPI, which correctly asserts RESET and reboots the
machine.
This is safe since we will fall back to keyboard reset and triple fault if
acpi is not enabled or if the reset is not successful.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
David Witbrodt tracked down (and bisected) a hpet bootup hang on his
system to the following problem: a BIOS bug made the hpet device
visible as a generic PCI device. If e820 reserved entries happen to
be registered first in the resource tree [which v2.6.26 started doing],
then the PCI code will reallocate that device's BAR to some other
address - breaking timer IRQs and hanging the system.
( Normally hpet devices are hidden by the BIOS from the OS's PCI
discovery via chipset magic. Sometimes the hpet is not a PCI device
at all. )
Solve this fundamental fragility by making non-PCI platform drivers
insert resources into the resource tree even if it overlaps the e820
reserved entry, to keep the resource manager from updating the BAR.
Also do these checks for the ioapic and mmconfig addresses, and emit
a warning if this happens.
Bisected-by: David Witbrodt <dawitbro@sbcglobal.net>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Tested-by: David Witbrodt <dawitbro@sbcglobal.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: crash on non-TSC-equipped CPUs
Don't enable the TSC notifier if we *either*:
1. don't have a CPU, or
2. have a CPU with constant TSC.
In either of those cases, the notifier is either damaging (1) or useless(2).
From: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This patch lets the files using linux/version.h match the files that
#include it.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During CPU hot-remove the sysfs directory created by
threshold_create_bank(), defined in
arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before
its parent directory, created by mce_create_device(), defined in
arch/x86/kernel/cpu/mcheck/mce_64.c . Moreover, when the CPU in
question is hotplugged again, obviously the latter has to be created
before the former. At present, the right ordering is not enforced,
because all of these operations are carried out by CPU hotplug
notifiers which are not appropriately ordered with respect to each
other. This leads to serious problems on systems with two or more
multicore AMD CPUs, among other things during suspend and hibernation.
Fix the problem by placing threshold bank CPU hotplug callbacks in
mce_cpu_callback(), so that they are invoked at the right places,
if defined. Additionally, use kobject_del() to remove the sysfs
directory associated with the kobject created by
kobject_create_and_add() in threshold_create_bank(), to prevent the
kernel from crashing during CPU hotplug operations on systems with
two or more multicore AMD CPUs.
This patch fixes bug #11337.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jan Beulich wrote:
> Even worse - this would even try to access the MSR on non-AMD CPUs
> (currently probably prevented just by the fact that only AMD ones use
> family values of 0x10 or higher).
This patch adds cpu vendor check to the postcore_initcalls.
Reported-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix the start addr for free_memtype calls in the error path.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: work around MTRR mask setting, v2
x86: fix section mismatch warning - uv_cpu_init
x86: fix VMI for early params
x86: fix two modpost warnings in mm/init_64.c
x86: fix 1:1 mapping init on 64-bit (memory hotplug case)
x86: work around MTRR mask setting
x86: PAT Update validate_pat_support for intel CPUs
devmem, x86: PAT Change /dev/mem mmap with O_SYNC to use UC_MINUS
x86: PAT proper tracking of set_memory_uc and friends
x86: fix BUG: unable to handle kernel paging request (numaq_tsc_disable)
x86: export pv_lock_ops non-GPL
x86, mmiotrace: silence section mismatch warning - leave_uniprocessor
x86: use WARN() in arch/x86/kernel
x86: use WARN() in arch/x86/mm/ioremap.c
werror: fix pci calgary
x86: fix oprofile + hibernation badness
x86, SGI UV: hardcode the TLB flush interrupt system vector
x86: fix Xorg startup/shutdown slowdown with PAT
x86: fix "kernel won't boot on a Cyrix MediaGXm (Geode)"
x86 iommu: remove unneeded parenthesis
improve the debug printout:
- make it actually display something
- print it only once
would be nice to have a WARN_ONCE() facility, to feed such things to
kerneloops.org.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
WARNING: vmlinux.o(.cpuinit.text+0x3cc4): Section mismatch in reference from the function uv_cpu_init() to the function .init.text:uv_system_init()
The function __cpuinit uv_cpu_init() references
a function __init uv_system_init().
If uv_system_init is only used by uv_cpu_init then
annotate uv_system_init with a matching annotation.
uv_system_init was ment to be called only once, so do it from codepath
(native_smp_prepare_cpus) which is called once, right before activation
of other cpus (smp_init).
Note: old code relied on uv_node_to_blade being initialized to 0,
but it'a not initialized from anywhere.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
alloc_coherent dma_ops callback was added to GART, however, it doesn't
return a size aligned address wrt dma_alloc_coherent, as
DMA-mapping.txt defines. This patch fixes it.
This patch also removes unused gart_map_simple
(dma_mapping_ops->map_simple has gone).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pci-dma.c doesn't use map_simple hook any more so we can remove it
from struct dma_mapping_ops now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All the x86 DMA-API functions are defined in asm/dma-mapping.h. This patch
moves the dma_*_coherent functions also to this header file because they are
now small enough to do so.
This is done as a separate patch because it also includes some renaming and
restructuring of the dma-mapping.h file.
Signed-off-by: Joerg Roedel <joerg.roede@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All dma_ops implementations support the alloc_coherent and free_coherent
callbacks now. This allows a big simplification of the dma_alloc_coherent
function which is done with this patch. The dma_free_coherent functions is also
cleaned up and calls now the free_coherent callback of the dma_ops
implementation.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ v2 - x86: make gart_alloc_coherent return zeroed memory
FUJITA Tomonori pointed it out that the dma_alloc_coherent function
should return memory set to zero. This patch adds this to the GART
implementation too. ]
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
while fixing a different bug i moved the call to vmi_init before
early params could be parsed.
This broke the vmi specific commandline parameters.
Fix that, by moving vmi initialization after kernel has got a chance to
parse early parameters.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
early_io{re,un}map() are __init and hence can't be called from __meminit
functions.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While I don't have a hotplug capable system at hand, I think two issues need
fixing:
- pud_phys (in kernel_physical_ampping_init()) would remain uninitialized in
the after_bootmem case
- the locking done just around phys_pmd_{init,update}() would leave out pgd
updates, and it was needlessly covering code portions that do allocations
(perhaps using a more friendly gfp value in alloc_low_page() would then be
possible)
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Actually, might as well simply reconstruct the memtype list at free time
I guess. How is this for a coalescing version of the array functions?
Compiles, boots and provides me with:
root@7ixe4:~# wc -l /debug/x86/pat_memtype_list
53 /debug/x86/pat_memtype_list
otherwise (down from 16384+).
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new set_memory_array_{uc,wb}() pass virtual addresses to
{reserve,free}_memtype() it seems.
Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Joshua Hoblitt reported that only 3 GB of his 16 GB of RAM is
usable. Booting with mtrr_show showed us the BIOS-initialized
MTRR settings - which are all wrong.
So the root cause is that the BIOS has not set the mask correctly:
> [ 0.429971] MSR00000200: 00000000d0000000
> [ 0.433305] MSR00000201: 0000000ff0000800
> should be ==> [ 0.433305] MSR00000201: 0000003ff0000800
>
> [ 0.436638] MSR00000202: 00000000e0000000
> [ 0.439971] MSR00000203: 0000000fe0000800
> should be ==> [ 0.439971] MSR00000203: 0000003fe0000800
>
> [ 0.443304] MSR00000204: 0000000000000006
> [ 0.446637] MSR00000205: 0000000c00000800
> should be ==> [ 0.446637] MSR00000205: 0000003c00000800
>
> [ 0.449970] MSR00000206: 0000000400000006
> [ 0.453303] MSR00000207: 0000000fe0000800
> should be ==> [ 0.453303] MSR00000207: 0000003fe0000800
>
> [ 0.456636] MSR00000208: 0000000420000006
> [ 0.459970] MSR00000209: 0000000ff0000800
> should be ==> [ 0.459970] MSR00000209: 0000003ff0000800
So detect this borkage and add the prefix 111.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch changes the pfn args from 'u32' to 'unsigned long'
on alloc_p*() functions on paravirt_ops, and the corresponding
implementations for Xen and VMI. The prototypes for CONFIG_PARAVIRT=n
are already using unsigned long, so paravirt.h now matches the prototypes
on asm-x86/pgalloc.h.
It shouldn't result in any changes on generated code on 32-bit, with
or without CONFIG_PARAVIRT. On both cases, 'codiff -f' didn't show any
change after applying this patch.
On 64-bit, there are (expected) binary changes only when CONFIG_PARAVIRT
is enabled, as the patch is really supposed to change the size of the
pfn args.
[ v2: KVM_GUEST: use the right parameter type on kvm_release_pt() ]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We changed the interface of some pageattr, this patch makes
pageattr-test compile.
Signed-off-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add array interface APIs of pageattr. page based cache flush is quite
slow for a lot of pages. If pages are more than 1024 (4M), the patch
will use a wbinvd(). We have a simple test here (run a 3d game - open
arena), nearly all agp memory allocation are small (< 1M), so suppose
this will not impact runtime performance.
Signed-off-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pentium III and Core Solo/Duo CPUs have an erratum
" Page with PAT set to WC while associated MTRR is UC may consolidate to UC "
which can result in WC setting in PAT to be ineffective. We will disable
PAT on such CPUs, so that we can continue to use MTRR WC setting.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All kernel mappings like ioremap(), etc uses UC_MINUS as the type. /dev/mem
mappings with /dev/mem being opened with O_SYNC however was using UC,
resulting in a conflict with /dev/mem mmap failing. This seems to be
affecting some apps (one being flashrom) which are using O_SYNC and which were
working before.
Switch /dev/mem with O_SYNC also to UC_MINUS.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Big thinko in pat memtype tracking code. reserve_memtype should be called
with physical address and not virtual address.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This section mismatch:
>> Seems to be a section mismatch; init_intel() is __cpuinit while
>> numaq_tsc_disable() is __init. Seems to be introduced in:
>>
>> commit 64898a8bad
>> Author: Yinghai Lu <yhlu.kernel@gmail.com>
>> Date: Sat Jul 19 18:01:16 2008 -0700
>>
>> x86: extend and use x86_quirks to clean up NUMAQ code
>
> Oops, I am wrong about numaq_tsc_disable() being __init. Still, I
> believe that Yinghai might be able to say what's really wrong :-)
Would lead to this crash:
BUG: unable to handle kernel paging request at c08a45f0
IP: [<c08a45f0>] numaq_tsc_disable+0x0/0x40
Fixed by the patch below.
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
None of the spinlock API is exported GPL, so there's no reason for
pv_lock_ops to be.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: drago01 <drago01@gmail.com>
WARNING: vmlinux.o(.text+0x180af): Section mismatch in reference from the function leave_uniprocessor() to the function .cpuinit.text:cpu_up()
The function leave_uniprocessor() references
the function __cpuinit cpu_up().
This is often because leave_uniprocessor lacks a __cpuinit
annotation or the annotation of cpu_up is wrong.
leave_uniprocessor calls cpu_up only when CONFIG_HOTPLUG_CPU is set,
so it can be safely annotated as __ref
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pekka Paalanen <pq@iki.fi>
Booting kernel with vmalloc=[any size<=16m] will oops on my pc (i386/1G memory).
BUG_ON in arch/x86/mm/init_32.c triggered:
BUG_ON((unsigned long)high_memory > VMALLOC_START);
It's due to the vm area hole.
In include/asm-x86/pgtable_32.h:
#define VMALLOC_OFFSET (8 * 1024 * 1024)
#define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
& ~(VMALLOC_OFFSET - 1))
There's several related point:
1. MAXMEM :
(-__PAGE_OFFSET - __VMALLOC_RESERVE).
The space after VMALLOC_END is included as well, I set it to
(VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)
2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE
fixed by adding VMALLOC_OFFSET to it.
3. VMALLOC_START :
(((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1))
So it's not always 8M, bigger than 8M possible.
I set it to ((unsigned long)high_memory + VMALLOC_OFFSET)
4. the VMALLOC_RESERVE is an unused macro, so remove it here.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: akpm@linux-foundation.org
Cc: hidave.darkstar@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Use WARN() instead of a printk+WARN_ON() pair; this way the message
becomes part of the warning section for better reporting/collection.
This also allowed the folding of some if()'s into the WARN()
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use WARN() instead of a printk+WARN_ON() pair; this way the message
becomes part of the warning section for better reporting/collection.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix an integer comparison always false warning in the PCI Calgary 64 driver.
A u8 is being compared to something that's 512 by default, resulting in the
following warning:
arch/x86/kernel/pci-calgary_64.c:1285: warning: comparison is always false due to limited range of data type
This was introduced by patch b34e90b8f0.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Vegard Nossum reported oprofile + hibernation problems:
> Now some warnings:
>
> ------------[ cut here ]------------
> WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s
> mp_call_function_mask+0x194/0x1a0()
The usual problem: the suspend function when interrupts are
already disabled calls smp_call_function which is not allowed with
interrupt off. But at this point all the other CPUs should be already
down anyways, so it should be enough to just drop that.
This patch should fix that problem at least by fixing cpu hotplug&
suspend support.
[ mingo@elte.hu: fixed 5 coding style errors. ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>