Commit d015fe995 'powerpc/cell/axon-msi: Retry on missing interrupt'
has turned a rare failure to kexec on QS22 into a reproducible
error, which we have now analysed.
The problem is that after a kexec, the MSIC hardware still points
into the middle of the old ring buffer. We set up the ring buffer
during reboot, but not the offset into it. On older kernels, this
would cause a storm of thousands of spurious interrupts after a
kexec, which would most of the time get dropped silently.
With the new code, we time out on each interrupt, waiting for
it to become valid. If more interrupts come in that we time
out on, this goes on indefinitely, which eventually leads to
a hard crash.
The solution in this commit is to read the current offset from
the MSIC when reinitializing it. This now works correctly, as
expected.
Reported-by: Dirk Herrendoerfer <d.herrendoerfer@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
careful_allocation() was calling into the bootmem allocator for
nodes which had not been fully initialized and caused a previous
bug: http://patchwork.ozlabs.org/patch/10528/ So, I merged a
few broken out loops in do_init_bootmem() to fix it. That changed
the code ordering.
I think this bug is triggered by having reserved areas for a node
which are spanned by another node's contents. In the
mark_reserved_regions_for_nid() code, we attempt to reserve the
area for a node before we have allocated the NODE_DATA() for that
nid. We do this since I reordered that loop. I suck.
This is causing crashes at bootup on some systems, as reported
by Jon Tollefson.
This may only present on some systems that have 16GB pages
reserved. But, it can probably happen on any system that is
trying to reserve large swaths of memory that happen to span other
nodes' contents.
This commit ensures that we do not touch bootmem for any node which
has not been initialized, and also removes a compile warning about
an unused variable.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It looks like most of the hugetlb code is doing the correct thing if
hugepages are not supported, but the mmap code is not. If we get into
the mmap code when hugepages are not supported, such as in an LPAR
which is running Active Memory Sharing, we can oops the kernel. This
fixes the oops being seen in this path.
oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: nfs(N) lockd(N) nfs_acl(N) sunrpc(N) ipv6(N) fuse(N) loop(N)
dm_mod(N) sg(N) ibmveth(N) sd_mod(N) crc_t10dif(N) ibmvscsic(N)
scsi_transport_srp(N) scsi_tgt(N) scsi_mod(N)
Supported: No
NIP: c000000000038d60 LR: c00000000003945c CTR: c0000000000393f0
REGS: c000000077e7b830 TRAP: 0300 Tainted: G
(2.6.27.5-bz50170-2-ppc64)
MSR: 8000000000009032 <EE,ME,IR,DR> CR: 44000448 XER: 20000001
DAR: c000002000af90a8, DSISR: 0000000040000000
TASK = c00000007c1b8600[4019] 'hugemmap01' THREAD: c000000077e78000 CPU: 6
GPR00: 0000001fffffffe0 c000000077e7bab0 c0000000009a4e78 0000000000000000
GPR04: 0000000000010000 0000000000000001 00000000ffffffff 0000000000000001
GPR08: 0000000000000000 c000000000af90c8 0000000000000001 0000000000000000
GPR12: 000000000000003f c000000000a73880 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000010000
GPR20: 0000000000000000 0000000000000003 0000000000010000 0000000000000001
GPR24: 0000000000000003 0000000000000000 0000000000000001 ffffffffffffffb5
GPR28: c000000077ca2e80 0000000000000000 c00000000092af78 0000000000010000
NIP [c000000000038d60] .slice_get_unmapped_area+0x6c/0x4e0
LR [c00000000003945c] .hugetlb_get_unmapped_area+0x6c/0x80
Call Trace:
[c000000077e7bbc0] [c00000000003945c] .hugetlb_get_unmapped_area+0x6c/0x80
[c000000077e7bc30] [c000000000107e30] .get_unmapped_area+0x64/0xd8
[c000000077e7bcb0] [c00000000010b140] .do_mmap_pgoff+0x140/0x420
[c000000077e7bd80] [c00000000000bf5c] .sys_mmap+0xc4/0x140
[c000000077e7be30] [c0000000000086b4] syscall_exit+0x0/0x40
Instruction dump:
fac1ffb0 fae1ffb8 fb01ffc0 fb21ffc8 fb41ffd0 fb61ffd8 fb81ffe0 fbc1fff0
fbe1fff8 f821fef1 f8c10158 f8e10160 <7d49002e> f9010168 e92d01b0 eb4902b0
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5348/1: fix documentation wrt location of the alignment trap interface
[ARM] Ensure linux/hardirqs.h is included where required
[ARM] fix kernel-doc syntax
[ARM] arch/arm/common/sa1111.c: Correct error handling code
[ARM] 5341/2: there is no copy_page on nommu ARM
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
Phonet: keep TX queue disabled when the device is off
SCHED: netem: Correct documentation comment in code.
netfilter: update rwlock initialization for nat_table
netlabel: Compiler warning and NULL pointer dereference fix
e1000e: fix double release of mutex
IA64: HP_SIMETH needs to depend upon NET
netpoll: fix race on poll_list resulting in garbage entry
ipv6: silence log messages for locally generated multicast
sungem: improve ethtool output with internal pcs and serdes
tcp: tcp_vegas cong avoid fix
sungem: Make PCS PHY support partially work again.
Otherwise those using it in transition patches (eg. kvm) can't compile
with CONFIG_SMP=n:
arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function 'make_all_cpus_request':
arch/x86/kvm/../../../virt/kvm/kvm_main.c:380: error: implicit declaration of function 'smp_call_function_many'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a cgroup is removed, it's unlinked from its parent's children list,
but not actually freed until the last dentry on it is released (at which
point cgrp->root->number_of_cgroups is decremented).
Currently rebind_subsystems checks for the top cgroup's child list being
empty in order to rebind subsystems into or out of a hierarchy - this can
result in the set of subsystems bound to a hierarchy being
removed-but-not-freed cgroup.
The simplest fix for this is to forbid remounts that change the set of
subsystems on a hierarchy that has removed-but-not-freed cgroups. This
bug can be reproduced via:
mkdir /mnt/cg
mount -t cgroup -o ns,freezer cgroup /mnt/cg
mkdir /mnt/cg/foo
sleep 1h < /mnt/cg/foo &
rmdir /mnt/cg/foo
mount -t cgroup -o remount,ns,devices,freezer cgroup /mnt/cg
kill $!
Though the above will cause oops in -mm only but not mainline, but the bug
can cause memory leak in mainline (and even oops)
Signed-off-by: Paul Menage <menage@google.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tyler Hicks and Dustin Kirkland are now the primary contact points for
eCryptfs issues that may arise from this point forward.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Acked-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Acked-by: Dustin Kirkland <kirkland@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kmem_cache_create() function in the slob allocator passes the SLAB
flags as GFP flags to the slob_alloc() function. The patch changes this
call to pass GFP_KERNEL as the other allocators seem to do.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add device definition and support functions for the
second i2c device (i2c1). If this is selected, the first
i2c bus will become index 0 instead of index -1.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Provide the initial register definitions for the newer
style of framebuffer cores found in the Samsung SoCs
such as S3C2450, S3C64XX.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Fix the compilation of the SDHCI configuration/setup
functions to depend on their respective configuration
variables.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
The function s3c64xx_init_io was missing from <plat/cpu.h>
and was masked by the SMDK6410 having an local definition.
Fix by removing the SMDK6410 variant and adding it to the
relevant <plat/cpu.h> file.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Add the specific register definitions for the Samsung SDHCI
(HSMMC) block for the S3C2443 and S3C64XX series.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
The clk_fout_epll clock wasn't registered as part of the initial clock
work, which can cause problems if it is used by one of the hardware
blocks.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
The gpiolib driver keeps its chip array to itself
and having a separate array for s3c-only gpios stops
any non-s3c gpio being used in one of the s3c specific
configuration calls.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Move the definition for the hsmmc device to plat-s3c
to be shared between the s3c24xx and s3c64xx platforms.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Add definitions for the external interrupt groups which accompany
the original IRQ_EINT from the s3c24xx series.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Discard the 'void *' from the pointers used for the
virtual addresses when setting up the .virtual fields
of the io map to avoid implicit cast warnings
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Index: linux.git/arch/arm/plat-s3c64xx/cpu.c
===================================================================
Some of the startup output can be reduced to
KERN_DEBUG from KERN_INFO as it is only really
useful when trying to debug kernel initialisation
problems.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>