linux-uconsole/include/asm-generic
Andrea Arcangeli 2d363e959e mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition
commit 26c191788f upstream.

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-10 00:32:57 +09:00
..
bitops bitops: add #ifndef for each of find bitops 2011-05-26 17:12:38 -07:00
4level-fixup.h mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() 2009-07-27 12:10:38 -07:00
atomic-long.h asm-generic: merge branch 'master' of torvalds/linux-2.6 2009-06-12 11:32:58 +02:00
atomic.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic 2010-10-22 11:17:06 -07:00
atomic64.h lib: Provide generic atomic64_t implementation 2009-06-15 13:27:38 +10:00
audit_change_attr.h audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
audit_dir_write.h audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
audit_read.h audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
audit_signal.h
audit_write.h audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
auxvec.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
bitops.h bitops: remove minix bitops from asm/bitops.h 2011-03-23 19:46:22 -07:00
bitsperlong.h asm-generic: introduce asm/bitsperlong.h 2009-06-11 21:02:14 +02:00
bug.h bug.h: Move ratelimit warn interfaces to ratelimit.h 2011-05-26 15:00:31 -04:00
bugs.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
cache.h asm-generic: add generic NOMMU versions of some headers 2009-06-11 21:02:50 +02:00
cacheflush.h asm-generic/cacheflush.h: flush icache when copying to user pages 2011-05-25 08:39:37 -07:00
checksum.h add generic lib/checksum.c 2009-06-11 21:02:51 +02:00
cmpxchg-local.h Fix IRQ flag handling naming 2010-10-07 14:08:55 +01:00
cmpxchg.h Add cmpxchg_local to asm-generic for per cpu atomic operations 2008-02-07 08:42:30 -08:00
cputime.h time: Add nsecs_to_cputime64 interface for asm-generic 2011-01-26 12:33:20 +01:00
current.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
delay.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
device.h Driver Core: Add platform device arch data V3 2009-07-22 00:28:38 +02:00
div64.h rename div64_64 to div64_u64 2008-05-01 08:03:58 -07:00
dma-coherent.h generic: per-device coherent dma allocator 2008-06-30 12:51:05 +02:00
dma-mapping-broken.h dma-mapping: remove dma_is_consistent API 2010-08-11 08:59:21 -07:00
dma-mapping-common.h dma-mapping: remove unnecessary sync_single_range_* in dma_map_ops 2010-05-27 09:12:52 -07:00
dma.h asm-generic: add legacy I/O header files 2009-06-11 21:02:42 +02:00
emergency-restart.h
errno-base.h
errno.h mm: make __get_user_pages return -EHWPOISON for HWPOISON page optionally 2011-03-17 13:08:27 -03:00
fb.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
fcntl.h New kind of open files - "location only". 2011-03-15 02:21:45 -04:00
ftrace.h asm-generic headers: add ftrace.h 2011-03-17 09:19:04 +08:00
futex.h futex: Sanitize futex ops argument types 2011-03-11 12:23:31 +01:00
getorder.h asm-generic: rename page.h and uaccess.h 2009-06-11 21:02:17 +02:00
gpio.h gpio: add GPIOF_ values regardless on kconfig settings 2011-06-16 08:40:52 -06:00
hardirq.h Fix IRQ flag handling naming 2010-10-07 14:08:55 +01:00
hw_irq.h asm-generic: add legacy I/O header files 2009-06-11 21:02:42 +02:00
ide_iops.h
int-l64.h asm-generic: introduce asm/bitsperlong.h 2009-06-11 21:02:14 +02:00
int-ll64.h asm-generic: introduce asm/bitsperlong.h 2009-06-11 21:02:14 +02:00
io.h asm-generic: fix inX/outX functions for architectures that have PCI 2011-03-17 09:19:03 +08:00
ioctl.h Make ioctl.h compatible with userland 2008-08-12 16:07:31 -07:00
ioctls.h tty: add TIOCVHANGUP to allow clean tty shutdown of all ttys 2011-02-17 14:16:30 -08:00
iomap.h generic: add ioremap_wc() interface wrapper 2008-04-24 23:40:47 +02:00
ipcbuf.h asm-generic: add generic sysv ipc headers 2009-06-11 21:02:15 +02:00
irq.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
irq_regs.h core: Replace __get_cpu_var with __this_cpu_read if not used for an address. 2010-12-17 15:07:19 +01:00
irqflags.h Fix IRQ flag handling naming 2010-10-07 14:08:55 +01:00
Kbuild include: replace unifdef-y with header-y 2010-08-14 22:26:51 +02:00
Kbuild.asm include: replace unifdef-y with header-y 2010-08-14 22:26:51 +02:00
kdebug.h asm-generic: kdebug.h: Checkpatch cleanup 2010-10-09 21:51:44 +02:00
kmap_types.h include/asm-generic/kmap_types.h: add helpful reminder 2010-05-25 08:07:03 -07:00
libata-portmap.h
linkage.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
local.h local_t: Remove cpu_local_xx macros 2010-01-05 15:34:49 +09:00
local64.h arch: Implement local64_t 2010-06-09 11:12:36 +02:00
memory_model.h tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
mm_hooks.h
mman-common.h thp: mm: define MADV_NOHUGEPAGE 2011-01-13 17:32:47 -08:00
mman.h mm: add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions 2009-09-22 07:17:41 -07:00
mmu.h asm-generic: add generic NOMMU versions of some headers 2009-06-11 21:02:50 +02:00
mmu_context.h asm-generic: add generic NOMMU versions of some headers 2009-06-11 21:02:50 +02:00
module.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
msgbuf.h asm-generic: add generic sysv ipc headers 2009-06-11 21:02:15 +02:00
mutex-dec.h mutex: speed up generic mutex implementations 2008-10-23 09:18:20 -07:00
mutex-null.h
mutex-xchg.h mutex: speed up generic mutex implementations 2008-10-23 09:18:20 -07:00
mutex.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
page.h asm-generic: add generic NOMMU versions of some headers 2009-06-11 21:02:50 +02:00
param.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
parport.h asm-generic: add legacy I/O header files 2009-06-11 21:02:42 +02:00
pci-dma-compat.h dma-mapping: pci: move pci_set_dma_mask and pci_set_consistent_dma_mask to pci-dma-compat.h 2010-03-12 15:52:42 -08:00
pci.h PCI: remove pcibios_scan_all_fns() 2009-09-09 13:29:18 -07:00
percpu.h percpu: Optimize __get_cpu_var() 2010-09-10 10:56:51 +02:00
pgalloc.h asm-generic: add generic NOMMU versions of some headers 2009-06-11 21:02:50 +02:00
pgtable-nopmd.h mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() 2009-07-27 12:10:38 -07:00
pgtable-nopud.h mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() 2009-07-27 12:10:38 -07:00
pgtable.h mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition 2012-06-10 00:32:57 +09:00
poll.h epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() 2012-02-29 16:34:34 -08:00
posix_types.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
ptrace.h asm-generic/ptrace.h: start a common low level ptrace helper 2011-05-26 17:12:36 -07:00
resource.h ulimit: raise default hard ulimit on number of files to 4096 2011-05-25 08:39:43 -07:00
rtc.h asm-generic: make get_rtc_time overridable 2009-06-11 21:02:18 +02:00
scatterlist.h asm-generic: remove ARCH_HAS_SG_CHAIN in scatterlist.h 2010-05-27 09:12:54 -07:00
sections.h x86: Separate out entry text section 2011-03-08 17:22:11 +01:00
segment.h asm-generic: add generic NOMMU versions of some headers 2009-06-11 21:02:50 +02:00
sembuf.h asm-generic: add generic sysv ipc headers 2009-06-11 21:02:15 +02:00
serial.h asm-generic: add legacy I/O header files 2009-06-11 21:02:42 +02:00
setup.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
shmbuf.h asm-generic: add generic sysv ipc headers 2009-06-11 21:02:15 +02:00
shmparam.h asm-generic: add generic sysv ipc headers 2009-06-11 21:02:15 +02:00
siginfo.h Fix common misspellings 2011-03-31 11:26:23 -03:00
signal-defs.h asm-generic: rename termios.h, signal.h and mman.h 2009-06-11 21:01:52 +02:00
signal.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
sizes.h asm-generic headers: add sizes.h 2011-03-17 09:19:04 +08:00
socket.h net: Generalize socket rx gap / receive queue overflow cmsg 2009-10-12 13:26:31 -07:00
sockios.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
spinlock.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
stat.h asm-generic/stat.h: support 64-bit file time_t for stat() 2010-11-01 15:31:29 -04:00
statfs.h asm-generic: Use __BITS_PER_LONG in statfs.h 2012-05-21 09:39:58 -07:00
string.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
swab.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
syscall.h asm-generic: syscall_get_nr returns int 2009-09-22 19:56:50 -07:00
syscalls.h Fix the declaration of sys_execve() in asm-generic/syscalls.h 2010-08-18 12:12:38 -07:00
system.h asm-generic: cmpxchg does not handle non-long arguments 2010-10-09 21:51:27 +02:00
termbits.h tty: Add EXTPROC support for LINEMODE 2010-08-10 13:47:39 -07:00
termios-base.h asm-generic: rename termios.h, signal.h and mman.h 2009-06-11 21:01:52 +02:00
termios.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
timex.h asm-generic: add legacy I/O header files 2009-06-11 21:02:42 +02:00
tlb.h mm: uninline large generic tlb.h functions 2011-05-25 08:39:20 -07:00
tlbflush.h asm-generic: add generic NOMMU versions of some headers 2009-06-11 21:02:50 +02:00
topology.h topology: alternate fix for ia64 tiger_defconfig build breakage 2010-08-09 20:44:57 -07:00
types.h add the common dma_addr_t typedef to include/linux/types.h 2011-03-22 17:44:09 -07:00
uaccess-unaligned.h asm-generic: rename page.h and uaccess.h 2009-06-11 21:02:17 +02:00
uaccess.h asm-generic headers: add arch-specific __strnlen_user calling in uaccess.h 2011-03-17 09:19:05 +08:00
ucontext.h asm-generic: add generic ABI headers 2009-06-11 21:02:15 +02:00
unaligned.h asm-generic: add generic versions of common headers 2009-06-11 21:02:37 +02:00
unistd.h compat: use sys_sendfile64() implementation for sendfile syscall 2012-04-02 09:27:21 -07:00
user.h asm-generic/user.h: Fix spelling in comment 2011-03-01 15:49:39 +01:00
vga.h asm-generic: add legacy I/O header files 2009-06-11 21:02:42 +02:00
vmlinux.lds.h Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2011-05-24 11:53:42 -07:00
xor.h sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00