linux-uconsole/include
Andrea Arcangeli 17b6ada056 mm: hugetlbfs: fix hugetlbfs optimization
commit 27c73ae759 upstream.

Commit 7cb2ef56e6 ("mm: fix aio performance regression for database
caused by THP") can cause dereference of a dangling pointer if
split_huge_page runs during PageHuge() if there are updates to the
tail_page->private field.

Also it is repeating compound_head twice for hugetlbfs and it is running
compound_head+compound_trans_head for THP when a single one is needed in
both cases.

The new code within the PageSlab() check doesn't need to verify that the
THP page size is never bigger than the smallest hugetlbfs page size, to
avoid memory corruption.

A longstanding theoretical race condition was found while fixing the
above (see the change right after the skip_unlock label, that is
relevant for the compound_lock path too).

By re-establishing the _mapcount tail refcounting for all compound
pages, this also fixes the below problem:

  echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

  BUG: Bad page state in process bash  pfn:59a01
  page:ffffea000139b038 count:0 mapcount:10 mapping:          (null) index:0x0
  page flags: 0x1c00000000008000(tail)
  Modules linked in:
  CPU: 6 PID: 2018 Comm: bash Not tainted 3.12.0+ #25
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  Call Trace:
    dump_stack+0x55/0x76
    bad_page+0xd5/0x130
    free_pages_prepare+0x213/0x280
    __free_pages+0x36/0x80
    update_and_free_page+0xc1/0xd0
    free_pool_huge_page+0xc2/0xe0
    set_max_huge_pages.part.58+0x14c/0x220
    nr_hugepages_store_common.isra.60+0xd0/0xf0
    nr_hugepages_store+0x13/0x20
    kobj_attr_store+0xf/0x20
    sysfs_write_file+0x189/0x1e0
    vfs_write+0xc5/0x1f0
    SyS_write+0x55/0xb0
    system_call_fastpath+0x16/0x1b

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guillaume Morin <guillaume@morinfr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-06 11:08:12 -08:00
..
acpi ACPI / hotplug: Fix conflicted PCI bridge notify handlers 2013-12-04 10:57:04 -08:00
asm-generic mm: fix TLB flush race between migration, and change_protection_range 2014-01-09 12:24:23 -08:00
clocksource clocksource: arch_timer: use virtual counters 2014-01-09 12:24:26 -08:00
crypto crypto: scatterwalk - Use sg_chain_ptr on chain entries 2013-12-11 22:36:29 -08:00
drm drm/radeon: 0x9649 is SUMO2 not SUMO 2014-01-09 12:24:22 -08:00
dt-bindings
keys
linux mm: hugetlbfs: fix hugetlbfs optimization 2014-02-06 11:08:12 -08:00
math-emu
media media: v4l2: added missing mutex.h include to v4l2-ctrls.h 2013-09-26 17:18:26 -07:00
memory
misc
net netfilter: push reasm skb through instead of original frag skbs 2013-12-08 07:29:25 -08:00
pcmcia
ras
rdma
rxrpc
scsi SCSI: Disable WRITE SAME for RAID and virtual host adapter drivers 2013-12-11 22:36:28 -08:00
sound ALSA: memalloc.h - fix wrong truncation of dma_addr_t 2013-12-20 07:45:06 -08:00
target target/file: Update hw_max_sectors based on current block_size 2014-01-09 12:24:20 -08:00
trace tracing: Allow events to have NULL strings 2013-12-04 10:57:17 -08:00
uapi ALSA: compress: Fix 64bit ABI incompatibility 2013-12-20 07:45:06 -08:00
video Merge branch 'fbdev-3.10-fixes' of git://gitorious.org/linux-omap-dss2/linux into linux-fbdev/for-3.10-fixes 2013-05-29 17:00:34 +08:00
xen xenbus: delay xenbus frontend resume if xenstored is not running 2013-05-29 09:04:19 -04:00
Kbuild UAPI: remove empty Kbuild files 2013-04-30 17:04:09 -07:00