Linux kernel for uConsole
  • C 97.1%
  • Assembly 1.8%
  • Shell 0.4%
  • Makefile 0.3%
  • Python 0.2%
Find a file
Yang Shi 2686f71fdc mm: thp: handle page cache THP correctly in PageTransCompoundMap
commit 169226f7e0 upstream.

We have a usecase to use tmpfs as QEMU memory backend and we would like
to take the advantage of THP as well.  But, our test shows the EPT is
not PMD mapped even though the underlying THP are PMD mapped on host.
The number showed by /sys/kernel/debug/kvm/largepage is much less than
the number of PMD mapped shmem pages as the below:

  7f2778200000-7f2878200000 rw-s 00000000 00:14 262232 /dev/shm/qemu_back_mem.mem.Hz2hSf (deleted)
  Size:            4194304 kB
  [snip]
  AnonHugePages:         0 kB
  ShmemPmdMapped:   579584 kB
  [snip]
  Locked:                0 kB

  cat /sys/kernel/debug/kvm/largepages
  12

And some benchmarks do worse than with anonymous THPs.

By digging into the code we figured out that commit 127393fbe5 ("mm:
thp: kvm: fix memory corruption in KVM with THP enabled") checks if
there is a single PTE mapping on the page for anonymous THP when setting
up EPT map.  But the _mapcount < 0 check doesn't work for page cache THP
since every subpage of page cache THP would get _mapcount inc'ed once it
is PMD mapped, so PageTransCompoundMap() always returns false for page
cache THP.  This would prevent KVM from setting up PMD mapped EPT entry.

So we need handle page cache THP correctly.  However, when page cache
THP's PMD gets split, kernel just remove the map instead of setting up
PTE map like what anonymous THP does.  Before KVM calls get_user_pages()
the subpages may get PTE mapped even though it is still a THP since the
page cache THP may be mapped by other processes at the mean time.

Checking its _mapcount and whether the THP has PTE mapped or not.
Although this may report some false negative cases (PTE mapped by other
processes), it looks not trivial to make this accurate.

With this fix /sys/kernel/debug/kvm/largepage would show reasonable
pages are PMD mapped by EPT as the below:

  7fbeaee00000-7fbfaee00000 rw-s 00000000 00:14 275464 /dev/shm/qemu_back_mem.mem.SKUvat (deleted)
  Size:            4194304 kB
  [snip]
  AnonHugePages:         0 kB
  ShmemPmdMapped:   557056 kB
  [snip]
  Locked:                0 kB

  cat /sys/kernel/debug/kvm/largepages
  271

And the benchmarks are as same as anonymous THPs.

[yang.shi@linux.alibaba.com: v4]
  Link: http://lkml.kernel.org/r/1571865575-42913-1-git-send-email-yang.shi@linux.alibaba.com
Link: http://lkml.kernel.org/r/1571769577-89735-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: dd78fedde4 ("rmap: support file thp")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Gang Deng <gavin.dg@linux.alibaba.com>
Tested-by: Gang Deng <gavin.dg@linux.alibaba.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12 19:20:36 +01:00
arch arm64: dts: ti: k3-am65-main: Fix gic-its node unit-address 2019-11-10 11:27:56 +01:00
block blk-rq-qos: fix first node deletion of rq_qos_del() 2019-10-29 09:20:09 +01:00
certs export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() 2018-08-22 23:21:44 +09:00
crypto crypto: skcipher - Unmap pages after an external error 2019-10-11 18:20:52 +02:00
Documentation x86/xen: Return from panic notifier 2019-11-06 13:05:55 +01:00
drivers net: hns: Fix the stray netpoll locks causing deadlock in NAPI path 2019-11-12 19:20:33 +01:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs 2019-11-10 11:27:34 +01:00
include mm: thp: handle page cache THP correctly in PageTransCompoundMap 2019-11-12 19:20:36 +01:00
init initramfs: don't free a non-existent initrd 2019-10-01 08:26:09 +02:00
ipc ipc/mqueue.c: only perform resource calculation if user valid 2019-08-06 19:06:52 +02:00
kernel tracing: Fix "gfp_t" format for synthetic events 2019-11-10 11:27:28 +01:00
lib lib: textsearch: fix escapes in example code 2019-10-29 09:19:35 +01:00
LICENSES LICENSES: Remove CC-BY-SA-4.0 license text 2018-10-18 11:28:50 +02:00
mm mm, meminit: recalculate pcpu batch and high limits after init completes 2019-11-12 19:20:35 +01:00
net ipv6: fixes rt6_probe() and fib6_nh->last_probe init 2019-11-12 19:20:33 +01:00
samples samples: bpf: fix: seg fault with NULL pointer arg 2019-11-06 13:05:30 +01:00
scripts scripts/setlocalversion: Improve -dirty check with git-status --no-optional-locks 2019-11-06 13:05:27 +01:00
security ima: fix freeing ongoing ahash_request 2019-10-11 18:21:11 +02:00
sound ALSA: hda/ca0132 - Fix possible workqueue stall 2019-11-12 19:20:35 +01:00
tools selftests/powerpc: Fix compile error on tlbie_test due to newer gcc 2019-11-10 11:27:55 +01:00
usr kbuild: clean compressed initramfs image 2019-10-07 18:57:16 +02:00
virt KVM: coalesced_mmio: add bounds checking 2019-09-21 07:16:44 +02:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS USB: rio500: Remove Rio 500 kernel driver 2019-10-17 13:44:47 -07:00
Makefile Linux 4.19.83 2019-11-10 11:27:57 +01:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.