Linux kernel for uConsole
Find a file
Michal Hocko 0d73e773ed mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps
[ Upstream commit 7550c60798 ]

Patch series "THP eligibility reporting via proc".

This series of three patches aims at making THP eligibility reporting much
more robust and long term sustainable.  The trigger for the change is a
regression report [2] and the long follow up discussion.  In short the
specific application didn't have good API to query whether a particular
mapping can be backed by THP so it has used VMA flags to workaround that.
These flags represent a deep internal state of VMAs and as such they
should be used by userspace with a great deal of caution.

A similar has happened for [3] when users complained that VM_MIXEDMAP is
no longer set on DAX mappings.  Again a lack of a proper API led to an
abuse.

The first patch in the series tries to emphasise that that the semantic of
flags might change and any application consuming those should be really
careful.

The remaining two patches provide a more suitable interface to address [2]
and provide a consistent API to query the THP status both for each VMA and
process wide as well.  [1]

http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2]
http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
[3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz

This patch (of 3):

Even though vma flags exported via /proc/<pid>/smaps are explicitly
documented to be not guaranteed for future compatibility the warning
doesn't go far enough because it doesn't mention semantic changes to those
flags.  And they are important as well because these flags are a deep
implementation internal to the MM code and the semantic might change at
any time.

Let's consider two recent examples:
http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
: commit e1fb4a0864 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
: removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
: mean time certain customer of ours started poking into /proc/<pid>/smaps
: and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
: flags, the application just fails to start complaining that DAX support is
: missing in the kernel.

http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
: Commit 1860033237 ("mm: make PR_SET_THP_DISABLE immediately active")
: introduced a regression in that userspace cannot always determine the set
: of vmas where thp is ineligible.
: Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps
: to determine if a vma is eligible to be backed by hugepages.
: Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to
: be disabled and emit "nh" as a flag for the corresponding vmas as part of
: /proc/pid/smaps.  After the commit, thp is disabled by means of an mm
: flag and "nh" is not emitted.
: This causes smaps parsing libraries to assume a vma is eligible for thp
: and ends up puzzling the user on why its memory is not backed by thp.

In both cases userspace was relying on a semantic of a specific VMA flag.
The primary reason why that happened is a lack of a proper interface.
While this has been worked on and it will be fixed properly, it seems that
our wording could see some refinement and be more vocal about semantic
aspect of these flags as well.

Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-26 09:32:44 +01:00
arch x86/topology: Use total_cpus for max logical packages calculation 2019-01-26 09:32:41 +01:00
block block: use rcu_work instead of call_rcu to avoid sleep in softirq 2019-01-22 21:40:35 +01:00
certs export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() 2018-08-22 23:21:44 +09:00
crypto crypto: ecc - regularize scalar for scalar multiplication 2019-01-26 09:32:35 +01:00
Documentation mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps 2019-01-26 09:32:44 +01:00
drivers scsi: megaraid: fix out-of-bound array accesses 2019-01-26 09:32:43 +01:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs userfaultfd: clear flag if remap event not enabled 2019-01-26 09:32:43 +01:00
include mm/swap: use nr_node_ids for avail_lists in swap_info_struct 2019-01-26 09:32:43 +01:00
init kbuild: Disable LD_DEAD_CODE_DATA_ELIMINATION with ftrace & GCC <= 4.7 2019-01-22 21:40:34 +01:00
ipc ipc/shm.c: use ERR_CAST() for shm_lock() error return 2018-10-05 16:32:04 -07:00
kernel bpf: relax verifier restriction on BPF_MOV | BPF_ALU 2019-01-26 09:32:39 +01:00
lib fix int_sqrt64() for very large numbers 2019-01-22 21:40:34 +01:00
LICENSES LICENSES: Remove CC-BY-SA-4.0 license text 2018-10-18 11:28:50 +02:00
mm mm/swap: use nr_node_ids for avail_lists in swap_info_struct 2019-01-26 09:32:43 +01:00
net netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine 2019-01-26 09:32:40 +01:00
samples samples: bpf: fix: error handling regarding kprobe_events 2019-01-26 09:32:35 +01:00
scripts kconfig: fix memory leak when EOF is encountered in quotation 2019-01-26 09:32:39 +01:00
security selinux: always allow mounting submounts 2019-01-26 09:32:36 +01:00
sound ALSA: bebob: fix model-id of unit for Apogee Ensemble 2019-01-26 09:32:42 +01:00
tools perf tools: Add missing open_memstream() prototype for systems lacking it 2019-01-26 09:32:41 +01:00
usr initramfs: move gen_initramfs_list.sh from scripts/ to usr/ 2018-08-22 23:21:44 +09:00
virt KVM: arm/arm64: Fix VMID alloc race by reverting to lock-less 2019-01-16 22:04:37 +01:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS MAINTAINERS: Add Sasha as a stable branch maintainer 2018-12-01 09:37:25 +01:00
Makefile Linux 4.19.17 2019-01-22 21:40:38 +01:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.