linux-uconsole/kernel
Laurent Dufour 6d8fc36d9e FROMLIST: mm: protect mm_rb tree with a rwlock
This change is inspired by the Peter's proposal patch [1] which was
protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
that particular case, and it is introducing major performance degradation
due to excessive scheduling operations.

To allow access to the mm_rb tree without grabbing the mmap_sem, this patch
is protecting it access using a rwlock.  As the mm_rb tree is a O(log n)
search it is safe to protect it using such a lock.  The VMA cache is not
protected by the new rwlock and it should not be used without holding the
mmap_sem.

To allow the picked VMA structure to be used once the rwlock is released, a
use count is added to the VMA structure. When the VMA is allocated it is
set to 1.  Each time the VMA is picked with the rwlock held its use count
is incremented. Each time the VMA is released it is decremented. When the
use count hits zero, this means that the VMA is no more used and should be
freed.

This patch is preparing for 2 kind of VMA access :
 - as usual, under the control of the mmap_sem,
 - without holding the mmap_sem for the speculative page fault handler.

Access done under the control the mmap_sem doesn't require to grab the
rwlock to protect read access to the mm_rb tree, but access in write must
be done under the protection of the rwlock too. This affects inserting and
removing of elements in the RB tree.

The patch is introducing 2 new functions:
 - vma_get() to find a VMA based on an address by holding the new rwlock.
 - vma_put() to release the VMA when its no more used.
These services are designed to be used when access are made to the RB tree
without holding the mmap_sem.

When a VMA is removed from the RB tree, its vma->vm_rb field is cleared and
we rely on the WMB done when releasing the rwlock to serialize the write
with the RMB done in a later patch to check for the VMA's validity.

When free_vma is called, the file associated with the VMA is closed
immediately, but the policy and the file structure remained in used until
the VMA's use count reach 0, which may happens later when exiting an
in progress speculative page fault.

[1] https://patchwork.kernel.org/patch/5108281/

Change-Id: I9ecc922b8efa4b28975cc6a8e9531284c24ac14e
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Link: https://lore.kernel.org/lkml/1523975611-15978-18-git-send-email-ldufour@linux.vnet.ibm.com/
Bug: 161210518
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
2021-01-22 18:00:57 +00:00
..
bpf bpf: Save correct stopping point in file seq iteration 2021-01-19 18:27:28 +01:00
cgroup Merge 5.10.5 into android12-5.10 2021-01-10 12:19:03 +01:00
configs
debug kdb: Fix pager search for multi-line strings 2020-10-01 14:44:08 +01:00
dma ANDROID: iommu/dma: Add support for DMA_ATTR_SYS_CACHE_ONLY_NWA 2021-01-13 18:27:04 +00:00
entry entry: Fix the incorrect ordering of lockdep and RCU check 2020-11-04 18:06:14 +01:00
events Merge 5.10.6 into android12-5.10 2021-01-13 10:28:55 +01:00
gcov gcov: add support for GCC 10.1 2020-09-11 09:33:54 -07:00
irq UPSTREAM: driver core: Add fwnode_init() 2021-01-21 18:02:19 -08:00
kcsan kernel/: fix repeated words in comments 2020-10-16 11:11:19 -07:00
livepatch kernel/: fix repeated words in comments 2020-10-16 11:11:19 -07:00
locking Merge 5.10.6 into android12-5.10 2021-01-13 10:28:55 +01:00
power Merge 65b55d4c85 ("Merge tag 'arm-soc-fixes-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc") into android-mainline 2020-11-02 11:04:03 +01:00
printk ANDROID: printk: Export symbols for loadable modules 2020-12-23 23:53:42 +00:00
rcu rcu-tasks: Move RCU-tasks initialization to before early_initcall() 2021-01-19 18:27:28 +01:00
sched ANDROID: sched: add em_cpu_energy vendor hook 2021-01-21 08:38:56 -08:00
time Merge 5.10.5 into android12-5.10 2021-01-10 12:19:03 +01:00
trace This is the 5.10.9 stable release 2021-01-19 18:49:54 +01:00
.gitignore
acct.c kernel: acct.c: fix some kernel-doc nits 2020-10-16 11:11:19 -07:00
async.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
audit.c audit: Remove redundant null check 2020-08-26 09:10:39 -04:00
audit.h audit: change unnecessary globals into statics 2020-08-17 20:26:58 -04:00
audit_fsnotify.c fsnotify: generalize handle_inode_event() 2020-12-30 11:54:18 +01:00
audit_tree.c fsnotify: generalize handle_inode_event() 2020-12-30 11:54:18 +01:00
audit_watch.c fsnotify: generalize handle_inode_event() 2020-12-30 11:54:18 +01:00
auditfilter.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
auditsc.c audit/stable-5.9 PR 20200803 2020-08-04 14:20:26 -07:00
backtracetest.c treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() 2020-07-30 11:15:58 -07:00
bounds.c
capability.c LSM: Signal to SafeSetID when setting group IDs 2020-10-13 09:17:34 -07:00
cfi.c ANDROID: add support for Clang's Control Flow Integrity (CFI) 2021-01-14 16:28:57 +00:00
compat.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
configs.c
context_tracking.c context_tracking: Ensure that the critical path cannot be instrumented 2020-06-11 15:14:36 +02:00
cpu.c ANDROID: cpuhp/pause: add trace points for pause and resume 2020-12-23 22:53:26 +00:00
cpu_pm.c notifier: Fix broken error handling pattern 2020-09-01 09:58:03 +02:00
crash_core.c kdump: append kernel build-id string to VMCOREINFO 2020-08-12 10:58:01 -07:00
crash_dump.c
cred.c
delayacct.c
dma.c
exec_domain.c
exit.c don't dump the threads that had been already exiting when zapped. 2020-10-28 16:39:49 -04:00
extable.c
fail_function.c fail_function: Remove a redundant mutex unlock 2020-11-19 11:58:16 -08:00
fork.c FROMLIST: mm: protect mm_rb tree with a rwlock 2021-01-22 18:00:57 +00:00
freezer.c
futex.c Merge 9cfd9c4599 ("Merge tag 'char-misc-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc") into android-mainline 2020-11-16 08:41:35 +01:00
gen_kheaders.sh kbuild: add variables for compression tools 2020-06-06 23:42:01 +09:00
groups.c LSM: Signal to SafeSetID when setting group IDs 2020-10-13 09:17:34 -07:00
hung_task.c kernel/hung_task.c: make type annotations consistent 2020-11-02 12:14:19 -08:00
iomem.c
irq_work.c ANDROID: Sched: Export scheduler symbols needed by vendor modules 2020-12-03 16:50:04 +00:00
jump_label.c kernel/: fix repeated words in comments 2020-10-16 11:11:19 -07:00
kallsyms.c ANDROID: kallsyms: cfi: strip hashes from static functions 2021-01-14 16:31:46 +00:00
kcmp.c exec: Transform exec_update_mutex into a rw_semaphore 2021-01-09 13:46:24 +01:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: make some symbols static 2020-08-12 10:58:02 -07:00
kexec.c LSM: Introduce kernel_post_load_data() hook 2020-10-05 13:37:03 +02:00
kexec_core.c kernel/: fix repeated words in comments 2020-10-16 11:11:19 -07:00
kexec_elf.c
kexec_file.c kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED 2020-10-16 11:11:18 -07:00
kexec_internal.h
kheaders.c
kmod.c kmod: remove redundant "be an" in the comment 2020-08-12 10:58:01 -07:00
kprobes.c kprobes: Tell lockdep about kprobe nesting 2020-11-04 09:46:06 -05:00
ksysfs.c
kthread.c ANDROID: kthread: cfi: disable callback pointer check with modules 2021-01-14 16:31:11 +00:00
latencytop.c
Makefile ANDROID: add support for Clang's Control Flow Integrity (CFI) 2021-01-14 16:28:57 +00:00
module-internal.h
module.c ANDROID: add support for Clang's Control Flow Integrity (CFI) 2021-01-14 16:28:57 +00:00
module_signature.c
module_signing.c
notifier.c notifier: Fix broken error handling pattern 2020-09-01 09:58:03 +02:00
nsproxy.c nsproxy: support CLONE_NEWTIME with setns() 2020-07-08 11:14:22 +02:00
padata.c padata: fix possible padata_works_lock deadlock 2020-09-04 17:51:55 +10:00
panic.c panic: don't dump stack twice on warn 2020-11-14 11:26:04 -08:00
params.c params: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
pid.c Merge 5.10.6 into android12-5.10 2021-01-13 10:28:55 +01:00
pid_namespace.c kernel/: fix repeated words in comments 2020-10-16 11:11:19 -07:00
profile.c
ptrace.c ptrace: Set PF_SUPERPRIV when checking capability 2020-11-17 12:53:22 -08:00
range.c kernel.h: split out min()/max() et al. helpers 2020-10-16 11:11:19 -07:00
reboot.c Merge e28c0d7c92 ("Merge branch 'akpm' (patches from Andrew)") into android-mainline 2020-11-15 14:37:09 +01:00
regset.c regset: kill ->get() 2020-07-27 14:31:12 -04:00
relay.c kernel/relay.c: drop unneeded initialization 2020-10-16 11:11:22 -07:00
resource.c kernel/resource: make iomem_resource implicit in release_mem_region_adjustable() 2020-10-16 11:11:18 -07:00
rseq.c
scftorture.c scftorture: Add cond_resched() to test loop 2020-08-24 18:38:38 -07:00
scs.c UPSTREAM: scs: switch to vmapped shadow stacks 2021-01-07 17:56:54 -08:00
seccomp.c UPSTREAM: seccomp: Remove bogus __user annotations 2020-12-21 18:48:41 +00:00
signal.c ANDROID: mm: oom_kill: reap memory of a task that receives SIGKILL 2021-01-06 21:11:02 +00:00
smp.c ANDROID: fix 0-day build-break for non-GKI 2021-01-14 11:39:55 -08:00
smpboot.c
smpboot.h
softirq.c ANDROID: softirq: Export irq_handler_exit tracepoint 2020-12-21 17:48:06 +00:00
stackleak.c stackleak: let stack_erasing_sysctl take a kernel pointer buffer 2020-09-19 13:13:39 -07:00
stacktrace.c stacktrace: Remove reliable argument from arch_stack_walk() callback 2020-09-18 14:24:16 +01:00
static_call.c static_call: Fix return type of static_call_init 2020-10-02 21:18:25 +02:00
stop_machine.c ANDROID: stop_machine: stop_one_cpu_async 2020-12-08 19:07:21 +00:00
sys.c Linux 5.10-rc1 2020-10-29 06:32:38 +01:00
sys_ni.c mm/madvise: introduce process_madvise() syscall: an external memory hinting API 2020-10-18 09:27:10 -07:00
sysctl-test.c
sysctl.c Linux 5.9-rc6 2020-09-21 12:13:45 +02:00
task_work.c task_work: cleanup notification modes 2020-10-17 15:05:30 -06:00
taskstats.c taskstats: move specifying netlink policy back to ops 2020-10-02 19:11:12 -07:00
test_kprobes.c
torture.c torture: Dump ftrace at shutdown only if requested 2020-06-29 12:01:45 -07:00
tracepoint.c tracepoint: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c usermodehelper: reset umask to default before executing user process 2020-10-06 10:31:52 -07:00
up.c
user-return-notifier.c
user.c user.c: make uidhash_table static 2020-06-04 19:06:24 -07:00
user_namespace.c kernel/: fix repeated words in comments 2020-10-16 11:11:19 -07:00
usermode_driver.c umd: Stop using split_argv 2020-07-07 11:58:59 -05:00
utsname.c
utsname_sysctl.c
watch_queue.c watch_queue: Limit the number of watches a user can hold 2020-08-17 09:39:18 -07:00
watchdog.c kernel/watchdog: fix watchdog_allowed_mask not used warning 2020-11-14 11:26:03 -08:00
watchdog_hld.c
workqueue.c UPSTREAM: workqueue: kasan: record workqueue stack 2021-01-19 21:47:26 -08:00
workqueue_internal.h