linux-uconsole/mm
Yisheng Xie 03489bfc78 mlock: fix mlock count can not decrease in race condition
commit 70feee0e1e upstream.

Kefeng reported that when running the follow test, the mlock count in
meminfo will increase permanently:

 [1] testcase
 linux:~ # cat test_mlockal
 grep Mlocked /proc/meminfo
  for j in `seq 0 10`
  do
 	for i in `seq 4 15`
 	do
 		./p_mlockall >> log &
 	done
 	sleep 0.2
 done
 # wait some time to let mlock counter decrease and 5s may not enough
 sleep 5
 grep Mlocked /proc/meminfo

 linux:~ # cat p_mlockall.c
 #include <sys/mman.h>
 #include <stdlib.h>
 #include <stdio.h>

 #define SPACE_LEN	4096

 int main(int argc, char ** argv)
 {
	 	int ret;
	 	void *adr = malloc(SPACE_LEN);
	 	if (!adr)
	 		return -1;

	 	ret = mlockall(MCL_CURRENT | MCL_FUTURE);
	 	printf("mlcokall ret = %d\n", ret);

	 	ret = munlockall();
	 	printf("munlcokall ret = %d\n", ret);

	 	free(adr);
	 	return 0;
	 }

In __munlock_pagevec() we should decrement NR_MLOCK for each page where
we clear the PageMlocked flag.  Commit 1ebb7cc6a5 ("mm: munlock: batch
NR_MLOCK zone state updates") has introduced a bug where we don't
decrement NR_MLOCK for pages where we clear the flag, but fail to
isolate them from the lru list (e.g.  when the pages are on some other
cpu's percpu pagevec).  Since PageMlocked stays cleared, the NR_MLOCK
accounting gets permanently disrupted by this.

Fix it by counting the number of page whose PageMlock flag is cleared.

Fixes: 1ebb7cc6a5 (" mm: munlock: batch NR_MLOCK zone state updates")
Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joern Engel <joern@logfs.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: zhongjiang <zhongjiang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07 12:06:01 +02:00
..
kasan kasan: update kasan_global for gcc 7 2016-12-08 07:15:24 +01:00
backing-dev.c block: fix double-free in the failure path of cgwb_bdi_init() 2017-02-26 11:07:51 +01:00
balloon_compaction.c virtio_balloon: fix race between migration and ballooning 2016-03-03 15:07:18 -08:00
bootmem.c bootmem: avoid freeing to bootmem after bootmem is done 2015-09-08 15:35:28 -07:00
cleancache.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
cma.c mm/cma: silence warnings due to max() usage 2016-11-10 16:36:36 +01:00
cma.h mm: cma: mark cma_bitmap_maxno() inline in header 2015-08-14 15:56:32 -07:00
cma_debug.c mm/cma_debug: correct size input to bitmap function 2015-07-17 16:39:54 -07:00
compaction.c mm, compaction: prevent VM_BUG_ON when terminating freeing scanner 2016-08-10 11:49:25 +02:00
debug-pagealloc.c mm/debug-pagealloc: make debug-pagealloc boottime configurable 2014-12-13 12:42:48 -08:00
debug.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
dmapool.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
early_ioremap.c mm/early_ioremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
fadvise.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
failslab.c mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM 2015-11-06 17:50:42 -08:00
filemap.c mm: do not access page->mapping directly on page_endio 2017-03-12 06:37:25 +01:00
frame_vector.c mm: fix docbook comment for get_vaddr_frames() 2015-11-05 19:34:48 -08:00
frontswap.c frontswap: allow multiple backends 2015-06-24 17:49:45 -07:00
gup.c mm: remove gup_flags FOLL_WRITE games from __get_user_pages() 2016-10-20 10:00:47 +02:00
highmem.c
huge_memory.c mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp 2017-05-25 14:30:16 +02:00
hugetlb.c mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd() 2017-04-08 09:53:32 +02:00
hugetlb_cgroup.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
hwpoison-inject.c hwpoison: use page_cgroup_ino for filtering by memcg 2015-09-10 13:29:01 -07:00
init-mm.c mm: Add a user_ns owner to mm_struct and fix ptrace permission checks 2017-01-06 11:16:11 +01:00
internal.h mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask 2016-08-10 11:49:25 +02:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
Kconfig mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
Kconfig.debug mm/debug_pagealloc: remove obsolete Kconfig options 2015-01-08 15:10:52 -08:00
kmemcheck.c
kmemleak-test.c
kmemleak.c mm/kmemleak.c: remove unneeded initialization of object to NULL 2015-11-05 19:34:48 -08:00
ksm.c mm,ksm: fix endless looping in allocating memory when ksm enable 2016-10-07 15:23:40 +02:00
list_lru.c mm/list_lru.c: avoid error-path NULL pointer deref 2016-11-10 16:36:32 +01:00
maccess.c mm/maccess.c: actually return -EFAULT from strncpy_from_unsafe 2015-11-05 19:34:48 -08:00
madvise.c mm: madvise allow remove operation for hugetlbfs 2015-09-08 15:35:28 -07:00
Makefile media updates for v4.3-rc1 2015-09-11 16:42:39 -07:00
memblock.c mm/memblock: make memblock_remove_range() static 2015-11-05 19:34:48 -08:00
memcontrol.c mm: memcontrol: avoid unused function warning 2017-03-18 19:09:57 +08:00
memory-failure.c mm/migrate: fix refcount handling when !hugepage_migration_supported() 2017-06-07 12:06:01 +02:00
memory.c numa: fix /proc/<pid>/numa_maps for THP 2016-05-04 14:48:49 -07:00
memory_hotplug.c base/memory, hotplug: fix a kernel oops in show_valid_zones() 2017-02-09 08:02:47 +01:00
mempolicy.c mm/mempolicy.c: fix error handling in set_mempolicy and mbind. 2017-04-12 12:38:35 +02:00
mempool.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
memtest.c memtest: remove unused header files 2015-09-08 15:35:28 -07:00
migrate.c mm: Export migrate_page_move_mapping and migrate_page_copy 2016-07-27 09:47:31 -07:00
mincore.c mm/mincore: use offset_in_page macro 2015-11-05 19:34:48 -08:00
mlock.c mlock: fix mlock count can not decrease in race condition 2017-06-07 12:06:01 +02:00
mm_init.c mm: meminit: remove mminit_verify_page_links 2015-06-30 19:44:56 -07:00
mmap.c mm: fix regression in remap_file_pages() emulation 2016-02-25 12:01:21 -08:00
mmu_context.c
mmu_notifier.c mmu-notifier: add clear_young callback 2015-09-10 13:29:01 -07:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx 2015-09-04 16:54:41 -07:00
mremap.c mm/mremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
msync.c mm/msync: use offset_in_page macro 2015-11-05 19:34:48 -08:00
nobootmem.c mm: page_alloc: pass PFN to __free_pages_bootmem 2015-06-30 19:44:55 -07:00
nommu.c mm/nommu.c: drop unlikely inside BUG_ON() 2015-11-05 19:34:48 -08:00
oom_kill.c mm/oom_kill.c: avoid attempting to kill init sharing same memory 2015-12-12 10:15:34 -08:00
page-writeback.c writeback: use higher precision calculation in domain_dirty_limits() 2016-07-27 09:47:29 -07:00
page_alloc.c mm/page_alloc: fix nodes for reclaim in fast path 2017-03-12 06:37:25 +01:00
page_counter.c mm: page_counter: let page_counter_try_charge() return bool 2015-11-05 19:34:48 -08:00
page_ext.c mm: introduce idle page tracking 2015-09-10 13:29:01 -07:00
page_idle.c mm: introduce idle page tracking 2015-09-10 13:29:01 -07:00
page_io.c fs: use helper bio_add_page() instead of open coding on bi_io_vec 2015-08-13 12:32:00 -06:00
page_isolation.c mm: fix invalid node in alloc_migrate_target() 2016-04-20 15:41:53 +09:00
page_owner.c mm/page_owner: set correct gfp_mask on page_owner 2015-07-17 16:39:54 -07:00
pagewalk.c mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers 2015-03-25 16:20:30 -07:00
percpu-km.c
percpu-vm.c
percpu.c percpu: acquire pcpu_lock when updating pcpu_nr_empty_pop_pages 2017-03-26 12:13:20 +02:00
pgtable-generic.c mm,thp: khugepaged: call pte flush at the time of collapse 2016-02-25 12:01:23 -08:00
process_vm_access.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 12:01:16 -08:00
quicklist.c
readahead.c mm, fs: introduce mapping_gfp_constraint() 2015-11-06 17:50:42 -08:00
rmap.c mm: page migration use migration entry for swapcache too 2015-11-05 19:34:48 -08:00
shmem.c tmpfs: fix regression hang in fallocate undo 2016-07-27 09:47:40 -07:00
slab.c slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slab.h slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slab_common.c mm: memcontrol: fix cgroup creation failure after many small jobs 2016-08-16 09:30:51 +02:00
slob.c slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slub.c slub/memcg: cure the brainless abuse of sysfs attributes 2017-06-07 12:06:01 +02:00
sparse-vmemmap.c
sparse.c
swap.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_state.c mm: swap: zswap: maybe_preload & refactoring 2015-09-08 15:35:28 -07:00
swapfile.c swapfile: fix memory corruption via malformed swapfile 2016-11-18 10:48:34 +01:00
truncate.c memcg: add per cgroup dirty page accounting 2015-06-02 08:33:33 -06:00
userfaultfd.c userfaultfd: avoid mmap_sem read recursion in mcopy_atomic 2015-09-04 16:54:41 -07:00
util.c proc: revert /proc/<pid>/maps [stack:TID] annotation 2016-09-15 08:27:46 +02:00
vmacache.c mm/vmacache: inline vmacache_valid_mm() 2015-11-05 19:34:48 -08:00
vmalloc.c mm: vmalloc: don't remove inexistent guard hole in remove_vm_area() 2015-11-20 16:17:32 -08:00
vmpressure.c mm: vmpressure: fix sending wrong events on underflow 2017-03-12 06:37:25 +01:00
vmscan.c mm/vmscan.c: set correct defer count for shrinker 2017-01-06 11:16:14 +01:00
vmstat.c vmstat: allocate vmstat_wq before it is used 2016-01-08 23:47:54 -08:00
workingset.c mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page() 2016-10-28 03:01:34 -04:00
zbud.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zpool.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zsmalloc.c zsmalloc: fix zs_can_compact() integer overflow 2016-05-18 17:06:44 -07:00
zswap.c zswap: disable changing params if init fails 2017-02-09 08:02:45 +01:00