linux-uconsole/mm
Wu Fengguang 7c083ba91b readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM
commit 0141450f66 upstream.

This fixes inefficient page-by-page reads on POSIX_FADV_RANDOM.

POSIX_FADV_RANDOM used to set ra_pages=0, which leads to poor performance:
a 16K read will be carried out in 4 _sync_ 1-page reads.

In other places, ra_pages==0 means
- it's ramfs/tmpfs/hugetlbfs/sysfs/configfs
- some IO error happened
where multi-page read IO won't help or should be avoided.

POSIX_FADV_RANDOM actually want a different semantics: to disable the
*heuristic* readahead algorithm, and to use a dumb one which faithfully
submit read IO for whatever application requests.

So introduce a flag FMODE_RANDOM for POSIX_FADV_RANDOM.

Note that the random hint is not likely to help random reads performance
noticeably.  And it may be too permissive on huge request size (its IO
size is not limited by read_ahead_kb).

In Quentin's report (http://lkml.org/lkml/2009/12/24/145), the overall
(NFS read) performance of the application increased by 313%!

Tested-by: Quentin Barnes <qbarnes+nfs@yahoo-inc.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: <qbarnes+nfs@yahoo-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-15 08:49:37 -07:00
..
allocpercpu.c percpu: use dynamic percpu allocator as the default percpu allocator 2009-06-24 15:13:35 +09:00
backing-dev.c Thaw refrigerated bdi flusher threads before invoking kthread_stop on them 2009-11-12 13:08:11 +01:00
bootmem.c kmemleak: Do not report alloc_bootmem blocks as leaks 2009-08-27 14:29:17 +01:00
bounce.c block: remove some includings of blktrace_api.h 2009-06-16 11:19:36 +02:00
debug-pagealloc.c generic debug pagealloc 2009-04-01 08:59:13 -07:00
dmapool.c dmapools: protect page_list walk in show_pools() 2009-06-30 18:56:00 -07:00
fadvise.c readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM 2010-03-15 08:49:37 -07:00
failslab.c kmemtrace, mm: fix slab.h dependency problem in mm/failslab.c 2009-04-03 12:23:01 +02:00
filemap.c mm: flush dcache before writing into page to avoid alias 2010-02-09 04:50:59 -08:00
filemap_xip.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
fremap.c Do not account for the address space used by hugetlbfs using VM_ACCOUNT 2009-02-10 10:48:42 -08:00
highmem.c highmem: Fix debug_kmap_atomic() to also handle KM_IRQ_PTE, KM_NMI, and KM_NMI_PTE 2009-11-10 04:15:47 +01:00
hugetlb.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
hwpoison-inject.c HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs 2009-09-16 11:50:17 +02:00
init-mm.c mm: consolidate init_mm definition 2009-06-16 19:47:28 -07:00
internal.h ksm: fix mlockfreed to munlocked 2010-01-06 15:05:22 -08:00
Kconfig NOMMU: Optimise away the {dac_,}mmap_min_addr tests 2010-01-06 15:04:30 -08:00
Kconfig.debug trivial: improve help text for mm debug config options 2009-09-21 15:14:57 +02:00
kmemcheck.c kmemcheck: add hooks for the page allocator 2009-06-15 15:48:33 +02:00
kmemleak-test.c percpu: clean up percpu variable definitions 2009-06-24 15:13:48 +09:00
kmemleak.c kmemleak: Check for NULL pointer returned by create_object() 2009-10-09 13:28:47 -07:00
ksm.c ksm: fix mlockfreed to munlocked 2010-01-06 15:05:22 -08:00
maccess.c [S390] maccess: add weak attribute to probe_kernel_write 2009-06-12 10:27:37 +02:00
madvise.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2009-09-24 07:53:22 -07:00
Makefile procfs: disable per-task stack usage on NOMMU 2009-09-24 17:11:24 -07:00
memcontrol.c memcg: ensure list is empty at rmdir 2010-01-22 15:18:01 -08:00
memory-failure.c hwpoison: fix oops on ksm pages 2009-10-29 07:39:24 -07:00
memory.c mm: sigbus instead of abusing oom 2009-12-18 14:05:51 -08:00
memory_hotplug.c mm: allow memory hotplug and hibernation in the same kernel 2009-11-17 17:40:33 -08:00
mempolicy.c do_mbind(): fix memory leak 2009-10-29 07:39:29 -07:00
mempool.c mm: remove broken 'kzalloc' mempool 2009-09-22 07:17:35 -07:00
migrate.c Fix potential crash with sys_move_pages 2010-02-23 07:37:42 -08:00
mincore.c mm: hugetlb: fix hugepage memory leak in mincore() 2009-12-18 14:04:29 -08:00
mlock.c ksm: fix mlockfreed to munlocked 2010-01-06 15:05:22 -08:00
mm_init.c mm: mminit_loglevel cannot be __meminitdata anymore 2008-08-20 15:40:30 -07:00
mmap.c untangle the do_mremap() mess 2010-01-18 10:19:11 -08:00
mmu_context.c mm: reduce atomic use on use_mm fast path 2009-09-22 07:17:42 -07:00
mmu_notifier.c ksm: add mmu_notifier set_pte_at_notify() 2009-09-22 07:17:31 -07:00
mmzone.c [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 2009-05-18 11:22:24 +01:00
mprotect.c perf: Do the big rename: Performance Counters -> Performance Events 2009-09-21 14:28:04 +02:00
mremap.c untangle the do_mremap() mess 2010-01-18 10:19:11 -08:00
msync.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
nommu.c NOMMU: Don't pass NULL pointers to fput() in do_mmap_pgoff() 2009-10-31 12:11:37 -07:00
oom_kill.c memcg: fix oom killing a child process in an other cgroup 2010-03-15 08:49:33 -07:00
page-writeback.c writeback: account IO throttling wait as iowait 2009-10-09 12:40:42 +02:00
page_alloc.c mm: fix migratetype bug which slowed swapping 2010-02-09 04:50:49 -08:00
page_cgroup.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
page_io.c mm: remove file argument from swap_readpage() 2009-06-16 19:47:44 -07:00
page_isolation.c memory hotplug: fix page_zone() calculation in test_pages_isolated() 2008-11-06 15:41:19 -08:00
pagewalk.c mm: hugetlb: fix hugepage memory leak in walk_page_range() 2009-12-18 14:04:30 -08:00
percpu.c percpu: restructure pcpu_extend_area_map() to fix bugs and improve readability 2009-11-13 00:55:35 +09:00
prio_tree.c
quicklist.c cpumask: use new-style cpumask ops in mm/quicklist. 2009-09-24 09:34:52 +09:30
readahead.c readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM 2010-03-15 08:49:37 -07:00
rmap.c mm/rmap.c: fix comment 2009-10-01 16:11:12 -07:00
shmem.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
shmem_acl.c shmfs: use 'check_acl' instead of 'permission' 2009-09-08 11:08:46 -07:00
slab.c slab: initialize unused alien cache entry as NULL at alloc_alien_cache(). 2010-03-15 08:49:36 -07:00
slob.c slab: remove duplicate kmem_cache_init_late() declarations 2009-08-06 11:36:25 +03:00
slub.c mm: kmem_cache_create(): make it easier to catch NULL cache names 2009-09-22 07:17:33 -07:00
sparse-vmemmap.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
sparse.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
swap.c mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
swap_state.c mm: add_to_swap_cache() does not return -EEXIST 2009-09-22 07:17:35 -07:00
swapfile.c mm: remove incorrect swap_count() from try_to_unuse() 2009-11-02 09:44:41 -08:00
thrash.c mm: pass mm to grab_swap_token 2009-06-23 12:50:05 -07:00
truncate.c vfs: Fix vmtruncate() regression 2010-01-22 15:18:41 -08:00
util.c untangle the do_mremap() mess 2010-01-18 10:19:11 -08:00
vmalloc.c mm: purge fragmented percpu vmap blocks 2010-02-09 04:50:58 -08:00
vmscan.c vmscan: do not evict inactive pages when skipping an active list scan 2010-01-06 15:05:21 -08:00
vmstat.c mm: vmstat: add isolate pages 2009-09-22 07:17:29 -07:00