shrink_inactive_list() scans in sc->swap_cluster_max chunks until it hits
the scan limit it was passed.
shrink_inactive_list()
{
do {
isolate_pages(swap_cluster_max)
shrink_page_list()
} while (nr_scanned < max_scan);
}
This assumes that swap_cluster_max is not bigger than the scan limit
because the latter is checked only after at least one iteration.
In shrink_all_memory() sc->swap_cluster_max is initialized to the overall
reclaim goal in the beginning but not decreased while reclaim is making
progress which leads to subsequent calls to shrink_inactive_list()
reclaiming way too much in the one iteration that is done unconditionally.
Set sc->swap_cluster_max always to the proper goal before doing
shrink_all_zones()
shrink_list()
shrink_inactive_list().
While the current shrink_all_memory() happily reclaims more than actually
requested, this patch fixes it to never exceed the goal:
unpatched
wanted=10000 reclaimed=13356
wanted=10000 reclaimed=19711
wanted=10000 reclaimed=10289
wanted=10000 reclaimed=17306
wanted=10000 reclaimed=10700
wanted=10000 reclaimed=10004
wanted=10000 reclaimed=13301
wanted=10000 reclaimed=10976
wanted=10000 reclaimed=10605
wanted=10000 reclaimed=10088
wanted=10000 reclaimed=15000
patched
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=9599
wanted=10000 reclaimed=8476
wanted=10000 reclaimed=8326
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=9919
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=9624
wanted=10000 reclaimed=10000
wanted=10000 reclaimed=10000
wanted=8500 reclaimed=8092
wanted=316 reclaimed=316
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: MinChan Kim <minchan.kim@gmail.com>
Acked-by: Nigel Cunningham <ncunningham@crca.org.au>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit a79311c14e "vmscan: bail out of
direct reclaim after swap_cluster_max pages" moved the nr_reclaimed
counter into the scan control to accumulate the number of all reclaimed
pages in a reclaim invocation.
shrink_all_memory() can use the same mechanism. it increase code
consistency and redability.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit bf3f3bc5e7 (mm: don't
mark_page_accessed in fault path) only remove the mark_page_accessed() in
filemap_fault().
Therefore, swap-backed pages and file-backed pages have inconsistent
behavior. mark_page_accessed() should be removed from do_swap_page().
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: cleanup
In almost cases, for_each_zone() is used with populated_zone(). It's
because almost function doesn't need memoryless node information.
Therefore, for_each_populated_zone() can help to make code simplify.
This patch has no functional change.
[akpm@linux-foundation.org: small cleanup]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sc.may_swap does not only influence reclaiming of anon pages but pages
mapped into pagetables in general, which also includes mapped file pages.
In shrink_page_list():
if (!sc->may_swap && page_mapped(page))
goto keep_locked;
For anon pages, this makes sense as they are always mapped and reclaiming
them always requires swapping.
But mapped file pages are skipped here as well and it has nothing to do
with swapping.
The real effect of the knob is whether mapped pages are unmapped and
reclaimed or not. Rename it to `may_unmap' to have its name match its
actual meaning more precisely.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: MinChan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no need to call for int_sqrt if argument is 0.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vmap's dirty_list is unused. It's for optimizing flushing. but Nick
didn't write the code yet. so, we don't need it until time as it is
needed.
This patch removes vmap_block's dirty_list and codes related to it.
Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In case if start_pfn overlap the upper bound no need to test end_pfn again
since we have it already trimmed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct tty_operations::proc_fops took it's place and there is one less
create_proc_read_entry() user now!
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Used for gradual switch of TTY drivers from using ->read_proc which helps
with gradual switch from ->read_proc for the whole tree.
As side effect, fix possible race condition when ->data initialized after
PDE is hooked into proc tree.
->proc_fops takes precedence over ->read_proc.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 29a814d2ee (vfs: add hooks for
ext4's delayed allocation support) exported the following functions
mpage_bio_submit()
__mpage_writepage()
for the benefit of ext4's delayed allocation support. Since commit
a1d6cc563b (ext4: Rework the
ext4_da_writepages() function), these functions are not used by the
ext4 driver anymore. However, the now unnecessary exports still
remain, and this patch removes those. Moreover, these two functions
can become static again.
The issue was spotted by namespacecheck.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... since we don't tell anyone which descriptor does the file get.
We used to, but only in case of ELF binary with a.out loader and
that stuff has been gone for a while.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Don't pull it in sched.h; very few files actually need it and those
can include directly. sched.h itself only needs forward declaration
of struct fs_struct;
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* all changes of current->fs are done under task_lock and write_lock of
old fs->lock
* refcount is not atomic anymore (same protection)
* its decrements are done when removing reference from current; at the
same time we decide whether to free it.
* put_fs_struct() is gone
* new field - ->in_exec. Set by check_unsafe_exec() if we are trying to do
execve() and only subthreads share fs_struct. Cleared when finishing exec
(success and failure alike). Makes CLONE_FS fail with -EAGAIN if set.
* check_unsafe_exec() may fail with -EAGAIN if another execve() from subthread
is in progress.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pure code move; two new helper functions for nfsd and daemonize
(unshare_fs_struct() and daemonize_fs_struct() resp.; for now -
the same code as used to be in callers). unshare_fs_struct()
exported (for nfsd, as copy_fs_struct()/exit_fs() used to be),
copy_fs_struct() and exit_fs() don't need exports anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Not because execve races with _that_ are serious - we really
need a situation when final drop of fs_struct refcount is
done by something that used to have it as current->fs.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If we use a shared legacy IRQ then our interrupt handler may be called
as soon as it is registered even though IRQs are disabled on the NIC.
Now that the legacy interrupt handler also checks for event delivery,
it may decide to schedule polling in this case. Ensure that the NAPI
context is valid but disabled at this point.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let driver depend on HAS_IOMEM to avoid build breakage on s390:
CC drivers/net/ethoc.o
drivers/net/ethoc.c: In function 'ethoc_read':
drivers/net/ethoc.c:221: error: implicit declaration of function 'ioread32'
drivers/net/ethoc.c: In function 'ethoc_write':
drivers/net/ethoc.c:226: error: implicit declaration of function 'iowrite32'
drivers/net/ethoc.c: In function 'ethoc_rx':
drivers/net/ethoc.c:405: error: implicit declaration of function 'memcpy_fromio'
drivers/net/ethoc.c: In function 'ethoc_start_xmit':
drivers/net/ethoc.c:828: error: implicit declaration of function 'memcpy_toio'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
COW means we cycle though blocks fairly quickly, and once we
free an extent on disk, it doesn't make much sense to keep the pages around.
This commit tries to immediately free the page when we free the extent,
which lowers our memory footprint significantly.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Renames and truncates are both common ways to replace old data with new
data. The filesystem can make an effort to make sure the new data is
on disk before actually replacing the old data.
This is especially important for rename, which many application use as
though it were atomic for both the data and the metadata involved. The
current btrfs code will happily replace a file that is fully on disk
with one that was just created and still has pending IO.
If we crash after transaction commit but before the IO is done, we'll end
up replacing a good file with a zero length file. The solution used
here is to create a list of inodes that need special ordering and force
them to disk before the commit is done. This is similar to the
ext3 style data=ordering, except it is only done on selected files.
Btrfs is able to get away with this because it does not wait on commits
very often, even for fsync (which use a sub-commit).
For renames, we order the file when it wasn't already
on disk and when it is replacing an existing file. Larger files
are sent to filemap_flush right away (before the transaction handle is
opened).
For truncates, we order if the file goes from non-zero size down to
zero size. This is a little different, because at the time of the
truncate the file has no dirty bytes to order. But, we flag the inode
so that it is added to the ordered list on close (via release method). We
also immediately add it to the ordered list of the current transaction
so that we can try to flush down any writes the application sneaks in
before commit.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
... and access them afterwards. Simplify rq completing code while at it.
Spotted-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
No reason to need IDE built-in to be able to compile pmac driver.
Tested to work on 2.6.29-rc8 and 2.6.28.8 with ide and pmac as modules
inside an initramfs.
Signed-off-by: Gilles Espinasse <g.esp@free.fr>
Cc: sam@ravnborg.org
Cc: benh@kernel.crashing.org
[bart: remove now superfluous IDE check]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>