Commit graph

32332 commits

Author SHA1 Message Date
Christoph Lameter
347ce434d5 [PATCH] zoned vm counters: conversion of nr_pagecache to per zone counter
Currently a single atomic variable is used to establish the size of the page
cache in the whole machine.  The zoned VM counters have the same method of
implementation as the nr_pagecache code but also allow the determination of
the pagecache size per zone.

Remove the special implementation for nr_pagecache and make it a zoned counter
named NR_FILE_PAGES.

Updates of the page cache counters are always performed with interrupts off.
We can therefore use the __ variant here.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Christoph Lameter
65ba55f500 [PATCH] zoned vm counters: convert nr_mapped to per zone counter
nr_mapped is important because it allows a determination of how many pages of
a zone are not mapped, which would allow a more efficient means of determining
when we need to reclaim memory in a zone.

We take the nr_mapped field out of the page state structure and define a new
per zone counter named NR_FILE_MAPPED (the anonymous pages will be split off
from NR_MAPPED in the next patch).

We replace the use of nr_mapped in various kernel locations.  This avoids the
looping over all processors in try_to_free_pages(), writeback, reclaim (swap +
zone reclaim).

[akpm@osdl.org: bugfix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Christoph Lameter
2244b95a7b [PATCH] zoned vm counters: basic ZVC (zoned vm counter) implementation
Per zone counter infrastructure

The counters that we currently have for the VM are split per processor.  The
processor however has not much to do with the zone these pages belong to.  We
cannot tell f.e.  how many ZONE_DMA pages are dirty.

So we are blind to potentially inbalances in the usage of memory in various
zones.  F.e.  in a NUMA system we cannot tell how many pages are dirty on a
particular node.  If we knew then we could put measures into the VM to balance
the use of memory between different zones and different nodes in a NUMA
system.  For example it would be possible to limit the dirty pages per node so
that fast local memory is kept available even if a process is dirtying huge
amounts of pages.

Another example is zone reclaim.  We do not know how many unmapped pages exist
per zone.  So we just have to try to reclaim.  If it is not working then we
pause and try again later.  It would be better if we knew when it makes sense
to reclaim unmapped pages from a zone.  This patchset allows the determination
of the number of unmapped pages per zone.  We can remove the zone reclaim
interval with the counters introduced here.

Futhermore the ability to have various usage statistics available will allow
the development of new NUMA balancing algorithms that may be able to improve
the decision making in the scheduler of when to move a process to another node
and hopefully will also enable automatic page migration through a user space
program that can analyse the memory load distribution and then rebalance
memory use in order to increase performance.

The counter framework here implements differential counters for each processor
in struct zone.  The differential counters are consolidated when a threshold
is exceeded (like done in the current implementation for nr_pageache), when
slab reaping occurs or when a consolidation function is called.

Consolidation uses atomic operations and accumulates counters per zone in the
zone structure and also globally in the vm_stat array.  VM functions can
access the counts by simply indexing a global or zone specific array.

The arrangement of counters in an array also simplifies processing when output
has to be generated for /proc/*.

Counters can be updated by calling inc/dec_zone_page_state or
_inc/dec_zone_page_state analogous to *_page_state.  The second group of
functions can be called if it is known that interrupts are disabled.

Special optimized increment and decrement functions are provided.  These can
avoid certain checks and use increment or decrement instructions that an
architecture may provide.

We also add a new CONFIG_DMA_IS_NORMAL that signifies that an architecture can
do DMA to all memory and therefore ZONE_NORMAL will not be populated.  This is
only currently set for IA64 SGI SN2 and currently only affects
node_page_state().  In the best case node_page_state can be reduced to
retrieving a single counter for the one zone on the node.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: export vm_stat[] for filesystems]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Christoph Lameter
f6ac2354d7 [PATCH] zoned vm counters: create vmstat.c/.h from page_alloc.c/.h
NOTE: ZVC are *not* the lightweight event counters.  ZVCs are reliable whereas
event counters do not need to be.

Zone based VM statistics are necessary to be able to determine what the state
of memory in one zone is.  In a NUMA system this can be helpful for local
reclaim and other memory optimizations that may be able to shift VM load in
order to get more balanced memory use.

It is also useful to know how the computing load affects the memory
allocations on various zones.  This patchset allows the retrieval of that data
from userspace.

The patchset introduces a framework for counters that is a cross between the
existing page_stats --which are simply global counters split per cpu-- and the
approach of deferred incremental updates implemented for nr_pagecache.

Small per cpu 8 bit counters are added to struct zone.  If the counter exceeds
certain thresholds then the counters are accumulated in an array of
atomic_long in the zone and in a global array that sums up all zone values.
The small 8 bit counters are next to the per cpu page pointers and so they
will be in high in the cpu cache when pages are allocated and freed.

Access to VM counter information for a zone and for the whole machine is then
possible by simply indexing an array (Thanks to Nick Piggin for pointing out
that approach).  The access to the total number of pages of various types does
no longer require the summing up of all per cpu counters.

Benefits of this patchset right now:

- Ability for UP and SMP configuration to determine how memory
  is balanced between the DMA, NORMAL and HIGHMEM zones.

- loops over all processors are avoided in writeback and
  reclaim paths. We can avoid caching the writeback information
  because the needed information is directly accessible.

- Special handling for nr_pagecache removed.

- zone_reclaim_interval vanishes since VM stats can now determine
  when it is worth to do local reclaim.

- Fast inline per node page state determination.

- Accurate counters in /sys/devices/system/node/node*/meminfo. Current
  counters are counting simply which processor allocated a page somewhere
  and guestimate based on that. So the counters were not useful to show
  the actual distribution of page use on a specific zone.

- The swap_prefetch patch requires per node statistics in order to
  figure out when processors of a node can prefetch. This patch provides
  some of the needed numbers.

- Detailed VM counters available in more /proc and /sys status files.

References to earlier discussions:
V1 http://marc.theaimsgroup.com/?l=linux-kernel&m=113511649910826&w=2
V2 http://marc.theaimsgroup.com/?l=linux-kernel&m=114980851924230&w=2
V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115014697910351&w=2
V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767318740&w=2

Performance tests with AIM7 did not show any regressions.  Seems to be a tad
faster even.  Tested on ia64/NUMA.  Builds fine on i386, SMP / UP.  Includes
fixes for s390/arm/uml arch code.

This patch:

Move counter code from page_alloc.c/page-flags.h to vmstat.c/h.

Create vmstat.c/vmstat.h by separating the counter code and the proc
functions.

Move the vm_stat_text array before zoneinfo_show.

[akpm@osdl.org: s390 build fix]
[akpm@osdl.org: HOTPLUG_CPU build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Adrian Bunk
672b2714ae [PATCH] fix ISTALLION=y
drivers/char/istallion.c: In function ‘stli_initbrds’:
drivers/char/istallion.c:4150: error: implicit declaration of function ‘stli_parsebrd’
drivers/char/istallion.c:4150: error: ‘stli_brdsp’ undeclared (first use in this function)
drivers/char/istallion.c:4150: error: (Each undeclared identifier is reported only once
drivers/char/istallion.c:4150: error: for each function it appears in.)
drivers/char/istallion.c:4164: error: implicit declaration of function ‘stli_argbrds’

While I was at it, I also removed the #ifdef MODULE around the initialation
code to allow it to perhaps work when built into the kernel and made a
needlessly global function static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Andrew Morton
e09793bb91 [PATCH] msr.c: use register_hotcpu_notifier()
register_cpu_notifier() cannot do anything in a module, in a
!CONFIG_HOTPLUG_CPU kernel.

Cc: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Ingo Molnar
1017f6afd5 [PATCH] fix platform_device_put/del mishaps
This fixes drivers/char/pc8736x_gpio.c and drivers/char/scx200_gpio.c to
use the platform_device_del/put ops correctly.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jim Cromie <jim.cromie@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Ingo Molnar
491d525ff1 [PATCH] fix drivers/video/imacfb.c compilation
Fix build error on x86_64.  There's nothing even remotely close to
imacmp_seg in the kernel, so I removed the whole line.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Cc: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30 11:25:34 -07:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Pavel Machek
e02169b682 remove obsolete swsusp_encrypt
Remove SWSUSP_ENCRYPT config option; it is no longer implemented.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:59:59 +02:00
Matt LaPlante
fcb4ee8852 arch/arm26/Kconfig typos
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:57:59 +02:00
Matt LaPlante
590abf09d6 Documentation/IPMI typos
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:56:29 +02:00
Matt LaPlante
3539c272f1 Kconfig: Typos in net/sched/Kconfig
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:53:46 +02:00
Paul Collins
779cbf0bbc v9fs: do not include linux/version.h
I noticed that part of v9fs was being rebuilt when version.h changed.

Signed-off-by: Paul Collins <paul@ondioline.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:50:03 +02:00
Patrick Pletscher
741c80c24b Documentation/DocBook/mtdnand.tmpl: typo fixes
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:35:56 +02:00
Adrian Bunk
d254c8f70a typo fixes: specfic -> specific
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:29:51 +02:00
Adrian Bunk
d0f19d8217 typo fixes in Documentation/networking/pktgen.txt
Three typos in only one line...

Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:28:43 +02:00
Adrian Bunk
80f7228b59 typo fixes: occuring -> occurring
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:27:16 +02:00
Adrian Bunk
47bdd718c6 typo fixes: infomation -> information
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:25:18 +02:00
Adrian Bunk
fd245f0069 typo fixes: disadvantadge -> disadvantage
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:23:39 +02:00
Adrian Bunk
0418726bb5 typo fixes: aquire -> acquire
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-30 18:23:04 +02:00
Adrian Bunk
b3c2ffd534 typo fixes: mecanism -> mechanism
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:20:44 +02:00
Adrian Bunk
9aaeded72f typo fixes: bandwith -> bandwidth
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:19:55 +02:00
Adrian Bunk
27ae4104b6 fix a typo in the RTC_CLASS help text
This patch fixes a typo spotted by
Matt LaPlante <webmaster@cyberdogtech.com>.

This patch fixes kernel Bugzilla #6704.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:18:41 +02:00
Adrian Bunk
9c4d3ef7b5 smb is no longer maintained
The smb filesystem in the Linux kernel is unmaintained for years.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:17:39 +02:00
akpm@osdl.org
0a1f1ab8de ACPI: fixup memhotplug debug message
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 03:40:01 -04:00
Len Brown
02438d8771 ACPI: delete acpi_os_free(), use kfree() directly
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 03:19:10 -04:00
Patrick Mochel
d07a8577f6 ACPI: video: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:51:39 -04:00
Patrick Mochel
8a4444bf5a ACPI: thermal: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:51:38 -04:00
Patrick Mochel
1474720405 ACPI: power: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:51:37 -04:00
Patrick Mochel
432bfaba7d ACPI: pci_root: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:51:34 -04:00
Patrick Mochel
e0e4e117d4 ACPI: pci_link: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:51:31 -04:00
Patrick Mochel
579c896cc9 ACPI: fan: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:50:48 -04:00
Patrick Mochel
6c68953772 ACPI: button: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:50:47 -04:00
Patrick Mochel
39cb61e267 ACPI: battery: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:50:46 -04:00
Patrick Mochel
9453ece926 ACPI: acpi_memhotplug: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:50:43 -04:00
Patrick Mochel
1b5b8b81bd ACPI: ac: Remove unneeded acpi_handle from driver.
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:48:37 -04:00
Patrick Mochel
901302688c ACPI: video: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:46:18 -04:00
Patrick Mochel
38ba7c9ed2 ACPI: thermal: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:44 -04:00
Patrick Mochel
5fbc19efdb ACPI: power: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:43 -04:00
Patrick Mochel
2d1e0a02f1 ACPI: pci_root: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:41 -04:00
Patrick Mochel
67a7136573 ACPI: pci_link: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:36 -04:00
Patrick Mochel
dc8c2b2744 ACPI: fan: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:09 -04:00
Patrick Mochel
27b1d3e85b ACPI: button: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:05 -04:00
Patrick Mochel
3b073ec366 ACPI: battery: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:04 -04:00
Patrick Mochel
b863278523 ACPI: acpi_memhotplug: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:42:02 -04:00
Patrick Mochel
a6ba5ebef9 ACPI: ac: Use acpi_device's handle instead of driver's
Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:37:05 -04:00
Patrick Mochel
e6afa0de14 ACPI: video: add struct acpi_device to struct acpi_video_bus.
- Use it instead of acpi_bus_get_device() in acpi_video_bus_notify()
  and use the one from struct acpi_video_device in
  acpi_video_device_notify().

Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:32:25 -04:00
Patrick Mochel
4159857288 ACPI: power: add struct acpi_device to struct acpi_power_resource
- Use it instead of acpi_bus_get_device() where we can..

Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:32:21 -04:00
Patrick Mochel
8348e1b19a ACPI: thermal: add struct acpi_device to struct acpi_thermal.
- Use it instead of acpi_bus_get_device() where we can..

Signed-off-by: Patrick Mochel <mochel@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-30 02:32:17 -04:00