Make ZONE_DMA optional in core code.
- ifdef all code for ZONE_DMA and related definitions following the example
for ZONE_DMA32 and ZONE_HIGHMEM.
- Without ZONE_DMA, ZONE_HIGHMEM and ZONE_DMA32 we get to a ZONES_SHIFT of
0.
- Modify the VM statistics to work correctly without a DMA zone.
- Modify slab to not create DMA slabs if there is no ZONE_DMA.
[akpm@osdl.org: cleanup]
[jdike@addtoit.com: build fix]
[apw@shadowen.org: Simplify calculation of the number of bits we need for ZONES_SHIFT]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Values are readily available via ZVC per node and global sums.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Function is unnecessary now. We can use the summing features of the ZVCs to
get the values we need.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nr_free_pages is now a simple access to a global variable. Make it a macro
instead of a function.
The nr_free_pages now requires vmstat.h to be included. There is one
occurrence in power management where we need to add the include. Directly
refrer to global_page_state() there to clarify why the #include was added.
[akpm@osdl.org: arm build fix]
[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The global and per zone counter sums are in arrays of longs. Reorder the ZVCs
so that the most frequently used ZVCs are put into the same cacheline. That
way calculations of the global, node and per zone vm state touches only a
single cacheline. This is mostly important for 64 bit systems were one 128
byte cacheline takes only 8 longs.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is again simplifies some of the VM counter calculations through the use
of the ZVC consolidated counters.
[michal.k.k.piotrowski@gmail.com: build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The determination of the dirty ratio to determine writeback behavior is
currently based on the number of total pages on the system.
However, not all pages in the system may be dirtied. Thus the ratio is always
too low and can never reach 100%. The ratio may be particularly skewed if
large hugepage allocations, slab allocations or device driver buffers make
large sections of memory not available anymore. In that case we may get into
a situation in which f.e. the background writeback ratio of 40% cannot be
reached anymore which leads to undesired writeback behavior.
This patchset fixes that issue by determining the ratio based on the actual
pages that may potentially be dirty. These are the pages on the active and
the inactive list plus free pages.
The problem with those counts has so far been that it is expensive to
calculate these because counts from multiple nodes and multiple zones will
have to be summed up. This patchset makes these counters ZVC counters. This
means that a current sum per zone, per node and for the whole system is always
available via global variables and not expensive anymore to calculate.
The patchset results in some other good side effects:
- Removal of the various functions that sum up free, active and inactive
page counts
- Cleanup of the functions that display information via the proc filesystem.
This patch:
The use of a ZVC for nr_inactive and nr_active allows a simplification of some
counter operations. More ZVC functionality is used for sums etc in the
following patches.
[akpm@osdl.org: UP build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With CONFIG_SPARSEMEM=y:
mm/rmap.c:579: warning: format '%lx' expects type 'long unsigned int', but argument 2 has type 'int'
Make __page_to_pfn() return unsigned long.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the last vestiges of the long-deprecated "MAP_ANON" page protection
flag: use "MAP_ANONYMOUS" instead.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following patch and script moves the arch/arm/mach-s3c2410
directory into arch/arm/plat-s3c24xx for the generic core code
and inti arch/arm/mach-s3c{cpu} for the cpu/machine support files
Include directory include/asm-arm/plat-s3c24xx is added for the
core include files.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The PAGE_* user page protection macros don't take into account the
configured memory policy and other architecture specific bits like
the global/ASID and shared mapping bits. Instead of constants let
these depend on a variable fixed up at init just like PAGE_KERNEL.
Signed-off-by: Imre Deak <imre.deak@solidboot.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds the support for the L210/L220 (outer) cache
controller. The cache range operations are done by index/way since L2
cache controller only accepts physical addresses.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is kind of hokey, we could use the hardware provided facilities
much better.
MSIs are assosciated with MSI Queues. MSI Queues generate interrupts
when any MSI assosciated with it is signalled. This suggests a
two-tiered IRQ dispatch scheme:
MSI Queue interrupt --> queue interrupt handler
MSI dispatch --> driver interrupt handler
But we just get one-level under Linux currently. What I'd like to do
is possibly stick the IRQ actions into a per-MSI-Queue data structure,
and dispatch them form there, but the generic IRQ layer doesn't
provide a way to do that right now.
So, the current kludge is to "ACK" the interrupt by processing the
MSI Queue data structures and ACK'ing them, then we run the actual
handler like normal.
We are wasting a lot of useful information, for example the MSI data
and address are provided with ever MSI, as well as a system tick if
available. If we could pass this into the IRQ handler it could help
with certain things, in particular for PCI-Express error messages.
The MSI entries on sparc64 also tell you exactly which bus/device/fn
sent the MSI, which would be great for error handling when no
registered IRQ handler can service the interrupt.
We override the disable/enable IRQ chip methods in sun4v_msi, so we
have to call {mask,unmask}_msi_irq() directly from there. This is
another ugly wart.
Signed-off-by: David S. Miller <davem@davemloft.net>
Some partitioning systems create special partitions that
span the entire disk. One example are Sun partitions, and
this whole-disk partition exists to tell the firmware the
extent of the entire device so it can load the boot block
and do other things.
Such partitions should not be treated as normal partitions,
because all the other partitions overlap this whole-disk one.
So we'd see multiple instances of the same UUID etc. which
we do not want. udev and friends can thus search for this
'whole_disk' attribute and use it to decide to ignore the
partition.
Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This last patch (but not least :) ) finally moves the next pointer at
the end of struct dst_entry. This permits to perform route cache
lookups with a minimal cost of one cache line per entry, instead of
two.
Both 32bits and 64bits platforms benefit from this new layout.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the next pointer from 'struct dn_route.u' union,
and renames u.rt_next to u.dst.dn_next.
It also moves 'struct flowi' right after 'struct dst_entry' to prepare
speedup lookups.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the next pointer from 'struct rt6_info.u' union,
and renames u.next to u.dst.rt6_next.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the rt_next pointer from 'struct rtable.u' union,
and renames u.rt_next to u.dst_rt_next.
It also moves 'struct flowi' right after 'struct dst_entry' to prepare
the gain on lookups.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces an anonymous union to nicely express the fact that all
objects inherited from struct dst_entry should access to the generic 'next'
pointer but with appropriate type verification.
This patch is a prereq before following patches.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yet another attempt to resolve cpufreq and hotplug locking issues.
Patchset has 3 patches:
* Rewrite the lock infrastructure of cpufreq using a per cpu rwsem.
* Minor restructuring of work callback in ondemand driver.
* Use the new cpufreq rwsem infrastructure in ondemand work.
This patch:
Convert policy->lock to rwsem and move it to per_cpu area.
This rwsem will protect against both changing/accessing policy
related parameters and CPU hot plug/unplug.
[malattia@linux.it: fix oops in kref_put()]
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
All the information in the MIPS c0_status register is priviledged.
Nothing that would constitute part of the thread context.
The one flag one could possibly argument about might be c0_status.fr
but none of the ABIs or tools or application software can make use
of it.
So for consistency with restore_sigcontext32(), which does not
restore c0_status register, this patch remove the saving part.
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
late_initcall() is too late for acpi_sleep_init().
Call it directly from acpi_init code.
http://bugzilla.kernel.org/show_bug.cgi?id=7887
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Signed-off-by: Vladimir Lebedev <vladimir.p.lebedev@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The ATA_ENABLE_PATA define was never meant to be permanent, and in
recent kernels, it's already been unconditionally enabled. Remove.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch is against the libata core and headers.
Two IRQ calls are added in ata_port_operations.
- irq_on() is used to enable interrupts.
- irq_ack() is used to acknowledge a device interrupt.
In most drivers, ata_irq_on() and ata_irq_ack() are used for
irq_on and irq_ack respectively.
In some drivers (ex: ahci, sata_sil24) which cannot use them
as is, ata_dummy_irq_on() and ata_dummy_irq_ack() are used.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Akira Iguchi <akira2.iguchi@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
In file included from drivers/infiniband/hw/ipath/ipath_diag.c:44:
include/linux/io.h:35: warning: 'struct device' declared inside parameter list
include/linux/io.h:35: warning: its scope is only this definition or declaration
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Convert libata core layer and LLDs to use iomap.
* managed iomap is used. Pointer to pcim_iomap_table() is cached at
host->iomap and used through out LLDs. This basically replaces
host->mmio_base.
* if possible, pcim_iomap_regions() is used
Most iomap operation conversions are taken from Jeff Garzik
<jgarzik@pobox.com>'s iomap branch.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement pcim_iomap_regions(). This function takes mask of BARs to
request and iomap. No BAR should have length of zero. BARs are
iomapped using pcim_iomap_table().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Now that all LLDs are converted to use devres, default stop callbacks
are unused. Remove them.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update libata core layer to use devres.
* ata_device_add() acquires all resources in managed mode.
* ata_host is allocated as devres associated with ata_host_release.
* Port attached status is handled as devres associated with
ata_host_attach_release().
* Initialization failure and host removal is handedl by releasing
devres group.
* Except for ata_scsi_release() removal, LLD interface remains the
same. Some functions use hacky is_managed test to support both
managed and unmanaged devices. These will go away once all LLDs are
updated to use devres.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement ata_host_detach() which calls ata_port_detach() for each
port in the host and export it. ata_port_detach() is now internal and
thus un-exported. ata_host_detach() will be used as the 'deregister
from libata layer' function after devres conversion.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement device resource management, in short, devres. A device
driver can allocate arbirary size of devres data which is associated
with a release function. On driver detach, release function is
invoked on the devres data, then, devres data is freed.
devreses are typed by associated release functions. Some devreses are
better represented by single instance of the type while others need
multiple instances sharing the same release function. Both usages are
supported.
devreses can be grouped using devres group such that a device driver
can easily release acquired resources halfway through initialization
or selectively release resources (e.g. resources for port 1 out of 4
ports).
This patch adds devres core including documentation and the following
managed interfaces.
* alloc/free : devm_kzalloc(), devm_kzfree()
* IO region : devm_request_region(), devm_release_region()
* IRQ : devm_request_irq(), devm_free_irq()
* DMA : dmam_alloc_coherent(), dmam_free_coherent(),
dmam_declare_coherent_memory(), dmam_pool_create(),
dmam_pool_destroy()
* PCI : pcim_enable_device(), pcim_pin_device(), pci_is_managed()
* iomap : devm_ioport_map(), devm_ioport_unmap(), devm_ioremap(),
devm_ioremap_nocache(), devm_iounmap(), pcim_iomap_table(),
pcim_iomap(), pcim_iounmap()
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata used two separate sets of variables to record request size and
current offset for ATA and ATAPI. This is confusing and fragile.
This patch replaces qc->nsect/cursect with qc->nbytes/curbytes and
kills them. Also, ata_pio_sector() is updated to use bytes for
qc->cursg_ofs instead of sectors. The field used to be used in bytes
for ATAPI and in sectors for ATA.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Handle pci_enable_device() failure while resuming. This patch kills
the "ignoring return value of 'pci_enable_device'" warning message and
propagates __must_check through ata_pci_device_do_resume().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* Kill _OFS suffixes in ATA_ID_{SERNO|FW_REV|PROD}_OFS for consistency
with other ATA_ID_* constants.
* Kill ATA_SERNO_LEN
* Add and use ATA_ID_SERNO_LEN, ATA_ID_FW_REV_LEN and ATA_ID_PROD_LEN.
This change also makes ata_device_blacklisted() use proper length
for fwrev.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Updated diff which doesn't move the comment as per Jeff's request and
corrects the docs as per report on l/k
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add a driver for the IT8213 which is a single channel ICH-ish PATA
controller. As it is very different to the IT8211/2 it gets its own
driver. There is a legacy drivers/ide driver also available and I'll post
that once I get time to test it all out (probably early January). If
anyone else needs the drivers/ide driver and wants to do the merge for
drivers/ide (Bart ??) then I'll forward it.
[akpm@osdl.org: add PCI ID, constify needed_pio[]]
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove include of asm/system.h, not needed.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: fix host device registering when nodemgr disabled
ieee1394: video1394: DMA fix
ieee1394: raw1394: prevent unloading of low-level driver
ieee1394: dv1394: tidy up card removal
ieee1394: dv1394: fix CardBus card ejection
ieee1394: sbp2: lower block queue alignment requirement
ieee1394: sbp2: remove bogus "emulated" host flag
ieee1394: save one word in struct hpsb_host
ieee1394: restore config ROM when resuming
ieee1394: ohci1394: drop pcmcia-cs compatibility code
ieee1394: nodemgr: check info_length in ROM header earlier
the scheduled IEEE1394_OUI_DB removal
the scheduled IEEE1394_EXPORT_FULL_API removal
ieee1394: sbp2: use a better wildcard for blacklist
Add PCI class ID for firewire OHCI controllers.
ieee1394: modified csr1212_key_id_type_map to support lisight
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-apm:
[APM] SH: Convert to use shared APM emulation.
[APM] MIPS: Convert to use shared APM emulation.
[APM] ARM: Convert to use shared APM emulation.
[APM] Add shared version of APM emulation
This patch adds a utility function install_special_mapping, for creating a
special vma using a fixed set of preallocated pages as backing, such as for a
vDSO. This consolidates some nearly identical code used for vDSO mapping
reimplemented for different architectures.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>