Commit graph

14984 commits

Author SHA1 Message Date
Glauber de Oliveira Costa
c49c5330c9 [PATCH] x86-64: Remove fastcall references in x86_64 code
Unlike x86, x86_64 already passes arguments in registers.  The use of
regparm attribute makes no difference in produced code, and the use of
fastcall just bloats the code.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:22 +01:00
Rohit Seth
53fee04f31 [PATCH] x86-64: Fix fake numa for x86_64 machines with big IO hole
This patch resolves the issue of running with numa=fake=X on kernel command
line on x86_64 machines that have big IO hole.  While calculating the size
of each node now we look at the total hole size in that range.

Previously there were nodes that only had IO holes in them causing kernel
boot problems.  We now use the NODE_MIN_SIZE (64MB) as the minimum size of
memory that any node must have.  We reduce the number of allocated nodes if
the number of nodes specified on kernel command line results in any node
getting memory smaller than NODE_MIN_SIZE.

This change allows the extra memory to be incremented in NODE_MIN_SIZE
granule and uniformly distribute among as many nodes (called big nodes) as
possible.

[akpm@osdl.org: build fix]
Signed-off-by: David Rientjes <reintjes@google.com>
Signed-off-by: Paul Menage <menage@google.com>
Signed-off-by: Rohit Seth <rohitseth@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:22 +01:00
Ingo Molnar
f9690982b8 [PATCH] i386: improve sched_clock() on i686
Clean up sched_clock() on i686: it will use the TSC if available and falls
back to jiffies only if the user asked for it to be disabled via notsc or
the CPU calibration code didnt figure out the right cpu_khz.

This generally makes the scheduler timestamps more finegrained, on all
hardware.  (the current scheduler is pretty resistant against asynchronous
sched_clock() values on different CPUs, it will allow at most up to a jiffy
of jitter.)

Also simplify sched_clock()'s check for TSC availability: propagate the
desire and ability to use the TSC into the tsc_disable flag, previously
this flag only indicated whether the notsc option was passed.  This makes
the rare low-res sched_clock() codepath a single branch off a read-mostly
flag.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:22 +01:00
Stephane Eranian
2ff2d3d747 [PATCH] i386: add idle notifier
Add a notifier mechanism to the low level idle loop.  You can register a
callback function which gets invoked on entry and exit from the low level idle
loop.  The low level idle loop is defined as the polling loop, low-power call,
or the mwait instruction.  Interrupts processed by the idle thread are not
considered part of the low level loop.

The notifier can be used to measure precisely how much is spent in useless
execution (or low power mode).  The perfmon subsystem uses it to turn on/off
monitoring.

Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:22 +01:00
Zachary Amsden
7b35520243 [PATCH] i386: Profile pc badness
Profile_pc was broken when using paravirtualization because the
assumption the kernel was running at CPL 0 was violated, causing
bad logic to read a random value off the stack.

The only way to be in kernel lock functions is to be in kernel
code, so validate that assumption explicitly by checking the CS
value.  We don't want to be fooled by BIOS / APM segments and
try to read those stacks, so only match KERNEL_CS.

I moved some stuff in segment.h to make it prettier.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:21 +01:00
Zachary Amsden
bbab4f3bb7 [PATCH] i386: vMI timer patches
VMI timer code.  It works by taking over the local APIC clock when APIC is
configured, which requires a couple hooks into the APIC code.  The backend
timer code could be commonized into the timer infrastructure, but there are
some pieces missing (stolen time, in particular), and the exact semantics of
when to do accounting for NO_IDLE need to be shared between different
hypervisors as well.  So for now, VMI timer is a separate module.

[Adrian Bunk: cleanups]

Subject: VMI timer patches
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden
7ce0bcfd16 [PATCH] i386: vMI backend for paravirt-ops
Fairly straightforward implementation of VMI backend for paravirt-ops.

[Adrian Bunk: some cleanups]

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden
ae5da273fe [PATCH] i386: SMP boot hook for paravirt
Add VMI SMP boot hook.  We emulate a regular boot sequence and use the same
APIC IPI initiation, we just poke magic values to load into the CPU state when
the startup IPI is received, rather than having to jump through a real mode
trampoline.

This is all that was needed to get SMP to work.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden
9226d125d9 [PATCH] i386: paravirt CPU hypercall batching mode
The VMI ROM has a mode where hypercalls can be queued and batched.  This turns
out to be a significant win during context switch, but must be done at a
specific point before side effects to CPU state are visible to subsequent
instructions.  This is similar to the MMU batching hooks already provided.
The same hooks could be used by the Xen backend to implement a context switch
multicall.

To explain a bit more about lazy modes in the paravirt patches, basically, the
idea is that only one of lazy CPU or MMU mode can be active at any given time.
 Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of
multiple PTE updates (say, inside a remap loop), but to avoid keeping some
kind of state machine about when to flush cpu or mmu updates, we just allow
one or the other to be active.  Although there is no real reason a more
comprehensive scheme could not be implemented, there is also no demonstrated
need for this extra complexity.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden
c119ecce89 [PATCH] MM: page allocation hooks for VMI backend
The VMI backend uses explicit page type notification to track shadow page
tables.  The allocation of page table roots is especially tricky.  We need to
clone the root for non-PAE mode while it is protected under the pgd lock to
correctly copy the shadow.

We don't need to allocate pgds in PAE mode, (PDPs in Intel terminology) as
they only have 4 entries, and are cached entirely by the processor, which
makes shadowing them rather simple.

For base page table level allocation, pmd_populate provides the exact hook
point we need.  Also, we need to allocate pages when splitting a large page,
and we must release pages before returning the page to any free pool.

Despite being required with these slightly odd semantics for VMI, Xen also
uses these hooks to determine the exact moment when page tables are created or
released.

AK: All nops for other architectures

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Eric Dumazet
5809f9d442 [PATCH] x86-64: get rid of ARCH_HAVE_XTIME_LOCK
ARCH_HAVE_XTIME_LOCK is used by x86_64 arch .  This arch needs to place a
read only copy of xtime_lock into vsyscall page.  This read only copy is
named __xtime_lock, and xtime_lock is defined in
arch/x86_64/kernel/vmlinux.lds.S as an alias.  So the declaration of
xtime_lock in kernel/timer.c was guarded by ARCH_HAVE_XTIME_LOCK define,
defined to true on x86_64.

We can get same result with _attribute__((weak)) in the declaration. linker
should do the job.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Jeremy Fitzhardinge
464d1a78fb [PATCH] i386: Convert i386 PDA code to use %fs
Convert the PDA code to use %fs rather than %gs as the segment for
per-processor data.  This is because some processors show a small but
measurable performance gain for reloading a NULL segment selector (as %fs
generally is in user-space) versus a non-NULL one (as %gs generally is).

On modern processors the difference is very small, perhaps undetectable.
Some old AMD "K6 3D+" processors are noticably slower when %fs is used
rather than %gs; I have no idea why this might be, but I think they're
sufficiently rare that it doesn't matter much.

This patch also fixes the math emulator, which had not been adjusted to
match the changed struct pt_regs.

[frederik.deweerdt@gmail.com: fixit with gdb]
[mingo@elte.hu: Fix KVM too]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Ian Campbell <Ian.Campbell@XenSource.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Zachary Amsden <zach@vmware.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:20 +01:00
Amul Shah
076422d2af [PATCH] x86-64: Allocate the NUMA hash function nodemap dynamically
Remove the statically allocated memory to NUMA node hash map in favor of a
dynamically allocated memory to node hash map (it is cache aligned).

This patch has the nice side effect in that it allows the hash map to grow
for systems with large amounts of memory (256GB - 1TB), but suffer from
having small PCI space tacked onto the boot node (which is somewhere
between 192MB to 512MB on the ES7000).

Signed-off-by: Amul Shah <amul.shah@unisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Rohit Seth <rohitseth@google.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:19 +01:00
Andi Kleen
0812a579c9 [PATCH] x86-64: Add __copy_from_user_nocache
This does user copies in fs write() into the page cache with write combining.
This pushes the destination out of the CPU's cache, but allows higher bandwidth
in some case.

The theory is that the page cache data is usually not touched by the
CPU again and it's better to not pollute the cache with it. Also it is a little
faster.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:19 +01:00
Len Brown
6eb87fed52 ACPI: acpi_table_parse_madt_family() is not MADT specific
acpi_table_parse_madt_family() is also used to parse SRAT entries.
So re-name it to acpi_table_parse_entries(), and re-name the
madt-specific variables within it accordingly.

cosmetic only.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-13 02:58:52 -05:00
Len Brown
5a8765a84c ACPI: acpi_madt_entry_handler() is not MADT specific
acpi_madt_entry_handler() is also used for the SRAT,
so re-name it acpi_table_entry_handler().

cosmetic only.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-13 02:58:52 -05:00
Trond Myklebust
d9bc125caf Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	net/sunrpc/auth_gss/gss_krb5_crypto.c
	net/sunrpc/auth_gss/gss_spkm3_token.c
	net/sunrpc/clnt.c

Merge with mainline and fix conflicts.
2007-02-12 22:43:25 -08:00
Chuck Lever
43d78ef2ba NFS: disconnect before retrying NFSv4 requests over TCP
RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
twice on the same connection unless it is the NULL procedure.  Section
3.1.1 suggests that the client should disconnect and reconnect if it
wants to retry a request.

Implement this by adding an rpc_clnt flag that an ULP can use to
specify that the underlying transport should be disconnected on a
major timeout.  The NFSv4 client asserts this new flag, and requests
no retries after a minor retransmit timeout.

Note that disconnecting on a retransmit is in general not safe to do
if the RPC client does not reuse the TCP port number when reconnecting.

See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-12 22:40:45 -08:00
Rusty Russell
a795ca5852 ACPI: cleanup: make disable_acpi() valid w/o CONFIG_ACPI
Len Brown <lenb@kernel.org> said:
> Okay, but better to use disable_acpi()
> indeed, since this would be the first code not already inside CONFIG_ACPI
> to invoke disable_acpi(), we could define the inline as empty and you could
> then scratch the #ifdef too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-13 00:09:13 -05:00
Stephen Rothwell
9b96ea662b [POWERPC] Wire up sys_getcpu
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-02-13 15:35:52 +11:00
Stefan Roese
ab9367e38f [POWERPC] ppc: Add support for AMCC Taishan 440GX eval board
This patch adds support for the AMCC Taishan PPC440GX evaluation
board.

This is still an arch/ppc port. I'm aware that the move of
4xx to arch/powerpc is making good progress right now. So this
patch is mainly intended to make the Taishan support available
for the community right now.

Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-02-13 15:35:52 +11:00
Benjamin Herrenschmidt
7ac9a13717 [POWERPC] Fix vDSO page count calculation
The recent vDSO consolidation patches broke powerpc due to a mistake
in the definition of MAXPAGES constants. This fixes it by moving to
a dynamically allocated array of pages instead as I don't like much
hard coded size limits. Also move the vdso initialisation to an initcall
since it doesn't really need to be done -that- early.

Applogies for not catching the breakage earlier, Roland _did_ CC me on
his patches a while ago, I got busy with other things and forgot to test
them.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-02-13 15:35:52 +11:00
Pavel Fedin
9ea8b7c96f [POWERPC] Virtual DMA support for floppy driver for new powerpc architecture
During ppc64+ppc merge virtual DMA code for floppy driver was not
ported.  This patch restores virtual DMA support for floppy in new
powerpc target.

It is necessary at least on Pegasos and AmigaOne machines for the
floppy drive to function.  ISA DMA controller works incorrectly there
due to its addressing limitations.

Virtual DMA mode is activated by floppy=nodma option passed to the
kernel (or module).  There's no automatic switch like on i386.

Signed-off-by: Pavel Fedin <sonic_amiga@rambler.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-02-13 15:35:52 +11:00
Paul Mackerras
fe6af6faec Merge branch 'for_paulus' of master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc 2007-02-13 13:28:00 +11:00
Paul Mundt
c7666e72cf sh: define dma noncoherent API functions.
sh was missing these, too.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 11:11:22 +09:00
Paul Mundt
fe82891750 sh: Missing flush_dcache_all() proto in cacheflush.h.
Some boards need this, so provide a definition.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 11:09:15 +09:00
Paul Mundt
f5df54dc2e sh: Add cpu-features header to asm/Kbuild.
This is used by the libc for parsing CPU capability flags passed
via the ELF auxvt, needed for run-time selection of atomic opcodes
amongst other things.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:46 +09:00
Paul Mundt
a5ba7d5453 sh: Move __KERNEL__ up in asm/page.h.
This was breaking the uClibc build, which triggered the bogus page
size error.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:46 +09:00
Paul Mundt
b37814352d sh: Fix syscall numbering breakage.
We accidentally broke the inotify syscalls, fix those up again.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:46 +09:00
Paul Mundt
ea9af69481 sh: Local TLB flushing variants for SMP prep.
Rename the existing flush routines to local_ variants for use by
the IPI-backed global flush routines on SMP.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:45 +09:00
Paul Mundt
11c1965687 sh: Fixup cpu_data references for the non-boot CPUs.
There are a lot of bogus cpu_data-> references that only end up working
for the boot CPU, convert these to current_cpu_data to fixup SMP.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:45 +09:00
Paul Mundt
aec5e0e1c1 sh: Use a per-cpu ASID cache.
Previously this was implemented using a global cache, cache
this per-CPU instead and bump up the number of context IDs to
match NR_CPUS.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:45 +09:00
Manuel Lauss
9f8a5e3a44 sh: SH-DMAC compile fixes
This patch does the following:
- remove the make_ipr_irq stuff from dma-sh.c and replace it
  with a simple channel<->irq mapping table.
- add DMTEx_IRQ constants for sh4 cpus
- fix sh7751 DMAE irq number

The SH7780 uses the same IRQs for DMA as other SH4 types, so
I put the constants on top of the dma.h file.

Other CPU types need to #define their own DMTEx_IRQ contants
in their appropriate header.

Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:45 +09:00
Paul Mundt
26b7a78c55 sh: Lazy dcache writeback optimizations.
This converts the lazy dcache handling to the model described in
Documentation/cachetlb.txt and drops the ptep_get_and_clear() hacks
used for the aliasing dcaches on SH-4 and SH7705 in 32kB mode. As a
bonus, this slightly cuts down on the cache flushing frequency.

With that and the PTEA handling out of the way, the update_mmu_cache()
implementations can be consolidated, and we no longer have to worry
about which configuration the cache is in for the SH7705 case.

And finally, explicitly disable the lazy writeback on SMP (SH-4A).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:44 +09:00
Paul Mundt
7a847f8190 sh: More tidying for large base pages.
There were a few more things that needed fixing up, namely THREAD_SIZE
and the TLB miss handler where certain PTRS_PER_PGD == PTRS_PER_PTE
assumptions were being made.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:44 +09:00
SUGIOKA Toshinobu
106dac130d sh: syscall 300 should be __NR_fstatat64.
syscall number 300 fails while testing with latest LTP
(ltp-full-20061121.tgz) on sh.

sys_fstatat64 is called on syscall 300 (see arch/sh/kernel/syscalls.S),
and __ARCH_WANT_STAT64 is defined in include/asm-sh/unistd.h, so
following patch seems correct.

Signed-off-by: SUGIOKA Toshinobu <sugioka@itonet.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:44 +09:00
Paul Mundt
f413d0d9fa sh: Use a jump call table for debug trap handlers.
This rips out most of the needlessly complicated sh_bios and kgdb
trap handling, and forces it all through a common fast dispatch path.
As more debug traps are inserted, it's important to keep them in sync
for all of the parts, not just SH-3/4.

As the SH-2 parts are unable to do traps in the >= 0x40 range, we
restrict the debug traps to the 0x30-0x3f range on all parts, and
also bump the kgdb breakpoint trap down in to this range (from 0xff
to 0x3c) so it's possible to use for nommu.

Optionally, this table can be padded out to catch spurious traps for
SH-3/4, but we don't do that yet..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-13 10:54:43 +09:00
Linus Torvalds
ec2f9d1331 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC]: Re-export saved_command_line to modules.
  [SPARC64]: Increase command line size to 2048 like other arches.
  [SPARC64]: We do not need ZONE_DMA.
2007-02-12 15:35:30 -08:00
David S. Miller
b5ba1b31c7 [SPARC64]: Increase command line size to 2048 like other arches.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 15:15:47 -08:00
Matt Reimer
b05f87172f [ARM] 4168/1: S3C24XX: use defines instead of numbers
Use defines instead of numbers.

Signed-off-by: Matt Reimer <mreimer@vpop.net>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-02-12 22:22:09 +00:00
Russell King
f454aa6b90 [ARM] Provide dummy noncoherent DMA API
We don't currently support the noncoherent DMA API, but it needs to
be provided for kernels with devres to link.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-02-12 19:26:05 +00:00
Patrick McHardy
fe3eb20c1a [NETFILTER]: nf_conntrack: change nf_conntrack_l[34]proto_unregister to void
No caller checks the return value, and since its usually called within the
module unload path there's nothing a module could do about errors anyway,
so BUG on invalid conditions and return void.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 11:14:28 -08:00
Patrick McHardy
c0e912d7ed [NETFILTER]: nf_conntrack: fix invalid conntrack statistics RCU assumption
NF_CT_STAT_INC assumes rcu_read_lock in nf_hook_slow disables
preemption as well, making it legal to use __get_cpu_var without
disabling preemption manually. The assumption is not correct anymore
with preemptable RCU, additionally we need to protect against softirqs
when not holding nf_conntrack_lock.

Add NF_CT_STAT_INC_ATOMIC macro, which disables local softirqs,
and use where necessary.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 11:13:43 -08:00
Patrick McHardy
abbaccda4c [NETFILTER]: ip_conntrack: fix invalid conntrack statistics RCU assumption
CONNTRACK_STAT_INC assumes rcu_read_lock in nf_hook_slow disables
preemption as well, making it legal to use __get_cpu_var without
disabling preemption manually. The assumption is not correct anymore
with preemptable RCU, additionally we need to protect against softirqs
when not holding ip_conntrack_lock.

Add CONNTRACK_STAT_INC_ATOMIC macro, which disables local softirqs,
and use where necessary.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 11:13:14 -08:00
Patrick McHardy
923f4902fe [NETFILTER]: nf_conntrack: properly use RCU API for nf_ct_protos/nf_ct_l3protos arrays
Replace preempt_{enable,disable} based RCU by proper use of the
RCU API and add missing rcu_read_lock/rcu_read_unlock calls in
all paths not obviously only used within packet process context
(nfnetlink_conntrack).
  
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 11:12:57 -08:00
Patrick McHardy
e92ad99c78 [NETFILTER]: nf_log: minor cleanups
- rename nf_logging to nf_loggers since its an array of registered loggers

- rename nf_log_unregister_logger() to nf_log_unregister() to make it
  symetrical to nf_log_register() and convert all users

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 11:11:55 -08:00
Patrick McHardy
9dc6aa5fcf [NETFILTER]: nf_log: make nf_log_unregister_pf return void
Since the only user of nf_log_unregister_pf (nfnetlink_log) doesn't
check the return value, change it to void and bail out silently when
a non-existant address family is supplied.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 11:11:24 -08:00
Linus Torvalds
ebaf0c6032 Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] remove __io_virt and mmiowb.
  [S390] cio: use ARRAY_SIZE in device_id.c
  [S390] cio: Fixup interface for setting options on ccw devices.
  [S390] smp_call_function/smp_call_function_on locking.
2007-02-12 09:57:44 -08:00
Josef 'Jeff' Sipek
ee9b6d61a2 [PATCH] Mark struct super_operations const
This patch is inspired by Arjan's "Patch series to mark struct
file_operations and struct inode_operations const".

Compile tested with gcc & sparse.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:47 -08:00
Arjan van de Ven
c5ef1c42c5 [PATCH] mark struct inode_operations const 3
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00