Linus noted (based on Eric Dumazet's numbers) that we would
probably be better off not trying an atomic_read() in
atomic64_add_return() but intead intentionally let the first
cmpxchg8b fail - to get a cache-friendly 'give me ownership
of this cacheline' transaction. That can then be followed
by the real cmpxchg8b which sets the value local to the CPU.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rewrite cmpxchg8b() to not use %edi register but a generic "+m"
constraint, to increase compiler freedom in code generation and
possibly better code.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noticed that the 32-bit version of atomic64_read() was
being overly complex with re-reading the value and doing a
retry loop over that.
Instead we can just rely on cmpxchg8b returning either the new
value or returning the current value.
We can use any 'old' value, which will be faster as it can be
loaded via immediates. Using some value that is not equal to
the real value in memory the instruction gets faster.
This also has the advantage that the CPU could avoid dirtying
the cacheline.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noted that the atomic64_t primitives are all inlines
currently which is crazy because these functions have a large
register footprint anyway.
Move them to a separate file: arch/x86/lib/atomic64_32.c
Also, while at it, rename all uses of 'unsigned long long' to
the much shorter u64.
This makes the appearance of the prototypes a lot nicer - and
it also uncovered a few bugs where (yet unused) API variants
had 'long' as their return type instead of u64.
[ More intrusive changes are not yet done in this patch. ]
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Locked instructions on two cache lines at once are painful. If
atomic64_t uses two cache lines, my test program is 10x slower.
The chance for that is significant: 4/32 or 12.5%.
Make sure an atomic64_t is 8 bytes aligned.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
[ changed it to __aligned(8) as per Andrew's suggestion ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Certain versions of GCC dont see the initialization that is done here:
builtin-report.c: In function ‘__cmd_report’:
builtin-report.c:1038: warning: ‘syms’ may be used uninitialized in this function
So annotate it with a NULL initialization.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
AC97 bus register read/write hooks need to provide locking, but the
mpc5200-psc-ac97 driver does not. This patch adds a mutex around
the register access routines.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Jon Smirl <jonsmirl@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
When doing register reads, it is possible for there to be a stale
data ready bit set which will cause subsequent reads to return
prematurely with incorrect data. This patch fixes the issues by
ensuring stale data is cleared before starting another transaction.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Jon Smirl <jonsmirl@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The wrong register cache variable was being used to provide the size for
the memcpy(), resulting in a copy of only a void * of data.
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
This patch updates the receive error code for muxed
interrupts in the sh-sci driver.
Receive error interrupts may be generated by the hardware
if RE or REIE bits in SCSCR are set. Update the muxed
interrupt handling code to acknowledge error interrupts
if RE or REIE is set, instead of only acknowledging if
REIE is set.
Without this patch error interrupts may be generated but
never acked resulting in a "nobody cared" crash.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Ingo Molnar wrote:
> i just bisected a 'perf report' bug that would cause us to not
> resolve all user-space symbols in a 'git gc' run to:
>
> f5812a7a33 is first bad commit
> commit f5812a7a33
> Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date: Tue Jun 30 11:43:17 2009 -0300
>
> perf_counter tools: Adjust only prelinked symbol's addresses
Rename ->prelinked to ->adjust_symbols and making what was done
only for prelinked libraries also to ET_EXEC binaries, such as
/usr/bin/git:
[acme@doppio pahole]$ readelf -h /usr/bin/git | grep Type
Type: EXEC (Executable file)
[acme@doppio pahole]$
And after installing the 'git-debuginfo' package, I get correct results:
[acme@doppio linux-2.6-tip]$ perf report --sort comm,dso,symbol -d /usr/bin/git | head -20
#
# (1139614 samples)
#
# Overhead Command Shared Object Symbol
# ........ ................ ......................... ......
#
34.98% git /usr/bin/git [.] send_sideband
33.39% git /usr/bin/git [.] enter_repo
6.81% git /usr/bin/git [.] diff_opt_parse
4.95% git /usr/bin/git [.] is_repository_shallow
3.24% git /usr/bin/git [.] odb_mkstemp
1.39% git /usr/bin/git [.] output
1.34% git /usr/bin/git [.] xmmap
1.25% git /usr/bin/git [.] receive_pack_config
1.16% git /usr/bin/git [.] git_pathdup
0.90% git /usr/bin/git [.] read_object_with_reference
0.86% git /usr/bin/git [.] show_patch_diff
0.85% git /usr/bin/git 0x00000000095e2e
0.69% git /usr/bin/git [.] display
[acme@doppio linux-2.6-tip]$
I'll check what are the last cases where we can't resolve symbols, like
this 0x00000000095e2e later.
And I guess this will fix the problems Mike were seeing too:
[acme@doppio linux-2.6-tip]$ readelf -h ../build/perf/vmlinux | grep Type
Type: EXEC (Executable file)
[acme@doppio linux-2.6-tip]$
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Somewhat redundant since our atomic_t uses hashed-locks on 32-bit
anyway... Maybe we can clean those up to be generic too someday.
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Usage of parport_pc_probe_port was changed in 28783eb52
(parport: Fix various uses of parport_pc).
It introduced this build error:
drivers/parisc/superio.c: In function 'superio_parport_init':
drivers/parisc/superio.c:437: error: too few arguments to function
'parport_pc_probe_port'
Fix it.
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
We weren't marking the resources as memory resources, so they weren't
being found by pci_claim_resource().
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
arch/parisc/mm/init.c: In function 'free_initmem':
381: warning: passing argument 1 of 'memset' makes pointer from integer without a cast
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
The TLB flushing functions on hppa, which causes PxTLB broadcasts on the system
bus, needs to be protected by irq-safe spinlocks to avoid irq handlers to deadlock
the kernel. The deadlocks only happened during I/O intensive loads and triggered
pretty seldom, which is why this bug went so long unnoticed.
Signed-off-by: Helge Deller <deller@gmx.de>
[edited to use spin_lock_irqsave on UP as well since we'd been locking there
all this time anyway, --kyle]
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Rewrote timer_interrupt() to properly handle the "delayed!" case.
If we used floating point math to compute the number of ticks that had
elapsed since the last timer interrupt, it could take up to 12K cycles
(emperical!) to handle the interrupt. Existing code assumed it would
never take more than 8k cycles. We end up programming Interval Timer
to a value less than "current" cycle counter. Thus have to wait until
Interval Timer "wrapped" and would then get the "delayed!" printk that
I moved below.
Since we don't really know what the upper limit is, I prefer to read
CR16 again after we've programmed it to make sure we won't have to
wait for CR16 to wrap.
Further, the printk was between reading CR16 (cycle couner) and writing CR16
(the interval timer). This would cause us to continue to set the interval
timer to a value that was "behind" the cycle counter. Rinse and repeat.
So no printk's between reading CR16 and setting next interval timer.
Tested on A500 (550 Mhz PA8600).
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Tested-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
----
Kyle, Helge, and other parisc's,
Please test on 32-bit before committing.
I think I have it right but recognize I might not.
TODO: I wanted to use "do_div()" in order to get both remainder
and value back with one division op. That should help with the
latency alot but can be applied seperately from this patch.
thanks,
grant
>>>> I think this is what was intended? Note that this patch may affect
>>>> profiling.
>>> it really should be
>>>
>>> - if (likely(t1 & (sizeof(unsigned int)-1)) == 0) {
>>> + if (likely((t1 & (sizeof(unsigned int)-1)) == 0)) {
>>>
>>> randolph
Reported-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Randolph Chung <tausq@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
gcc 4.4 warns about:
drivers/parisc/lba_pci.c: In function 'lba_pat_resources':
drivers/parisc/lba_pci.c:1099: warning: the frame size of 8280 bytes is larger than 4096 bytes
The problem is we declare two large structures on the stack. They don't need
to be on the stack since they are only used during LBA initialization (which
is serialized). Moving to be "static".
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
The defines and typedefs (hw_interrupt_type, no_irq_type, irq_desc_t) have
been kept around for migration reasons. After more than two years it's
time to remove them finally.
This patch cleans up one of the remaining users. When all such patches
hit mainline we can remove the defines and typedefs finally.
Impact: cleanup
Convert the last remaining users to struct irq_chip and remove the
define.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Fix miscompilation in arch/parisc/kernel/irq.c:
123: warning: passing arg 1 of `cpumask_setall' from incompatible pointer type
141: warning: passing arg 1 of `cpumask_copy' from incompatible pointer type
300: warning: passing arg 1 of `cpumask_copy' from incompatible pointer type
357: warning: passing arg 2 of `cpumask_copy' from incompatible pointer type
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Alex Chiang asked me why PARISC was calling pci_bus_add_devices()
and pci_bus_assign_resources() in the opposite order from everyone else.
No reason and I couldn't see any data dependency.
Patch below applies cleanly to 2.6.30-rc2.
Later, I suspected the code worked only because no drivers would be
loaded/ready until much later in the system initialization sequence.
Tested "LBA" code on J6000 (32-bit) and A500 (64-bit SMP) with 2.6.30-rc2.
Not tested with any Dino controllers.
Not tested with PCI-PCI Bridge (TBD).
Reported-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
There are two reasons to expose the memory *a in the asm:
1) To prevent the compiler from discarding a preceeding write to *a, and
2) to prevent it from caching *a in a register over the asm.
The change has had a few days testing with a SMP build of 2.6.22.19
running on a rp3440.
This patch is about the correctness of the __ldcw() macro itself.
The use of the macro should be confined to small inline functions
to try to limit the effect of clobbering memory on GCC's optimization
of loads and stores.
Signed-off-by: Dave Anglin <dave.anglin@nrc-cnrc.gc.ca>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Doing an IPI with local interrupts off triggers a warning. We
don't need to be quite so ridiculously paranoid. Also, clean up
a bit of the code a little.
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
The atomic operations on parisc are defined as macros. The macros
includes casts which disallows the use of some syntax elements and
produces error like this:
net/phonet/pep.c: In function 'pipe_rcv_status':
net/phonet/pep.c:262: error: lvalue required as left operand of assignment
The patch removes this superfluous casts.
Signed-off-by: Bastian Blank <waldi@debian.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Fix this build error when CONFIG_PROC_FS is not set:
drivers/parisc/ccio-dma.c:1574: error: 'ccio_proc_info_fops' undeclared
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Fix this build error when CONFIG_STI_CONSOLE is not set
drivers/video/stifb.c:1337: undefined reference to `sti_get_rom'
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: fix error message formatting
Btrfs: fix use after free in btrfs_start_workers fail path
Btrfs: honor nodatacow/sum mount options for new files
Btrfs: update backrefs while dropping snapshot
Btrfs: account for space we may use in fallocate
Btrfs: fix the file clone ioctl for preallocated extents
Btrfs: don't log the inode in file_write while growing the file
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] cxgb3i: fix connection error when vlan is enabled
[SCSI] FC transport: Locking fix for common-code FC pass-through patch
[SCSI] zalon: fix oops on attach failure
[SCSI] fnic: use DMA_BIT_MASK(nn) instead of deprecated DMA_nnBIT_MASK
[SCSI] fnic: remove redundant BUG_ONs and fix checks on unsigned
[SCSI] ibmvscsi: Fix module load hang
* git://git.infradead.org/iommu-2.6: (38 commits)
intel-iommu: Don't keep freeing page zero in dma_pte_free_pagetable()
intel-iommu: Introduce first_pte_in_page() to simplify PTE-setting loops
intel-iommu: Use cmpxchg64_local() for setting PTEs
intel-iommu: Warn about unmatched unmap requests
intel-iommu: Kill superfluous mapping_lock
intel-iommu: Ensure that PTE writes are 64-bit atomic, even on i386
intel-iommu: Make iommu=pt work on i386 too
intel-iommu: Performance improvement for dma_pte_free_pagetable()
intel-iommu: Don't free too much in dma_pte_free_pagetable()
intel-iommu: dump mappings but don't die on pte already set
intel-iommu: Combine domain_pfn_mapping() and domain_sg_mapping()
intel-iommu: Introduce domain_sg_mapping() to speed up intel_map_sg()
intel-iommu: Simplify __intel_alloc_iova()
intel-iommu: Performance improvement for domain_pfn_mapping()
intel-iommu: Performance improvement for dma_pte_clear_range()
intel-iommu: Clean up iommu_domain_identity_map()
intel-iommu: Remove last use of PHYSICAL_PAGE_MASK, for reserving PCI BARs
intel-iommu: Make iommu_flush_iotlb_psi() take pfn as argument
intel-iommu: Change aligned_size() to aligned_nrpages()
intel-iommu: Clean up intel_map_sg(), remove domain_page_mapping()
...
For some reason, the DP clocks were based off a 100MHz reference instead of
the standard 96MHz reference. This caused some DP monitors to fail to lock
to the signal.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
The default configuration file for the KwikByte kb9202 board was based on a 2.6.13-rc2 kernel and doesn't produce a bootable kernel. Update the
configuration in order to produce a bootable image.
Signed-off-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It is not safe to use match_int without checking the token type returned
by match_token (especially when the token type returned is Opt_err and
args is empty). Fix it.
Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds ETH_P_1588 protocol ID define.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>