Patch was originally submitted upstream on 4/21/2008:
http://marc.info/?l=linux-scsi&m=120880973719266&w=2
Somewhere, it never get merged. The patch restructures the task mgmt
routines, commonizing like behavior. Then the patch changes device
reset to LUN resets, and adds a target reset handler.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Contains the following changes:
- Fixed error paths retaking a spin lock which they already hold
- Added code to free memory in a couple of error paths
- Added code to free RPI bit map while unloading driver
- Added code to write zero to memory object allocated through dma_alloc_coherent
- Fixed crash/hang with target or LUN resets
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Contains the following changes:
- Force vport to send LOGO to fabric controller when deleting vport
- Fixed driver failing to register login when a PLOGI is received
- Fixes for FIP discovery
- Added stricter checks for FCF addressing mode
- Added code to send only FLOGI, FDISC and LOGO to Fabric controller as FIP
- Fixed handling of LOGO from Fabric port
- Fixed consecutive link up events skipped link_down processing
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Contains the following changes
- Set the CT field of FDISC to 3
- Fixed over allocation of SCSI buffers on SLI4
- Removed unused jump table entries
- Increase LPFC_WQE_DEF_COUNT to 256
- Updated FDISC context to VPI
- Fixed immediate SCSI command for LUN reset translation to WQE
- Extended mailbox handling to allow MBX_POLL commands in between async
MBQ commands
- Fixed SID used for FDISC
- Fix crash when accessing ctlregs from sysfs for SLI4 HBAs
- Fix SLI4 firmware version not being saved or displayed correctly
- Expand CQID field in WQE structure to 16 bits
- Fix post header template mailbox command timing out
- Removed FCoE PCI device ID 0x0705
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reverted back a change in qla*_intr_handler code that caused an increase in
cpu cycles by allowing interrupts to occur while the instance hardware lock
was being held. Fix by taking the lock in irqsave mode.
Reported-and-tested-by: Douglas W. Styner <douglas.w.styner@intel.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
CNIC and BNX2I must depend on PCI. Dependencies do not get
propagated through select.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
With a postfix decrement timeouts will reach -1 rather than 0, so the
errors do not appear.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fix qla2xxx printk format warnings:
drivers/scsi/qla2xxx/qla_sup.c:915: warning: long long unsigned int format, u64 arg (arg 5)
drivers/scsi/qla2xxx/qla_sup.c:915: warning: long long unsigned int format, u64 arg (arg 6)
drivers/scsi/qla2xxx/qla_sup.c:923: warning: long long unsigned int format, u64 arg (arg 5)
drivers/scsi/qla2xxx/qla_sup.c:923: warning: long long unsigned int format, u64 arg (arg 6)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The members from 'status' in struct sg_io_hdr to the last are used to
transfer information from kernel to user space. The values that user
space sets are just ignored.
Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Yong Wang reported the following compiler warning:
builtin-report.c: In function 'process_overflow_event':
builtin-report.c:984: error: cast to pointer from integer of different size
Which happens because we try to print ->ips[] out with a limited
format, losing the high 32 bits. Print it out using %016Lx instead.
Reported-by: Yong Wang <yong.y.wang@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At present, every architecture that supports perf_counters has to
declare set_perf_counter_pending() in its arch-specific headers.
This consolidates the declarations into a single declaration in one
common place, include/linux/perf_counter.h. On powerpc, we continue
to provide a static inline definition of set_perf_counter_pending()
in the powerpc hw_irq.h.
Also, this removes from the x86 perf_counter.h the unused null
definitions of {test,clear}_perf_counter_pending.
Reported-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: benh@kernel.crashing.org
LKML-Reference: <18998.13388.920691.523227@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This fixes a couple of compile warnings that crept into the powerpc
perf_counter code recently:
CC arch/powerpc/kernel/perf_counter.o
arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
arch/powerpc/kernel/perf_counter.c:1016: warning: unused variable 'addr'
arch/powerpc/kernel/perf_counter.c: In function 'hw_perf_counter_init':
arch/powerpc/kernel/perf_counter.c:891: warning: 'ev' may be used uninitialized in this function
Stephen Rothwell reported this against linux-next as well.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18998.12884.787039.22202@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
These registers may contain values from previous kernels. So reset them
to known values before enable the event buffer again.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Take advantage of call-graph percounter sampling/recording to
display a non-trivial histogram: the true, collapsed/summarized
cost measurement, on a per system call total overhead basis:
aldebaran:~/linux/linux/tools/perf> ./perf record -g -a -f ~/hackbench 10
aldebaran:~/linux/linux/tools/perf> ./perf report -s symbol --syscalls | head -10
#
# (3536 samples)
#
# Overhead Symbol
# ........ ......
#
40.75% [k] sys_write
40.21% [k] sys_read
4.44% [k] do_nmi
...
This is done by accounting each (reliable) call-chain that chains back
to a given system call to that system call function.
[ So in the above example we can see that hackbench spends about 40% of
its total time somewhere in sys_write() and 40% somewhere in
sys_read(), the rest of the time is spent in user-space. The time
is not spent in sys_write() _itself_ but in one of its many child
functions. ]
Or, a recording of a (source files are already in the page-cache) kernel build:
$ perf record -g -m 512 -f -- make -j32 kernel
$ perf report -s s --syscalls | grep '\[k\]' | grep -v nmi
4.14% [k] do_page_fault
1.20% [k] sys_write
1.10% [k] sys_open
0.63% [k] sys_exit_group
0.48% [k] smp_apic_timer_interrupt
0.37% [k] sys_read
0.37% [k] sys_execve
0.20% [k] sys_mmap
0.18% [k] sys_close
0.14% [k] sys_munmap
0.13% [k] sys_poll
0.09% [k] sys_newstat
0.07% [k] sys_clone
0.06% [k] sys_newfstat
0.05% [k] sys_access
0.05% [k] schedule
Shows the true total cost of each syscall variant that gets used
during a kernel build. This profile reveals it that pagefaults are
the costliest, followed by read()/write().
An interesting detail: timer interrupts cost 0.5% - or 0.5 seconds
per 100 seconds of kernel build-time. (this was done with HZ=1000)
The summary is done in 'perf report', i.e. in the post-processing
stage - so once we have a good call-graph recording, this type of
non-trivial high-level analysis becomes possible.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
__copy_from_user_inatomic() isn't NMI safe in that it can trigger
the page fault handler which is another trap and its return path
invokes IRET which will also close the NMI context.
Therefore use a GUP based approach to copy the stack frames over.
We tried an alternative solution as well: we used a forward ported
version of Mathieu Desnoyers's "NMI safe INT3 and Page Fault" patch
that modifies the exception return path to use an open-coded IRET with
explicit stack unrolling and TF checking.
This didnt work as it interacted with faulting user-space instructions,
causing them not to restart properly, which corrupts user-space
registers.
Solving that would probably involve disassembling those instructions
and backtracing the RIP. But even without that, the code was deemed
rather complex to the already non-trivial x86 entry assembly code,
so instead we went for this GUP based method that does a
software-walk of the pagetables.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Two new kmap_atomic slots for NMI context. And teach pte_offset_map()
about NMI context.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Nick Piggin <npiggin@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce a gup_fast() variant which is usable from IRQ/NMI context.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Nick Piggin <npiggin@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Simon triggered a lockdep inversion report about us taking ctx->mutex
vs counter->mutex in inverse orders. Fix that up.
Reported-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Tested-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The IOMMU spec states that IOMMU behavior may be undefined when the
IOMMU registers are rewritten while command or event buffer is enabled.
Disable them in IOMMU disable path.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This false positive is due to field padding in struct sigqueue. When
this dynamically allocated structure is copied to the stack (in arch-
specific delivery code), kmemcheck sees a read from the padding, which
is, naturally, uninitialized.
Hide the false positive using the __GFP_NOTRACK_FALSE_POSITIVE flag.
Also made the rlimit override code a bit clearer by introducing a new
variable.
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
This false positive is due to the fact that do_mount_root() fakes a
mount option (which is normally read from userspace), and the kernel
unconditionally reads a whole page for the mount option.
Hide the false positive by using the new __getname_gfp() with the
__GFP_NOTRACK_FALSE_POSITIVE flag.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
The purpose of this change is to allow __getname() users to pass a
custom GFP mask to kmem_cache_alloc(). This is needed for annotating
a certain kmemcheck false positive.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
This gets rid of a heap of false-positive warnings from the tracer
code due to the use of bitfields.
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
The use of bitfields here would lead to false positive warnings with
kmemcheck. Silence them.
(Additionally, one erroneous comment related to the bitfield was also
fixed.)
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
kmemcheck reports a use of uninitialized memory here, but it's not
a real error. The structure in question has just been allocated, and
the whole field is initialized, but it happens in two steps.
We fix the false positive by inserting a kmemcheck annotation.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Add the bitfield API which can be used to annotate bitfields in structs
and get rid of false positive reports.
According to Al Viro, the syntax we were using (putting #ifdef inside
macro arguments) was not valid C. He also suggested using begin/end
markers instead, which is what we do now.
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
We've had some troubles in the past with weird instructions. This
patch adds a self-test framework which can be used to verify that
a certain set of opcodes are decoded correctly. Of course, the
opcodes which are not tested can still give the wrong results.
In short, this is just a safeguard to catch unintentional changes
in the opcode decoder. It does not mean that errors can't still
occur!
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Only _PAGE_HIDDEN when CONFIG_KMEMCHECK is defined, otherwise set it
to 0. Allows later cleanups.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
The Kconfig options of kmemcheck are hidden under arch/x86 which makes porting
to other architectures harder. To fix that, move the Kconfig bits to
lib/Kconfig.kmemcheck and introduce a CONFIG_HAVE_ARCH_KMEMCHECK config option
that architectures can define.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
let it rip!
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
This adds support for tracking the initializedness of memory that
was allocated with the page allocator. Highmem requests are not
tracked.
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
[build fix for !CONFIG_KMEMCHECK]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
When kexec'ing to a new kernel (for example, when crashing and launching
a kdump session), the AMD IOMMU may have cached translations. The kexec'd
kernel, during initialization, will invalidate the IOMMU device table
entries, but not the domain translations. These stale entries can cause
a device's DMA to fail, makes it rough to write a dump to disk when the
disk controller can't DMA ;-)
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
If the IOMMUs are still enabled when the kexec kernel boots access to
the disk is not possible. This is bad for tools like kdump or anything
else which wants to use PCI devices.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
When the IOMMU stays enabled the BIOS may not be able to finish the
machine shutdown properly. So disable the hardware on shutdown.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The PCM x-fi native update routine can cause deadlocks when the
trigger(START) is called while the stream is running.
This patch fixes the deadlock by just postponing the pcm period update
to the next possible wake-up. Also it adds the flip of ti->running
flag (just to be sure as now).
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Recent change to use slab allocations earlier exposed a bug where
SLUB can call schedule_work and try to call sysfs before it is
safe to do so.
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
This is needed for page allocator support to prevent false positives
when accessing pages which are dma-mapped.
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
As these are allocated using the page allocator, we need to pass
__GFP_NOTRACK before we add page allocator support to kmemcheck.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
The xor tests are run on uninitialized data, because it is doesn't
really matter what the underlying data is. Annotate this false-
positive warning.
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
We now have SLAB support for kmemcheck! This means that it doesn't matter
whether one chooses SLAB or SLUB, or indeed whether Linus chooses to chuck
SLAB or SLUB.. ;-)
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Parts of this patch were contributed by Pekka Enberg but merged for
atomicity.
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>