Commit graph

114600 commits

Author SHA1 Message Date
Marcin Slusarz
2e4a75cdcb rtc: fix kernel panic on second use of SIGIO nofitication
When userspace uses SIGIO notification and forgets to disable it before
closing file descriptor, rtc->async_queue contains stale pointer to struct
file.  When user space enables again SIGIO notification in different
process, kernel dereferences this (poisoned) pointer and crashes.

So disable SIGIO notification on close.

Kernel panic:
(second run of qemu (requires echo 1024 > /sys/class/rtc/rtc0/max_user_freq))

general protection fault: 0000 [1] PREEMPT
CPU 0
Modules linked in: af_packet snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq usbhid tuner tea5767 tda8290 tuner_xc2028 xc5000 tda9887 tuner_simple tuner_types mt20xx tea5761 tda9875 uhci_hcd ehci_hcd usbcore bttv snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer ir_common compat_ioctl32 snd_page_alloc videodev v4l1_compat snd_mpu401_uart snd_rawmidi v4l2_common videobuf_dma_sg videobuf_core snd_seq_device snd btcx_risc soundcore tveeprom i2c_viapro
Pid: 5781, comm: qemu-system-x86 Not tainted 2.6.27-rc6 #363
RIP: 0010:[<ffffffff8024f891>]  [<ffffffff8024f891>] __lock_acquire+0x3db/0x73f
RSP: 0000:ffffffff80674cb8  EFLAGS: 00010002
RAX: ffff8800224c62f0 RBX: 0000000000000046 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800224c62f0
RBP: ffffffff80674d08 R08: 0000000000000002 R09: 0000000000000001
R10: ffffffff80238941 R11: 0000000000000001 R12: 0000000000000000
R13: 6b6b6b6b6b6b6b6b R14: ffff88003a450080 R15: 0000000000000000
FS:  00007f98b69516f0(0000) GS:ffffffff80623200(0000) knlGS:00000000f7cc86d0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000a87000 CR3: 0000000022598000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-system-x86 (pid: 5781, threadinfo ffff880028812000, task ffff88003a450080)
Stack:  ffffffff80674cf8 0000000180238440 0000000200000002 0000000000000000
 ffff8800224c62f0 0000000000000046 0000000000000000 0000000000000002
 0000000000000002 0000000000000000 ffffffff80674d68 ffffffff8024fc7a
Call Trace:
 <IRQ>  [<ffffffff8024fc7a>] lock_acquire+0x85/0xa9
 [<ffffffff8029cb62>] ? send_sigio+0x2a/0x184
 [<ffffffff80491d1f>] _read_lock+0x3e/0x4a
 [<ffffffff8029cb62>] ? send_sigio+0x2a/0x184
 [<ffffffff8029cb62>] send_sigio+0x2a/0x184
 [<ffffffff8024fb97>] ? __lock_acquire+0x6e1/0x73f
 [<ffffffff8029cd4d>] ? kill_fasync+0x2c/0x4e
 [<ffffffff8029cd10>] __kill_fasync+0x54/0x65
 [<ffffffff8029cd5b>] kill_fasync+0x3a/0x4e
 [<ffffffff80402896>] rtc_update_irq+0x9c/0xa5
 [<ffffffff80404640>] cmos_interrupt+0xae/0xc0
 [<ffffffff8025d1c1>] handle_IRQ_event+0x25/0x5a
 [<ffffffff8025e5e4>] handle_edge_irq+0xdd/0x123
 [<ffffffff8020da34>] do_IRQ+0xe4/0x144
 [<ffffffff8020bad6>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff8026fdc2>] ? __alloc_pages_internal+0xe7/0x3ad
 [<ffffffff8033fe67>] ? clear_page_c+0x7/0x10
 [<ffffffff8026fc10>] ? get_page_from_freelist+0x385/0x450
 [<ffffffff8026fdc2>] ? __alloc_pages_internal+0xe7/0x3ad
 [<ffffffff80280aac>] ? anon_vma_prepare+0x2e/0xf6
 [<ffffffff80279400>] ? handle_mm_fault+0x227/0x6a5
 [<ffffffff80494716>] ? do_page_fault+0x494/0x83f
 [<ffffffff8049251d>] ? error_exit+0x0/0xa9

Code: cc 41 39 45 28 74 24 e8 5e 1d 0f 00 85 c0 0f 84 6a 03 00 00 83 3d 8f a9 aa 00 00 be 47 03 00 00 0f 84 6a 02 00 00 e9 53 03 00 00 <41> ff 85 38 01 00 00 45 8b be 90 06 00 00 41 83 ff 2f 76 24 e8
RIP  [<ffffffff8024f891>] __lock_acquire+0x3db/0x73f
 RSP <ffffffff80674cb8>
---[ end trace 431877d860448760 ]---
Kernel panic - not syncing: Aiee, killing interrupt handler!

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Alessandro Zummo <alessandro.zummo@towertech.it>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-03 18:22:17 -07:00
Paul Moore
3040a6d5a2 selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field.  The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used.  This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.

Please apply this to the 2.6.27-rcX release stream.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-04 08:25:18 +10:00
Paul Moore
81990fbdd1 selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field.  The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used.  This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.

Please apply this to the 2.6.27-rcX release stream.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-04 08:18:18 +10:00
Josef Bacik
68c9d702bb generic block based fiemap implementation
Any block based fs (this patch includes ext3) just has to declare its own
fiemap() function and then call this generic function with its own
get_block_t. This works well for block based filesystems that will map
multiple contiguous blocks at one time, but will work for filesystems that
only map one block at a time, you will just end up with an "extent" for each
block. One gotcha is this will not play nicely where there is hole+data
after the EOF. This function will assume its hit the end of the data as soon
as it hits a hole after the EOF, so if there is any data past that it will
not pick that up. AFAIK no block based fs does this anyway, but its in the
comments of the function anyway just in case.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org
2008-10-03 17:32:43 -04:00
Mark Fasheh
00dc417fa3 ocfs2: fiemap support
Plug ocfs2 into ->fiemap. Some portions of ocfs2_get_clusters() had to be
refactored so that the extent cache can be skipped in favor of going
directly to the on-disk records. This makes it easier for us to determine
which extent is the last one in the btree. Also, I'm not sure we want to be
caching fiemap lookups anyway as they're not directly related to data
read/write.

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: ocfs2-devel@oss.oracle.com
Cc: linux-fsdevel@vger.kernel.org
2008-10-03 17:32:11 -04:00
Mark Fasheh
c4b929b85b vfs: vfs-level fiemap interface
Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap
inode operation.

Userspace can get extent information on a file via fiemap ioctl. As input,
the fiemap ioctl takes a struct fiemap which includes an array of struct
fiemap_extent (fm_extents). Size of the extent array is passed as
fm_extent_count and number of extents returned will be written into
fm_mapped_extents. Offset and length fields on the fiemap structure
(fm_start, fm_length) describe a logical range which will be searched for
extents. All extents returned will at least partially contain this range.
The actual extent offsets and ranges returned will be unmodified from their
offset and range on-disk.

The fiemap ioctl returns '0' on success. On error, -1 is returned and errno
is set. If errno is equal to EBADR, then fm_flags will contain those flags
which were passed in which the kernel did not understand. On all other
errors, the contents of fm_extents is undefined.

As fiemap evolved, there have been many authors of the vfs patch. As far as
I can tell, the list includes:
Kalpak Shah <kalpak.shah@sun.com>
Andreas Dilger <adilger@sun.com>
Eric Sandeen <sandeen@redhat.com>
Mark Fasheh <mfasheh@suse.com>

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: linux-api@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
2008-10-08 19:44:18 -04:00
Kalpak Shah
4d20c685fa ext4: fix xattr deadlock
ext4_xattr_set_handle() eventually ends up calling
ext4_mark_inode_dirty() which tries to expand the inode by shifting
the EAs.  This leads to the xattr_sem being downed again and leading
to a deadlock.

This patch makes sure that if ext4_xattr_set_handle() is in the
call-chain, ext4_mark_inode_dirty() will not expand the inode.

Signed-off-by: Kalpak Shah <kalpak.shah@sun.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-08 23:21:54 -04:00
Theodore Ts'o
45a90bfd90 jbd2: Fix buffer head leak when writing the commit block
Also make sure the buffer heads are marked clean before submitting bh
for writing.  The previous code was marking the buffer head dirty,
which would have forced an unneeded write (and seek) to the journal
for no good reason.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-06 12:04:02 -04:00
Theodore Ts'o
ede86cc473 ext4: Add debugging markers that can be used by systemtap
This debugging markers are designed to debug problems such as the
random filesystem latency problems reported by Arjan.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-05 20:50:06 -04:00
Duane Griffin
23f8b79eae jbd2: abort instead of waiting for nonexistent transaction
The __jbd2_log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal. 
However, if there are no transactions to be processed (e.g.  because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Cc: Sami Liedes <sliedes@cc.hut.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-08 23:28:31 -04:00
Frederic Bohe
c806e68f56 ext4: fix initialization of UNINIT bitmap blocks
This fixes a bug which caused on-line resizing of filesystems with a
1k blocksize to fail.  The root cause of this bug was the fact that if
an uninitalized bitmap block gets read in by userspace (which
e2fsprogs does try to avoid, but can happen when the blocksize is less
than the pagesize and an adjacent blocks is read into memory)
ext4_read_block_bitmap() was erroneously depending on the buffer
uptodate flag to decide whether it needed to initialize the bitmap
block in memory --- i.e., to set the standard set of blocks in use by
a block group (superblock, bitmaps, inode table, etc.).  Essentially,
ext4_read_block_bitmap() assumed it was the only routine that might
try to read a block containing a block bitmap, which is simply not
true.  

To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
must always initialize uninitialized bitmap blocks.  Once a block or
inode is allocated out of that bitmap, it will be marked as
initialized in the block group descriptor, so in general this won't
result any extra unnecessary work.

Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10 08:09:18 -04:00
Theodore Ts'o
c2ea3fde61 ext4: Remove old legacy block allocator
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10 09:40:52 -04:00
Theodore Ts'o
240799cdf2 ext4: Use readahead when reading an inode from the inode table
With modern hard drives, reading 64k takes roughly the same time as
reading a 4k block.  So request readahead for adjacent inode table
blocks to reduce the time it takes when iterating over directories
(especially when doing this in htree sort order) in a cold cache case.
With this patch, the time it takes to run "git status" on a kernel
tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches"
is reduced by 21%.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09 23:53:47 -04:00
Theodore Ts'o
37515facd0 ext4: Improve the documentation for ext4's /proc tunables
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Alex Tomas <bzzz@sun.com>
Cc: Andreas Dilger <adilger@sun.com>
2008-10-09 23:21:54 -04:00
Linus Torvalds
e105eabb5b Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] SMTC: Fix SMTC dyntick support.
  [MIPS] SMTC: Close tiny holes in the SMTC IPI replay system.
  [MIPS] SMTC: Fix holes in SMTC and FPU affinity support.
  [MIPS] SMTC: Build fix: Fix filename in Makefile
  [MIPS] Build fix: Fix irq flags type
2008-10-03 14:11:43 -07:00
Chuck Lever
9a38a83880 lockd: Remove unused fields in the nlm_reboot structure
The nlm_reboot structure is used to store information provided by the
NSM_NOTIFY procedure.  This procedure is not specified by the NLM or NSM
protocols, other than to say that the procedure can be used to transmit
information private to a particular NLM/NSM implementation.

For Linux, the callback arguments include the name of the monitored host,
the new NSM state of the host, and a 16-byte private opaque.

As a clean up, remove the unused fields and the server-side XDR logic that
decodes them.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
b85e467634 lockd: Add helper to sanity check incoming NOTIFY requests
lockd accepts SM_NOTIFY calls only from a privileged process on the
local system.  If lockd uses an AF_INET6 listener, the sender's address
(ie the local rpc.statd) will be the IPv6 loopback address, not the
IPv4 loopback address.

Make sure the privilege test in nlmsvc_proc_sm_notify() and
nlm4svc_proc_sm_notify() works for both AF_INET and AF_INET6 family
addresses by refactoring the test into a helper and adding support for
IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
dcff09f124 lockd: change nlmclnt_grant() to take a "struct sockaddr *"
Adjust the signature and callers of nlmclnt_grant() to pass a "struct
sockaddr *" instead of a "struct sockaddr_in *" in order to support IPv6
addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
6bfbe8af46 lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses
Fix up nlmsvc_lookup_host() to pass AF_INET6 source addresses to
nlm_lookup_host().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
d7d204403b lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET
Pass a struct sockaddr * and a length to nlmclnt_lookup_host() to
accomodate non-AF_INET family addresses.

As a side benefit, eliminate the hostname_len argument, as the hostname
is always NUL-terminated.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:34 -04:00
Chuck Lever
88541c8487 lockd: Support non-AF_INET addresses in nlm_lookup_host()
Use struct sockaddr * and length in nlm_lookup_host_info to all callers
to pass in either AF_INET or AF_INET6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:01:57 -04:00
Chuck Lever
7f1ed18bd3 NLM: Convert nlm_lookup_host() to use a single argument
The nlm_lookup_host() function already has a large number of arguments,
and I'm about to add a few more.  As a clean up, convert the function
to use a single data structure argument.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 16:58:23 -04:00
Linus Torvalds
1db9b83738 Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] qdio: prevent stack clobber
  [S390] nohz: Fix __udelay.
2008-10-03 13:43:05 -07:00
H. Peter Anvin
cc65f1ec19 x86 setup: correct segfault in generation of 32-bit reloc kernel
Impact: segfault on build of a 32-bit relocatable kernel

When converting arch/x86/boot/compressed/relocs.c to support unlimited
sections, the computation of sym_strtab in walk_relocs() was done
incorrectly.  This causes a segfault for some people when building the
relocatable 32-bit kernel.

Pointed out by Anonymous <pageexec@freemail.hu>.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-03 13:42:04 -07:00
Tom Tucker
0d3ebb9ae9 svcrdma: Add Fast Reg MR Data Types
Add data types to track Fast Reg Memory Regions. The core data type is
svc_rdma_fastreg_mr that associates a device MR with a host kva and page
list. A field is added to the WR context to keep track of the FRMR
used to map the local memory for an RPC.

An FRMR list and spin lock are added to the transport instance to keep
track of all FRMR allocated for the transport. Also added are device
capability flags to indicate what the memory registration
capabilities are for the underlying device and whether or not fast
memory registration is supported.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-10-03 15:39:58 -05:00
Linus Torvalds
96d746c68f Fix init/main.c to use regular printk with '%pF' for initcall fn
.. small detail, but the silly e1000e initcall warning debugging caused
me to look at this code.  Rather than gouge my eyes out with a spoon, I
just fixed it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-03 13:38:07 -07:00
J. Bruce Fields
d22b1cff09 lockd: reject reclaims outside the grace period
The current lockd does not reject reclaims that arrive outside of the
grace period.

Accepting a reclaim means promising to the client that no conflicting
locks were granted since last it held the lock.  We can meet that
promise if we assume the only lockers are nfs clients, and that they are
sufficiently well-behaved to reclaim only locks that they held before,
and that only reclaim locks have been permitted so far.  Once we leave
the grace period (and start permitting non-reclaims), we can no longer
keep that promise.  So we must start rejecting reclaims at that point.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 16:19:20 -04:00
J. Bruce Fields
b2b5028905 lockd: move grace period checks to common code
Do all the grace period checks in svclock.c.  This simplifies the code a
bit, and will ease some later changes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 16:19:19 -04:00
J. Bruce Fields
af558e33be nfsd: common grace period control
Rewrite grace period code to unify management of grace period across
lockd and nfsd.  The current code has lockd and nfsd cooperate to
compute a grace period which is satisfactory to them both, and then
individually enforce it.  This creates a slight race condition, since
the enforcement is not coordinated.  It's also more complicated than
necessary.

Here instead we have lockd and nfsd each inform common code when they
enter the grace period, and when they're ready to leave the grace
period, and allow normal locking only after both of them are ready to
leave.

We also expect the locks_start_grace()/locks_end_grace() interface here
to be simpler to build on for future cluster/high-availability work,
which may require (for example) putting individual filesystems into
grace, or enforcing grace periods across multiple cluster nodes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 16:19:02 -04:00
Jan Glauber
75f6276187 [S390] qdio: prevent stack clobber
Don't print more information than fits into the string on the
stack. Combine the informational output of qdio to fit into
one line.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-03 21:55:55 +02:00
Heiko Carstens
d3d238c774 [S390] nohz: Fix __udelay.
This fixes a regression that came with 934b2857cc
("[S390] nohz/sclp: disable timer on synchronous waits.").
If udelay() gets called from a disabled context it sets the clock comparator
to a value where it expects the next interrupt. When the interrupt happens
the clock comparator gets not reset and therefore the interrupt condition
doesn't get cleared. The result is an endless timer interrupt loop.

In addition this patch fixes also the following:

rcutorture reveals that our __udelay implementation is still buggy,
since it might schedule tasklets, but prevents their execution:

NOHZ: local_softirq_pending 42
NOHZ: local_softirq_pending 02
NOHZ: local_softirq_pending 142
NOHZ: local_softirq_pending 02

To fix this we make sure that only the clock comparator interrupt
is enabled when the enabled wait psw is loaded.
Also no code gets called anymore which might schedule tasklets.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-03 21:55:54 +02:00
Bob Sharp
7191a0a182 RDMA/nes: Fix routed RDMA connections
Fix routed RDMA connections to destinations where the next hop is not
the final destination.  Use neigh_*() to properly locate neighbor.

Signed-off-by: Bob Sharp <bsharp@neteffect.com>
Signed-off-by: Sweta Bhatt <sweta.bhatt@einfochips.com>
Signed-off-by: Chien Tung <ctung@neteffect.com>
2008-10-03 12:21:19 -07:00
Vadim Makhervaks
7e36d3d732 RDMA/nes: Enhanced PFT management scheme
Change management of perfect filter table to allow enhanced
performance applications.

Signed-off-by: Vadim Makhervaks <vmakhervaks@neteffect.com>
Signed-off-by: Sweta Bhatt <sweta.bhatt@einfochips.com>
Signed-off-by: Chien Tung <ctung@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-10-03 12:21:18 -07:00
Ingo Molnar
f68ec0c247 Merge commit 'v2.6.27-rc8' into x86/setup 2008-10-03 19:28:46 +02:00
H. Peter Anvin
98920dc3d1 Revert "x86: fix ghost EDD devices in /sys again"
This reverts commit 464f04c9e9.
Obsoleted by commit 6cdcdb99cf.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-03 10:22:33 -07:00
Andrey Borzenkov
6cdcdb99cf x86 setup: fix ghost entries under /sys/firmware/edd take 3
Some BIOSes do not indicate error when trying to read from non-
existing device. Zero buffer before reading and check that we
possibly have valid MBR by looking for MBR magic.

This was fixed in different way for edd.S in
http://marc.info/?l=linux-kernel&m=114087765422490&w=2, but lost
again when edd.S was rewritten in C.

Signed-off-by: Andrey Borzenkov < arvidjaar@mail.ru>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-03 10:15:29 -07:00
Swen Schillig
41bfcf9010 [SCSI] zfcp: fix double dbf id usage
Trace ids 107 and 3 are used twice, fix this to have unique ids for
the erp triggers.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:56 -05:00
Swen Schillig
091694a556 [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev
Due to the character of a scheduled work we cannot guarantee the
LUN register to be finished before an initial device tries to use it.
Therefor we have to wait for PENDING_SCSI_WORK flag to be cleared
before proceeding.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:56 -05:00
Swen Schillig
9fb3cd86e4 [SCSI] zfcp: fix erp list usage without using locks
The zfcp_erp_thread was using the nolock version of the dbf function.
This resulted in a list access while other tasks could modifying the
list. The symptom was an erp thread running at 100% CPU and never
returning from the dbf function.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:56 -05:00
Swen Schillig
e4e9ba5d93 [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport
In case of an adapter reopen all rports have to be deleted from the
environment. This should only happen for already registered rports
otherwise fc_remote_port_delete is called with a NULL pointer.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:55 -05:00
Swen Schillig
b7f15f3c94 [SCSI] zfcp: fix deadlock caused by shared work queue tasks
Each adapter reopen trigger automatically a scan_port task which
is waiting for the ERP to be finished before further processing.
Since the initial device setup enqueues adapter, port and LUN which
are individual ERP actions, this process would start after
everything is done. Unfortunately the port_reopen requires another
scheduled work to be finished which is queued after the automatic
scan_port -> deadlock !

This fix creates an own work queue for ERP based nameserver requests.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:55 -05:00
Swen Schillig
5706938669 [SCSI] zfcp: put threshold data in hba trace
Now that we removed the long messages for the bit error threshold
data, put the data in the hba trace. This way, we get a short warning
for the threshold event from the hardware and have the data in the
trace for further analysis.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:55 -05:00
Christof Schmitt
0406289ed5 [SCSI] zfcp: Simplify zfcp data structures
Reduce the size of zfcp data structures by removing unused and
redundant members. scsi_lun is only the mangled version of the
fcp_lun. So, remove the redundant field and use the fcp_lun instead.

Since the queue lock and the pci_batch indicator are only used in the
request queue, move them from the common queue struct to the adapter
struct.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:54 -05:00
Swen Schillig
a1b449de5d [SCSI] zfcp: Simplify get_adapter_by_busid
Call the helper function from cio instead looping through all zfcp
adapters.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:53 -05:00
Swen Schillig
7ba58c9cc1 [SCSI] zfcp: remove all typedefs and replace them with standards
Remove typedefs from zfcp, use already existing types instead.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:53 -05:00
Swen Schillig
5ab944f97e [SCSI] zfcp: attach and release SAN nameserver port on demand
Changing the zfcp behaviour from always having the nameserver port
open to an on-demand strategy.  This strategy reduces the use of
limited resources like port connections. The patch provides a common
infrastructure which could be used for all WKA ports in future.

Also reduce the number of nameserver lookups by changing the zfcp
behaviour of always querying the nameserver for the corresponding
destination ID of the remote port.  If the destination ID has changed
during the reopen process we will be informed and then trigger a
nameserver query on demand.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:53 -05:00
Swen Schillig
44cc76f2d1 [SCSI] zfcp: remove unused references, declarations and flags
- Remove unused references and declarations, including one instance
   of the FC ls_adisc struct that has been defined twice.
 - Also remove the flags COMMON_OPENING, COMMON_CLOSING,
   ADAPTER_REGISTERED and XPORT_OK that are only set and cleared, but
   not checked anywhere.
 - Remove the zfcp specific atomic_test_mask makro. Simply use
   atomic_read directly instead.
 - Remove the zfcp internal sg helper functions and switch the places
   where it is still used to call sg_virt directly.
 - With the update of the QDIO code, the QDIO data structures no
   longer use the volatile type qualifier. Now we can also remove the
   volatile qualifiers from the zfcp code.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:52 -05:00
Christof Schmitt
ff3b24fa53 [SCSI] zfcp: Update message with input from review
Update the kernel messages in zfcp with input from the message review
and remove some messages that have been identified as redundant.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:52 -05:00
Stefan Raspl
2450d3e7b8 [SCSI] zfcp: add queue_full sysfs attribute
Adds a new sysfs attribute queue_full for adapters that records the number
of incidents where a requests could not be submitted due to insufficient
free space on the request queue.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:52 -05:00
James Bottomley
7ae628d9d2 [SCSI] scsi_dh: suppress comparison warning
On Mon, 2008-09-22 at 14:56 -0700, akpm@linux-foundation.org wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
>
> s390:
>
> drivers/scsi/device_handler/scsi_dh_emc.c: In function 'parse_sp_info_reply':
> drivers/scsi/device_handler/scsi_dh_emc.c:179: warning: comparison is always false due to limited range of data type
>
> because chars are unsigned, I assume.

Fix by making csdev->buffer explicitly an unsigned char and dropping
the < 0 test.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:51 -05:00