Commit graph

120014 commits

Author SHA1 Message Date
Hannes Hering
c5916cf8db ehea: Fix some whitespace issues
This patch removes some trailing whitespaces and spaces before tabs.

Signed-off-by: Hannes Hering <hering2@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:50:56 -05:00
Ben Hutchings
739bb23d72 sfc: Do not reset when hardware monitor detects a fault
The TX watchdog should trigger a reset, but a temperature/power alarm
should not as this is unlikely to solve the problem.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:50:15 -05:00
Ben Hutchings
3e133c44d2 sfc: Use lm87 and lm90 drivers for board temperature/power monitoring
Add board monitoring to periodic work whenever link is down.
For SFE4001, report when a fault has caused the PHY to turn off.
For SFE4002, switch XFP PHY into low-power state in case of a fault.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:50:09 -05:00
Ben Hutchings
f41507245e sfc: Expose flash region storing boot code as MTD
The boot code that appears as a PCI expansion ROM on the SFC4000 is
stored in flash.  Expose this as a standard MTD device to allow for
in-place upgrades.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:57 -05:00
Ben Hutchings
0a95f56323 sfc: Clean up non-volatile memory partitioning
Move flash and EEPROM partition boundary constants into spi.h and rename
them to be consistent.

Add a comment on the partitioning.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:56 -05:00
Ben Hutchings
2883f552f2 sfc: Correct address of gPXE boot configuration in EEPROM
Due to a hardware bug, the originally assigned range cannot reliably
be used for boot configuration and must not be modifiable through
ethtool.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:50 -05:00
Jay Vosburgh
fd989c8332 bonding: alternate agg selection policies for 802.3ad
This patch implements alternative aggregator selection policies
for 802.3ad.  The existing policy, now termed "stable," selects the active
aggregator by greatest bandwidth, and only reselects a new aggregator
if the active aggregator is entirely disabled (no more ports or all ports
down).

	This patch adds two new policies: bandwidth and count, selecting
the active aggregator by total bandwidth (like the stable policy) or by
the number of ports in the aggregator, respectively.  These two policies
also differ from the stable policy in that they will reselect the active
aggregator when availability-related changes occur in the bond (e.g.,
link state change).

	This permits "gang failover" within 802.3ad, allowing redundant
aggregators along parallel paths to always maintain the "best" aggregator
as the active aggregator (rather than having to wait for the active to
entirely fail).

	This patch also updates the driver version to 3.5.0.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:47 -05:00
Jay Vosburgh
6146b1a4da bonding: Fix ALB mode to balance traffic on VLANs
The current ALB function that processes incoming ARPs
does not handle traffic for VLANs configured above bonding.  This causes
traffic on those VLANs to all be assigned the same slave.  This patch
corrects that misbehavior by locating the bonding interface nested below
the VLAN interface.

	Bug reported by Sven Anders <anders@anduras.de>, who also
tested an earlier version of this patch and confirmed that it resolved
the problem.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:40 -05:00
Brian Haley
305d552acc bonding: send IPv6 neighbor advertisement on failover
This patch adds better IPv6 failover support for bonding devices,
especially when in active-backup mode and there are only IPv6 addresses
configured, as reported by Alex Sidorenko.

- Creates a new file, net/drivers/bonding/bond_ipv6.c, for the
   IPv6-specific routines.  Both regular bonds and VLANs over bonds
   are supported.

- Adds a new tunable, num_unsol_na, to limit the number of unsolicited
   IPv6 Neighbor Advertisements that are sent on a failover event.
   Default is 1.

- Creates two new IPv6 neighbor discovery functions:

   ndisc_build_skb()
   ndisc_send_skb()

   These were required to support VLANs since we have to be able to
   add the VLAN id to the skb since ndisc_send_na() and friends
   shouldn't be asked to do this.  These two routines are basically
   __ndisc_send() split into two pieces, in a slightly different order.

- Updates Documentation/networking/bonding.txt and bumps the rev of bond
   support to 3.4.0.

On failover, this new code will generate one packet:

- An unsolicited IPv6 Neighbor Advertisement, which helps the switch
   learn that the address has moved to the new slave.

Testing has shown that sending just the NA results in pretty good
behavior when in active-back mode, I saw no lost ping packets for example.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:37 -05:00
Jarek Poplawski
61c9eaf900 pkt_sched: Fix qdisc len in qdisc_peek_dequeued()
A packet dequeued and stored as gso_skb in qdisc_peek_dequeued() should
be seen as part of the queue for sch->q.qlen queries until it's really
dequeued with qdisc_dequeue_peeked(), so qlen needs additional updating
in these functions. (Updating qstats.backlog shouldn't matter here.)

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 16:02:34 -08:00
Eric W. Biederman
0a36b345ab net: Don't leak packets when a netns is going down
I have been tracking for a while a case where when the
network namespace exits the cleanup gets stck in an
endless precessess of:

unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3

It turns out that if you listen on a multicast address an unsubscribe
packet is sent when the network device goes down.   If you shutdown
the network namespace without carefully cleaning up this can trigger
the unsubscribe packet to be sent over the loopback interface while
the network namespace is going down.

All of which is fine except when we drop the packet and forget to
free it leaking the skb and the dst entry attached to.  As it
turns out the dst entry hold a reference to the idev which holds
the dev and keeps everything from being cleaned up.  Yuck!

By fixing my earlier thinko and add the needed kfree_skb and everything
cleans up beautifully. 

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 16:00:24 -08:00
Eric W. Biederman
ae33bc40c0 net: Guaranetee the proper ordering of the loopback device.
I was recently hunting a bug that occurred in network namespace
cleanup.  In looking at the code it became apparrent that we have
and will continue to have cases where if we have anything going
on in a network namespace there will be assumptions that the
loopback device is present.   Things like sending igmp unsubscribe
messages when we bring down network devices invokes the routing
code which assumes that at least the loopback driver is present.

Therefore to avoid magic initcall ordering hackery that is hard
to follow and hard to get right insert a call to register the
loopback device directly from net_dev_init().    This guarantes
that the loopback device is the first device registered and
the last network device to go away.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 16:00:02 -08:00
Eric W. Biederman
d0c082cea6 netns: Delete virtual interfaces during namespace cleanup
When physical devices are inside of network namespace and that
network namespace terminates we can not make them go away.  We
have to keep them and moving them to the initial network namespace
is the best we can do.

For virtual devices left in a network namespace that is exiting
we have no need to preserve them and we now have the infrastructure
that allows us to delete them.  So delete virtual devices when we
exit a network namespace.  Keeping the necessary user space clean up
after a network namespace exits much more tractable.

Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 15:59:38 -08:00
Geert Uytterhoeven
dc8a0843a4 [JFFS2] fix race condition in jffs2_lzo_compress()
deflate_mutex protects the globals lzo_mem and lzo_compress_buf.  However,
jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the
compressed data from lzo_compress_buf.  Correct this by moving the mutex
unlock after the copy.

In addition, document what deflate_mutex actually protects.

Cc: stable@kernel.org
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-11-05 23:22:02 +01:00
Randy Dunlap
b0d5fdef52 net/9p: fix printk format warnings
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.

net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Roel Kluin
9f3e9bbe62 unsigned fid->fid cannot be negative
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Huang Weiyi
1558c62149 9p: rdma: remove duplicated #include
Removed duplicated #include <rdma/ib_verbs.h> in
net/9p/trans_rdma.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Tom Tucker
45abdf1c7b p9: Fix leak of waitqueue in request allocation path
If a T or R fcall cannot be allocated, the function returns an error
but neglects to free the wait queue that was successfully allocated.

If it comes through again a second time this wq will be overwritten
with a new allocation and the old allocation will be leaked.

Also, if the client is subsequently closed, the close path will
attempt to clean up these allocations, so set the req fields to
NULL to avoid duplicate free.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker
82b189eaaf 9p: Remove unneeded free of fcall for Flush
T and R fcall are reused until the client is destroyed. There does
not need to be a special case for Flush

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker
cac23d6505 9p: Make all client spin locks IRQ safe
The client lock must be IRQ safe. Some of the lock acquisition paths
took regular spin locks.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker
517ac45af4 9p: rdma: Set trans prior to requesting async connection ops
The RDMA connection manager is fundamentally asynchronous.
Since the async callback context is the client pointer, the
transport in the client struct needs to be set prior to calling
the first async op.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Vladimir Sokolovsky
7f3abf5c7c IB/mlx4: Set umem field to NULL in mlx4_ib_alloc_fast_reg_mr()
Set mr->umem to NULL in mlx4_ib_alloc_fast_reg_mr(). Otherwise
ib_dereg_mr() may invoke ib_umem_release() on a random pointer value
and get an oops.

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-11-05 10:56:52 -08:00
Mike Christie
939c2288c3 [SCSI] scsi_error regression: Fix idempotent command handling
Drivers want to be able to return DID_TRANSPORT_DISRUPTED and
have it do the right thing for commands like tape and passthrouh
as far as retries go. The LLDs previously used DID_BUS_BUSY or DID_ERROR
which followed the cmd->retries limit, but DID_TRANSPORT_DISRUPTED
was skipping that check so it could have caused a problem with tape
commands.

This patch has DID_TRANSPORT_DISRUPTED check the cmd->retries/cmd->allowed.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:48:23 -05:00
Christof Schmitt
d94ce6c6e9 [SCSI] zfcp: Fix hexdump data in s390dbf traces
Fix multiple problems found in the hexdump data:
 - length calculation was wrong, traces were incomplete
 - FC payloads were dumped in different record than the output
   function tried to read
 - minor fixes in output
 - allow complete RSCN traces (up to 1024 bytes according to spec)

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:55 -05:00
Martin Petermann
7ea633ffad [SCSI] zfcp: fix erp timeout cleanup for port open requests
If an open port fsf request times out (in erp) the
corresponding erp_action member of the fsf
request need to set to NULL. If the port structure
will be removed later-on there will be still a
reference in the fsf request to the non existing
erp_action otherwise.

Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:40 -05:00
Christof Schmitt
77fd9494bc [SCSI] zfcp: Wait for port scan to complete when setting adapter online
Attaching a unit immediately after setting the adapter online should
be possible. The problem right now is that the port_scan runs from a
workqueue and has not finished when the set_online call returns and
the sysfs structures for the ports are not available yet. Fix that by
waiting for the port scan to complete.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:19 -05:00
Christof Schmitt
adc90daffb [SCSI] zfcp: Fix cast warning
Fix leftover from last typecast patch:

drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_port_enqueue’:
drivers/s390/scsi/zfcp_aux.c:629: warning: format ‘%016llx’ expects
type ‘long long unsigned int’, but argument 3 has type ‘u64’

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:03 -05:00
Christof Schmitt
3765138ae9 [SCSI] zfcp: Fix request list handling in error path
Fix the handling of the request list in the error path:
 - Use irqsave for the lock as in the good path.
 - Before removing the request, check if it is still in the list, a
   call to dismiss_all might have changed the list in between.
 - zfcp_qdio_send does not change the queue counters on failure,
   trying revert something is wrong, so remove this.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:46:39 -05:00
Christof Schmitt
88f2a97787 [SCSI] zfcp: fix mempool usage for status_read requests
When allocating fsf requests without qtcb, store the pointer to the
mempool in the fsf requests for later call to mempool_free. This
codepath is only used by the status_read requests.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:45:07 -05:00
Heiko Carstens
45316a86a6 [SCSI] zfcp: fix req_list_locking.
The per adapter req_list_lock must be held with interrupts disabled, otherwise
we might end up with nice deadlocks as lockdep tells us (see below).

zfcp 0.0.1804: QDIO problem occurred.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.27-rc8-00035-g4a77035-dirty #86
---------------------------------------------------------
swapper/0 just changed the state of lock:
 (&adapter->erp_lock){++..}, at: [<00000000002c82ae>] zfcp_erp_adapter_reopen+0x4e/0x8c
but this lock took another, hard-irq-unsafe lock in the past:
 (&adapter->req_list_lock){-+..}

and interrupts could create inverse lock ordering between them.

[tons of backtraces, but only the interesting part follows]

the second lock's dependencies:
-> (&adapter->req_list_lock){-+..} ops: 2280627634176 {
   initial-use  at:
                        [<0000000000071f10>] __lock_acquire+0x504/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002cf684>] zfcp_fsf_req_dismiss_all+0x50/0x140
                        [<00000000002c87ee>] zfcp_erp_adapter_strategy_generic+0x66/0x3d0
                        [<00000000002c9498>] zfcp_erp_thread+0x88c/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
   in-softirq-W at:
                        [<0000000000072172>] __lock_acquire+0x766/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002ca73e>] zfcp_qdio_int_resp+0xbe/0x2ac
                        [<000000000027a1d6>] qdio_kick_inbound_handler+0x82/0xa0
                        [<000000000027daba>] tiqdio_inbound_processing+0x62/0xf8
                        [<0000000000047ba4>] tasklet_action+0x100/0x1f4
                        [<0000000000048b5a>] __do_softirq+0xae/0x154
                        [<0000000000021e4a>] do_softirq+0xea/0xf0
                        [<00000000000485de>] irq_exit+0xde/0xe8
                        [<0000000000268c64>] do_IRQ+0x160/0x1fc
                        [<00000000000261a2>] io_return+0x0/0x8
                        [<000000000001b8f8>] cpu_idle+0x17c/0x224
   hardirq-on-W at:
                        [<0000000000072190>] __lock_acquire+0x784/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d702c>] _spin_lock+0x5c/0x9c
                        [<00000000002caff6>] zfcp_fsf_req_send+0x3e/0x158
                        [<00000000002ce7fe>] zfcp_fsf_exchange_config_data+0x106/0x124
                        [<00000000002c8948>] zfcp_erp_adapter_strategy_generic+0x1c0/0x3d0
                        [<00000000002c98ea>] zfcp_erp_thread+0xcde/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
 }
 ... key      at: [<0000000000e356c8>] __key.26629+0x0/0x8

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmit@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:44:37 -05:00
Christof Schmitt
26816f1c2b [SCSI] zfcp: Dont clear reference from SCSI device to unit
It is possible that a remote port has a problem, the SCSI device gets
deleted after the rport timeout and then the timeout for pending SCSI
commands trigger an abort. For this case, don't delete the reference
from the SCSI device to the zfcp unit, so that we can still have the
reference to issue an abort request.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:44:15 -05:00
Andrew Vasquez
3869a17288 [SCSI] qla2xxx: Update version number to 8.02.01-k9.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:42:29 -05:00
Michael Reed
5bff55db3d [SCSI] qla2xxx: Return a FAILED status when abort mailbox-command fails.
Mike Reed noted
(https://bugzilla.novell.com/show_bug.cgi?id=421330) that the
driver was incorrectly returning a SUCCESS status if the driver's
request to the firmware to abort a command failed.  By doing so,
the mid-layer believed, incorrectly, that the command has
completed and has been returned (ultimately clearing
scsi_cmnd.request_buffer) yet the driver still has the command.
What should correctly happen is a mid-layer escalation
(device-reset, etc.) of recovery during which the driver will
eventually return the outstanding commands to the mid-layer.

Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:42:12 -05:00
Shyam Sundar
680d7db88a [SCSI] qla2xxx: Do not honour max_vports from firmware for 2G ISPs and below.
For 23XX ISPs, max_vports may return an invalid value.
Do not honour it.

Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:41:49 -05:00
Andrew Vasquez
737faece27 [SCSI] qla2xxx: Use pci_disable_rom() to manipulate PCI config space.
http://bugzilla.kernel.org/show_bug.cgi?id=9422

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:41:28 -05:00
Lalit Chandivade
821b399600 [SCSI] qla2xxx: Correct Atmel flash-part handling.
Use correct block size (4K) for erase command 0x20 for Atmel
Flash. Use dword addresses for determining sector boundary.

Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:41:06 -05:00
FUJITA Tomonori
6b0eea21ef [SCSI] megaraid: fix mega_internal_command oops
scsi_cmnd->cmnd was changed from a static array to a pointer post
2.6.25. It breaks mega_internal_command():

static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
	scb = &adapter->int_scb;
	memset(scb, 0, sizeof(scb_t));

	scmd = &adapter->int_scmd;
	memset(scmd, 0, sizeof(Scsi_Cmnd));

	sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
	scmd->device = sdev;

	scmd->device->host = adapter->host;
	scmd->host_scribble = (void *)scb;
	scmd->cmnd[0] = MEGA_INTERNAL_CMD;

mega_internal_command() uses scsi_cmnd allocated internally so
scmd->cmnd is NULL here. This patch adds a static array for cdb to
adapter_t and uses it here. This also uses
scsi_allocate_command/scsi_free_command, the recommended way to
allocate struct scsi_cmnd since the driver might use sense_buffer in
struct scsi_cmnd.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Pascal Terjan <pterjan@gmail.com>
Reported-by: Pascal Terjan <pterjan@gmail.com>
Acked-by: "Yang, Bo" <Bo.Yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:40:23 -05:00
Ingo Molnar
9fcd18c9e6 sched: re-tune balancing
Impact: improve wakeup affinity on NUMA systems, tweak SMP systems

Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
balancing defaults on NUMA and SMP systems.

Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
why we would not want to have wakeup affinity across nodes as well.
(we already do this in the standard NUMA template.)

lat_ctx on a NUMA box is particularly happy about this change:

before:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.60
 |   2 5.70

after:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.65
 |   2 2.07

a 2.75x speedup.

pipe-test is similarly happy about it too:

 |  phoenix:~/sched-tests> ./pipe-test
 |   18.26 usecs/loop.
 |   14.70 usecs/loop.
 |   14.38 usecs/loop.
 |   10.55 usecs/loop.              # +WAKE_AFFINE on domain0+domain1
 |   8.63 usecs/loop.
 |   8.59 usecs/loop.
 |   9.03 usecs/loop.
 |   8.94 usecs/loop.
 |   8.96 usecs/loop.
 |   8.63 usecs/loop.

Also:

 - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
 - enable SD_WAKE_BALANCE on SMP domains

Sysbench+postgresql improves all around the board, quite significantly:

           .28-rc3-11474e2c  .28-rc3-11474e2c-tune
-------------------------------------------------
    1:             571              688    +17.08%
    2:            1236             1206    -2.55%
    4:            2381             2642    +9.89%
    8:            4958             5164    +3.99%
   16:            9580             9574    -0.07%
   32:            7128             8118    +12.20%
   64:            7342             8266    +11.18%
  128:            7342             8064    +8.95%
  256:            7519             7884    +4.62%
  512:            7350             7731    +4.93%
-------------------------------------------------
  SUM:           55412            59341    +6.62%

So it's a win both for the runup portion, the peak area and the tail.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 18:04:38 +01:00
Eric W. Biederman
467622ef2a [MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4)
For "unlock" cycles to 16bit devices in 8bit compatibility mode we need
to use the byte addresses 0xaaa and 0x555. These effectively match
the word address 0x555 and 0x2aa, except the latter has its low bit set.

Most chips don't care about the value of the 'A-1' pin in x8 mode,
but some -- like the ST M29W320D -- do. So we need to be careful to
set it where appropriate.

cfi_send_gen_cmd is only ever passed addresses where the low byte
is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are
affected by this patch, by masking in the extra low bit when the device
is known to be in compatibility mode.

[dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa]
v4: Fix  stupid typo in cfi_build_cmd_addr that failed to compile
    I'm writing this patch way to late at night.
v3: Bring all of the work back into cfi_build_cmd_addr
    including calling of map_bankwidth(map) and cfi_interleave(cfi)
    So every caller doesn't need to.
v2: Only modified the address if we our device_type is larger than our
    bus width.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-11-05 14:40:25 +01:00
David S. Miller
518a09ef11 tcp: Fix recvmsg MSG_PEEK influence of blocking behavior.
Vito Caputo noticed that tcp_recvmsg() returns immediately from
partial reads when MSG_PEEK is used.  In particular, this means that
SO_RCVLOWAT is not respected.

Simply remove the test.  And this matches the behavior of several
other systems, including BSD.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 03:36:01 -08:00
Alexey Dobriyan
efb9a8c28c netfilter: netns ct: walk netns list under RTNL
netns list (just list) is under RTNL. But helper and proto unregistration
happen during rmmod when RTNL is not held, and that's how it was tested:
modprobe/rmmod vs clone(CLONE_NEWNET)/exit.

BUG: unable to handle kernel paging request at 0000000000100100	<===
IP: [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
PGD 15e300067 PUD 15e1d8067 PMD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/kernel/uevent_seqnum
CPU 0
Modules linked in: nf_conntrack_proto_sctp(-) nf_conntrack_proto_dccp(-) af_packet iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 sr_mod cdrom [last unloaded: nf_conntrack_proto_sctp]
Pid: 16758, comm: rmmod Not tainted 2.6.28-rc2-netns-xfrm #3
RIP: 0010:[<ffffffffa009890f>]  [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
RSP: 0018:ffff88015dc1fec8  EFLAGS: 00010212
RAX: 0000000000000000 RBX: 00000000001000f8 RCX: 0000000000000000
RDX: ffffffffa009575c RSI: 0000000000000003 RDI: ffffffffa00956b5
RBP: ffff88015dc1fed8 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88015dc1fe48 R12: ffffffffa0458f60
R13: 0000000000000880 R14: 00007fff4c361d30 R15: 0000000000000880
FS:  00007f624435a6f0(0000) GS:ffffffff80521580(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000100100 CR3: 0000000168969000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 16758, threadinfo ffff88015dc1e000, task ffff880179864218)
Stack:
 ffffffffa0459100 0000000000000000 ffff88015dc1fee8 ffffffffa0457934
 ffff88015dc1ff78 ffffffff80253fef 746e6e6f635f666e 6f72705f6b636172
 00707463735f6f74 ffffffff8024cb30 00000000023b8010 0000000000000000
Call Trace:
 [<ffffffffa0457934>] nf_conntrack_proto_sctp_fini+0x10/0x1e [nf_conntrack_proto_sctp]
 [<ffffffff80253fef>] sys_delete_module+0x19f/0x1fe
 [<ffffffff8024cb30>] ? trace_hardirqs_on_caller+0xf0/0x114
 [<ffffffff803ea9b2>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8020b52b>] system_call_fastpath+0x16/0x1b
Code: 13 35 e0 e8 c4 6c 1a e0 48 8b 1d 6d c6 46 e0 eb 16 48 89 df 4c 89 e2 48 c7 c6 fc 85 09 a0 e8 61 cd ff ff 48 8b 5b 08 48 83 eb 08 <48> 8b 43 08 0f 18 08 48 8d 43 08 48 3d 60 4f 50 80 75 d3 5b 41
RIP  [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
 RSP <ffff88015dc1fec8>
CR2: 0000000000100100
---[ end trace bde8ac82debf7192 ]---

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 03:03:18 -08:00
Takashi Iwai
959973b92d ALSA: hda - Add a quirk for MEDION MD96630
Use model=lenovo-ms7195-dig for MEDION MD96630 laptop (17c0:4085)
with ALC888 codec.
Reference: Novell bnc#412548
	https://bugzilla.novell.com/show_bug.cgi?id=412528

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-05 11:30:56 +01:00
Benjamin Thery
e3ec6cfc26 ipv6: fix run pending DAD when interface becomes ready
With some net devices types, an IPv6 address configured while the
interface was down can stay 'tentative' forever, even after the interface
is set up. In some case, pending IPv6 DADs are not executed when the
device becomes ready.

I observed this while doing some tests with kvm. If I assign an IPv6 
address to my interface eth0 (kvm driver rtl8139) when it is still down
then the address is flagged tentative (IFA_F_TENTATIVE). Then, I set
eth0 up, and to my surprise, the address stays 'tentative', no DAD is
executed and the address can't be pinged.

I also observed the same behaviour, without kvm, with virtual interfaces
types macvlan and veth.

Some easy steps to reproduce the issue with macvlan:

1. ip link add link eth0 type macvlan
2. ip -6 addr add 2003::ab32/64 dev macvlan0
3. ip addr show dev macvlan0
   ... 
   inet6 2003::ab32/64 scope global tentative
   ...
4. ip link set macvlan0 up
5. ip addr show dev macvlan0
   ...
   inet6 2003::ab32/64 scope global tentative
   ...
   Address is still tentative

I think there's a bug in net/ipv6/addrconf.c, addrconf_notify():
addrconf_dad_run() is not always run when the interface is flagged IF_READY.
Currently it is only run when receiving NETDEV_CHANGE event. Looks like
some (virtual) devices doesn't send this event when becoming up.

For both NETDEV_UP and NETDEV_CHANGE events, when the interface becomes
ready, run_pending should be set to 1. Patch below.

'run_pending = 1' could be moved below the if/else block but it makes 
the code less readable.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 01:43:57 -08:00
Eric Dumazet
270acefafe net: sk_free_datagram() should use sk_mem_reclaim_partial()
I noticed a contention on udp_memory_allocated on regular UDP applications.

While tcp_memory_allocated is seldom used, it appears each incoming UDP frame
is currently touching udp_memory_allocated when queued, and when received by
application.

One possible solution is to use sk_mem_reclaim_partial() instead of
sk_mem_reclaim(), so that we keep a small reserve (less than one page)
of memory for each UDP socket.

We did something very similar on TCP side in commit
9993e7d313
([TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer())

A more complex solution would need to convert prot->memory_allocated to
use a percpu_counter with batches of 64 or 128 pages.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 01:38:06 -08:00
Randy Dunlap
b22cecdd8f net/9p: fix printk format warnings
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.

net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 01:35:55 -08:00
Peter Zijlstra
02479099c2 sched: fix buddies for group scheduling
Impact: scheduling order fix for group scheduling

For each level in the hierarchy, set the buddy to point to the right entity.
Therefore, when we do the hierarchical schedule, we have a fair chance of
ending up where we meant to.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 10:30:15 +01:00
Peter Zijlstra
4793241be4 sched: backward looking buddy
Impact: improve/change/fix wakeup-buddy scheduling

Currently we only have a forward looking buddy, that is, we prefer to
schedule to the task we last woke up, under the presumption that its
going to consume the data we just produced, and therefore will have
cache hot benefits.

This allows co-waking producer/consumer task pairs to run ahead of the
pack for a little while, keeping their cache warm. Without this, we
would interleave all pairs, utterly trashing the cache.

This patch introduces a backward looking buddy, that is, suppose that
in the above scenario, the consumer preempts the producer before it
can go to sleep, we will therefore miss the wakeup from consumer to
producer (its already running, after all), breaking the cycle and
reverting to the cache-trashing interleaved schedule pattern.

The backward buddy will try to schedule back to the task that woke us
up in case the forward buddy is not available, under the assumption
that the last task will be the one with the most cache hot task around
barring current.

This will basically allow a task to continue after it got preempted.

In order to avoid starvation, we allow either buddy to get wakeup_gran
ahead of the pack.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 10:30:14 +01:00
Peter Zijlstra
d95f98d069 sched: fix fair preempt check
Impact: fix cross-class preemption

Inter-class wakeup preemptions should go on class order.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 10:30:13 +01:00
Peter Zijlstra
f4b6755fb3 sched: cleanup fair task selection
Impact: cleanup

Clean up task selection

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 10:30:13 +01:00
Eric Anholt
072ba49838 ftrace: fix breakage in bin_fmt results
In 777e208d40 we changed from outputting
field->cpu (a char) to iter->cpu (unsigned int), increasing the resulting
structure size by 3 bytes.

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 10:22:42 +01:00