The Yukon FE (100mbit only) chips do not support large packets.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The Yukon EC Ultra chips have transmit settings for store and
forward and PCI buffering. By setting these appropriately, normal
performance goes from 750Mbytes/sec to 940Mbytes/sec (non-jumbo).
It is also possible to do Jumbo mode, but it means turning off
TSO and checksum offload so the performance gets worse. There isn't
enough buffering for checksum offload to work.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Need to make sure and disable ASF on all chip types. Otherwise, there may be
random reboots.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There should never be descriptor error unless hardware or driver is buggy.
But if an error occurs, print useful information, clear irq, and recover.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This device is having all sorts of problems that lead to data corruption
and system instability. It gets receive status and data out of order,
it generates descriptor and TSO errors, etc.
Until the problems are resolved, it should not be used by anyone
who cares about there system.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The basic structure of "normal" UDP/IP/Ethernet
frames (that actually work):
- It starts with the Ethernet header (dest MAC, src MAC, etc.)
- The next part is occupied by the IP header (version info, length of
packet, id=0, fragment offset=0, checksum, from / to address, etc.)
- Then comes the UDP header (src / dest port, length, checksum)
- Actual payload
- Ethernet checksum
Now what's different for IP fragment:
- The IP header has id set to some value (same for all fragments),
offset is set appropriately (i.e. 0 for first fragment, following
according to size of other fragments), size is the length of the frame.
- UDP header is unchanged. I.e. length is according to full UDP
datagram, not just the part within the actual frame! But this is only
true within the first frame: all following frames don't have a valid
UDP-header at all.
The spidernet silicon seems to be quite intelligent: It's able to
compute (IP / UDP / Ethernet) checksums on the fly and tests if frames
are conforming to RFC -- at least conforming to RFC on complete frames.
But IP fragments are different as explained above:
I.e. for IP fragments containing part of a UDP datagram it sees
incompatible length in the headers for IP and UDP in the first frame
and, thus, skips this frame. But the content *is* correct for IP
fragments. For all following frames it finds (most probably) no valid
UDP header at all. But this *is* also correct for IP fragments.
The Linux IP-stack seems to be clever in this point. It expects the
spidernet to calculate the checksum (since the module claims to be able
to do so) and marks the skb's for "normal" frames accordingly
(ip_summed set to CHECKSUM_HW).
But for the IP fragments it does not expect the driver to be capable to
handle the frames appropriately. Thus all checksums are allready
computed. This is also flaged within the skb (ip_summed set to
CHECKSUM_NONE).
Unfortunately the spidernet driver ignores that hints. It tries to send
the IP fragments of UDP datagrams as normal UDP/IP frames. Since they
have different structure the silicon detects them the be not
"well-formed" and skips them.
The following one-liner against 2.6.21-rc2 changes this behavior. If the
IP-stack claims to have done the checksumming, the driver should not
try to checksum (and analyze) the frame but send it as is.
Signed-off-by: Norbert Eicker <n.eicker@fz-juelich.de>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove assumption that PHY interrupts use GPIOs 3 and 5.
Deal with PHY interrupts connected to any GPIO pins.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reuse the incoming skb when a clientless abort req is recieved.
The release of RDMA connections HW resources might be deferred in
low memory situations.
Ensure that no further activity is passed up to the RDMA driver
for these connections.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Nonpae guest pdes are shadowed by two pae ptes, so we double the offset
twice: once to account for the pte size difference, and once because we
need to shadow pdes for a single guest pde.
But when writing to the upper guest pde we also need to truncate the
lower bits, otherwise the multiply shifts these bits into the pde index
and causes an access to the wrong shadow pde. If we're at the end of the
page (accessing the very last guest pde) we can even overflow into the
next host page and oops.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The kernel ib_wc structure now uses a QP pointer, but the user space
equivalent uses a QP number instead. This means we can no longer use
a simple structure copy to copy stuff into user space.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Don't let userspace use the direct-physical-map L_key or R_key.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
At some point things changed so that all the affinity bits can be set,
but cpus_full() macro is not true. This caused problems with the unit
selection logic on multi-unit (board) configurations.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Move the code that shuts down the IB link earlier in the unload
process, to be sure no new packets can arrive while we are unloading.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
To prevent random utility reads and writes of the diag interface to the
chip, we first require a handshake of reading from offset 0 and writing
to offset 0 before any other reads or writes can be done through the
diags device. Otherwise chip errors can be triggered.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the chip is no longer usable, LEDs should be turned off so system
can be found easily in the cluster.
Also some minor reorganizing so both chips print hardware error
message at same point and only if there were unrecovered errors
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Re-init of the kernel structures after a chip reset was leaving the
portdata structure for port zero in an inconsistent state, and a
pointer to it either stale (in re-init code) or NULL (in devdata)
Fixing the order of operations on this struct, and the condition for
interrupt access, prevents the crashes.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Due to a chip bug, the PIOAvail register is not always updated to
memory. This patch allows userspace to force an update.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In initialization, if we bailed at chip specific initialization, we
forgot to clean up the irq we had requested.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a bug where multicast packets without a GRH were not
being dropped as per the IB spec.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the module parameter "kpiobufs" is set too high, the calculation to
reset it to a sane value was incorrect.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix RDMA read response length checking for RDMA_READ_RESPONSE_ONLY to
allow a zero length response. RDMA read responses which don't match
the expected length or occur in response to some other operation
should generate a completion queue error (see table 56, ch. 9.9.2.3 in
the IB spec).
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Improve port-sharing performance by allowing any process to receive
packets from the shared hardware port under a spin lock for mutual
exclusion. Previously, one process was nominated as the master and
that process was responsible for receiving all packets from the shared
hardware port and either consuming them or forwarding them to their
destination. This led to starvation problems for other processes when
the master process was busy in computation phases.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The port sharing feature mixed kernel virtual addresses as well as
physical addresses for the offset used to describe the mmap address to
map the InfiniPath hardware into user space. This had a conflict on
powerpc. The new scheme converts it to a physical address so it
doesn't conflict with chip addresses and yet still fits in 40/44 bits
so it isn't truncated by 32-bit applications calling mmap64().
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a receive work request has been removed from the queue but has not
had a CQ entry generated for it and the QP is modified to the error
state, the completion entry generated is incorrect. This patch fixes
the problem.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Code was converted from a &= ~mask to clear_bit, but the bit was left
shifted instead of being used directly, so we were either trashing
memory several pages away, or sometimes taking a kernel page fault on
an invalid page.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some types of packet errors are moderately common with longer IB
cables and large clusters, and are not reported with prints by other
IB HCA drivers. This suppresses those messages unless the new
__IPATH_ERRPKTDBG bit is set in ipath_debug. Reporting of temporarily
disabled frequent error interrupts was also made clearer
We also distinguish between chip errors, and bad packets sent or
received in the wording of the messages.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a number of bugs with updating the PSN for retries of
RC requests.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When switching to the QP error state, the completion queue entries
(error or flush) were not being generated correctly.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_dbg doesn't need the same prefixes that printk does.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds support for multiple RDMA reads and atomics to be sent
before an ACK is required to be seen by the requester.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a post send is done in loopback and there is no receive queue
entry, the sending QP is put on a timeout list for a while so the
receiver has a chance to post a receive buffer. If the another post
send is done, the code incorrectly tried to put the QP on the timeout
list again an corrupted the timeout list. This eventually leads to a
spin lock deadlock NMI due to the timer function looping forever with
the lock held.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A silly programming error causes a CQ entry to not be generated if a
SRQ limit event is generated.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A recent change was made to allocate memory for a port after CPU
affinity is set. That change didn't account for subports and was
trying to allocate memory for the port twice.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The chip documentation on the expected TID vs eager TID parity error
bits was reversed from what was implemented in the RTL, for both
chips. This corrects the definitions.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The loop which initializes the user memory region from an array of
pages was using the wrong limit for the array. This worked OK when
dma_map_sg() returned the same number as the number of pages. This
patch fixes the problem.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This is a sticky state. It is useful for diagnosing problems with
boards versus cable/switch problems.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There's no point in printing the opcode field in the completion
handling debugging output, since the type of completion is already
printed at the beginning of the line. In fact the opcode field is not
even defined for completions with a status other than success.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The current ib_umad code never accesses bits past IB_UMAD_MAX_PORTS in
dev_map[]. We shouldn't declare it to be twice as big.
Pointed-out-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Since commit b1c1b6a3 ("IB/ipath: merge ipath_core and ib_ipath
drivers"), CONFIG_IPATH_CORE no longer exists, so there's no reason to
have a line for it in drivers/Makefile.
Pointed out by Robert P. J. Day <rpjday@mindspring.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There is a race between netlink_dump_start() and netlink_release()
that can lead to the situation when a netlink socket with non-zero
callback is freed.
Here it is:
CPU1: CPU2
netlink_release(): netlink_dump_start():
sk = netlink_lookup(); /* OK */
netlink_remove();
spin_lock(&nlk->cb_lock);
if (nlk->cb) { /* false */
...
}
spin_unlock(&nlk->cb_lock);
spin_lock(&nlk->cb_lock);
if (nlk->cb) { /* false */
...
}
nlk->cb = cb;
spin_unlock(&nlk->cb_lock);
...
sock_orphan(sk);
/*
* proceed with releasing
* the socket
*/
The proposal it to make sock_orphan before detaching the callback
in netlink_release() and to check for the sock to be SOCK_DEAD in
netlink_dump_start() before setting a new callback.
Signed-off-by: Denis Lunev <den@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>