The iseries_veth driver can attach to multiple vlans, which correspond to
multiple net devices. However there is only 1 connection between each LPAR,
so the connection structure may be shared by multiple net devices.
This makes module removal messy, because we can't deallocate the connections
until we know there are no net devices still using them. The solution is to
use ref counts on the connections, so we can delete them (actually stop) as
soon as the ref count hits zero.
This patch fixes (part of) a bug we were seeing with IPv6 sending probes to
a dead LPAR, which would then hang us forever due to leftover skbs.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This patch makes veth_init_connection() and veth_destroy_connection()
symmetrical in that they allocate/deallocate the same data.
Currently if there's an error while initialising connections (ie. ENOMEM)
we call veth_module_cleanup(), however this will oops because we call
driver_unregister() before we've called driver_register(). I've never seen
this actually happen though.
So instead we explicitly call veth_destroy_connection() for each connection,
any that have been set up will be deallocated.
We also fix a potential leak if vio_register_driver() fails.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The iseries_veth driver unconditionally calls dma_unmap_single() even
when the corresponding dma_map_single() may have failed.
Rework the code a bit to keep the return value from dma_unmap_single()
around, and then check if it's a dma_mapping_error() before we do
the dma_unmap_single().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The iseries_veth driver uses atomic ops to manipulate the in_use field of
one of its per-connection structures. However all references to the
flag occur while the connection's lock is held, so the atomic ops aren't
necessary.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The iseries_veth driver keeps a stack of messages for each connection
and a lock to protect the stack. However there is also a per-connection lock
which makes the message stack lock redundant.
Remove the message stack lock and document the fact that callers of the
stack-manipulation functions must hold the connection's lock.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Due to a logic bug, once promiscuous mode is enabled in the iseries_veth
driver it is never disabled.
The driver keeps two flags, promiscuous and all_mcast which have exactly the
same effect. This is because we only ever receive packets destined for us,
or multicast packets. So consolidate them into one promiscuous flag for
simplicity.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The iseries_veth driver contains a state machine which is used to manage
how connections are setup and neogotiated between LPARs.
If one side of a connection resets for some reason, the two LPARs can get
stuck in a race to re-setup the connection. This can lead to the connection
being declared dead by one or both ends. In practice the connection is
declared dead by one or both ends approximately 8/10 times a connection is
reset, although it is rare for connections to be reset.
(an example here: http://michael.ellerman.id.au/files/misc/veth-trace.html)
The core of the problem is that the end that resets the connection doesn't
wait for the other end to become aware of the reset. So the resetting end
starts setting the connection back up, and then receives a reset from the
other end (which is the response to the initial reset). And so on.
We're severely limited in what we can do to fix this. The protocol between
LPARs is essentially fixed, as we have to interoperate with both OS/400
and old Linux drivers. Which also means we need a fix that only changes the
code on one end.
The only fix I've found given that, is to just blindly sleep for a bit when
resetting the connection, in the hope that the other end will get itself
sorted. Needless to say I'd love it if someone has a better idea.
This does work, I've so far been unable to get it to break, whereas without
the fix a reset of one end will lead to a dead connection ~8/10 times.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The iseries_veth driver has a timer which we use to send acks. When the
connection is reset or stopped we need to delete the timer.
Currently we only call del_timer() when resetting a connection, which means
the timer might run again while the connection is being re-setup. As it turns
out that's ok, because the flags the timer consults have been reset.
It's cleaner though to call del_timer_sync() once we've dropped the lock,
although the timer may still run between us dropping the lock and calling
del_timer_sync(), but as above that's ok.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Currently the iseries_veth driver prints the file name and line number in its
error messages. This isn't very useful for most users, so just print
"iseries_veth: message" instead.
- convert uses of veth_printk() to veth_debug()/veth_error()/veth_info()
- make terminology consistent, ie. always refer to LPAR not lpar
- be consistent about printing return codes as %d not %x
- make format strings fit in 80 columns
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This kills warnings when building drivers/ide/ide-iops.c
and puts us in-line with what other platforms do here.
Signed-off-by: David S. Miller <davem@davemloft.net>
Change sn2-specific calls into generic functions. Without this change
the uncached allocator will not work on non-sn2 platforms.
Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Patch from Sascha Hauer
This patch adds support for setting and getting RTS / CTS via
set_mtctrl / get_mctrl functions.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from David Vrabel
Correct the ioread* and iowrite* functions. In particular, add an offset to the cookie in ioport_map so we can map I/O port ranges starting from 0 (0 is for reporting errors).
Signed-off-by: David Vrabel <dvrabel@arcom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Catalin Marinas
Minor compilation error fix.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Steve Longerbeam
Adds an implementation of unaligned LDRD and STRD fixups.
Also fixes a bug where do_alignment() would misinterpret and
fixup an unaligned LDRD/STRD as LDRH/STRH, causing memory
corruption.
This is the same as Patch #2867/1, but with minor whitespace
and comments changes, plus a check for arch-level >= v5TE
before printing ai_dword count in proc_alignment_read().
Signed-off-by: Steve Longerbeam <stevel@mwwireless.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
- notifying the PROM of specific features that are supported by the OS.
This is used to enable PROM feature if and only if the corresponding
feature is implemented in the OS
- fetch feature sets that are supported by the current PROM. This allows
the OS to selectively enable features when the PROM support is available.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
I've solved the problem I was having with the simulator and not
booting Debian.
The problem is that the number of bits for the virtual linear array
short-format VHPT (Virtually mapped linear page table, VMLPT for
short) is being tested incorrectly.
There are two problems:
1. The PAL call that should tell the kernel the size of the
virtual address space isn't implemented for the simulator, so
the kernel uses the default 50. This is addressed separately
in dc90e95f31
2. In arch/ia64/mm/init.c there's code to calcualte the size
of the VMLPT based on the number of implemented virtual address
bits and the page size. It checks to see if the VMLPT base
address overlaps the top of the mapped region, but this check
doesn't allow for the address space hole, and in fact will
never trigger.
Here's an alternative test and panic, that I think is more accurate.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Not all of the PAL VM calls are implemented for the SKI simulator.
Don't just give up if one fails, print information from the calls
that succeed.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch implements PAL_VM_SUMMARY (and PAL_MEM_ATTRIB for good
measure) and pretends that the simulated machine is a McKinley.
Some extra comments and clean-up by Tony Luck.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The start_tx and stop_tx methods were passed a flag to indicate
whether the start/stop was from the tty start/stop callbacks, and
some drivers used this flag to decide whether to ask the UART to
immediately stop transmission (where the UART supports such a
feature.)
There are other cases when we wish this to occur - when CTS is
lowered, or if we change from soft to hard flow control and CTS
is inactive. In these cases, this flag was false, and we would
allow the transmitter to drain before stopping.
There is really only one case where we want to let the transmitter
drain before disabling, and that's when we run out of characters
to send.
Hence, re-jig the start_tx and stop_tx methods to eliminate this
flag, and introduce new functions for the special "disable and
allow transmitter to drain" case.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
timer_dyn_reprogram() fails with an OOPS if the
configuration for CONFIG_NO_IDLE_HZ is enabled, and
the system has no support for it.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The idea behind a RAID class is to provide a uniform interface to all
RAID subsystems (both hardware and software) in the kernel.
To do that, I've made this class a transport class that's entirely
subsystem independent (although the matching routines have to match per
subsystem, as you'll see looking at the code). I put it in the scsi
subdirectory purely because I needed somewhere to play with it, but it's
not a scsi specific module.
I used a fusion raid card as the test bed for this; with that kind of
card, this is the type of class output you get:
jejb@titanic> ls -l /sys/class/raid_devices/20\:0\:0\:0/
total 0
lrwxrwxrwx 1 root root 0 Aug 16 17:21 component-0 -> ../../../devices/pci0000:80/0000:80:04.0/host20/target20:1:0/20:1:0:0/
lrwxrwxrwx 1 root root 0 Aug 16 17:21 component-1 -> ../../../devices/pci0000:80/0000:80:04.0/host20/target20:1:1/20:1:1:0/
lrwxrwxrwx 1 root root 0 Aug 16 17:21 device -> ../../../devices/pci0000:80/0000:80:04.0/host20/target20:0:0/20:0:0:0/
-r--r--r-- 1 root root 16384 Aug 16 17:21 level
-r--r--r-- 1 root root 16384 Aug 16 17:21 resync
-r--r--r-- 1 root root 16384 Aug 16 17:21 state
So it's really simple: for a SCSI device representing a hardware raid,
it shows the raid level, the array state, the resync % complete (if the
state is resyncing) and the underlying components of the RAID (these are
exposed in fusion on the virtual channel 1).
As you can see, this type of information can be exported by almost
anything, including software raid.
The more difficult trick, of course, is going to be getting it to
perform configuration type actions with writable attributes.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Since the attribute container deletes from a klist while it's walking
it, it is vulnerable to the problem (and fix) here:
http://marc.theaimsgroup.com/?l=linux-scsi&m=112485448830217
The attached fixes this (but won't compile without the above).
It also fixes the logical reversal in the traversal loop which meant
that we were never actually traversing the loop to hit this bug in the
first place.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
One of the changes in the attribute_container code in the scsi-misc tree
was to add a lock to protect the list of devices per container. This,
unfortunately, leads to potential scheduling while atomic problems if
there's a sleep in the function called by a trigger.
The correct solution is to use the kernel klist infrastructure instead
which allows lockless traversal of a list.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Peter Staubach pointed out that it is not correct to check
current->personality & PER_LINUX32 (this will have false
hits on several other personality values).
Signed-off-by: Tony Luck <tony.luck@intel.com>
The following patch fixes two warnings in arch/ppc/syslib/m8xx_setup.c
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I had some time to think about PCI assign issues in 2.6.13-rc series.
The major problem here is that we call pci_assign_unassigned_resources()
way too early - at subsys_initcall level. Therefore we give no chances
to ACPI and PnP routines (called at fs_initcall level) to reserve their
respective resources properly, as the comments in drivers/pnp/system.c
and drivers/acpi/motherboard.c suggest:
/**
* Reserve motherboard resources after PCI claim BARs,
* but before PCI assign resources for uninitialized PCI devices
*/
So I moved the pci_assign_unassigned_resources() call to
pcibios_assign_resources() (fs_initcall), which should hopefully fix a
lot of problems and make PCIBIOS_MIN_IO tweaks unnecessary.
Other changes:
- remove resource assignment code from pcibios_assign_resources(), since
it duplicates pci_assign_unassigned_resources() functionality and
actually does nothing in 2.6.13;
- modify ROM assignment code as per Ben's suggestion: try to use firmware
settings by default (if PCI_ASSIGN_ROMS is not set);
- set CARDBUS_IO_SIZE back to 4K as it's a wonderful stress test for
various setups.
Confirmed by Tero Roponen <teanropo@cc.jyu.fi> (who had problems with
the 4kB CardBus IO size previously).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
USB generic driver
When a USB error occurs that might indicate that the device has been
unplugged, don't resubmit the URB immediately to prevent flooding the
log with error messages before khubd has us disconnect()ed.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Don't bother calling a hook, to call our own module, to call a helper
than simply calls ionumap().
If you unroll all that convolution, you get a simple kfree()+iounmap()
pair of calls.
ATAPI is getting close to being ready. To increase exposure, we enable
the code in the upstream kernel, but default it to off (present
behavior). Users must pass atapi_enabled=1 as a module option (if
module) or on the kernel command line (if built in) to turn on
discovery of their ATAPI devices.