Commit graph

119702 commits

Author SHA1 Message Date
Ben Hutchings
c78d0cf292 x86: don't allow nr_irqs > NR_IRQS
Impact: fix boot hang on 32-bit systems with more than 224 IO-APIC pins

On some 32-bit systems with a lot of IO-APICs probe_nr_irqs() can
return a value larger than NR_IRQS. This will lead to probe_irq_on()
overrunning the irq_desc array.

I hit this when running net-next-2.6 (close to 2.6.28-rc3) on a
Supermicro dual Xeon system.  NR_IRQS is 224 but probe_nr_irqs() detects
5 IOAPICs and returns 240.  Here are the log messages:

Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
Tue Nov  4 16:53:47 2008 IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x02] address[0xfec81000] gsi_base[24])
Tue Nov  4 16:53:47 2008 IOAPIC[1]: apic_id 2, version 32, address 0xfec81000, GSI 24-47
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x03] address[0xfec81400] gsi_base[48])
Tue Nov  4 16:53:47 2008 IOAPIC[2]: apic_id 3, version 32, address 0xfec81400, GSI 48-71
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x04] address[0xfec82000] gsi_base[72])
Tue Nov  4 16:53:47 2008 IOAPIC[3]: apic_id 4, version 32, address 0xfec82000, GSI 72-95
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x05] address[0xfec82400] gsi_base[96])
Tue Nov  4 16:53:47 2008 IOAPIC[4]: apic_id 5, version 32, address 0xfec82400, GSI 96-119
Tue Nov  4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
Tue Nov  4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Tue Nov  4 16:53:47 2008 Enabling APIC mode:  Flat.  Using 5 I/O APICs

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 07:23:21 +01:00
Harvey Harrison
1c1b777a56 powerpc: Use the new byteorder headers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 10:23:55 +11:00
Sebastian Siewior
5b4d218944 powerpc/boot: Allocate more memory for dtb
David Gibson suggested that since we are now unconditionally copying
the dtb into a malloc()ed buffer, it would be sensible to add a little
padding to the buffer at that point, so that further device tree
manipulations won't need to reallocate it.

This implements that suggestion.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:49:43 +11:00
Jon Tollefson
7d4320f3d5 powerpc: Hugetlb pgtable cache access cleanup
Andrew Morton suggested that using a macro that makes an array
reference look like a function call makes it harder to understand the
code.

This therefore removes the huge_pgtable_cache(psize) macro and
replaces its uses with pgtable_cache[HUGE_PGTABLE_INDEX(psize)].

Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:49:39 +11:00
Masakazu Mokuno
d4ad304841 powerpc/ps3: Fix memory leak in device init
Free dynamically allocated device data structures when device registration
fails.  This fixes memory leakage when the registration fails.

Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:49:35 +11:00
Paul Mackerras
3cc698789a powerpc: Eliminate unused do_gtod variable
Since we started using the generic timekeeping code, we haven't had a
powerpc-specific version of do_gettimeofday, and hence there is now
nothing that reads the do_gtod variable in arch/powerpc/kernel/time.c.
This therefore removes it and the code that sets it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:49:28 +11:00
Paul Mackerras
597bc5c00b powerpc: Improve resolution of VDSO clock_gettime
Currently the clock_gettime implementation in the VDSO produces a
result with microsecond resolution for the cases that are handled
without a system call, i.e. CLOCK_REALTIME and CLOCK_MONOTONIC.  The
nanoseconds field of the result is obtained by computing a
microseconds value and multiplying by 1000.

This changes the code in the VDSO to do the computation for
clock_gettime with nanosecond resolution.  That means that the
resolution of the result will ultimately depend on the timebase
frequency.

Because the timestamp in the VDSO datapage (stamp_xsec, the real time
corresponding to the timebase count in tb_orig_stamp) is in units of
2^-20 seconds, it doesn't have sufficient resolution for computing a
result with nanosecond resolution.  Therefore this adds a copy of
xtime to the VDSO datapage and updates it in update_gtod() along with
the other time-related fields.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:49:22 +11:00
Mark Nelson
c73049f6aa powerpc: Remove map_/unmap_single() from dma_mapping_ops
Now that all of the remaining dma_mapping_ops have had their
map_/unmap_single functions updated to become map/unmap_page
functions, there is no need to have the map_/unmap_single function
pointers in the dma_mapping_ops.

So, this removes them and also removes the code that does the checking
for which set of functions to use.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Acked-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:43:46 +11:00
Benjamin Herrenschmidt
7eef440a54 powerpc/pci: Cosmetic cleanups of pci-common.c
This does a few cosmetic cleanups, moving a couple of things around
but without actually changing what the code does.

(There is a minor change in ordering of operations in
pcibios_setup_bus_devices but it should have no impact).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:41:52 +11:00
Benjamin Herrenschmidt
fd6852c8fa powerpc/pci: Fix various pseries PCI hotplug issues
The pseries PCI hotplug code has a number of issues, ranging from
incorrect resource setup to crashes, depending on what is added,
when, whether it contains a bridge, etc etc....

This fixes a whole bunch of these, while actually simplifying the code
a bit, using more generic code in the process and factoring out common
code between adding of a PHB, a slot or a device.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:31:52 +11:00
Benjamin Herrenschmidt
b5ae5f911d powerpc/pci: Make pcibios_allocate_bus_resources more robust
To properly fix PCI hotplug, it's useful to be able to make the fixup
passes on all devices whether they were just hot plugged or already
there.

However, pcibios_allocate_bus_resources() wouldn't cope well with
being called twice for a given bus.  This makes it ignore resources
that have already been allocated, along with adding a bit of debug
output.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:26:05 +11:00
Benjamin Herrenschmidt
57b066ff4e powerpc/eeh: Make EEH device add/remove more robust
To properly fix PCI hotplug, it's useful to be able to make the fixup
passes on all devices whether they were just hot plugged or already
there.

The EEH code however used to not be very friendly with calling
eeh_add_device_late() multiple time, and not very rebust in the way it
generally tests whether a device is in the expected state vs. the EEH
code.

This improves it, along with cleaning up a couple of debug printk's.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:25:15 +11:00
Benjamin Herrenschmidt
8b8da35804 powerpc/pci: Split pcibios_fixup_bus() into bus setup and device setup
Currently, our PCI code uses the pcibios_fixup_bus() callback, which
is called by the generic code when probing PCI buses, for two
different things.

One is to set up things related to the bus itself, such as reading
bridge resources for P2P bridges, fixing them up, or setting up the
iommu's associated with bridges on some platforms.

The other is some setup for each individual device under that bridge,
mostly setting up DMA mappings and interrupts.

The problem is that this approach doesn't work well with PCI hotplug
when an existing bus is re-probed for new children.  We fix this
problem by splitting pcibios_fixup_bus into two routines:

	pcibios_setup_bus_self() is now called to setup the bus itself

	pcibios_setup_bus_devices() is now called to setup devices

pcibios_fixup_bus() is then modified to call these two after reading the
bridge bases, and the OF based PCI probe is modified to avoid calling
into the first one when rescanning an existing bridge.

[paulus@samba.org - fixed eeh.h for 32-bit compile now that pci-common.c
is including it unconditionally.]

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-06 09:22:37 +11:00
Geert Uytterhoeven
dc8a0843a4 [JFFS2] fix race condition in jffs2_lzo_compress()
deflate_mutex protects the globals lzo_mem and lzo_compress_buf.  However,
jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the
compressed data from lzo_compress_buf.  Correct this by moving the mutex
unlock after the copy.

In addition, document what deflate_mutex actually protects.

Cc: stable@kernel.org
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-11-05 23:22:02 +01:00
Randy Dunlap
b0d5fdef52 net/9p: fix printk format warnings
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.

net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Roel Kluin
9f3e9bbe62 unsigned fid->fid cannot be negative
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Huang Weiyi
1558c62149 9p: rdma: remove duplicated #include
Removed duplicated #include <rdma/ib_verbs.h> in
net/9p/trans_rdma.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Tom Tucker
45abdf1c7b p9: Fix leak of waitqueue in request allocation path
If a T or R fcall cannot be allocated, the function returns an error
but neglects to free the wait queue that was successfully allocated.

If it comes through again a second time this wq will be overwritten
with a new allocation and the old allocation will be leaked.

Also, if the client is subsequently closed, the close path will
attempt to clean up these allocations, so set the req fields to
NULL to avoid duplicate free.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker
82b189eaaf 9p: Remove unneeded free of fcall for Flush
T and R fcall are reused until the client is destroyed. There does
not need to be a special case for Flush

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker
cac23d6505 9p: Make all client spin locks IRQ safe
The client lock must be IRQ safe. Some of the lock acquisition paths
took regular spin locks.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker
517ac45af4 9p: rdma: Set trans prior to requesting async connection ops
The RDMA connection manager is fundamentally asynchronous.
Since the async callback context is the client pointer, the
transport in the client struct needs to be set prior to calling
the first async op.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Vladimir Sokolovsky
7f3abf5c7c IB/mlx4: Set umem field to NULL in mlx4_ib_alloc_fast_reg_mr()
Set mr->umem to NULL in mlx4_ib_alloc_fast_reg_mr(). Otherwise
ib_dereg_mr() may invoke ib_umem_release() on a random pointer value
and get an oops.

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-11-05 10:56:52 -08:00
Mike Christie
939c2288c3 [SCSI] scsi_error regression: Fix idempotent command handling
Drivers want to be able to return DID_TRANSPORT_DISRUPTED and
have it do the right thing for commands like tape and passthrouh
as far as retries go. The LLDs previously used DID_BUS_BUSY or DID_ERROR
which followed the cmd->retries limit, but DID_TRANSPORT_DISRUPTED
was skipping that check so it could have caused a problem with tape
commands.

This patch has DID_TRANSPORT_DISRUPTED check the cmd->retries/cmd->allowed.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:48:23 -05:00
Christof Schmitt
d94ce6c6e9 [SCSI] zfcp: Fix hexdump data in s390dbf traces
Fix multiple problems found in the hexdump data:
 - length calculation was wrong, traces were incomplete
 - FC payloads were dumped in different record than the output
   function tried to read
 - minor fixes in output
 - allow complete RSCN traces (up to 1024 bytes according to spec)

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:55 -05:00
Martin Petermann
7ea633ffad [SCSI] zfcp: fix erp timeout cleanup for port open requests
If an open port fsf request times out (in erp) the
corresponding erp_action member of the fsf
request need to set to NULL. If the port structure
will be removed later-on there will be still a
reference in the fsf request to the non existing
erp_action otherwise.

Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:40 -05:00
Christof Schmitt
77fd9494bc [SCSI] zfcp: Wait for port scan to complete when setting adapter online
Attaching a unit immediately after setting the adapter online should
be possible. The problem right now is that the port_scan runs from a
workqueue and has not finished when the set_online call returns and
the sysfs structures for the ports are not available yet. Fix that by
waiting for the port scan to complete.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:19 -05:00
Christof Schmitt
adc90daffb [SCSI] zfcp: Fix cast warning
Fix leftover from last typecast patch:

drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_port_enqueue’:
drivers/s390/scsi/zfcp_aux.c:629: warning: format ‘%016llx’ expects
type ‘long long unsigned int’, but argument 3 has type ‘u64’

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:03 -05:00
Christof Schmitt
3765138ae9 [SCSI] zfcp: Fix request list handling in error path
Fix the handling of the request list in the error path:
 - Use irqsave for the lock as in the good path.
 - Before removing the request, check if it is still in the list, a
   call to dismiss_all might have changed the list in between.
 - zfcp_qdio_send does not change the queue counters on failure,
   trying revert something is wrong, so remove this.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:46:39 -05:00
Christof Schmitt
88f2a97787 [SCSI] zfcp: fix mempool usage for status_read requests
When allocating fsf requests without qtcb, store the pointer to the
mempool in the fsf requests for later call to mempool_free. This
codepath is only used by the status_read requests.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:45:07 -05:00
Heiko Carstens
45316a86a6 [SCSI] zfcp: fix req_list_locking.
The per adapter req_list_lock must be held with interrupts disabled, otherwise
we might end up with nice deadlocks as lockdep tells us (see below).

zfcp 0.0.1804: QDIO problem occurred.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.27-rc8-00035-g4a77035-dirty #86
---------------------------------------------------------
swapper/0 just changed the state of lock:
 (&adapter->erp_lock){++..}, at: [<00000000002c82ae>] zfcp_erp_adapter_reopen+0x4e/0x8c
but this lock took another, hard-irq-unsafe lock in the past:
 (&adapter->req_list_lock){-+..}

and interrupts could create inverse lock ordering between them.

[tons of backtraces, but only the interesting part follows]

the second lock's dependencies:
-> (&adapter->req_list_lock){-+..} ops: 2280627634176 {
   initial-use  at:
                        [<0000000000071f10>] __lock_acquire+0x504/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002cf684>] zfcp_fsf_req_dismiss_all+0x50/0x140
                        [<00000000002c87ee>] zfcp_erp_adapter_strategy_generic+0x66/0x3d0
                        [<00000000002c9498>] zfcp_erp_thread+0x88c/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
   in-softirq-W at:
                        [<0000000000072172>] __lock_acquire+0x766/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002ca73e>] zfcp_qdio_int_resp+0xbe/0x2ac
                        [<000000000027a1d6>] qdio_kick_inbound_handler+0x82/0xa0
                        [<000000000027daba>] tiqdio_inbound_processing+0x62/0xf8
                        [<0000000000047ba4>] tasklet_action+0x100/0x1f4
                        [<0000000000048b5a>] __do_softirq+0xae/0x154
                        [<0000000000021e4a>] do_softirq+0xea/0xf0
                        [<00000000000485de>] irq_exit+0xde/0xe8
                        [<0000000000268c64>] do_IRQ+0x160/0x1fc
                        [<00000000000261a2>] io_return+0x0/0x8
                        [<000000000001b8f8>] cpu_idle+0x17c/0x224
   hardirq-on-W at:
                        [<0000000000072190>] __lock_acquire+0x784/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d702c>] _spin_lock+0x5c/0x9c
                        [<00000000002caff6>] zfcp_fsf_req_send+0x3e/0x158
                        [<00000000002ce7fe>] zfcp_fsf_exchange_config_data+0x106/0x124
                        [<00000000002c8948>] zfcp_erp_adapter_strategy_generic+0x1c0/0x3d0
                        [<00000000002c98ea>] zfcp_erp_thread+0xcde/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
 }
 ... key      at: [<0000000000e356c8>] __key.26629+0x0/0x8

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmit@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:44:37 -05:00
Christof Schmitt
26816f1c2b [SCSI] zfcp: Dont clear reference from SCSI device to unit
It is possible that a remote port has a problem, the SCSI device gets
deleted after the rport timeout and then the timeout for pending SCSI
commands trigger an abort. For this case, don't delete the reference
from the SCSI device to the zfcp unit, so that we can still have the
reference to issue an abort request.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:44:15 -05:00
Andrew Vasquez
3869a17288 [SCSI] qla2xxx: Update version number to 8.02.01-k9.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:42:29 -05:00
Michael Reed
5bff55db3d [SCSI] qla2xxx: Return a FAILED status when abort mailbox-command fails.
Mike Reed noted
(https://bugzilla.novell.com/show_bug.cgi?id=421330) that the
driver was incorrectly returning a SUCCESS status if the driver's
request to the firmware to abort a command failed.  By doing so,
the mid-layer believed, incorrectly, that the command has
completed and has been returned (ultimately clearing
scsi_cmnd.request_buffer) yet the driver still has the command.
What should correctly happen is a mid-layer escalation
(device-reset, etc.) of recovery during which the driver will
eventually return the outstanding commands to the mid-layer.

Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:42:12 -05:00
Shyam Sundar
680d7db88a [SCSI] qla2xxx: Do not honour max_vports from firmware for 2G ISPs and below.
For 23XX ISPs, max_vports may return an invalid value.
Do not honour it.

Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:41:49 -05:00
Andrew Vasquez
737faece27 [SCSI] qla2xxx: Use pci_disable_rom() to manipulate PCI config space.
http://bugzilla.kernel.org/show_bug.cgi?id=9422

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:41:28 -05:00
Lalit Chandivade
821b399600 [SCSI] qla2xxx: Correct Atmel flash-part handling.
Use correct block size (4K) for erase command 0x20 for Atmel
Flash. Use dword addresses for determining sector boundary.

Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:41:06 -05:00
FUJITA Tomonori
6b0eea21ef [SCSI] megaraid: fix mega_internal_command oops
scsi_cmnd->cmnd was changed from a static array to a pointer post
2.6.25. It breaks mega_internal_command():

static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
	scb = &adapter->int_scb;
	memset(scb, 0, sizeof(scb_t));

	scmd = &adapter->int_scmd;
	memset(scmd, 0, sizeof(Scsi_Cmnd));

	sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
	scmd->device = sdev;

	scmd->device->host = adapter->host;
	scmd->host_scribble = (void *)scb;
	scmd->cmnd[0] = MEGA_INTERNAL_CMD;

mega_internal_command() uses scsi_cmnd allocated internally so
scmd->cmnd is NULL here. This patch adds a static array for cdb to
adapter_t and uses it here. This also uses
scsi_allocate_command/scsi_free_command, the recommended way to
allocate struct scsi_cmnd since the driver might use sense_buffer in
struct scsi_cmnd.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Pascal Terjan <pterjan@gmail.com>
Reported-by: Pascal Terjan <pterjan@gmail.com>
Acked-by: "Yang, Bo" <Bo.Yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:40:23 -05:00
Ingo Molnar
9fcd18c9e6 sched: re-tune balancing
Impact: improve wakeup affinity on NUMA systems, tweak SMP systems

Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
balancing defaults on NUMA and SMP systems.

Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
why we would not want to have wakeup affinity across nodes as well.
(we already do this in the standard NUMA template.)

lat_ctx on a NUMA box is particularly happy about this change:

before:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.60
 |   2 5.70

after:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.65
 |   2 2.07

a 2.75x speedup.

pipe-test is similarly happy about it too:

 |  phoenix:~/sched-tests> ./pipe-test
 |   18.26 usecs/loop.
 |   14.70 usecs/loop.
 |   14.38 usecs/loop.
 |   10.55 usecs/loop.              # +WAKE_AFFINE on domain0+domain1
 |   8.63 usecs/loop.
 |   8.59 usecs/loop.
 |   9.03 usecs/loop.
 |   8.94 usecs/loop.
 |   8.96 usecs/loop.
 |   8.63 usecs/loop.

Also:

 - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
 - enable SD_WAKE_BALANCE on SMP domains

Sysbench+postgresql improves all around the board, quite significantly:

           .28-rc3-11474e2c  .28-rc3-11474e2c-tune
-------------------------------------------------
    1:             571              688    +17.08%
    2:            1236             1206    -2.55%
    4:            2381             2642    +9.89%
    8:            4958             5164    +3.99%
   16:            9580             9574    -0.07%
   32:            7128             8118    +12.20%
   64:            7342             8266    +11.18%
  128:            7342             8064    +8.95%
  256:            7519             7884    +4.62%
  512:            7350             7731    +4.93%
-------------------------------------------------
  SUM:           55412            59341    +6.62%

So it's a win both for the runup portion, the peak area and the tail.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 18:04:38 +01:00
Eric W. Biederman
467622ef2a [MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4)
For "unlock" cycles to 16bit devices in 8bit compatibility mode we need
to use the byte addresses 0xaaa and 0x555. These effectively match
the word address 0x555 and 0x2aa, except the latter has its low bit set.

Most chips don't care about the value of the 'A-1' pin in x8 mode,
but some -- like the ST M29W320D -- do. So we need to be careful to
set it where appropriate.

cfi_send_gen_cmd is only ever passed addresses where the low byte
is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are
affected by this patch, by masking in the extra low bit when the device
is known to be in compatibility mode.

[dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa]
v4: Fix  stupid typo in cfi_build_cmd_addr that failed to compile
    I'm writing this patch way to late at night.
v3: Bring all of the work back into cfi_build_cmd_addr
    including calling of map_bankwidth(map) and cfi_interleave(cfi)
    So every caller doesn't need to.
v2: Only modified the address if we our device_type is larger than our
    bus width.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-11-05 14:40:25 +01:00
David S. Miller
518a09ef11 tcp: Fix recvmsg MSG_PEEK influence of blocking behavior.
Vito Caputo noticed that tcp_recvmsg() returns immediately from
partial reads when MSG_PEEK is used.  In particular, this means that
SO_RCVLOWAT is not respected.

Simply remove the test.  And this matches the behavior of several
other systems, including BSD.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 03:36:01 -08:00
Benjamin Herrenschmidt
ab56ced9c5 powerpc/pci: Remove pcibios_do_bus_setup()
The function pcibios_do_bus_setup() was used by pcibios_fixup_bus()
to perform setup that is different between the 32-bit and 64-bit
code.  This difference no longer exists, thus the function is removed
and the setup now done directly from pci-common.c.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:11:53 +11:00
Benjamin Herrenschmidt
5328032335 powerpc/pci: Use common PHB resource hookup
The 32-bit and 64-bit powerpc PCI code used to set up the resource
pointers of the root bus of a given PHB in completely different
places.

This unifies this in large part, by making 32-bit use a routine very
similar to what 64-bit does when initially scanning the PCI busses.

The actual setup of the PHB resources itself is then moved to a
common function in pci-common.c.

This should cause no functional change on 64-bit.  On 32-bit, the
effect is that the PHB resources are going to be setup a bit earlier,
instead of being setup from pcibios_fixup_bus().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:11:53 +11:00
Benjamin Herrenschmidt
b0494bc8ee powerpc/pci: Cleanup debug printk's
This removes the various DBG() macro from the powerpc PCI code and
makes it use the standard pr_debug instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:11:53 +11:00
Mark Nelson
25d6e2d7c5 powerpc: Update 64bit memcpy() using CPU_FTR_UNALIGNED_LD_STD
Update memcpy() to add two new feature sections: one for aligning the
destination before copying and one for copying using aligned load
and store doubles.

These new feature sections will only affect Power6 and Cell because
the CPU feature bit was only added to these two processors.

Power6 gets its best performance in memcpy() when aligning neither the
source nor the destination, while Cell gets its best performance when
just the destination is aligned. But in order to save on CPU feature
bits we can use the previously added CPU_FTR_CP_USE_DCBTZ feature bit
to differentiate between Power6 and Cell (because CPU_FTR_CP_USE_DCBTZ
was added to Cell but not Power6).

The first feature section acts to nop out the branch that takes us to
the code that aligns us to an eight byte boundary for the destination.
We only want to nop out this branch on Power6.

So the ALT_FTR_SECTION_END() for this feature section creates a test
mask of the two feature bits ORed together and provides an expected
result of just CPU_FTR_UNALIGNED_LD_STD, thus we nop out the branch
if we're on a CPU that has CPU_FTR_UNALIGNED_LD_STD set and
CPU_FTR_CP_USE_DCBTZ unset.

For the second feature section added, if we're on a CPU that has the
CPU_FTR_UNALIGNED_LD_STD bit set then we don't want to do the copy
with aligned loads and stores (and the appropriate shifting left and
right instructions), so we want to nop out the branch to
.Lsrc_unaligned.

The andi. used for this branch is moved to just above the branch
because this allows us to nop out both instructions with just one
feature section which gives us better performance and doesn't hurt
readability which two separate feature sections did.

Moving the andi. to just above the branch doesn't have any noticeable
negative effect on the remaining 64bit processors (the ones that
didn't have this feature bit added).

On Cell this simple modification results in an improvement to measured
memcpy() bandwidth of up to 50% in the hot cache case and up to 15% in
the cold cache case.

On Power6 we get memory bandwidth results that are up to three times
faster in the hot cache case and up to 50% faster in the cold cache
case.

Commit 2a9294369b ("powerpc: Add new CPU
feature: CPU_FTR_CP_USE_DCBTZ") was where CPU_FTR_CP_USE_DCBTZ was
added.

To say that Cell gets its best performance in memcpy() with just the
destination aligned is true but only for the reason that the indirect
shift and rotate instructions, sld and srd, are microcoded on Cell.
This means that either the destination or the source can be aligned,
but not both, and seeing as we get better performance with the
destination aligned we choose this option.

While we're at it make a one line change from cmpldi r1,... to
cmpldi cr1,... for consistency.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:29 +11:00
Mark Nelson
4ec577a289 powerpc: Add new CPU feature: CPU_FTR_UNALIGNED_LD_STD
Add a new CPU feature bit, CPU_FTR_UNALIGNED_LD_STD, to be added
to the 64bit powerpc chips that can do unaligned load double and
store double without any performance hit.

This is added to Power6 and Cell and will be used in the next commit
to disable the code that gets the destination address aligned on
those CPUs where doing that doesn't improve performance.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:28 +11:00
Brian King
409001948d powerpc: Update page-in counter for CMM
A new field has been added to the VPA as a method for the client OS to
communicate to firmware the number of page-ins it is performing when
running collaborative memory overcommit.  The hypervisor will use this
information to better determine if a partition is experiencing memory
pressure and needs more memory allocated to it.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:28 +11:00
Sebastien Dugue
1ef8014deb powerpc/pseries: Fix getting the server number size
The 'ibm,interrupt-server#-size' properties are not in the cpu nodes,
which is where we currently look for them, but rather live under the
interrupt source controller nodes (which have "ibm,ppc-xics" in their
compatible property).

This moves the code that looks for the ibm,interrupt-server#-size
properties from xics_update_irq_servers() into xics_init_IRQ().

Also this adds a check for mismatched sizes across the interrupt
source controller nodes.  Not sure this is necessary as in this case
the firmware might be seriously busted.

This property only appears on POWER6 boxes and is only used in the
set-indicator(gqirm) call, and apparently firmware currently ignores
the value we pass.  Nevertheless we need to fix it in case future
firmware versions use it.

Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:28 +11:00
Anton Vorontsov
691de57679 powerpc: Remove device_type = "rtc" properties in .dts files
We don't want to encourage the device_type usage.  It isn't used in
the code, so we can simply remove it from the dts files.

Suggested-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:28 +11:00
Benjamin Herrenschmidt
a6a8e009b1 powerpc: Silence software timebase sync
When no hardware method is provided to sync the timebase registers
across the machine, and the platform doesn't sync them for us, then we
use a generic software implementation.  Currently, the code for that
has many printks, and they don't have log levels.  Most of the printks
are only useful for debugging the code, and since we haven't had any
problems with it for years, this turns them into pr_debug.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:28 +11:00
Benjamin Herrenschmidt
1fd0f52583 powerpc: Fix domain numbers in /proc on 64-bit
The code to properly expose domain numbers in /proc is somewhat
bogus on ppc64 as it depends on the "buid" field being non-0,
but that field is really pseries specific.

This removes that code and makes ppc64 use the same code as 32-bit
which effectively decides whether to expose domains based on
ppc_pci_flags set by the platform, and sets the default for 64-bit
to enable domains and enable compatibility for domain 0 (which
strips the domain number for domain 0 to help with X servers).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:27 +11:00