Commit graph

240511 commits

Author SHA1 Message Date
Yanqing_Liu@Dell.com
bc898c97f7 [SCSI] scsi_dh_rdac: Add MD36xxf into device list
This patch is to add Dell MD36xxf array into the RDAC handler device list.

Singed-off-by: Yanqing Liu <Yanqing_Liu@Dell.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 19:05:07 -05:00
Mark Rustad
0c0217b016 net: dcbnl: Add IEEE app selector value definitions
This adds defines for the app selector values currently
defined in the IEEE 802.1Qaz specification.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 17:02:43 -07:00
Mark Rustad
171f20e93b net: dcbnl: Fix misspellings
Fix a few spelling errors in dcbnl.h.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 17:02:42 -07:00
Mark Rustad
698e1d23cf net: dcbnl: Update copyright dates
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 17:02:42 -07:00
Douglas Gilbert
32f7ef7358 [SCSI] scsi_debug: add consecutive medium errors
A useful test case for error recovery is multiple,
consecutive medium errors. When scsi_debug is started
with "opts=2" a MEDIUM ERROR is generated when block
0x1234 (4660) is read. The patch extends that to
10 consecutive blocks from 0x1234 (i.e. blocks 4660 to
4669 inclusive).

[0:0:0:0]  disk  ATA    INTEL SSD  2CV1 /dev/sda /dev/sg0 80.0GB
[10:0:0:0] disk  Linux  scsi_debug 0004 /dev/sdb /dev/sg1 1.09TB

Output file not specified so no copy, just reading input
 >> unrecovered read error at blk=4660, substitute zeros
...
 >> unrecovered read error at blk=4669, substitute zeros
4670+10 records in
0+0 records out
10 unrecovered read errors
lowest unrecovered read lba=4660, highest unrecovered lba=4669
time to read data: 0.047943 secs at 49.87 MB/sec

BTW Change /dev/sg1 (bsg device works just as well) to
/dev/sdb to see why, with faulty media, you do not want
to use the block layer interface. Reason: time block
layer takes to do useless retries and collateral damage
to data in its 4 KB blocks (O_DIRECT mitigates the
latter).

ChangeLog:
    - extend opts=2 medium error generation at block
      0x1234 to 10 consecutive blocks (i.e. blocks
      0x1234 to 0x123d).

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:59:57 -05:00
Dave Airlie
235b87afbc Merge remote branch 'nouveau/drm-nouveau-next' of ../drm-nouveau-next into drm-core-next
* 'nouveau/drm-nouveau-next' of ../drm-nouveau-next:
  drm/nouveau: fix __nouveau_fence_wait performance
  drm/nv40: attempt to reserve just enough vram for all 32 channels
  drm/nv50: check for vm traps on every gr irq
  drm/nv50: decode vm faults some more
  drm/nouveau: add nouveau_enum_find() util function
  drm/nouveau: properly handle pushbuffer check failures
  drm/nvc0: remove vm hack forcing large/small pages to not share a PDE
2011-03-15 09:59:31 +10:00
Daniel Lezcano
12a2856b60 macvlan : fix checksums error when we are in bridge mode
When the lower device has offloading capabilities, the packets checksums
are not computed. That leads to have any macvlan port in bridge mode to
not work because the packets are dropped due to a bad checksum.

If the macvlan is in bridge mode, the packet is forwarded to another
macvlan port and reach the network stack where it looks for a checksum
but this one was not computed due to the offloading of the lower device.
In this case, we have to set the packet with CHECKSUM_UNNECESSARY
when it is forwarded to a bridged port and restore the previous value of
ip_summed when the packet goes to the lowerdev.

Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Andrian Nord <nightnord@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 16:54:44 -07:00
Jiri Pirko
cc8bdf0623 fcoe: correct checking for bonding
Check for bonding master and refuse to use that.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 16:51:46 -07:00
Domenico Andreoli
2ce8c07d63 CS89x0: Add networking support for QQ2440
QQ2440 is only another non-ISA board using CS89x0. This patch adds the
minimum bits required to make QQ2440 work with CS89x0.

Signed-off-by: Domenico Andreoli <cavokz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 16:49:28 -07:00
Domenico Andreoli
d181a6171e CS89x0: Finish transition to CS89x0_NONISA_IRQ
CS89x0_NONISA_IRQ is selected by all those non-ISA boards which use
CS89x0. This patch only cleans the last bits left after its introduction.

Signed-off-by: Domenico Andreoli <cavokz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 16:49:28 -07:00
James Bottomley
a82058a730 [SCSI] libsas: fix ata list corruption issue
I think this stems from a misunderstanding of how the ata error handler
works.  ata_scsi_cmd_error_handler() gets called with a passed in list
of commands to handle.  However, that list may still not be empty when
it exits.  The command ata_scsi_port_error_handler() must be called
(which takes no list) before the list will be completely emptied.  This
bites the sas error handler because the two are called from different
functions and the original list has gone out of scope before
ata_scsi_port_error_handler() is called. leading to some commands
dangling on bare stack, which is a potential memory corruption issue.
Fix this by manually deleting all outstanding commands from the on-stack
list before it goes out of scope.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:46:25 -05:00
Rafael J. Wysocki
bea3864fb6 PM / Hibernate: Reduce autotuned default image size
The hibernate image size autotuning mechanism sets the default
image size to 5/2 of the total system RAM, but it is reported
that on some systems device drivers allocate substantial
amounts of memory during suspend and the creation of the image
fails as a result (too little memory is preallocated).

Modify the autotuning mechanism to use 1/3 instead of 2/5 of RAM
as the default image size, which is reported to be sufficient for
the affected systems.

References: https://bugzilla.kernel.org/show_bug.cgi?id=30482
Reported-and-tested-by: Martin Steigerwald <Martin@Lichtvoll.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:45:46 +01:00
Stephen M. Cameron
941b1cdae8 [SCSI] hpsa: export resettable host attribute
This attribute, requested by Redhat, allows kexec-tools to know
whether the controller can honor the reset_devices kernel parameter
and actually reset the controller.  For kdump to work properly it
is necessary that the reset_devices parameter be honored.  This
attribute enables kexec-tools to warn the user if they attempt to
designate a non-resettable controller as the dump device.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:44:41 -05:00
Rafael J. Wysocki
40dc166cb5 PM / Core: Introduce struct syscore_ops for core subsystems PM
Some subsystems need to carry out suspend/resume and shutdown
operations with one CPU on-line and interrupts disabled.  The only
way to register such operations is to define a sysdev class and
a sysdev specifically for this purpose which is cumbersome and
inefficient.  Moreover, the arguments taken by sysdev suspend,
resume and shutdown callbacks are practically never necessary.

For this reason, introduce a simpler interface allowing subsystems
to register operations to be executed very late during system suspend
and shutdown and very early during resume in the form of
strcut syscore_ops objects.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-15 00:43:46 +01:00
Thomas Renninger
f9b9e806ae PM QoS: Make pm_qos settings readable
I have a machine where entering deep C-states broke.
pm_qos was a hot candidate, but I couldn't find any way to double
check without the need of recompiling.

While in this case it was a driver bug (ath9k):
https://bugzilla.kernel.org/show_bug.cgi?id=27532

powertop or others may want to read out cpu_dma_latency
restrictions which could be the cause of preventing a machine
entering deeper C-states.

Output with this patch:

# default value of 2000 * USEC_PER_SEC (0x77359400)
cat /dev/network_latency |hexdump
0000000 9400 7735
0000004

# value of 55 us which is the reason for not entering C2
cat /dev/cpu_dma_latency |hexdump
0000000 0037 0000
0000004

There is no reason to hide this info -> make pm_qos files readable.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:18 +01:00
Nishanth Menon
7ae4961878 PM / OPP: opp_find_freq_exact() documentation fix
opp_find_freq_exact() documentation has is_available instead
of available. This also fixes warning with the kernel-doc:
scripts/kernel-doc drivers/base/power/opp.c >/dev/null
Warning(drivers/base/power/opp.c:246): No description found for parameter 'available'
Warning(drivers/base/power/opp.c:246): Excess function parameter 'is_available' description in 'opp_find_freq_exact'

Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:17 +01:00
Alexandre Courbot
a8b7228cdc PM: Documentation/power/states.txt: fix repetition
Remove repetition of "called swsusp".

Signed-off-by: Alexandre Courbot <gnurou@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:17 +01:00
Rafael J. Wysocki
9659cc0678 PM: Make system-wide PM and runtime PM treat subsystems consistently
The code handling system-wide power transitions (eg. suspend-to-RAM)
can in theory execute callbacks provided by the device's bus type,
device type and class in each phase of the power transition.  In
turn, the runtime PM core code only calls one of those callbacks at
a time, preferring bus type callbacks to device type or class
callbacks and device type callbacks to class callbacks.

It seems reasonable to make them both behave in the same way in that
respect.  Moreover, even though a device may belong to two subsystems
(eg. bus type and device class) simultaneously, in practice power
management callbacks for system-wide power transitions are always
provided by only one of them (ie. if the bus type callbacks are
defined, the device class ones are not and vice versa).  Thus it is
possible to modify the code handling system-wide power transitions
so that it follows the core runtime PM code (ie. treats the
subsystem callbacks as mutually exclusive).

On the other hand, the core runtime PM code will choose to execute,
for example, a runtime suspend callback provided by the device type
even if the bus type's struct dev_pm_ops object exists, but the
runtime_suspend pointer in it happens to be NULL.  This is confusing,
because it may lead to the execution of callbacks from different
subsystems during different operations (eg. the bus type suspend
callback may be executed during runtime suspend of the device, while
the device type callback will be executed during system suspend).

Make all of the power management code treat subsystem callbacks in
a consistent way, such that:
(1) If the device's type is defined (eg. dev->type is not NULL)
    and its pm pointer is not NULL, the callbacks from dev->type->pm
    will be used.
(2) If dev->type is NULL or dev->type->pm is NULL, but the device's
    class is defined (eg. dev->class is not NULL) and its pm pointer
    is not NULL, the callbacks from dev->class->pm will be used.
(3) If dev->type is NULL or dev->type->pm is NULL and dev->class is
    NULL or dev->class->pm is NULL, the callbacks from dev->bus->pm
    will be used provided that both dev->bus and dev->bus->pm are
    not NULL.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
Reasoning-sounds-sane-to: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-15 00:43:17 +01:00
Jan Beulich
cf4fb80ca3 PM: Simplify kernel/power/Kconfig
'n' defaults are pretty pointless and actually bogus when used with
prompt-less config options.

The "bool"/"default y" pair with no prompt can be expressed more
compactly using def_bool.

[rjw: Rebased on top of earlier patches modifying this file.]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:17 +01:00
Rafael J. Wysocki
7538e3db6e PM: Add support for device power domains
The platform bus type is often used to handle Systems-on-a-Chip (SoC)
where all devices are represented by objects of type struct
platform_device.  In those cases the same "platform" device driver
may be used with multiple different system configurations, but the
actions needed to put the devices it handles into a low-power state
and back into the full-power state may depend on the design of the
given SoC.  The driver, however, cannot possibly include all the
information necessary for the power management of its device on all
the systems it is used with.  Moreover, the device hierarchy in its
current form also is not suitable for representing this kind of
information.

The patch below attempts to address this problem by introducing
objects of type struct dev_power_domain that can be used for
representing power domains within a SoC.  Every struct
dev_power_domain object provides a sets of device power
management callbacks that can be used to perform what's needed for
device power management in addition to the operations carried out by
the device's driver and subsystem.

Namely, if a struct dev_power_domain object is pointed to by the
pwr_domain field in a struct device, the callbacks provided by its
ops member will be executed in addition to the corresponding
callbacks provided by the device's subsystem and driver during all
power transitions.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-and-acked-by: Kevin Hilman <khilman@ti.com>
2011-03-15 00:43:16 +01:00
Rafael J. Wysocki
6831c6edc7 PM: Drop pm_flags that is not necessary
The variable pm_flags is used to prevent APM from being enabled
along with ACPI, which would lead to problems.  However, acpi_init()
is always called before apm_init() and after acpi_init() has
returned, it is known whether or not ACPI will be used.  Namely, if
acpi_disabled is not set after acpi_init() has returned, this means
that ACPI is enabled.  Thus, it is sufficient to check acpi_disabled
in apm_init() to prevent APM from being enabled in parallel with
ACPI.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Len Brown <len.brown@intel.com>
2011-03-15 00:43:16 +01:00
Rafael J. Wysocki
e866500247 PM: Allow pm_runtime_suspend() to succeed during system suspend
The dpm_prepare() function increments the runtime PM reference
counters of all devices to prevent pm_runtime_suspend() from
executing subsystem-level callbacks.  However, this was supposed to
guard against a specific race condition that cannot happen, because
the power management workqueue is freezable, so pm_runtime_suspend()
can only be called synchronously during system suspend and we can
rely on subsystems and device drivers to avoid doing that
unnecessarily.

Make dpm_prepare() drop the runtime PM reference to each device
after making sure that runtime resume is not pending for it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
2011-03-15 00:43:16 +01:00
Rafael J. Wysocki
88a6f33e4d PM: Clean up PM_TRACE dependencies and drop unnecessary Kconfig option
CONFIG_PM_SLEEP_ADVANCED_DEBUG is not used any more, so drop it
and CONFIG_CAN_PM_TRACE need not depend on EXPERIMENTAL, so remove
that dependency.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:15 +01:00
Rafael J. Wysocki
aa33860158 PM: Remove CONFIG_PM_OPS
After redefining CONFIG_PM to depend on (CONFIG_PM_SLEEP ||
CONFIG_PM_RUNTIME) the CONFIG_PM_OPS option is redundant and can be
replaced with CONFIG_PM.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:15 +01:00
Rafael J. Wysocki
196ec24322 PM: Reorder power management Kconfig options
Reorder configuration options in kernel/power/Kconfig so that
the options depended on are at the top of the list.

This patch doesn't introduce any functional changes.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:15 +01:00
Rafael J. Wysocki
1eb208aea3 PM: Make CONFIG_PM depend on (CONFIG_PM_SLEEP || CONFIG_PM_RUNTIME)
From the users' point of view CONFIG_PM is really only used for
making it possible to set CONFIG_SUSPEND, CONFIG_HIBERNATION,
CONFIG_PM_RUNTIME and (surprisingly enough) CONFIG_XEN_SAVE_RESTORE
(CONFIG_PM_OPP also depends on CONFIG_PM, but quite artificially).
However, both CONFIG_SUSPEND and CONFIG_HIBERNATION require platform
support (independent of CONFIG_PM) and it is not quite obvious that
CONFIG_PM has to be set for CONFIG_XEN_SAVE_RESTORE to be available.
Thus, from the users' point of view, it would be more logical to
automatically select CONFIG_PM if any of the above options depending
on it are set.

Make CONFIG_PM depend on (CONFIG_PM_SLEEP || CONFIG_PM_RUNTIME),
which will cause it to be selected when any of CONFIG_SUSPEND,
CONFIG_HIBERNATION, CONFIG_PM_RUNTIME, CONFIG_XEN_SAVE_RESTORE is
set and will clarify its meaning.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:15 +01:00
Rafael J. Wysocki
cd51e61cf4 PM / ACPI: Remove references to pm_flags from bus.c
If direct references to pm_flags are removed from drivers/acpi/bus.c,
CONFIG_ACPI will not need to depend on CONFIG_PM any more.  Make that
happen.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Len Brown <len.brown@intel.com>
2011-03-15 00:43:15 +01:00
Rafael J. Wysocki
cb8f51bdad PM: Do not create wakeup sysfs files for devices that cannot wake up
Currently, wakeup sysfs attributes are created for all devices,
regardless of whether or not they are wakeup-capable.  This is
excessive and complicates wakeup device identification from user
space (i.e. to identify wakeup-capable devices user space has to read
/sys/devices/.../power/wakeup for all devices and see if they are not
empty).

Fix this issue by avoiding to create wakeup sysfs files for devices
that cannot wake up the system from sleep states (i.e. whose
power.can_wakeup flags are unset during registration) and modify
device_set_wakeup_capable() so that it adds (or removes) the relevant
sysfs attributes if a device's wakeup capability status is changed.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:14 +01:00
Rafael J. Wysocki
4681b17154 USB / Hub: Do not call device_set_wakeup_capable() under spinlock
A subsequent patch will modify device_set_wakeup_capable() in such
a way that it will call functions which may sleep and therefore it
shouldn't be called under spinlocks.  In preparation to that, modify
usb_set_device_state() to avoid calling device_set_wakeup_capable()
under device_state_lock.

Tested-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-15 00:43:14 +01:00
Mandeep Singh Baines
0295a34d61 PM: Use appropriate printk() priority level in trace.c
printk()s without a priority level default to KERN_WARNING. To reduce
noise at KERN_WARNING, this patch sets the priority level appriopriately
for unleveled printks()s. This should be useful to folks that look at
dmesg warnings closely.

Changed these messages to pr_info().

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:14 +01:00
Rafael J. Wysocki
790c7885a4 PM / Wakeup: Don't update events_check_enabled in pm_get_wakeup_count()
Since pm_save_wakeup_count() has just been changed to clear
events_check_enabled unconditionally before checking if there are
any new wakeup events registered since the last read from
/sys/power/wakeup_count, the detection of wakeup events during
suspend may be disabled, after it's been enabled, by writing a
"wrong" value back to /sys/power/wakeup_count.  For this reason,
it is not necessary to update events_check_enabled in
pm_get_wakeup_count() any more.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:14 +01:00
Rafael J. Wysocki
378eef99ad PM / Wakeup: Make pm_save_wakeup_count() work as documented
According to Documentation/ABI/testing/sysfs-power, the
/sys/power/wakeup_count interface should only make the kernel react
to wakeup events during suspend if the last write to it has been
successful.  However, if /sys/power/wakeup_count is written to two
times in a row, where the first write is successful and the second
is not, the kernel will still react to wakeup events during suspend
due to a bug in pm_save_wakeup_count().

Fix the bug by making pm_save_wakeup_count() clear
events_check_enabled unconditionally before checking if there are
any new wakeup events registered since the previous read from
/sys/power/wakeup_count.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:13 +01:00
Rafael J. Wysocki
023d377914 PM / Wakeup: Combine atomic counters to avoid reordering issues
The memory barrier in wakeup_source_deactivate() is supposed to
prevent the callers of pm_wakeup_pending() and pm_get_wakeup_count()
from seeing the new value of events_in_progress (0, in particular)
and the old value of event_count at the same time.  However, if
wakeup_source_deactivate() is executed by CPU0 and, for instance,
pm_wakeup_pending() is executed by CPU1, where both processors can
reorder operations, the memory barrier in wakeup_source_deactivate()
doesn't affect CPU1 which can reorder reads.  In that case CPU1 may
very well decide to fetch event_count before it's modified and
events_in_progress after it's been updated, so pm_wakeup_pending()
may fail to detect a wakeup event.  This issue can be addressed by
using a single atomic variable to store both events_in_progress
and event_count, so that they can be updated together in a single
atomic operation.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:13 +01:00
Stephen M. Cameron
3f5eac3a04 [SCSI] hpsa: move device attributes to avoid forward declarations
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:40:20 -05:00
Martin K. Petersen
5b94e23292 [SCSI] scsi_debug: Logical Block Provisioning (SBC3r26)
Update scsi_debug to support the Logical Block Provisioning commands and
bits as defined in SBC3r26. The old tp* parameters have been
transitioned to the new lbp* scheme found in the draft standard.

The old tpu option to enable UNMAP is now called lbpu. tpws to signal
support for WRITE SAME(16) with the UNMAP bit set is now lbpws. Support
for WRITE SAME(10) with the UNMAP bit set is also available using the
lpuws10 parameter.

Limiting the maximum number of blocks per WRITE SAME command has been
implemented and is available via the write_same_length module parameter.

As part of the renaming process the parameter lists have been sorted
alphabetically (request from Doug).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:38:44 -05:00
Martin K. Petersen
c98a0eb0e9 [SCSI] sd: Logical Block Provisioning update
SBC3r26 contains many changes to the Logical Block Provisioning
interfaces (formerly known as Thin Provisioning ditto). This patch
implements support for both the old and new schemes using the same
heuristic as before (whether the LBP VPD page is present).

The new code also allows the provisioning mode (i.e. choice of command)
to be overridden on a per-device basis via sysfs. Two additional modes
are supported in this version:

 - WRITE SAME(10) with the UNMAP bit set

 - WRITE SAME(10) without the UNMAP bit set. This allows us to support
   devices that predate the TP/LBP enhancements in SBC3 and which work
   by way zero-detection

Switching between modes has been consolidated in a helper function that
also updates the block layer topology according to the limitations of
the chosen command.

I experimented with trying WRITE SAME(16) if UNMAP fails, WRITE SAME(10)
if WRITE SAME(16) fails, etc. but found several devices that got
cranky. So for now we'll disable discard if one of the commands
fail. The user still has the option of selecting a different mode in
sysfs.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:37:34 -05:00
Martin K. Petersen
72f7d322fd [SCSI] Include protection operation in SCSI command trace
When debugging DIF/DIX it is very helpful to be able to see which DIX
operation is associated with the scsi_cmnd. Include the protection op in
the SCSI command trace.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:36:02 -05:00
scameron@beardog.cce.hp.com
9143a96122 [SCSI] hpsa: fix incorrect PCI IDs and add two new ones (2nd try)
My first attempt was botched, got the wrong PCI Device ID
(used PCI_DEVICE_ID_HP_CISSE, should have been PCI_DEVICE_ID_HP_CISSF)

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:34:58 -05:00
Nicholas Bellinger
904f0bc482 [SCSI] target: Fix volume size misreporting for volumes > 2TB
the target infrastructure fails to send the correct conventional size
to READ_CAPACITY that force a retry with READ_CAPACITY_16, which reads
the capacity for devices > 2TB.  Fix by adding the correct return to
trigger RC(16).

Reported-by: Ben Jarvis <bjarvismn@gmail.com>
Signed-off-by: Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-14 18:31:08 -05:00
Denis Turischev
6ae705b23b pch_uart: reference clock on CM-iTC
Default clock source for UARTs on Topcliff is external UART_CLK.
On CM-iTC USB_48MHz is used instead. After VCO2PLL and DIV
manipulations UARTs will receive 192 MHz.
Clock manipulations on Topcliff are controlled in pch_phub.c

v2: redone against the linux-next tree
v3: redone against linux/kernel/git/next/linux-next.git snapshot

Signed-off-by: Denis Turischev <denis@compulab.co.il>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-14 16:24:23 -07:00
Tomoya MORINAGA
1a738dcf6d pch_phub: add new device ML7213
Add ML7213 device information.
ML7213 is companion chip of Intel Atom E6xx series for IVI(In-Vehicle Infotainment).
ML7213 is completely compatible for Intel EG20T PCH.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-14 16:24:10 -07:00
Al Viro
f52e0c1130 New AT_... flag: AT_EMPTY_PATH
For name_to_handle_at(2) we'll want both ...at()-style syscall that
would be usable for non-directory descriptors (with empty relative
pathname).  Introduce new flag (AT_EMPTY_PATH) to deal with that and
corresponding LOOKUP_EMPTY; teach user_path_at() and path_init() to
deal with the latter.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 19:12:20 -04:00
Sangtae Ha
b5ccd07337 tcp_cubic: fix low utilization of CUBIC with HyStart
HyStart sets the initial exit point of slow start.
Suppose that HyStart exits at 0.5BDP in a BDP network and no history exists.
If the BDP of a network is large, CUBIC's initial cwnd growth may be
too conservative to utilize the link.
CUBIC increases the cwnd 20% per RTT in this case.

Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:42 -07:00
Sangtae Ha
2b4636a5f8 tcp_cubic: make the delay threshold of HyStart less sensitive
Make HyStart less sensitive to abrupt delay variations due to buffer bloat.

Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:42 -07:00
stephen hemminger
3b585b3449 tcp_cubic: enable high resolution ack time if needed
This is a refined version of an earlier patch by Lucas Nussbaum.
Cubic needs RTT values in milliseconds. If HZ < 1000 then
the values will be too coarse.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:40 -07:00
stephen hemminger
17a6e9f1aa tcp_cubic: fix clock dependency
The hystart code was written with assumption that HZ=1000.
Replace the use of jiffies with bictcp_clock as a millisecond
real time clock.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:39 -07:00
stephen hemminger
aac46324e1 tcp_cubic: make ack train delta value a parameter
Make the spacing between ACK's that indicates a train a tuneable
value like other hystart values.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:39 -07:00
stephen hemminger
c54b4b7655 tcp_cubic: fix comparison of jiffies
Jiffies wraps around therefore the correct way to compare is
to use cast to signed value.

Note: cubic is not using full jiffies value on 64 bit arch
because using full unsigned long makes struct bictcp grow too
large for the available ca_priv area.

Includes correction from Sangtae Ha to improve ack train detection.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:38 -07:00
stephen hemminger
febf081987 tcp: fix RTT for quick packets in congestion control
In the congestion control interface, the callback for each ACK
includes an estimated round trip time in microseconds.
Some algorithms need high resolution (Vegas style) but most only
need jiffie resolution.  If RTT is not accurate (like a retransmission)
-1 is used as a flag value.

When doing coarse resolution if RTT is less than a a jiffie
then 0 should be returned rather than no estimate. Otherwise algorithms
that expect good ack's to trigger slow start (like CUBIC Hystart)
will be confused.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:38 -07:00
Eric Dumazet
c05e7ac99c ftmac100: use GFP_ATOMIC allocations where needed
When running in softirq context, we should use GFP_ATOMIC allocations
instead of GFP_KERNEL ones.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Po-Yu Chuang <ratbert@faraday-tech.com>
Acked-by: Po-Yu Chuang <ratbert@faraday-tech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:40:39 -07:00