Currently, the PM core always attempts to manage devices with drivers
that use the new PM framework. In particular, it attempts to disable
the devices (which is unnecessary), to save their state (which may be
undesirable if the driver has done that already) and to put them into
low power states (again, this may be undesirable if the driver has
already put the device into a low power state). That need not be
the right thing to do, so make the core be more careful in this
respect.
Generally, there are the following categories of devices to consider:
* bridge devices without drivers
* non-bridge devices without drivers
* bridge devices with drivers
* non-bridge devices with drivers
and each of them should be handled differently.
For bridge devices without drivers the PCI PM core will save their
state on suspend and restore it (early) during resume, after putting
them into D0 if necessary. It will not attempt to do anything else
to these devices.
For non-bridge devices without drivers the PCI PM core will disable
them and save their state on suspend. During resume, it will put
them into D0, if necessary, restore their state (early) and reenable
them.
For bridge devices with drivers the PCI PM core will only save
their state on suspend if the driver hasn't done that already.
Still, the core will restore their state (early) during resume,
after putting them into D0, if necessary.
For non-bridge devices with drivers the PCI PM core will only save
their state on suspend if the driver hasn't done that already. Also,
if the state of the device hasn't been saved by the driver, the core
will attempt to put the device into a low power state. During
resume the core will restore the state of the device (early), after
putting it into D0, if necessary.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_restore_standard_config() unconditionally changes current_state
to PCI_D0 after attempting to change the device's power state, but
it should rather read the actual current power state from the
device.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
It is a mistake to disable and enable PCI bridges and PCI Express
ports during suspend-resume, at least at the time when it is
currently done. Disabling them may lead to problems with accessing
devices behind them and they should be automatically enabled when
their standard config spaces are restored. Fix this by not attempting
to disable bridges during suspend and enable them during resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Simplify suspend and resume of the PCI Express port driver. It no
longer needs to save and restore the standard configuration space of the
device; this is now done by the PCI PM core layer.
This patch is reported to fix the regression tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=12598
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make pci_legacy_suspend() save the state of the device if it is
in PCI_UNKNOWN after its suspend callback has run and warn only if
the power state of the device has been changed by its suspend
callback.
Also, use WARN_ONCE(), which is more useful, in pci_legacy_suspend(),
so that the name of the offending function is printed.
Additionally, remove the unnecessary line of code setting
pci_dev->state_saved.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Check if the standard configuration registers of a PCI device have
been saved during suspend before trying to restore them during
resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-By: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Suspend to RAM is reported to break on some machines as a result of
attempting to put one of driverless PCI devices into a low power
state. Avoid that by not attepmting to power manage driverless
devices during suspend.
Fix up pci_pm_poweroff() after a previous incomplete fix for the same
thing during hibernation.
This patch is reported to fix the regression from 2.6.28 tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=12605
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch makes the ROM reading code return an error to user space if
the size of the ROM read is equal to 0.
The patch also emits a warnings if the contents of the ROM are invalid,
and documents the effects of the "enable" file on ROM reading.
Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au>
Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We only want to disable ASPM when the last function is removed from
the parent's device list. We determine this by checking to see if
the parent's device list is completely empty.
Unfortunately, we never hit that code because the parent is considered
an upstream port, and never had an ASPM link_state associated with it.
The early check for !link_state causes us to return early, we never
discover that our device list is empty, and thus we never remove the
downstream ports' link_state nodes.
Instead of checking to see if the parent's device list is empty, we can
check to see if we are the last device on the list, and if so, then we
know that we can clean up properly.
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The function sock_alloc_send_pskb is completely useless if not
exported since most of the code in it won't be used as is. In
fact, this code has already been duplicated in the tun driver.
Now that we need accounting in the tun driver, we can in fact
use this function as is. So this patch marks it for export again.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it currently stands, skb destructors are forbidden on the
receive path because the protocol end-points will overwrite
any existing destructor with their own.
This is the reason why we have to call skb_orphan in the loopback
driver before we reinject the packet back into the stack, thus
creating a period during which loopback traffic isn't charged
to any socket.
With virtualisation, we have a similar problem in that traffic
is reinjected into the stack without being associated with any
socket entity, thus providing no natural congestion push-back
for those poor folks still stuck with UDP.
Now had we been consistent in telling them that UDP simply has
no congestion feedback, I could just fob them off. Unfortunately,
we appear to have gone to some length in catering for this on
the standard UDP path, with skb/socket accounting so that has
created a very unhealthy dependency.
Alas habits are difficult to break out of, so we may just have
to allow skb destructors on the receive path.
It turns out that making skb destructors useable on the receive path
isn't as easy as it seems. For instance, simply adding skb_orphan
to skb_set_owner_r isn't enough. This is because we assume all
over the IP stack that skb->sk is an IP socket if present.
The new transparent proxy code goes one step further and assumes
that skb->sk is the receiving socket if present.
Now all of this can be dealt with by adding simple checks such
as only treating skb->sk as an IP socket if skb->sk->sk_family
matches. However, it turns out that for bridging at least we
don't need to do all of this work.
This is of interest because most virtualisation setups use bridging
so we don't actually go through the IP stack on the host (with
the exception of our old nemesis the bridge netfilter, but that's
easily taken care of).
So this patch simply adds skb_orphan to the point just before we
enter the IP stack, but after we've gone through the bridge on the
receive path. It also adds an skb_orphan to the one place in
netfilter that touches skb->sk/skb->destructor, that is, tproxy.
One word of caution, because of the internal code structure, anyone
wishing to deploy this must use skb_set_owner_w as opposed to
skb_set_owner_r since many functions that create a new skb from
an existing one will invoke skb_set_owner_w on the new skb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stashing is only supported on the 85xx (e500-based) SoCs. The 83xx and 86xx
chips don't have a proper cache for this. U-Boot has been updated to add
stashing properties to the device tree nodes of gianfar devices on 85xx. So
now we modify Linux to keep stashing off unless those properties are there.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MDIO bus drivers for the UCC and gianfar ethernet controllers are
essentially the same. There's no reason to duplicate that much code.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SOFT_RESET must be asserted for at least 3 TX clocks in order for it to work
properly. The syncs in the gfar_write() commands have been hiding this, but
we need to guarantee it.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BD_LENGTH_MASK is supposed to catch the low 16-bits of the status field, not
the low byte. The old way, we would never be able to clean up tx packets with
sizes divisible by 256.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many physical NICs let the OS re-program the "hardware" MAC
address. Virtual NICs should allow this too.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VLAN filtering allows the hypervisor to drop packets from VLANs
that we're not a part of, further reducing the number of extraneous
packets recieved. This makes use of the VLAN virtqueue command class.
The CTRL_VLAN feature bit tells us whether the backend supports VLAN
filtering.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the MAC control virtqueue class to support a MAC
filter table. The filter table is managed by the hypervisor.
We consider the table to be available if the CTRL_RX feature
bit is set. We leave it to the hypervisor to manage the table
and enable promiscuous or all-multi mode as necessary depending
on the resources available to it.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the RX_MODE control virtqueue class to enable the
set_rx_mode netdev interface. This allows us to selectively
enable/disable promiscuous and allmulti mode so we don't see
packets we don't want. For now, we automatically enable these
as needed if additional unicast or multicast addresses are
requested.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used for RX mode, MAC filter table, VLAN filtering, etc...
The control transaction consists of one or more "out" sg entries and
one or more "in" sg entries. The first out entry contains a header
defining the class and command. Additional out entries may provide
data for the command. The last in entry provides a status response
back from the command.
Virtqueues typically run asynchronous, running a callback function
when there's data in the channel. We can't readily make use of this
in the command paths where we need to use this. Instead, we kick
the virtqueue and spin. The kick causes an I/O write, triggering an
immediate trap into the hypervisor.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LRO switch is always set to 1 in the rx processing loop.
It breaks the accelerated iSCSI receive traffic.
Fix its computation.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a small problem with hpet_rtc_reinit function - it checks
for the:
hpet_readl(HPET_COUNTER) - hpet_t1_cmp > 0
to continue increasing both the HPET_T1_CMP (register) and the
hpet_t1_cmp (variable).
But since the HPET_COUNTER is always 32-bit, if the hpet_t1_cmp
is 64-bit this condition will always be FALSE once the latter hits
the 32-bit boundary, and we can have a situation, when we don't
increase the HPET_T1_CMP register high enough.
The result - timer stops ticking, since HPET_T1_CMP becomes less,
than the COUNTER and never increased again.
The solution is (based on Linus's suggestion) to not compare 64-bits
(on 64-bit x86), but to do the comparison on 32-bit signed
integers.
Reported-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: APIC: enable workaround on AMD Fam10h CPUs
xen: disable interrupts before saving in percpu
x86: add x86@kernel.org to MAINTAINERS
x86: push old stack address on irqstack for unwinder
irq, x86: fix lock status with numa_migrate_irq_desc
x86: add cache descriptors for Intel Core i7
x86/Voyager: make it build and boot
Christian Borntraeger reports:
> After a logical cpu offline, even on a complete idle system, there
> is one cpu with full ticks. It turns out that nohz.cpu_mask has the
> the offlined cpu still set.
>
> In select_nohz_load_balancer() we check if the system is completely
> idle to turn of load balancing. We compare cpu_online_map with
> nohz.cpu_mask. Since cpu_online_map is updated on cpu unplug,
> but nohz.cpu_mask is not, the check fails and the scheduler believes
> that we need an "idle load balancer" even on a fully idle system.
> Since the ilb cpu does not deactivate the timer tick this breaks NOHZ.
Fix the select_nohz_load_balancer() to not set the nohz.cpu_mask
while a cpu is going offline.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Some lines exceed the 80 char width making them unreadable.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cleans uCode key table bit map iwl_clear_stations_table
since all stations are cleared also the key table must be.
Since the keys are not removed properly on suspend by mac80211
this may result in exhausting key table on resume leading
to memory corruption during removal
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch echoes what we already do on 32-bit since
90f7d25c6b, and prints the DMI
product name in show_regs, so that system specific problems can be
easily identified.
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 0e0333429a.
I committed this by accident - Joel and Louis are working with the lockdep
maintainer to provide a better solution than just turning lockdep off.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Acked-by: <Joel Becker <joel.becker@oracle.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: pcm_oss: AFMT_S24_LE is set twice in return value
ALSA: ASoC: email - update email addresses.
OMAP: ASoC: Fix spinlock misuse in omap-pcm.c
ALSA: hda - No widget selection for volume knob widgets in proc output
ALSA: hda - Add support of iMac 24 Aluminium
ALSA: alsa: time reaches -1, tested 0
ALSA: hda - Add quirk for another HP dv5 model
AFMT_S24_LE is set twice in return value
vi sound/core/oss/pcm_oss.c +640
#define AFMT_S24_LE 0x00008000
#define AFMT_S24_BE 0x00010000
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (40 commits)
Blackfin arch: Remove outdated code
Blackfin arch: Fix udelay implementation
Blackfin arch: Update Copyright information
Blackfin arch: Add BF561 PPI POLS, POLC Masks
Blackfin arch: Update CM-BF527 kernel config
Blackfin arch: define bfin_memmap as static since it is only used here
Blackfin arch: cplb mananger: use a do...while loop rather than a for loop
Blackfin arch: fix bug - traps test case 19 for exception 0x2d fails
Blackfin arch: add platform device bfin_mii-bus and KSZ8893M switch driver platform resources to board files
Blackfin arch: build jtag tty driver as a module by default
Blackfin arch: fix 2 bugs related to debug
Blackfin arch: Add ANOMALY_05000380 to BF54x to kill the compile warning
Blackfin arch: Fix bug - 561 SMP kernel can't boot from jffs2
Blackfin arch: base SIC_IWR# programming on whether the MMR exists
Blackfin arch: read SYSCR on newer parts that mirror the bits of SWRST in it
Blackfin arch: fixup board init function name
Blackfin arch: drop CONFIG_I2C_BOARDINFO ifdefs
Blackfin arch: bfin_reset->_bfin_reset redirection no longer needed
Blackfin arch: sync reboot handler with version in u-boot
Blackfin arch: Faster Implementation of csum_tcpudp_nofold()
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Kill bogus TPC/address truncation during 32-bit faults.
sparc: fixup for sparseirq changes
sparc64: Validate kernel generated fault addresses on sparc64.
sparc64: On non-Niagara, need to touch NMI watchdog in NOHZ mode.
sparc64: Implement NMI watchdog on capable cpus.
sparc: Probe PMU type and record in sparc_pmu_type.
sparc64: Move generic PCR support code to seperate file.
On fast devices that go from congested to uncongested very quickly, pdflush
is waiting too often in congestion_wait, and the FS is backing off to
easily in write_cache_pages.
For now, fix this on the btrfs side by only checking congestion after
some bios have already gone down. Longer term a real fix is needed
for pdflush, but that is a larger project.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Whenever an item deletion is done, we need to balance all the nodes
in the tree to make sure we don't end up with an empty node if a pointer
is deleted. This balance prep happens from the root of the tree down
so we can drop our locks as we go.
reada_for_balance was triggering read-ahead on neighboring nodes even
when no balancing was required. This adds an extra check to avoid
calling balance_level() and avoid reada_for_balance() when a balance
won't be required.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
btrfs_unlock_up_safe would break out at the first NULL node entry or
unlocked node it found in the path.
Some of the callers have missing nodes at the lower levels of the path, so this
commit fixes things to check all the nodes in the path before returning.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
btrfs_del_leaf does two things. First it removes the pointer in the
parent, and then it frees the block that has the leaf. It has the
parent node locked for both operations.
But, it only needs the parent locked while it is deleting the pointer.
After that it can safely free the block without the parent locked.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
btrfs_truncate_inode_items is setup to stop doing btree searches when
it has finished removing the items for the inode. It used to detect the
end of the inode by looking for an objectid that didn't match the
one we were searching for.
But, this would result in an extra search through the btree, which
adds extra balancing and cow costs to the operation.
This commit adds a check to see if we found the inode item, which means
we can stop searching early.
Signed-off-by: Chris Mason <chris.mason@oracle.com>