Thomas Hellstrom bisected a regression where erratic 3D performance is
experienced on virtual machines as measured by glxgears. It identified
commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA
node") as the problem which had modified the behaviour of effective_load.
Effective load calculates the difference to the system-wide load if a
scheduling entity was moved to another CPU. The task group is not heavier
as a result of the move but overall system load can increase/decrease as a
result of the change. Commit 58d081b5 ("sched/numa: Avoid overloading CPUs
on a preferred NUMA node") changed effective_load to make it suitable for
calculating if a particular NUMA node was compute overloaded. To reduce
the cost of the function, it assumed that a current sched entity weight
of 0 was uninteresting but that is not the case.
wake_affine() uses a weight of 0 for sync wakeups on the grounds that it
is assuming the waking task will sleep and not contribute to load in the
near future. In this case, we still want to calculate the effective load
of the sched entity hierarchy. As effective_load is no longer used by
task_numa_compare since commit fb13c7ee (sched/numa: Use a system-wide
search to find swap/migration candidates), this patch simply restores the
historical behaviour.
Reported-and-tested-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
[ Wrote changelog]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140106113912.GC6178@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to
call destroy_work_on_stack() which frees the debug object to pair
with INIT_WORK_ONSTACK().
Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.
Reported-by: halfdog <me@halfdog.net>
Tested-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
895a068a52 ("kernfs: make kernfs_get_active() block if the node is
deactivated but not removed") added "struct kernfs_root *root =
kernfs_root(kn);" at the head of the function; however, the parameter
@kn is checked for later implying that the function may be called with
NULL. This means that we may end up invoking kernfs_root() with NULL
which will oops. None of the existing users invokes removal with NULL
@kn, so this bug doesn't actually trigger.
We can relocate kernfs_root() invocation after NULL check; however,
allowing NULL param tends to cause more confusion than actually
helping anything. As there's no existing user, let's remove the
spurious NULL check.
This bug was detected by smatch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are leaks of resources allocated by wlan_setup() and usb_dev refcnt
on failure paths in prism2sta_probe_usb().
The patch adds appropriate deallocations and removes invalid code
from hfa384x_corereset() failure handling.
unregister_wlandev() is wrong because it is not registered yet.
hfa384x_destroy() is just noop in init state.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch for 8255.c fixes a spacing warning found by checkpatch.pl.
Signed-off-by: Chase Southwood <chase.southwood@yahoo.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes two spaces at the start of the line aswell as all space after
opening parenthesis and space before closeing parenthesis checkpatch.pl warnings
in rtw_mlme.h
Signed-off-by: Tim Jester-Pfadt <t.jp@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix sparse warnings for undeclared symbols not marked static like:
148:6: warning: symbol 'enqueue_mgmt' was not declared. Should it be static?
166:16: warning: symbol 'dequeue_mgmt' was not declared. Should it be static?
Signed-off-by: Anmol Sarma <me@anmolsarma.in>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes the following sparse warning:
27:5: warning: symbol 'rtl8180_rates' was not declared. Should it be static?
Signed-off-by: Anmol Sarma <me@anmolsarma.in>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix sparse warnings for undeclared symbols not marked static like:
390:6: warning: symbol 'buffer_free' was not declared. Should it be static?
1031:5: warning: symbol 'ComputeTxTime' was not declared. Should it be static?
Signed-off-by: Anmol Sarma <me@anmolsarma.in>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently code has an inverted logic: opcode from user memory
is swapped to a proper endianness only in case of read error.
While normally opcode should be swapped only if it was read
correctly from user memory.
Reviewed-by: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
arch/arm/kernel/perf_event_cpu.c:274:25: warning: incorrect type in assignment (different modifiers)
arch/arm/kernel/perf_event_cpu.c:274:25: expected int ( *init_fn )( ... )
arch/arm/kernel/perf_event_cpu.c:274:25: got void const *const data
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The MPIDR contains specific bitfields(MPIDR.Aff{2..0}) which uniquely
identify a CPU, in addition to some non-identifying information and
reserved bits. The ARM cpu binding defines the 'reg' property to only
contain the affinity bits, and any cpu nodes with other bits set in
their 'reg' entry are skipped.
As such it is not necessary to mask the phys_id with MPIDR_HWID_BITMASK,
and doing so could lead to matching erroneous CPU nodes in the device
tree. This patch removes the masking of the physical identifier.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The PCI devices with DMA masks smaller than 32bit should enable
CONFIG_ZONE_DMA. Since the recent change of page allocator, page
allocations via dma_alloc_coherent() with the limited DMA mask bits
may fail more frequently, ended up with no available buffers, when
CONFIG_ZONE_DMA isn't enabled. With CONFIG_ZONE_DMA, the system has
much more chance to obtain such pages.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=68221
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The failures of buffer preallocations at driver initializations aren't
critical but it's still helpful to inform, so that user can know that
something doesn't work as expected.
For example, the recent page allocator change triggered regressions,
but developers didn't notice until recently because the driver didn't
complain.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Often, usb drivers need some driver_info to get a device to work. To
have access to driver_info when using new_id, allow to pass a reference
vendor:product tuple from which new_id will inherit driver_info.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Check if that field is actually used and if so, bail out if it exeeds a
u8. Make it also future-proof by not requiring "exactly three"
parameters in new_id, but simply "more than two".
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subsystems such as ALSA, DRM and others require a single card-level
device structure to represent a subsystem. However, firmware tends to
describe the individual devices and the connections between them.
Therefore, we need a way to gather up the individual component devices
together, and indicate when we have all the component devices.
We do this in DT by providing a "superdevice" node which specifies
the components, eg:
imx-drm {
compatible = "fsl,drm";
crtcs = <&ipu1>;
connectors = <&hdmi>;
};
The superdevice is declared into the component support, along with the
subcomponents. The superdevice receives callbacks to locate the
subcomponents, and identify when all components are present. At this
point, we bind the superdevice, which causes the appropriate subsystem
to be initialised in the conventional way.
When any of the components or superdevice are removed from the system,
we unbind the superdevice, thereby taking the subsystem down.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All device_schedule_callback_owner() users are converted to use
device_remove_file_self(). Remove now unused
{sysfs|device}_schedule_callback_owner().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This reverts b48d4425b6 ("PCI: add ID-based ordering enable/disable
support"), removing these interfaces:
pci_enable_ido()
pci_disable_ido()
[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This reverts 48a92a8179 ("PCI: add OBFF enable/disable support"),
removing these interfaces:
pci_enable_obff()
pci_disable_obff()
[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This reverts 51c2e0a7e5 ("PCI: add latency tolerance reporting
enable/disable support"), removing these interfaces:
pci_enable_ltr()
pci_disable_ltr()
pci_set_ltr()
[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Pull networking fixes from David Miller:
"Famouse last words: "final pull request" :-)
I'm sending this because Jason Wang's fixes are pretty important
1) Add missing per-cpu stats initialization to ip6_vti. Otherwise
lockdep spits out a call trace. From Li RongQing.
2) Fix NULL oops in wireless hwsim, from Javier Lopez
3) TIPC deferred packet queue unlink must NULL out skb->next to avoid
crashes. From Erik Hugne
4) Fix access to uninitialized buffer in nf_nat netfilter code, from
Daniel Borkmann
5) Fix lifetime of ipv6 loopback and SIT tunnel addresses, otherwise
they basically timeout immediately. From Hannes Frederic Sowa
6) Fix DMA unmapping of TSO packets in bnx2x driver, from Michal
Schmidt
7) Do not allow L2 forwarding offload via macvtap device, the way
things are now it will not end up being forwaded at all. From
Jason Wang
8) Fix transmit queue selection via ndo_dfwd_start_xmit(), fixing
things like applying NETIF_F_LLTX to the wrong device (!!) and
eliding the proper transmit watchdog handling
9) qlcnic driver was not updating tx statistics at all, from Manish
Chopra"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
qlcnic: Fix ethtool statistics length calculation
qlcnic: Fix bug in TX statistics
net: core: explicitly select a txq before doing l2 forwarding
macvlan: forbid L2 fowarding offload for macvtap
bnx2x: fix DMA unmapping of TSO split BDs
ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME
bnx2x: prevent WARN during driver unload
tipc: correctly unlink packets from deferred packet queue
ipv6: pcpu_tstats.syncp should be initialised in ip6_vti.c
netfilter: only warn once on wrong seqadj usage
netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper
NFC: Fix target mode p2p link establishment
iwlwifi: add new devices for 7265 series
mac80211: move "bufferable MMPDU" check to fix AP mode scan
mac80211_hwsim: Fix NULL pointer dereference
- fix off-by-one in xfs_attr3_rmt_verify
- fix missing destroy_work_on_stack() in xfs_bmapi_allocate
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJS0ECWAAoJENaLyazVq6ZOgn0QAKSC/pkP4km+QbmL0R7SqSJH
ZSSj16gIjR5lHlwI3PQzv5BgyEC9BcRDKWXN6dy+GHHuMtP4qYK8cLWFcyl7EysH
HAyDBnaJVphXt23C5iIzk+iseNfRYXA2LOpYSH6qfhZ5bxEeYzQS42zL4YhxZrXq
kzLHojcTLUx0IzJ+4oHn5AXSgPt+PXxNz3s+TU9virFnfSMlw2qYukxQtG49nbQr
kQjNHgeTIBKzeHdlnxmv5Rd2bD//397w5aWXxmaUh8fk6Z7VJi40ALAG4Pks81HF
+TEgMtF9/xTXdlwrYJDoHp++vUs6HANCX+wSAb4MdrBQvjh/USytK2WFwOeMyyR6
L/iogfPXHHizTkoYSzPwPdEmCCFhzidvBEqNX68+ojlJnDtoart7IgkOcm9LvaQI
j//u76CPRcd8tFh+1fDNaXn1ykJ6/CepSY13/yOnbpc7JoDbtqK2R8HFxdSlkDDg
UooLF2AfQ6lX280cUWwV0flqGO6iTIM3Fw1mIq3z8X4usNn+bMnlOu/DUnCbF5bB
YJCV4uT7f04w7oJqin9a7LHaHKRD56tWQun/OCEd7ZV/hJ1YRYlhhLfSdWdX7+SX
oIawXJy7NvCPQLaTwycD3h2gDlaxw17GAc9rA3AcCknxBsgNosv1ETQnEPC4iIAq
QsVal7p6oMLZ/qx6mvX7
=Xpq3
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs
Pull xfs bugfixes from Ben Myers:
"Here we have a bugfix for an off-by-one in the remote attribute
verifier that results in a forced shutdown which you can hit with v5
superblock by creating a 64k xattr, and a fix for a missing
destroy_work_on_stack() in the allocation worker.
It's a bit late, but they are both fairly straightforward"
* tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs:
xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
xfs: fix off-by-one error in xfs_attr3_rmt_verify
Pull LED fix from Bryan Wu:
"Pali Rohár and Pavel Machek reported the LED of Nokia N900 doesn't
work with our latest 3.13-rc6 kernel. Milo fixed the regression here"
* 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: lp5521/5523: Remove duplicate mutex
- Recent commits modifying the lists of C-states in the intel_idle
driver introduced bugs leading to crashes on some systems. Two
fixes from Jiang Liu.
- The ACPI AC driver should receive all types of notifications, but
recent change made it ignore some of them. Fix from Alexander Mezin.
- intel_pstate's validity checks for MSRs it depends on are not
sufficient to catch the lack of support in nested KVM setups, so
they are extended to cover that case. From Dirk Brandewie.
- NEC LZ750/LS has a botched up _BIX method in its ACPI tables, so our
ACPI battery driver needs a quirk for it. From Lan Tianyu.
- The tpm_ppi driver sometimes leaks memory allocated by acpi_get_name().
Fix from Jiang Liu.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJSz/qJAAoJEILEb/54YlRxraQP/Arse18aykhROgu4tFbJ5Cy0
w4T8PjmSgb7IYkld/zsY18dgOxEXRSzpXeaYREfcehgjipwv30BS4YPxXlulCMYG
79CCOFbWqjzhxRdNYpgqoKsW7H5PI1bMn6tD1Ih2DAlkqnCevqYqs9PsUnNJCXE9
Go2ldksIw/pRUlTEnhIBU0mGGcZLSau0UELVYV0ObJMbjY2DLCaqODXKHBOKoBvQ
+4cBkIF0WP73ISLwPAXv7o3C0+10b4q23zpQ+oDw5qzckuhHcCIJOmSs62mhoJjt
ZswQWZrrocmntASaMsvzDNCezu/GkV6ZF6+jFyedrpQiFJSCVjTVD+Qix098UWIn
3Fs0l9Mf51sTuQR8/RD93zgQEwPBjphrFOzBePDZkZXYAEzKA8IPXxQv8PKU/rIy
LkhmocwTVN4shNy+JF0xOxlvndba1FUm2E9TqVJzNFrSrQFt6M2AyBbyMxyTKlTm
SYcfR1jkoPTMPZ4z5tXB91HtXZVd5R9Bn3b80AwXriU6mEUb0Sf7UAdPDulvBv/W
WAeItvg6K1oiED6WHhardL/MQc7gXtrRzXSL+M+xAoxlGI/vo7MJsez3SzDafLhQ
nSggpcpo2F6h39Fi1UFUkciWsp4s2/BzbHObu960DMO0n+23/riguOxPgHz4AfiC
7ejxAG5eMpF7QiKjEUYO
=Eqc0
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- Recent commits modifying the lists of C-states in the intel_idle
driver introduced bugs leading to crashes on some systems. Two fixes
from Jiang Liu.
- The ACPI AC driver should receive all types of notifications, but
recent change made it ignore some of them. Fix from Alexander Mezin.
- intel_pstate's validity checks for MSRs it depends on are not
sufficient to catch the lack of support in nested KVM setups, so they
are extended to cover that case. From Dirk Brandewie.
- NEC LZ750/LS has a botched up _BIX method in its ACPI tables, so our
ACPI battery driver needs a quirk for it. From Lan Tianyu.
- The tpm_ppi driver sometimes leaks memory allocated by
acpi_get_name(). Fix from Jiang Liu.
* tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
intel_idle: close avn_cstates array with correct marker
Revert "intel_idle: mark states tables with __initdata tag"
ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS
intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.
ACPI / TPM: fix memory leak when walking ACPI namespace
ACPI / AC: change notification handler type to ACPI_ALL_NOTIFY
It only contains one fix for the rtsx_pcr driver. Without it we see a
kernel panic on some machines, when resuming from suspend to RAM.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSz7AIAAoJEIqAPN1PVmxKRSAQAJJpDZQB7zdQmcAEWK/NQPDE
h3QRfO0Fc+X8OpIPjAtmQgTXi4Naf5B5KwpDl5Mdcqe1PyWC959y94KohiHiN5WL
lcbGJF7LD+NfpdiXomIa6HmkBM8gUVAheVfU4mnYy5rbIRTVCLxysPYFt6w9cQRU
7dkHyL5qVt6+9p+/jr+XuVW//k7lbZX7nwBcs+HFInQNHu4qT97gxs99Mbvob+LZ
EdREj4hlrgteyTLAYTmHFtWkL464IeZXN9iI7ncShf+6icxwYyHIsr7QPjxO33+B
A0Wueofb3VKAbFi41g38QbstsywdWi2X5YDxbMi1VpB5LZe/TyVl6fAbAB8NAfNN
s9mOtrEdL3bBWcbWmXuZSsq4Jjvxl9IZ3aK0nwibD4cP64BRbbQG+ICcZ1e4urAR
uxhC7sT8tR+XJYr3cv+r6Br3awqku+0/hOLAXK+YKonftUBd2WvpWVya/zJW64JW
UyxnS87Zdz98Z1BK9925vC+TwhBxcp+jlEgrT+Ersgvtr+0o22ditrusxbY3vIea
F3LVeIIlD1rcOt8n2fL/0L6M/QXHNKycdA24nb2kViYQCfb5edmEEQ+OXTcoXzTy
9Gxm0aZaABY2Y+4cHnAQIi7IcVd34KcJRSn77+rStXwlyI9fIgG+5xfKUj4oFHZZ
eiBUJ6vC6Uq36jqh4SLt
=Dmc7
-----END PGP SIGNATURE-----
Merge tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes
Pull MFD fix from Samuel Ortiz:
"This is the 2nd MFD pull request for 3.13
It only contains one fix for the rtsx_pcr driver. Without it we see a
kernel panic on some machines, when resuming from suspend to RAM"
* tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
mfd: rtsx_pcr: Disable interrupts before cancelling delayed works
It can be a problem when a pattern is loaded via the firmware interface.
LP55xx common driver has already locked the mutex in 'lp55xx_firmware_loaded()'.
So it should be deleted.
On the other hand, locks are required in store_engine_load()
on updating program memory.
Reported-by: Pali Rohár <pali.rohar@gmail.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Milo Kim <milo.kim@ti.com>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
Cc: <stable@vger.kernel.org>
driver-core now supports synchrnous self-deletion of attributes and
the asynchrnous removal mechanism is scheduled for removal. Use it
instead of device_schedule_callback().
* Conversions in arch/s390/pci/pci_sysfs.c and
drivers/s390/block/dcssblk.c are straightforward.
* drivers/s390/cio/ccwgroup.c is a bit more tricky because
ccwgroup_notifier() was (ab)using device_schedule_callback() to
purely obtain a process context to kick off ungroup operation which
may block from a notifier callback.
Rename ccwgroup_ungroup_callback() to ccwgroup_ungroup() and make it
take ccwgroup_device * instead. The new function is now called
directly from ccwgroup_ungroup_store().
ccwgroup_notifier() chain is updated to explicitly bounce through
ccwgroup_device->ungroup_work. This also removes possible failure
from memory pressure.
Only compile-tested.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
driver-core now supports synchrnous self-deletion of attributes and
the asynchrnous removal mechanism is scheduled for removal. Use it
instead of device_schedule_callback(). This makes "delete" behave
synchronously.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
driver-core now supports synchrnous self-deletion of attributes and
the asynchrnous removal mechanism is scheduled for removal. Use it
instead of device_schedule_callback(). This makes "remove" behave
synchronously.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sometimes it's necessary to implement a node which wants to delete
nodes including itself. This isn't straightforward because of kernfs
active reference. While a file operation is in progress, an active
reference is held and kernfs_remove() waits for all such references to
drain before completing. For a self-deleting node, this is a deadlock
as kernfs_remove() ends up waiting for an active reference that itself
is sitting on top of.
This currently is worked around in the sysfs layer using
sysfs_schedule_callback() which makes such removals asynchronous.
While it works, it's rather cumbersome and inherently breaks
synchronicity of the operation - the file operation which triggered
the operation may complete before the removal is finished (or even
started) and the removal may fail asynchronously. If a removal
operation is immmediately followed by another operation which expects
the specific name to be available (e.g. removal followed by rename
onto the same name), there's no way to make the latter operation
reliable.
The thing is there's no inherent reason for this to be asynchrnous.
All that's necessary to do this synchronous is a dedicated operation
which drops its own active ref and deactivates self. This patch
implements kernfs_remove_self() and its wrappers in sysfs and driver
core. kernfs_remove_self() is to be called from one of the file
operations, drops the active ref and deactivates using
__kernfs_deactivate_self(), removes the self node, and restores active
ref to the dead node using __kernfs_reactivate_self() so that the ref
is balanced afterwards. __kernfs_remove() is updated so that it takes
an early exit if the target node is already fully removed so that the
active ref restored by kernfs_remove_self() after removal doesn't
confuse the deactivation path.
This makes implementing self-deleting nodes very easy. The normal
removal path doesn't even need to be changed to use
kernfs_remove_self() for the self-deleting node. The method can
invoke kernfs_remove_self() on itself before proceeding the normal
removal path. kernfs_remove() invoked on the node by the normal
deletion path will simply be ignored.
This will replace sysfs_schedule_callback(). A subtle feature of
sysfs_schedule_callback() is that it collapses multiple invocations -
even if multiple removals are triggered, the removal callback is run
only once. An equivalent effect can be achieved by testing the return
value of kernfs_remove_self() - only the one which gets %true return
value should proceed with actual deletion. All other instances of
kernfs_remove_self() will wait till the enclosing kernfs operation
which invoked the winning instance of kernfs_remove_self() finishes
and then return %false. This trivially makes all users of
kernfs_remove_self() automatically show correct synchronous behavior
even when there are multiple concurrent operations - all "echo 1 >
delete" instances will finish only after the whole operation is
completed by one of the instances.
v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing
and sysfs_remove_file_self() had incorrect return type. Fix it.
Reported by kbuild test bot.
v3: Updated to use __kernfs_{de|re}activate_self().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch implements four functions to manipulate deactivation state
- deactivate, reactivate and the _self suffixed pair. A new fields
kernfs_node->deact_depth is added so that concurrent and nested
deactivations are handled properly. kernfs_node->hash is moved so
that it's paired with the new field so that it doesn't increase the
size of kernfs_node.
A kernfs user's lock would normally nest inside active ref but during
removal the user may want to perform kernfs_remove() while holding the
said lock, which would introduce a reverse locking dependency. This
function can be used to break such reverse dependency by allowing
deactivation step to performed separately outside user's critical
section.
This will also be used implement kernfs_remove_self().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, kernfs_get_active() fails if the target node is
deactivated. This is fine as a node always gets removed after
deactivation; however, we're gonna add reactivation so the assumption
won't hold. It'd be incorrect for kernfs_get_active() to fail for a
node which was deactivated only temporarily.
This patch makes kernfs_get_active() block if the node is deactivated
but not removed. If the node gets reactivated (not yet implemented),
it will be retried and succeed. If the node gets removed, it will be
woken up and fail.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs_addrm_cxt and the accompanying kernfs_addrm_start/finish() were
added because there were operations which should be performed outside
kernfs_mutex after adding and removing kernfs_nodes. The necessary
operations were recorded in kernfs_addrm_cxt and performed by
kernfs_addrm_finish(); however, after the recent changes which
relocated deactivation and unmapping so that they're performed
directly during removal, the only operation kernfs_addrm_finish()
performs is kernfs_put(), which can be moved inside the removal path
too.
This patch moves the kernfs_put() of the base ref to __kernfs_remove()
and remove kernfs_addrm_cxt and kernfs_addrm_start/finish().
* kernfs_add_one() is updated to grab and release the parent's active
ref and kernfs_mutex itself. kernfs_get/put_active() and
kernfs_addrm_start/finish() invocations around it are removed from
all users.
* __kernfs_remove() puts an unlinked node directly instead of chaining
it to kernfs_addrm_cxt. Its callers are updated to grab and release
kernfs_mutex instead of calling kernfs_addrm_start/finish() around
it.
v2: Updated to fit the v2 restructuring of removal path.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs_unmap_bin_file() is supposed to unmap all memory mappings of
the target file before kernfs_remove() finishes; however, it currently
is being called from kernfs_addrm_finish() and has the same race
problem as the original implementation of deactivation when there are
multiple removers - only the remover which snatches the node to its
addrm_cxt->removed list is guaranteed to wait for its completion
before returning.
It can be fixed by moving kernfs_unmap_bin_file() invocation from
kernfs_addrm_finish() to __kernfs_remove(). The function may be
called multiple times but that shouldn't do any harm.
We end up dropping kernfs_mutex in the removal loop and the node may
be removed inbetween by someone else. kernfs_unlink_sibling() is
updated to test whether the node has already been removed and return
accordingly. __kernfs_remove() in turn performs post-unlinking
cleanup only if it actually unlinked the node.
KERNFS_HAS_MMAP test is moved out of the unmap function into
__kernfs_remove() so that we don't unlock kernfs_mutex unnecessarily.
While at it, drop the now meaningless "bin" qualifier from the
function name.
v2: Rewritten to fit the v2 restructuring of removal path. HAS_MMAP
test relocated.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The recursive nature of kernfs_remove() means that, even if
kernfs_remove() is not allowed to be called multiple times on the same
node, there may be race conditions between removal of parent and its
descendants. While we can claim that kernfs_remove() shouldn't be
called on one of the descendants while the removal of an ancestor is
in progress, such rule is unnecessarily restrictive and very difficult
to enforce. It's better to simply allow invoking kernfs_remove() as
the caller sees fit as long as the caller ensures that the node is
accessible.
The current behavior in such situations is broken. Whoever enters
removal path first takes the node off the hierarchy and then
deactivates. Following removers either return as soon as it notices
that it's not the first one or can't even find the target node as it
has already been removed from the hierarchy. In both cases, the
following removers may finish prematurely while the nodes which should
be removed and drained are still being processed by the first one.
This patch restructures so that multiple removers, whether through
recursion or direction invocation, always follow the following rules.
* When there are multiple concurrent removers, only one puts the base
ref.
* Regardless of which one puts the base ref, all removers are blocked
until the target node is fully deactivated and removed.
To achieve the above, removal path now first deactivates the subtree,
drains it and then unlinks one-by-one. __kernfs_deactivate() is
called directly from __kernfs_removal() and drops and regrabs
kernfs_mutex for each descendant to drain active refs. As this means
that multiple removers can enter __kernfs_deactivate() for the same
node, the function is updated so that it can handle multiple
deactivators of the same node - only one actually deactivates but all
wait till drain completion.
The restructured removal path guarantees that a removed node gets
unlinked only after the node is deactivated and drained. Combined
with proper multiple deactivator handling, this guarantees that any
invocation of kernfs_remove() returns only after the node itself and
all its descendants are deactivated, drained and removed.
v2: Draining separated into a separate loop (used to be in the same
loop as unlink) and done from __kernfs_deactivate(). This is to
allow exposing deactivation as a separate interface later.
Root node removal was broken in v1 patch. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KERNFS_REMOVED is used to mark half-initialized and dying nodes so
that they don't show up in lookups and deny adding new nodes under or
renaming it; however, its role overlaps those of deactivation and
removal from rbtree.
It's necessary to deny addition of new children while removal is in
progress; however, this role considerably intersects with deactivation
- KERNFS_REMOVED prevents new children while deactivation prevents new
file operations. There's no reason to have them separate making
things more complex than necessary.
KERNFS_REMOVED is also used to decide whether a node is still visible
to vfs layer, which is rather redundant as equivalent determination
can be made by testing whether the node is on its parent's children
rbtree or not.
This patch removes KERNFS_REMOVED.
* Instead of KERNFS_REMOVED, each node now starts its life
deactivated. This means that we now use both atomic_add() and
atomic_sub() on KN_DEACTIVATED_BIAS, which is INT_MIN. The compiler
generates an overflow warnings when negating INT_MIN as the negation
can't be represented as a positive number. Nothing is actually
broken but let's bump BIAS by one to avoid the warnings for archs
which negates the subtrahend..
* KERNFS_REMOVED tests in add and rename paths are replaced with
kernfs_get/put_active() of the target nodes. Due to the way the add
path is structured now, active ref handling is done in the callers
of kernfs_add_one(). This will be consolidated up later.
* kernfs_remove_one() is updated to deactivate instead of setting
KERNFS_REMOVED. This removes deactivation from kernfs_deactivate(),
which is now renamed to kernfs_drain().
* kernfs_dop_revalidate() now tests RB_EMPTY_NODE(&kn->rb) instead of
KERNFS_REMOVED and KERNFS_REMOVED test in kernfs_dir_pos() is
dropped. A node which is removed from the children rbtree is not
included in the iteration in the first place. This means that a
node may be visible through vfs a bit longer - it's now also visible
after deactivation until the actual removal. This slightly enlarged
window difference doesn't make any difference to the userland.
* Sanity check on KERNFS_REMOVED in kernfs_put() is replaced with
checks on the active ref.
* Some comment style updates in the affected area.
v2: Reordered before removal path restructuring. kernfs_active()
dropped and kernfs_get/put_active() used instead. RB_EMPTY_NODE()
used in the lookup paths.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There currently are two mechanisms gating active ref lockdep
annotations - KERNFS_LOCKDEP flag and KERNFS_ACTIVE_REF type mask.
The former disables lockdep annotations in kernfs_get/put_active()
while the latter disables all of kernfs_deactivate().
While KERNFS_ACTIVE_REF also behaves as an optimization to skip the
deactivation step for non-file nodes, the benefit is marginal and it
needlessly diverges code paths. Let's drop KERNFS_ACTIVE_REF and use
KERNFS_LOCKDEP in kernfs_deactivate() too.
While at it, add a test helper kernfs_lockdep() to test KERNFS_LOCKDEP
flag so that it's more convenient and the related code can be compiled
out when not enabled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs_node->u.completion is used to notify deactivation completion
from kernfs_put_active() to kernfs_deactivate(). We now allow
multiple racing removals of the same node and the current removal
scheme is no longer correct - kernfs_remove() invocation may return
before the node is properly deactivated if it races against another
removal. The removal path will be restructured to address the issue.
To help such restructure which requires supporting multiple waiters,
this patch replaces kernfs_node->u.completion with
kernfs_root->deactivate_waitq. This makes deactivation event
notifications share a per-root waitqueue_head; however, the wait path
is quite cold and this will also allow shaving one pointer off
kernfs_node.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When kernfs_seq_start() fails to obtain an active reference, it
returns ERR_PTR(-ENODEV). kernfs_seq_stop() is then invoked with the
error pointer value; however, it still proceeds to invoke
kernfs_put_active() on the node leading to unbalanced put.
If kernfs_seq_stop() is called even after active ref failure, it
should skip invocation of @ops->seq_stop() and put_active.
Unfortunately, this is a bit complicated because active ref failure
isn't the only thing which may fail with ERR_PTR(-ENODEV).
@ops->seq_start/next() may also fail with the error value and
kernfs_seq_stop() doesn't have a way to tell apart those failures.
Work it around by factoring out the active part of kernfs_seq_stop()
into kernfs_seq_stop_active() and invoking it directly if
@ops->seq_start/next() fail with ERR_PTR(-ENODEV) and updating
kernfs_seq_stop() to skip kernfs_seq_stop_active() on
ERR_PTR(-ENODEV). This is a bit nasty but ensures that the active put
is skipped iff get_active failed in kernfs_seq_start().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We were apparently relying on the defaults on BDW, which resulted in no
hotplug or AUX interrupts. So be sure to call the ibx_irq_preinstall to
enable all interrupts.
v2: use preinstall instead of redundant SDIER write
References: https://bugs.freedesktop.org/show_bug.cgi?id=72834
References: https://bugs.freedesktop.org/show_bug.cgi?id=72833
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* pci/resource:
PCI: Allocate 64-bit BARs above 4G when possible
PCI: Enforce bus address limits in resource allocation
PCI: Split out bridge window override of minimum allocation address
agp/ati: Use PCI_COMMAND instead of hard-coded 4
agp/intel: Use CPU physical address, not bus address, for ioremap()
agp/intel: Use pci_bus_address() to get GTTADR bus address
agp/intel: Use pci_bus_address() to get MMADR bus address
agp/intel: Support 64-bit GMADR
agp/intel: Rename gtt_bus_addr to gtt_phys_addr
drm/i915: Rename gtt_bus_addr to gtt_phys_addr
agp: Use pci_resource_start() to get CPU physical address for BAR
agp: Support 64-bit APBASE
PCI: Add pci_bus_address() to get bus address of a BAR
PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev
PCI: Change pci_bus_region addresses to dma_addr_t
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This reverts parts of c320b976d7 ("PCI: Add implementation for PRI
capability"), removing these interfaces:
pci_pri_enabled()
pci_pri_stopped()
pci_pri_status()
[bhelgaas: split to separate patch]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Joerg Roedel <joro@8bytes.org>
In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to
call destroy_work_on_stack() which frees the debug object to pair
with INIT_WORK_ONSTACK().
Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 6f96b3063c)
With CRC check is enabled, if trying to set an attributes value just
equal to the maximum size of XATTR_SIZE_MAX would cause the v3 remote
attr write verification procedure failure, which would yield the back
trace like below:
<snip>
XFS (sda7): Internal error xfs_attr3_rmt_write_verify at line 191 of file fs/xfs/xfs_attr_remote.c
<snip>
Call Trace:
[<ffffffff816f0042>] dump_stack+0x45/0x56
[<ffffffffa0d99c8b>] xfs_error_report+0x3b/0x40 [xfs]
[<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffffa0d99ce5>] xfs_corruption_error+0x55/0x80 [xfs]
[<ffffffffa0dbef6b>] xfs_attr3_rmt_write_verify+0x14b/0x1a0 [xfs]
[<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d96edd>] _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffff81184cda>] ? vm_map_ram+0x31a/0x460
[<ffffffff81097230>] ? wake_up_state+0x20/0x20
[<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d9726b>] xfs_buf_iorequest+0x6b/0xc0 [xfs]
[<ffffffffa0d97315>] xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d97906>] xfs_bwrite+0x46/0x80 [xfs]
[<ffffffffa0dbfa94>] xfs_attr_rmtval_set+0x334/0x490 [xfs]
[<ffffffffa0db84aa>] xfs_attr_leaf_addname+0x24a/0x410 [xfs]
[<ffffffffa0db8893>] xfs_attr_set_int+0x223/0x470 [xfs]
[<ffffffffa0db8b76>] xfs_attr_set+0x96/0xb0 [xfs]
[<ffffffffa0db13b2>] xfs_xattr_set+0x42/0x70 [xfs]
[<ffffffff811df9b2>] generic_setxattr+0x62/0x80
[<ffffffff811e0213>] __vfs_setxattr_noperm+0x63/0x1b0
[<ffffffff81307afe>] ? evm_inode_setxattr+0xe/0x10
[<ffffffff811e0415>] vfs_setxattr+0xb5/0xc0
[<ffffffff811e054e>] setxattr+0x12e/0x1c0
[<ffffffff811c6e82>] ? final_putname+0x22/0x50
[<ffffffff811c708b>] ? putname+0x2b/0x40
[<ffffffff811cc4bf>] ? user_path_at_empty+0x5f/0x90
[<ffffffff811bdfd9>] ? __sb_start_write+0x49/0xe0
[<ffffffff81168589>] ? vm_mmap_pgoff+0x99/0xc0
[<ffffffff811e07df>] SyS_setxattr+0x8f/0xe0
[<ffffffff81700c2d>] system_call_fastpath+0x1a/0x1f
Tests:
setfattr -n user.longxattr -v `perl -e 'print "A"x65536'` testfile
This patch fix it to check the remote EA size is greater than the
XATTR_SIZE_MAX rather than more than or equal to it, because it's
valid if the specified EA value size is equal to the limitation as
per VFS setxattr interface.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 85dd0707f0)
o Consider number of Tx queues while calculating the length of
Tx statistics as part of ethtool stats.
o Calculate statistics lenght properly for 82xx and 83xx adapter
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Driver was not updating TX stats so it was not populating
statistics in `ifconfig` command output.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>