When testing if the firmware's initial value is valid, we should use
the corrected level value instead of the raw value returned from
firmware.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Do not load the Intel pstate driver if the platform firmware
(ACPI BIOS) supports the power management alternatives.
The ACPI BIOS indicates that the OS control mode can be used
if the _PSS (Performance Supported States) is defined in ACPI
table. For the OS control mode, the Intel pstate driver will be
loaded.
HP BIOS has several power management modes (firmware, OS-control and
so on). For the OS control mode in HP BIOS, the Intel p-state driver
will be loaded. When the customer chooses the firmware power
management in HP BIOS, the Intel p-state driver will be ignored.
I put hw_vendor_info vendor_info in case other vendors (Dell, Lenovo...)
have their firmware power management. Vendors should make sure their
firmware power management works properly, and they can go for adding
their vendor info to the variable.
I have verified the patch on HP ProLiant servers. The patch worked
correctly.
Signed-off-by: Adrian Huang <adrianhuang0701@gmail.com>
[rjw: Fixed up !CONFIG_ACPI build]
[Linda Knippers: As Adrian has recently left HP, I retested the
updated patch on an HP ProLiant server and verified that it is
behaving correctly. When the BIOS is configured for OS control for
power management, the intel_pstate driver loads as expected. When
the BIOS is configured to provide the power management, the
intel_pstate driver does not load and we get the pcc_cpufreq driver
instead.]
Signed-off-by: Linda Knippers <linda.knippers@hp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When system has a lot of highmem (e.g. 16GiB using a 32 bits kernel),
the code to calculate how much memory we need to preallocate in
normal zone may cause overflow. As Leon has analysed:
It looks that during computing 'alloc' variable there is overflow:
alloc = (3943404 - 1970542) - 1978280 = -5418 (signed)
And this function goes to err_out.
Fix this by avoiding that overflow.
References: https://bugzilla.kernel.org/show_bug.cgi?id=60817
Reported-and-tested-by: Leon Drugi <eyak@wp.pl>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Increasingly, Linux is running on thermally constrained devices. The simple
thermal relationship between processor and fan has become past for modern
computers.
As hardware vendors cope with the thermal constraints on their products,
more sensors are added, new cooling capabilities are introduced. The
complexity of the thermal relationship can grow exponentially among cooling
devices, zones, sensors, and trip points. They can also change dynamically.
To expose such relationship to the userspace, Linux generic thermal layer
introduced sysfs entry at /sys/class/thermal with a matrix of symbolic
links, trip point bindings, and device instances. To traverse such
matrix by hand is not a trivial task. Testing is also difficult in that
thermal conditions are often exception cases that hard to reach in
normal operations.
TMON is conceived as a tool to help visualize, tune, and test the
complex thermal subsystem.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
According to the ACPI spec (5.0, Section 6.3.5), the "Device
insertion in progress (pending)" (0x80) _OST status code is
reserved for the "Insertion Processing" (0x200) source event
which is "a result of an OSPM action". Specifically, it is not
a notification, so that status code should not be used during
notification processing, which unfortunately is done by
acpi_scan_bus_device_check().
For this reason, drop the ACPI_OST_SC_INSERT_IN_PROGRESS _OST
status evaluation from there (it was a mistake to put it in there
in the first place).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Since _handle_hotplug_event_root() is run from the ACPI hotplug
workqueue, it doesn't need to queue up a work item to eject a PCI
host bridge on the same workqueue. Instead, it can just carry out
the eject by calling acpi_bus_device_eject() directly, so make that
happen.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no real reasn why acpi_bus_device_eject() and
acpi_bus_hot_remove_device() should work differently, so rework
acpi_bus_device_eject() so that it can be called internally by
both acpi_bus_hot_remove_device() and acpi_eject_store_work().
Accordingly, rework acpi_hotplug_notify_cb() to queue up the
execution of acpi_bus_hot_remove_device() through
acpi_os_hotplug_execute() on eject request notifications.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Notice that handle_root_bridge_removal() is the only user of
acpi_bus_hot_remove_device(), so it doesn't have to be exported
any more and can be made internal to the ACPI core.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Simplify handle_root_bridge_removal() and acpi_eject_store() by
getting rid of struct acpi_eject_event and passing device objects
directly to async routines executed via acpi_os_hotplug_execute().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
It is required to do get_device() on the struct acpi_device in
question before passing it to acpi_bus_hot_remove_device() through
acpi_os_hotplug_execute(), because acpi_bus_hot_remove_device()
calls acpi_scan_hot_remove() that does put_device() on that
object.
The ACPI PCI root removal routine, handle_root_bridge_removal(),
doesn't do that, which may lead to premature freeing of the
device object or to executing put_device() on an object that
has been freed already.
Fix this problem by making handle_root_bridge_removal() use
get_device() as appropriate.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: All applicable <stable@vger.kernel.org>
In theory, an ACPI device object may be the parent of another
device object whose hotplug is disabled by user space through its
scan handler. In that case, the eject operation targeting the
parent should fail as though the parent's own hotplug was disabled,
but currently this is not the case, because acpi_scan_hot_remove()
doesn't check the disable/enable hotplug status of the children
of the top-most object passed to it.
To fix this, modify acpi_bus_offline_companions() to return an
error code if hotplug is disabled for the given device object.
[Also change the name of the function to acpi_bus_offline(),
because it is not only about companions any more, and change
the name of acpi_bus_online_companions() accordingly.] Make
acpi_scan_hot_remove() propagate that error to its callers.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Here's the big char/misc driver patchset for 3.13-rc1.
Lots of stuff in here, including some new drivers for Intel's "MIC"
co-processor devices, and a new eeprom driver. Other things include the
driver attribute cleanups, extcon driver updates, hyperv updates, and a
raft of other miscellaneous driver fixes.
All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEUEABECAAYFAlJ6v9kACgkQMUfUDdst+ykPzACXdwm/1DryfqnyhVPyITNAKcma
WACg1Yu5mtIvJg3NsN/7Ff0Qfj6GzYY=
=MIEe
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc patches from Greg KH:
"Here's the big char/misc driver patchset for 3.13-rc1.
Lots of stuff in here, including some new drivers for Intel's "MIC"
co-processor devices, and a new eeprom driver. Other things include
the driver attribute cleanups, extcon driver updates, hyperv updates,
and a raft of other miscellaneous driver fixes.
All of these have been in linux-next for a while"
* tag 'char-misc-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (121 commits)
misc: mic: Fixes for randconfig build errors and warnings.
tifm: fix error return code in tifm_7xx1_probe()
w1-gpio: Use devm_* functions
w1-gpio: Detect of_gpio_error for first gpio
uio: Pass pointers to virt_to_page(), not integers
uio: fix memory leak
misc/at24: avoid infinite loop on write()
misc/93xx46: avoid infinite loop on write()
misc: atmel_pwm: add deferred-probing support
mei: wd: host_init propagate error codes from called functions
mei: replace stray pr_debug with dev_dbg
mei: bus: propagate error code returned by mei_me_cl_by_id
mei: mei_cl_link remove duplicated check for open_handle_count
mei: print correct device state during unexpected reset
mei: nfc: fix memory leak in error path
lkdtm: add tests for additional page permissions
lkdtm: adjust recursion size to avoid warnings
lkdtm: isolate stack corruption test
mei: move host_clients_map cleanup to device init
mei: me: downgrade two errors to debug level
...
ACPI scan handlers should always be attached to struct acpi_device
objects before any ACPI drivers, but there is a window during which
a driver may be attached to a struct acpi_device before checking if
there is a matching scan handler. Namely, that will happen if an
ACPI driver module is loaded during acpi_bus_scan() right after
the first namespace walk is complete and before the given device
is processed by the second namespace walk.
To prevent that from happening, set the match_driver flags of
struct acpi_device objects right before running device_attach()
for them in acpi_bus_device_attach().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Since acpi_pci_slot_init() is now called from acpi_pci_init()
and pci-acpi.h contains its header, remove that header (and the
empty definition of that function for CONFIG_ACPI_PCI_SLOT unset)
from internal.h as it doesn't have to be there any more. That also
avoids a build warning about duplicate function definitions for
CONFIG_ACPI_PCI_SLOT unset.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
By default, IRQ work is run from the tick interrupt (see
irq_work_run() in update_process_times()). When we're in full
NOHZ mode, restarting the tick requires the use of IRQ work and
if the only place we run IRQ work is in the tick interrupt we
have an unbreakable cycle. Implement arch_irq_work_raise() via
self IPIs to break this cycle and get the tick started again.
Note that we implement this via IPIs which are only available on
SMP builds. This shouldn't be a problem because full NOHZ is only
supported on SMP builds anyway.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Here's the big USB driver update for 3.13-rc1.
It includes the usual xhci changes, EHCI updates to get the scheduling
of USB transactions working better, and a raft of gadget and musb
updates as well.
All of this has been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlJ6xycACgkQMUfUDdst+ymO+gCgxXdQXSU23i9ykc2CKBemdEBH
w6IAoKcokITcdN1IxxkfiMxOEld2hgZm
=3kbb
-----END PGP SIGNATURE-----
Merge tag 'usb-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver update from Greg KH:
"Here's the big USB driver update for 3.13-rc1.
It includes the usual xhci changes, EHCI updates to get the scheduling
of USB transactions working better, and a raft of gadget and musb
updates as well.
All of this has been in linux-next for a while with no reported
issues"
* tag 'usb-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (305 commits)
USB: Maintainers change for usb serial drivers
usb: usbtest: support container id descriptor test
usb: usbtest: support superspeed device capbility descriptor test
usb: usbtest: support usb2 extension descriptor test
usb: chipidea: only get vbus regulator for non-peripheral mode
USB: ehci-atmel: add usb_clk for transition to CCF
usb: cdc-wdm: ignore speed change notifications
USB: cdc-wdm: support back-to-back USB_CDC_NOTIFY_RESPONSE_AVAILABLE notifications
usbatm: Fix dynamic_debug / ratelimited atm_dbg and atm_rldbg macros
printk: pr_debug_ratelimited: check state first to reduce "callbacks suppressed" messages
usb: usbtest: support bos descriptor test for usb 3.0
USB: phy: samsung: Support multiple PHYs of same type
usb: wusbcore: change WA_SEGS_MAX to a legal value
usb: wusbcore: add a quirk for Alereon HWA device isoc behavior
usb: wusbcore: combine multiple isoc frames in a single transfer request.
usb: wusbcore: set the RPIPE wMaxPacketSize value correctly
usb: chipidea: host: more enhancement when ci->hcd is NULL
usb: ohci: remove ep93xx bus glue platform driver
usb: usbtest: fix checkpatch warning as sizeof code style
UWB: clean up attribute use by using ATTRIBUTE_GROUPS()
...
The ARM architecture reference specifies that the IT state bits in the
PSR must be all zeros in ARM mode or behavior is unspecified. On the
Qualcomm Snapdragon S4/Krait architecture CPUs the processor continues
to consider the IT state bits while in ARM mode. This makes it so
that some instructions are skipped by the CPU.
Signed-off-by: T.J. Purtell <tj@mobisocial.us>
[rmk+kernel@arm.linux.org.uk: fixed whitespace formatting in patch]
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
No-MMU configurations currenty fail to build because they are missing
the early_paging_init() symbol.
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The exception handling code fails to clear the IT state, potentially
leading to incorrect execution of the fixup if the size of the IT
block is more than one.
Let fixup_exception do the IT sanitizing if a fixup has been found,
and restore CPSR from the stack when returning from a data abort.
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 6dedcca610 ("hotplug, powerpc, x86: Remove
cpu_hotplug_driver_lock())" removes the the definition of
cpu_hotplug_driver_{lock,unlock} APIs, thereby causing a build error.
Replace these calls with {lock,unlock}_device_hotplug().
Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Certain platforms do not allow writes in the MSI-X BARs to setup or tear
down vector values. To combat against the generic code trying to write to
that and either silently being ignored or crashing due to the pagetables
being marked R/O this patch introduces a platform override.
Note that we keep two separate, non-weak, functions default_mask_msi_irqs()
and default_mask_msix_irqs() for the behavior of the arch_mask_msi_irqs()
and arch_mask_msix_irqs(), as the default behavior is needed by x86 PCI
code.
For Xen, which does not allow the guest to write to MSI-X tables - as the
hypervisor is solely responsible for setting the vector values - we
implement two nops.
This fixes a Xen guest crash when passing a PCI device with MSI-X to the
guest. See the bugzilla for more details.
[bhelgaas: add bugzilla info]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=64581
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
CC: Zhenzhong Duan <zhenzhong.duan@oracle.com>
* pci/misc:
PCI: Warn on driver probe return value greater than zero
PCI: Drop warning about drivers that don't use pci_set_master()
PCI: Workaround missing pci_set_master in pci drivers
PCI: Update pcie_ports 'auto' behavior for non-ACPI platforms
Ages ago, drivers could return values greater than zero from their probe
function and this would be regarded as success.
But after f3ec4f87d6 ("PCI: change device runtime PM settings for probe
and remove") and 967577b062 ("PCI/PM: Keep runtime PM enabled for unbound
PCI devices"), we set dev->driver to NULL if the driver's probe function
returns a value greater than zero.
__pci_device_probe() treats this as success, and drivers can still mostly
work even with dev->driver == NULL, but PCI power management doesn't work,
and we don't call the driver's remove function on rmmod.
To help catch these driver problems, issue a warning in this case.
[bhelgaas: changelog]
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Introduce flag KM_ZERO which is used to alloc zeroed entry, and convert
kmem_{zone_}zalloc to call kmem_{zone_}alloc() with KM_ZERO directly,
in order to avoid the setting to zero step.
And following Dave's suggestion, make kmem_{zone_}zalloc static inline
into kmem.h as they're now just a simple wrapper.
V2:
Make kmem_{zone_}zalloc static inline into kmem.h as Dave suggested.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
The early_init_devtree() API was removed in linux-next for 3.13 with
commit "mips: use early_init_dt_scan". This causes Netlogic XLP compile
to fail:
arch/mips/netlogic/xlp/setup.c:101: undefined reference to `early_init_devtree'
Add xlp_early_init_devtree() which uses the __dt_setup_arch() to
handle early device tree related initialization to fix this.
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
The current default perf paranoid level is "1" which has
"perf_paranoid_kernel()" return false, and giving any operations that
use it, access to normal users. Unfortunately, this includes function
tracing and normal users should not be allowed to enable function
tracing by default.
The proper level is defined at "-1" (full perf access), which
"perf_paranoid_tracepoint_raw()" will only give access to. Use that
check instead for enabling function tracing.
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: stable@vger.kernel.org # 3.4+
CVE: CVE-2013-2930
Fixes: ced39002f5 ("ftrace, perf: Add support to use function tracepoint in perf")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
set_swbp() and set_orig_insn() are __weak, but this is pointless
because write_opcode() is static.
Export write_opcode() as uprobe_write_opcode() for the upcoming
arm port, this way it can actually override set_swbp() and use
__opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Currently xol_get_insn_slot() assumes that we should simply copy
arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn)
just the copy of the original insn.
This is not true for arm which needs to create another insn to
execute it out-of-line.
So this patch simply adds the new member, ->ixol into the union.
This doesn't make any difference for x86 and powerpc, but arm
can divorce insn/ixol and initialize the correct xol insn in
arch_uprobe_analyze_insn().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Turn module_init() into __initcall() and kill module_exit().
This code can't be compiled as a module so these module_*()
calls only add the confusion, especially if arch-dependant
code needs its own initialization hooks.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Move the function declarations from the arch headers to the common
header, since only the function bodies are architecture-specific.
These changes are from Vincent Rabin's uprobes patch.
[ oleg: update arch/powerpc/include/asm/uprobes.h ]
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David A. Long <dave.long@linaro.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
To help track down AGI/AGF lock ordering issues, I added these
tracepoints to tell us when an AGI or AGF is read and locked. With
these we can now determine if the lock ordering goes wrong from
tracing captures.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
I debugging a log tail issue on a RHEL6 kernel, I added these trace
points to trace log items being added, moved and removed in the AIL
and how that affected the log tail LSN that was written to the log.
They were very helpful in that they immediately identified the cause
of the problem being seen. Hence I'd like to always have them
available for use.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
The variable hv_lapic_frequency causes an unused variable warning if
CONFIG_X86_LOCAL_APIC is disabled. Since the variable is only used
inside a small if statement, move the declaration of that variable
into the if statement itself.
Cc: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1381444224-3303-1-git-send-email-kys@microsoft.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The warnings are really harmless but annoying. Since they are only
about debug prints, and it's at most 32bit DMA, let's just cast to
unsigned long.
sound/pci/lx6464es/lx6464es.c:457:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
sound/pci/lx6464es/lx_core.c:1195:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Just add an ifdef CONFIG_PM to shut up the warnings:
sound/isa/cmi8328.c:129:13: warning: ‘snd_cmi8328_cfg_save’ defined but not used [-Wunused-function]
sound/isa/cmi8328.c:136:13: warning: ‘snd_cmi8328_cfg_restore’ defined but not used [-Wunused-function]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The while loop only peeks at the top request in the queue but does
not yet consume it. Since we only handle fs requests, we need to
dequeue and complete all other request command types with error
just in case we would ever receive such an unforeseen request.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Currently seqlocks and seqcounts don't support lockdep.
After running across a seqcount related deadlock in the timekeeping
code, I used a less-refined and more focused variant of this patch
to narrow down the cause of the issue.
This is a first-pass attempt to properly enable lockdep functionality
on seqlocks and seqcounts.
Since seqcounts are used in the vdso gettimeofday code, I've provided
non-lockdep accessors for those needs.
I've also handled one case where there were nested seqlock writers
and there may be more edge cases.
Comments and feedback would be appreciated!
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to enable lockdep on seqcount/seqlock structures, we
must explicitly initialize any locks.
The u64_stats_sync structure, uses a seqcount, and thus we need
to introduce a u64_stats_init() function and use it to initialize
the structure.
This unfortunately adds a lot of fairly trivial initialization code
to a number of drivers. But the benefit of ensuring correctness makes
this worth while.
Because these changes are required for lockdep to be enabled, and the
changes are quite trivial, I've not yet split this patch out into 30-some
separate patches, as I figured it would be better to get the various
maintainers thoughts on how to best merge this change along with
the seqcount lockdep enablement.
Feedback would be appreciated!
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Roger Luethi <rl@hellgate.ch>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Simon Horman <horms@verge.net.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
nr_busy_cpus parameter is used by nohz_kick_needed() to find out the
number of busy cpus in a sched domain which has SD_SHARE_PKG_RESOURCES
flag set. Therefore instead of updating nr_busy_cpus at every level
of sched domain, since it is irrelevant, we can update this parameter
only at the parent domain of the sd which has this flag set. Introduce
a per-cpu parameter sd_busy which represents this parent domain.
In nohz_kick_needed() we directly query the nr_busy_cpus parameter
associated with the groups of sd_busy.
By associating sd_busy with the highest domain which has
SD_SHARE_PKG_RESOURCES flag set, we cover all lower level domains
which could have this flag set and trigger nohz_idle_balancing if any
of the levels have more than one busy cpu.
sd_busy is irrelevant for asymmetric load balancing. However sd_asym
has been introduced to represent the highest sched domain which has
SD_ASYM_PACKING flag set so that it can be queried directly when
required.
While we are at it, we might as well change the nohz_idle parameter to
be updated at the sd_busy domain level alone and not the base domain
level of a CPU. This will unify the concept of busy cpus at just one
level of sched domain where it is currently used.
Signed-off-by: Preeti U Murthy<preeti@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: svaidy@linux.vnet.ibm.com
Cc: vincent.guittot@linaro.org
Cc: bitbucket@online.de
Cc: benh@kernel.crashing.org
Cc: anton@samba.org
Cc: Morten.Rasmussen@arm.com
Cc: pjt@google.com
Cc: peterz@infradead.org
Cc: mikey@neuling.org
Link: http://lkml.kernel.org/r/20131030031252.23426.4417.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Asymmetric scheduling within a core is a scheduler loadbalancing
feature that is triggered when SD_ASYM_PACKING flag is set. The goal
for the load balancer is to move tasks to lower order idle SMT threads
within a core on a POWER7 system.
In nohz_kick_needed(), we intend to check if our sched domain (core)
is completely busy or we have idle cpu.
The following check for SD_ASYM_PACKING:
(cpumask_first_and(nohz.idle_cpus_mask, sched_domain_span(sd)) < cpu)
already covers the case of checking if the domain has an idle cpu,
because cpumask_first_and() will not yield any set bits if this domain
has no idle cpu.
Hence, nr_busy check against group weight can be removed.
Reported-by: Michael Neuling <michael.neuling@au1.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: vincent.guittot@linaro.org
Cc: bitbucket@online.de
Cc: benh@kernel.crashing.org
Cc: anton@samba.org
Cc: Morten.Rasmussen@arm.com
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131030031242.23426.13019.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX
device. For PCI bus without UBOX device, we find the next bus that
has UBOX device and use its 'bus to socket' mapping.
Besides the counter/control registers in IRP boxes are not properly
aligned.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The encoding for filter registers of IvyBridge-EP uncore QPI boxes is
completely the same as SandyBridge-EP.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While this is really minor, but strncpy() does the unnecessary
zero-padding till the end of tmp[16] and it is called every time
we are going to use the string literal.
Turn these strncpy()'s into the single strlcpy() under the new
label, saves 72 bytes.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131017182417.GA17753@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>