We need to deal with time ordered events to build a correct
state machine of lock events. This is why we multiplex the lock
events buffers. But the ordering is done from the kernel, on
the tracing fast path, leading to high contention between cpus.
Without multiplexing, the events appears in a weak order.
If we have four events, each split per cpu, perf record will
read the events buffers in the following order:
[ CPU0 ev0, CPU0 ev1, CPU0 ev3, CPU0 ev4, CPU1 ev0, CPU1 ev0....]
To handle a post processing reordering, we could just read and sort
the whole in memory, but it just doesn't scale with high amounts
of events: lock events can fill huge amounts in few times.
Basically we need to sort in memory and find a "grace period"
point when we know that a given slice of previously sorted events
can be committed for post-processing, so that we can unload the
memory usage step by step and keep a scalable sorting list.
There is no strong rules about how to define such "grace period".
What does this patch is:
We define a FLUSH_PERIOD value that defines a grace period in
seconds.
We want to have a slice of events covering 2 * FLUSH_PERIOD in our
sorted list.
If FLUSH_PERIOD is big enough, it ensures every events that occured
in the first half of the timeslice have all been buffered and there
are none remaining and there won't be further to put inside this
first timeslice. Then once we reach the 2 * FLUSH_PERIOD
timeslice, we flush the first half to be gentle with the memory
(the second half can still get new events in the middle, so wait
another period to flush it)
FLUSH_PERIOD is defined to 5 seconds. Say the first event started on
time t0. We can safely assume that at the time we are processing
events of t0 + 10 seconds, ther won't be anymore events to read
from perf.data that occured between t0 and t0 + 5 seconds. Hence
we can safely flush the first half.
To point out funky bugs, we have a guardian that checks a new event
timestamp is not below the last event's timestamp flushed and that
displays a warning in this case.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
I've forgot to add 'perf lock' line to command-list.txt,
so users of perf could not find perf lock when they type 'perf'.
Fixing command-list.txt requires document
(tools/perf/Documentation/perf-lock.txt).
But perf lock is too much "under construction" to write a
stable document, so this is something like pseudo document for now.
And I wrote description of perf lock at help section of
CONFIG_LOCK_STAT, this will navigate users of lock trace events.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <1265267295-8388-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Add __percpu sparse annotations to hw_breakpoint.
These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.
In kernel/hw_breakpoint.c, per_cpu(nr_task_bp_pinned, cpu)'s will
trigger spurious noderef related warnings from sparse. Changing it to
&per_cpu(nr_task_bp_pinned[0], cpu) will work around the problem but
deemed to ugly by the maintainer. Leave it alone until better
solution can be found.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <4B7B4B7A.9050902@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
The chip returns voltage and current in mV and mA, but
power supply class uses uV and uA, so add missing conversion.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
The BQ27x00 series of chips can report time-to-empty and
time-to-full, so let's add corresponding properties.
Also report charge status based on status flag register.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
With the Bluetooth 3.0 specification and the introduction of alternate
MAC/PHY (AMP) support, it is required to differentiate between primary
BR/EDR controllers and 802.11 AMP controllers. So introduce a special
type inside HCI device for differentiation.
For now all AMP controllers will be treated as raw devices until an
AMP manager has been implemented.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The debugfs support of the Marvell driver is buggy. It is limited to one
controller per system. Fix this by using the controller specific debugfs
directory as parent.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The output of the inquiry cache is only useful for debugging purposes
and so move it into debugfs.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The id_table field of the struct usb_device_id is constant in <linux/usb.h>
so it is worth to make bcm203x_table also constant.
The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
struct I1 {
...
const struct I2 *x;
...
};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
struct I1 y = {
.x = E,
};
@c@
identifier r.I2;
identifier s.E;
@@
const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+ const
struct I2 E[] = ...;
// </smpl>
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Trivial patch which adds the __init/__exit macros to the module_init/
module_exit functions of btmrvl_sdio.c driver.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Updated, leaner defconfig for the alchemy development boards.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/1005/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Don't define platform info for second mac on au1100 (which only has a
single mac).
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/1004/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A few hunks somehow ended up outside their #ifdef/endif blocks,
leading to -Werror-induces build failures.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/1003/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The GT-64111 PCI host bridge has no address translation mechanism, so
it can't generate legacy port accesses. This quirk fixes legacy device
port resources to contain the bus addresses actually generated by the
GT-64111.
I think this is the approach Ben Herrenschmidt suggested long ago:
http://marc.info/?l=linux-kernel&m=119733290624544&w=2
This allows us to remove the IORESOURCE_PCI_FIXED hack from
pcibios_fixup_device_resources(), which converts bus addresses to CPU
addresses. IORESOURCE_PCI_FIXED denotes resources that can't be moved;
it has nothing to do with converting bus to CPU addresses.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-mips@linux-mips.org
Tested-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/998/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On Alchemy the PCMCIA area lies at the end of the chips 36bit system bus
area. Currently, addresses at the far end of the 32bit area are assumed
to belong to the PCMCIA area and fixed up to the real 36bit address before
being passed to ioremap().
A previous commit enabled 64 bit physical size for the resource datatype on
Alchemy and this allows to use the correct 36bit addresses when registering
the PCMCIA sockets.
This patch removes the 32-to-36bit address fixup and registers the Alchemy
demo board pcmcia socket with the correct 36bit physical addresses.
Tested on DB1200, with a CF card (ide-cs driver) and a 3c589 PCMCIA ethernet
card.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Cc: Manuel Lauss <manuel.lauss@gmail.com>
Patchwork: http://patchwork.linux-mips.org/patch/994/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Because the VIA SuperIO chip only decodes 24 bits of address space but port
address space currently being configured as 32MB there is the theoretical
possibility of aliases within the I/O port address range.
The complicated solution is to reserve all address range that potencially
could cause such aliases. But with the PCI spec limiting port allocations
for devices to a maximum of 256 bytes 16MB of port address space already is
way more than one would ever expect to be used so we just reduce the port
space to 16MB.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
To: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: linux-mips@linux-mips.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Patchwork: http://patchwork.linux-mips.org/patch/995/
ALIGN(x, bytes) expands to __ALIGN_MASK(x, bytes - 1), so use the one
that is most clear.
Signed-off-by: Matt Turner <mattst88@gmail.com>
To: linux-mips@linux-mips.org
Cc: David Daney <ddaney@caviumnetworks.com>
Patchwork: http://patchwork.linux-mips.org/patch/999/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is just a test program for raw_spinlocks. The main reason I
wrote it is to validate my spinlock changes that I sent in a previous
patch.
To use it enable CONFIG_DEBUG_FS and CONFIG_SPINLOCK_TEST then at run
time do:
# mount -t debugfs none /sys/kernel/debug/
# cat /sys/kernel/debug/mips/spin_single
# cat /sys/kernel/debug/mips/spin_multi
On my 600MHz octeon cn5860 (16 CPUs) I get
spin_single spin_multi
base 106885 247941
spinlock_patch 75194 219465
This shows that for uncontended locks the spinlock patch gives 41%
improvement and for contended locks 12% improvement (1/time).
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/969/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The current locking mechanism uses a ll/sc sequence to release a
spinlock. This is slower than a wmb() followed by a store to unlock.
The branching forward to .subsection 2 on sc failure slows down the
contended case. So we get rid of that part too.
Since we are now working on naturally aligned u16 values, we can get
rid of a masking operation as the LHU already does the right thing.
The ANDI are reversed for better scheduling on multi-issue CPUs
On a 12 CPU 750MHz Octeon cn5750 this patch improves ipv4 UDP packet
forwarding rates from 3.58*10^6 PPS to 3.99*10^6 PPS, or about 11%.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/937/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Save/restore CPLD registers when doing suspend-to-ram; this fixes issues
with harddisk and ethernet not working correctly when resuming on DB1200.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/986/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Change to different macros for assembler macros since the old names in
powertv_setup.c were co-opted for use in asm/asm.h. This broken the
build for the powertv platform. This patch introduces new macros based on
the new macros in asm.h to take the place of the old macro values.
Signed-off-by: David VomLehn <dvomlehn@cisco.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/985/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The sole user is au1xxx_calc_clock() which is only used in early bootup
where the is no paralellism thus no race condition to protect against.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Manuel Lauss <manuel.lauss@googlemail.com>