sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
variant of __cycles_2_ns().
Most of the time sched_clock() is called with irqs disabled already.
The few places that call it with irqs enabled need to be updated.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
in scheduler-intense workloads native_read_tsc() overhead accounts for
20% of the system overhead:
659567 system_call 41222.9375
686796 schedule 435.7843
718382 __switch_to 665.1685
823875 switch_mm 4526.7857
1883122 native_read_tsc 55385.9412
9761990 total 2.8468
this is large part due to the rdtsc_barrier() that is done before
and after reading the TSC.
But sched_clock() is not a precise clock in the GTOD sense, using such
barriers is completely pointless. So remove the barriers and only use
them in vget_cycles().
This improves lat_ctx performance by about 5%.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: change in trace output
Because the trace buffers are per cpu ring buffers, the start of
the trace can be confusing. If one CPU is very active at the
end of the trace, its history will not go as far back as the
other CPU traces. This means that output for a particular CPU
may not appear for the first part of a trace.
To help annotate what is happening, and to prevent any more
confusion, this patch adds a line that annotates the start of
a CPU buffer output.
For example:
automount-3495 [001] 184.596443: dnotify_parent <-vfs_write
[...]
automount-3495 [001] 184.596449: dput <-path_put
automount-3496 [002] 184.596450: down_read_trylock <-do_page_fault
[...]
sshd-3497 [001] 184.597069: up_read <-do_page_fault
<idle>-0 [000] 184.597074: __exit_idle <-exit_idle
[...]
automount-3496 [002] 184.597257: filemap_fault <-__do_fault
<idle>-0 [003] 184.597261: exit_idle <-smp_apic_timer_interrupt
Note, parsers of a trace output should always ignore any lines that
start with a '#'.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: preemptoff not tested in selftest
Due to the BKL not being preemptable anymore, the selftest of the
preemptoff code can not be tested. It requires that it is called
with preemption enabled, but since the BKL is held, that is no
longer the case.
This patch simply skips those tests if it detects that the context
is not preemptable. The following will now show up in the tests:
Testing tracer preemptoff: can not test ... force PASSED
Testing tracer preemptirqsoff: can not test ... force PASSED
When the BKL is removed, or it becomes preemptable once again, then
the tests will be performed.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add alignment option for recordmcount.pl script
Align the __mcount_loc sections so that architectures with strict
alignment requirements need not worry about performing unaligned
accesses.
This fixes an issue where I was seeing unaligned accesses, which are not
supported on our architecture (the results of an unaligned access are
undefined).
Signed-off-by: Matt Fleming <matthew.fleming@imgtec.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: remove obsolete variable in trace_array structure
With the new start / stop method of ftrace, the ctrl variable
in the trace_array structure is now obsolete. Remove it.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Remove the ctrl_update tracer method
With the new quick start/stop method of tracing, the ctrl_update
method is out of date.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: have the ftrace_printk enabled on startup
It is confusing to have to "echo trace_printk > /debug/tracing/iter_ctrl"
after adding ftrace_printk in the kernel.
Currently the trace_printk is set to off by default. ftrace_printk
should only be in open kernel code when used for debugging, and thus
it should be enabled by default.
It may also be used to record data within a tracer, but those ftrace_printks
should be within wrappers that are either enabled by trace_points or
have a variable protecting the code path from being entered when the
tracer is disabled.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix to irqsoff tracer output
In converting to the new start / stop ftrace handling, the irqsoff
tracer start called the irqsoff reset function. irqsoff tracer is
not the same as the other traces, and it resets the buffers while
searching for the longest latency.
The reset that the irqsoff stop method calls disables the function
tracing. That means that, by starting the tracer, the function
tracer is disabled incorrectly.
This patch simply removes the call to reset which keeps the function
tracing enabled. Reset is not needed for the irqsoff stop method.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix for sched_switch that broke dynamic ftrace startup
The commit: tracing/fastboot: use sched switch tracer from boot tracer
broke the API of the sched_switch trace. The use of the
tracing_start/stop_cmdline record is for only recording the cmdline,
NOT recording the schedule switches themselves.
Seeing that the boot tracer broke the API to do something that it
wanted, this patch adds a new interface for the API while
puting back the original interface of the old API.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: boot tracer startup modified
The boot tracer calls into some of the schedule tracing private functions
that should not be exported. This patch cleans it up, and makes
way for further changes in the ftrace infrastructure.
This patch adds a api to assign a tracer array to the schedule
context switch tracer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix of output of set_ftrace_filter
Commit ftrace: do not show freed records in available_filter_functions
Removed a bit too much from the set_ftrace_filter code, where we now see
all functions in the set_ftrace_filter file even when we set a filter.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thanks to Randy Dunlap for finding this problem.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This Kconfig change allows the common 'make allmodconfig' and
'make allyesconfig' build options to skip the staging tree, which is
probably what you want to have happen anyway.
This makes the linux-next developer's life a lot easier so he doesn't
have to worry about changes that break the staging tree, that's for me
to worry about...
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
If an ACPI graphics device supports backlight brightness functions (cmp. with
latest ACPI spec Appendix B), let the ACPI video driver control backlight and
switch backlight control off in vendor specific ACPI drivers (asus_acpi,
thinkpad_acpi, eeepc, fujitsu_laptop, msi_laptop, sony_laptop, acer-wmi).
Currently it is possible to load above drivers and let both poke on the
brightness HW registers, the video and vendor specific ACPI drivers -> bad.
This patch provides the basic support to check for BIOS capabilities before
driver loading time. Driver specific modifications are in separate follow up
patches.
"acpi_backlight=vendor"
Prever vendor driver over ACPI driver for backlight.
"acpi_backlight=video" (default)
Prever ACPI driver over vendor driver for backlight.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This is a reimplemention of commit
0119509c4f
from Matthew Garrett <mjg59@srcf.ucam.org>
This patch got removed because of a regression: ThinkPads with a
Intel graphics card and an Integrated Graphics Device BIOS implementation
stopped working.
In fact, they only worked because the ACPI device of the discrete, the
wrong one, got used (via int10). So ACPI functions were poking on the wrong
hardware used which is a sever bug.
The next patch provides support for above ThinkPads to be able to
switch brightness via the legacy thinkpad_acpi driver and automatically
detect when to use it.
Original commit message from Matthew Garrett:
Vendors often ship machines with a choice of integrated or discrete
graphics, and use the same DSDT for both. As a result, the ACPI video
module will locate devices that may not exist on this specific platform.
Attempt to determine whether the device exists or not, and abort the
device creation if it doesn't.
http://bugzilla.kernel.org/show_bug.cgi?id=9614
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len's tree branch release-2.6.27, found an unwanted return statement at
evgpe.c.
(git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
release-2.6.27)
Signed-of-by Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Reformat acpi.debug_layer and acpi.debug_level documentation so it's
more readable, add some clues about how to figure out the mask bits that
enable a specific ACPI_DEBUG_PRINT statement, and include some useful
examples.
Move the list of masks to Documentation/acpi/debug.txt (these are
copies of the authoritative values in acoutput.h and acpi_drivers.h).
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
When CONFIG_ACPI_DEBUG=y, the default acpi_dbg_layer and acpi_dbg_level
values built into the ACPI CA have some debug output enabled. We'd
rather be quiet unless the user actually specified the acpi.debug_level
argument.
This enables distros to ship with CONFIG_ACPI_DEBUG=y without
inundating users with debug output.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
/sys/module/acpi/parameters/debug_layers used to contain only the
debug layers defined by the ACPI CA. This patch adds the additional
layer definitions for ACPI drivers.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some of the component definitions that were previous scattered around
the drivers conflict with each other. That doesn't hurt anything
except that setting one bit in the debug_layer mask would turn on
debugging in two different modules. This patch fixes the conflicts.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Move all the component definitions for drivers to a single shared place,
include/acpi/acpi_drivers.h.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
breakage introduced by following patch
commit 27663c5855
Author: Matthew Wilcox <willy@linux.intel.com>
Date: Fri Oct 10 02:22:59 2008 -0400
acpi_evaluate_integer() does not clear passed variable if
there is an error at evaluation.
So if we ignore error, we must supply initialized variable.
http://bugzilla.kernel.org/show_bug.cgi?id=11917
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Reserve elfcorehdr memory in CONFIG_CRASH_DUMP
[IA64] fix boot panic caused by offline CPUs
[IA64] reorder Kconfig options to match x86
[IA64] Build VT-D iommu support into generic kernel
[IA64] remove dead BIO_VMERGE_BOUNDARY definition
[IA64] remove duplicated #include from pci-dma.c
[IA64] use common header for software IO/TLB
[IA64] fix the difference between node_mem_map and node_start_pfn
[IA64] Add error_recovery_info field to SAL section header
[IA64] Add UV watchlist support.
[IA64] Simplify SGI uv vs. sn2 driver issues
IA64 kdump kernel failed to initialize /proc/vmcore in 2.6.28-rc2.
A bug was introduced in this patch commit:
d9a9855d0b
always reserve elfcore header memory in crash kernel
The problem was that the call to reserve_elfcorehdr() should be placed
in CONFIG_CRASH_DUMP rather than in CONFIG_CRASH_KERNEL, which does
not exist.
Signed-off-by: Jay Lan <jlan@sgi.com>
Acked-by: Simon Hormon <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fix range check on mmapped sysfs resource files
PCI: remove excess kernel-doc notation
PCI: annotate return value of pci_ioremap_bar with __iomem
PCI: fix VPD limit quirk for Broadcom 5708S
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, xen: fix use of pgd_page now that it really does return a page
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
xen: make sure stray alias mappings are gone before pinning
vmap: cope with vm_unmap_aliases before vmalloc_init()
Fix the counter overflow check for CPUs with counter width > 32
I had a similar change in a different patch that I didn't submit
and I didn't notice the problem earlier because it was always
tested together.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This triggers false bug reports as it does a bogus kmalloc with locks held
but is never really compiled into the kernel.
Closes#8329
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As we've lost our trivial maintainer for the moment I'll send this
directly. Only touches a comment
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode
ext4: Mark the buffer_heads as dirty and uptodate after prepare_write
ext4: calculate journal credits correctly
ext4: wait on all pending commits in ext4_sync_fs()
ext4: Convert to host order before using the values.
ext4: fix missing ext4_unlock_group in error path
jbd2: deregister proc on failure in jbd2_journal_init_inode
jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space
jbd: don't give up looking for space so easily in __log_wait_for_space
Tune SD_MC_INIT the same way as SD_CPU_INIT:
unset SD_BALANCE_NEWIDLE, and set SD_WAKE_BALANCE.
This improves vmark by 5%:
vmark 132102 125968 125497 messages/sec avg 127855.66 .984
vmark 139404 131719 131272 messages/sec avg 134131.66 1.033
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
# *DOCUMENTATION*
When initializing an uninitialized block group in ext4_new_inode(),
its block group checksum must be re-calculated. This fixes a race
when several threads try to allocate a new inode in an UNINIT'd group.
There is some question whether we need to be initializing the block
bitmap in ext4_new_inode() at all, but for now, if we are going to
init the block group, let's eliminate the race.
Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We need to make sure we mark the buffer_heads as dirty and uptodate
so that block_write_full_page write them correctly.
This fixes mmap corruptions that can occur in low memory situations.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Xen requires that all mappings of pagetable pages are read-only, so
that they can't be updated illegally. As a result, if a page is being
turned into a pagetable page, we need to make sure all its mappings
are RO.
If the page had been used for ioremap or vmalloc, it may still have
left over mappings as a result of not having been lazily unmapped.
This change makes sure we explicitly mop them all up before pinning
the page.
Unlike aliases created by kmap, the there can be vmalloc aliases even
for non-high pages, so we must do the flush unconditionally.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>