Pull scheduler changes from Ingo Molnar:
"Main changes:
- scheduler side full-dynticks (user-space execution is undisturbed
and receives no timer IRQs) preparation changes that convert the
cputime accounting code to be full-dynticks ready, from Frederic
Weisbecker.
- Initial sched.h split-up changes, by Clark Williams
- select_idle_sibling() performance improvement by Mike Galbraith:
" 1 tbench pair (worst case) in a 10 core + SMT package:
pre 15.22 MB/sec 1 procs
post 252.01 MB/sec 1 procs "
- sched_rr_get_interval() ABI fix/change. We think this detail is not
used by apps (so it's not an ABI in practice), but lets keep it
under observation.
- misc RT scheduling cleanups, optimizations"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h>
cputime: Remove irqsave from seqlock readers
sched, powerpc: Fix sched.h split-up build failure
cputime: Restore CPU_ACCOUNTING config defaults for PPC64
sched/rt: Move rt specific bits into new header file
sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
sched: Move sched.h sysctl bits into separate header
sched: Fix signedness bug in yield_to()
sched: Fix select_idle_sibling() bouncing cow syndrome
sched/rt: Further simplify pick_rt_task()
sched/rt: Do not account zero delta_exec in update_curr_rt()
cputime: Safely read cputime of full dynticks CPUs
kvm: Prepare to add generic guest entry/exit callbacks
cputime: Use accessors to read task cputime stats
cputime: Allow dynamic switch between tick/virtual based cputime accounting
cputime: Generic on-demand virtual cputime accounting
cputime: Move default nsecs_to_cputime() to jiffies based cputime file
cputime: Librarize per nsecs resolution cputime definitions
cputime: Avoid multiplication overflow on utime scaling
context_tracking: Export context state for generic vtime
...
Fix up conflict in kernel/context_tracking.c due to comment additions.
Pull perf changes from Ingo Molnar:
"There are lots of improvements, the biggest changes are:
Main kernel side changes:
- Improve uprobes performance by adding 'pre-filtering' support, by
Oleg Nesterov.
- Make some POWER7 events available in sysfs, equivalent to what was
done on x86, from Sukadev Bhattiprolu.
- tracing updates by Steve Rostedt - mostly misc fixes and smaller
improvements.
- Use perf/event tracing to report PCI Express advanced errors, by
Tony Luck.
- Enable northbridge performance counters on AMD family 15h, by Jacob
Shin.
- This tracing commit:
tracing: Remove the extra 4 bytes of padding in events
changes the ABI. All involved parties (PowerTop in particular)
seem to agree that it's safe to do now with the introduction of
libtraceevent, but the devil is in the details ...
Main tooling side changes:
- Add 'event group view', from Namyung Kim:
To use it, 'perf record' should group events when recording. And
then perf report parses the saved group relation from file header
and prints them together if --group option is provided. You can
use the 'perf evlist' command to see event group information:
$ perf record -e '{ref-cycles,cycles}' noploop 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ]
$ perf evlist --group
{ref-cycles,cycles}
With this example, default perf report will show you each event
separately.
You can use --group option to enable event group view:
$ perf report --group
...
# group: {ref-cycles,cycles}
# ========
# Samples: 7K of event 'anon group { ref-cycles, cycles }'
# Event count (approx.): 6876107743
#
# Overhead Command Shared Object Symbol
# ................ ....... ................. ..........................
99.84% 99.76% noploop noploop [.] main
0.07% 0.00% noploop ld-2.15.so [.] strcmp
0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del
0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu
0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time
0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask
0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe
0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock
0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page
0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks
0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time
As you can see the Overhead column now contains both of ref-cycles
and cycles and header line shows group information also - 'anon
group { ref-cycles, cycles }'. The output is sorted by period of
group leader first.
- Initial GTK+ annotate browser, from Namhyung Kim.
- Add option for runtime switching perf data file in perf report,
just press 's' and a menu with the valid files found in the current
directory will be presented, from Feng Tang.
- Add support to display whole group data for raw columns, from Jiri
Olsa.
- Add per processor socket count aggregation in perf stat, from
Stephane Eranian.
- Add interval printing in 'perf stat', from Stephane Eranian.
- 'perf test' improvements
- Add support for wildcards in tracepoint system name, from Jiri
Olsa.
- Add anonymous huge page recognition, from Joshua Zhu.
- perf build-id cache now can show DSOs present in a perf.data file
that are not in the cache, to integrate with build-id servers being
put in place by organizations such as Fedora.
- perf top now shares more of the evsel config/creation routines with
'record', paving the way for further integration like 'top'
snapshots, etc.
- perf top now supports DWARF callchains.
- Fix mmap limitations on 32-bit, fix from David Miller.
- 'perf bench numa mem' NUMA performance measurement suite
- ... and lots of fixes, performance improvements, cleanups and other
improvements I failed to list - see the shortlog and git log for
details."
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (270 commits)
perf/x86/amd: Enable northbridge performance counters on AMD family 15h
perf/hwbp: Fix cleanup in case of kzalloc failure
perf tools: Fix build with bison 2.3 and older.
perf tools: Limit unwind support to x86 archs
perf annotate: Make it to be able to skip unannotatable symbols
perf gtk/annotate: Fail early if it can't annotate
perf gtk/annotate: Show source lines with gray color
perf gtk/annotate: Support multiple event annotation
perf ui/gtk: Implement basic GTK2 annotation browser
perf annotate: Fix warning message on a missing vmlinux
perf buildid-cache: Add --update option
uprobes/perf: Avoid uprobe_apply() whenever possible
uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE
uprobes/perf: Teach trace_uprobe/perf code to pre-filter
uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's
uprobes: Introduce uprobe_apply()
perf: Introduce hw_perf_event->tp_target and ->tp_list
uprobes/perf: Always increment trace_uprobe->nhit
uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe
uprobes/tracing: Introduce is_trace_uprobe_enabled()
...
Pull irq core changes from Ingo Molnar:
"The biggest changes are the IRQ-work and printk changes from Frederic
Weisbecker, which prepare the code for 'full dynticks' (the ability to
stop or slow down the periodic tick arbitrarily, not just in idle time
as today):
- Don't stop tick with irq works pending. This fix is generally
useful and concerns archs that can't raise self IPIs.
- Flush irq works before CPU offlining.
- Introduce "lazy" irq works that can wait for the next tick to be
executed, unless it's stopped.
- Implement klogd wake up using irq work. This removes the ad-hoc
printk_tick()/printk_needs_cpu() hooks and make it working even in
dynticks mode.
- Cleanups and fixes."
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Export enable/disable_percpu_irq()
arch Kconfig: Remove references to IRQ_PER_CPU
irq_work: Remove return value from the irq_work_queue() function
genirq: Avoid deadlock in spurious handling
printk: Wake up klogd using irq_work
irq_work: Make self-IPIs optable
irq_work: Warn if there's still work on cpu_down
irq_work: Flush work on CPU_DYING
irq_work: Don't stop the tick with pending works
nohz: Add API to check tick state
irq_work: Remove CONFIG_HAVE_IRQ_WORK
irq_work: Fix racy check on work pending flag
irq_work: Fix racy IRQ_WORK_BUSY flag setting
Pull RCU changes from Ingo Molnar:
"SRCU changes:
- These include debugging aids, updates that move towards the goal of
permitting srcu_read_lock() and srcu_read_unlock() to be used from
idle and offline CPUs, and a few small fixes.
Changes to rcutorture and to RCU documentation:
- Posted to LKML at https://lkml.org/lkml/2013/1/26/188
Enhancements to uniprocessor handling in tiny RCU:
- Posted to LKML at https://lkml.org/lkml/2013/1/27/2
Tag RCU callbacks with grace-period number to simplify callback
advancement:
- Posted to LKML at https://lkml.org/lkml/2013/1/26/203
Miscellaneous fixes:
- Posted to LKML at https://lkml.org/lkml/2013/1/26/204"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()
srcu: Update synchronize_srcu_expedited()'s comments
srcu: Update synchronize_srcu()'s comments
srcu: Remove checks preventing idle CPUs from calling srcu_read_lock()
srcu: Remove checks preventing offline CPUs from calling srcu_read_lock()
srcu: Simple cleanup for cleanup_srcu_struct()
srcu: Add might_sleep() annotation to synchronize_srcu()
srcu: Simplify __srcu_read_unlock() via this_cpu_dec()
rcu: Allow rcutorture to be built at low optimization levels
rcu: Make rcutorture's shuffler task shuffle recently added tasks
rcu: Allow TREE_PREEMPT_RCU on UP systems
rcu: Provide RCU CPU stall warnings for tiny RCU
context_tracking: Add comments on interface and internals
rcu: Remove obsolete Kconfig option from comment
rcu: Remove unused code originally used for context tracking
rcu: Consolidate debugging Kconfig options
rcu: Correct 'optimized' to 'optimize' in header comment
rcu: Trace callback acceleration
rcu: Tag callback lists with corresponding grace-period number
rcutorture: Don't compare ptr with 0
...
IA64 relied on it through sched.h inclusion:
arch/ia64/kernel/init_task.c:38:11: error: 'MAX_PRIO' undeclared here (not in a function)
arch/ia64/kernel/init_task.c:38:11: error: 'RR_TIMESLICE' undeclared here (not in a function)
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-xaan1twswggedMR0airtpjui@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
1) Revert iwlwifi reclaimed packet tracking, it causes problems for a
bunch of folks. From Emmanuel Grumbach.
2) Work limiting code in brcmsmac wifi driver can clear tx status
without processing the event. From Arend van Spriel.
3) rtlwifi USB driver processes wrong SKB, fix from Larry Finger.
4) l2tp tunnel delete can race with close, fix from Tom Parkin.
5) pktgen_add_device() failures are not checked at all, fix from Cong
Wang.
6) Fix unintentional removal of carrier off from tun_detach(),
otherwise we confuse userspace, from Michael S. Tsirkin.
7) Don't leak socket reference counts and ubufs in vhost-net driver,
from Jason Wang.
8) vmxnet3 driver gets it's initial carrier state wrong, fix from Neil
Horman.
9) Protect against USB networking devices which spam the host with 0
length frames, from Bjørn Mork.
10) Prevent neighbour overflows in ipv6 for locally destined routes,
from Marcelo Ricardo. This is the best short-term fix for this, a
longer term fix has been implemented in net-next.
11) L2TP uses ipv4 datagram routines in it's ipv6 code, whoops. This
mistake is largely because the ipv6 functions don't even have some
kind of prefix in their names to suggest they are ipv6 specific.
From Tom Parkin.
12) Check SYN packet drops properly in tcp_rcv_fastopen_synack(), from
Yuchung Cheng.
13) Fix races and TX skb freeing bugs in via-rhine's NAPI support, from
Francois Romieu and your's truly.
14) Fix infinite loops and divides by zero in TCP congestion window
handling, from Eric Dumazet, Neal Cardwell, and Ilpo Järvinen.
15) AF_PACKET tx ring handling can leak kernel memory to userspace, fix
from Phil Sutter.
16) Fix error handling in ipv6 GRE tunnel transmit, from Tommi Rantala.
17) Protect XEN netback driver against hostile frontend putting garbage
into the rings, don't leak pages in TX GOP checking, and add proper
resource releasing in error path of xen_netbk_get_requests(). From
Ian Campbell.
18) SCTP authentication keys should be cleared out and released with
kzfree(), from Daniel Borkmann.
19) L2TP is a bit too clever trying to maintain skb->truesize, and ends
up corrupting socket memory accounting to the point where packet
sending is halted indefinitely. Just remove the adjustments
entirely, they aren't really needed. From Eric Dumazet.
20) ATM Iphase driver uses a data type with the same name as the S390
headers, rename to fix the build. From Heiko Carstens.
21) Fix a typo in copying the inner network header offset from one SKB
to another, from Pravin B Shelar.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
net: sctp: sctp_endpoint_free: zero out secret key data
net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree
atm/iphase: rename fregt_t -> ffreg_t
net: usb: fix regression from FLAG_NOARP code
l2tp: dont play with skb->truesize
net: sctp: sctp_auth_key_put: use kzfree instead of kfree
netback: correct netbk_tx_err to handle wrap around.
xen/netback: free already allocated memory on failure in xen_netbk_get_requests
xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.
xen/netback: shutdown the ring if it contains garbage.
net: qmi_wwan: add more Huawei devices, including E320
net: cdc_ncm: add another Huawei vendor specific device
ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()
tcp: fix for zero packets_in_flight was too broad
brcmsmac: rework of mac80211 .flush() callback operation
ssb: unregister gpios before unloading ssb
bcma: unregister gpios before unloading bcma
rtlwifi: Fix scheduling while atomic bug
net: usbnet: fix tx_dropped statistics
tcp: ipv6: Update MIB counters for drops
...
Currently it is not possible to change the filtering constraints after
uprobe_register(), so a consumer can not, say, start to trace a task/mm
which was previously filtered out, or remove the no longer needed bp's.
Introduce uprobe_apply() which simply does register_for_each_vma() again
to consult uprobe_consumer->filter() and install/remove the breakpoints.
The only complication is that register_for_each_vma() can no longer
assume that uprobe->consumers should be consulter if is_register == T,
so we change it to accept "struct uprobe_consumer *new" instead.
Unlike uprobe_register(), uprobe_apply(true) doesn't do "unregister" if
register_for_each_vma() fails, it is up to caller to handle the error.
Note: we probably need to cleanup the current interface, it is strange
that uprobe_apply/unregister need inode/offset. We should either change
uprobe_register() to return "struct uprobe *", or add a private ->uprobe
member in uprobe_consumer. And in the long term uprobe_apply() should
take a single argument, uprobe or consumer, even "bool add" should go
away.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
sys_perf_event_open()->perf_init_event(event) is called before
find_get_context(event), this means that event->ctx == NULL when
class->reg(TRACE_REG_PERF_REGISTER/OPEN) is called and thus it
can't know if this event is per-task or system-wide.
This patch adds hw_perf_event->tp_target for PERF_TYPE_TRACEPOINT,
this is analogous to PERF_TYPE_BREAKPOINT/bp_target we already have.
The patch also moves ->bp_target up so that it can overlap with the
new member, this can help the compiler to generate the better code.
trace_uprobe_register() will use it for prefiltering to avoid the
unnecessary breakpoints in mm's we do not want to trace.
->tp_target doesn't have its own reference, but we can rely on the
fact that either sys_perf_event_open() holds a reference, or it is
equal to event->ctx->task. So this pointer is always valid until
free_event().
Also add the "struct list_head tp_list" into this union. It is not
strictly necessary, but it can simplify the next changes and we can
add it for free.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Currrently the are 2 problems with pre-filtering:
1. It is not possible to add/remove a task (mm) after uprobe_register()
2. A forked child inherits all breakpoints and uprobe_consumer can not
control this.
This patch does the first step to improve the filtering. handler_chain()
removes the breakpoints installed by this uprobe from current->mm if all
handlers return UPROBE_HANDLER_REMOVE.
Note that handler_chain() relies on ->register_rwsem to avoid the race
with uprobe_register/unregister which can add/del a consumer, or even
remove and then insert the new uprobe at the same address.
Perhaps we will add uprobe_apply_mm(uprobe, mm, is_register) and teach
copy_mm() to do filter(UPROBE_FILTER_FORK), but I think this change makes
sense anyway.
Note: instead of checking the retcode from uc->handler, we could add
uc->filter(UPROBE_FILTER_BPHIT). But I think this is not optimal to
call 2 hooks in a row. This buys nothing, and if handler/filter do
something nontrivial they will probably do the same work twice.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Finally add uprobe_consumer->filter() and change consumer_filter()
to actually call this method.
Note that ->filter() accepts mm_struct, not task_struct. Because:
1. We do not have for_each_mm_user(mm, task).
2. Even if we implement for_each_mm_user(), ->filter() can
use it itself.
3. It is not clear who will actually need this interface to
do the "nontrivial" filtering.
Another argument is "enum uprobe_filter_ctx", consumer->filter() can
use it to figure out why/where it was called. For example, perhaps
we can add UPROBE_FILTER_PRE_REGISTER used by build_map_info() to
quickly "nack" the unwanted mm's. In this case consumer should know
that it is called under ->i_mmap_mutex.
See the previous discussion at http://marc.info/?t=135214229700002
Perhaps we should pass more arguments, vma/vaddr?
Note: this patch obviously can't help to filter out the child created
by fork(), this will be addressed later.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
uprobe_consumer->filter() is pointless in its current form, kill it.
We will add it back, but with the different signature/semantics. Perhaps
we will even re-introduce the callsite in handler_chain(), but not to
just skip uc->handler().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
In commit 6509141f9c ("usbnet: add new
flag FLAG_NOARP for usb net devices"), the newly added flag NOARP was
using an already defined value, which broke drivers using flag
MULTI_PACKET.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of ff195cb6 (rcu: Warn when srcu_read_lock() is used in an extended
quiescent state).
It also makes the codes match the comments in whatisRCU.txt:
g. Do you need read-side critical sections that are respected
even though they are in the middle of the idle loop, during
user-mode execution, or on an offlined CPU? If so, SRCU is the
only choice that will work for you.
[ paulmck: There is at least one remaining issue, namely use of lockdep
with tracing enabled. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of c0d6d01b (rcu: Check for illegal use of RCU from offlined CPUs).
It also makes the code match the comments in whatisRCU.txt:
g. Do you need read-side critical sections that are respected
even though they are in the middle of the idle loop, during
user-mode execution, or on an offlined CPU? If so, SRCU is the
only choice that will work for you.
[ paulmck: There is at least one remaining issue, namely use of lockdep
with tracing enabled. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Move rt scheduler definitions out of include/linux/sched.h into
new file include/linux/sched/rt.h
Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a /proc/sys/kernel scheduler knob named
sched_rr_timeslice_ms that allows global changing of the
SCHED_RR timeslice value. User visable value is in milliseconds
but is stored as jiffies. Setting to 0 (zero) resets to the
default (currently 100ms).
Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094704.13751796@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move the sysctl-related bits from include/linux/sched.h into
a new file: include/linux/sched/sysctl.h. Then update source
files requiring access to those bits by including the new
header file.
Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094659.06dced96@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull block layer updates from Jens Axboe:
"I've got a few bits pending for 3.8 final, that I better get sent out.
It's all been sitting for a while, I consider it safe.
It contains:
- Two bug fixes for mtip32xx, fixing a driver hang and a crash.
- A few-liner protocol error fix for drbd.
- A few fixes for the xen block front/back driver, fixing a potential
data corruption issue.
- A race fix for disk_clear_events(), causing spurious warnings. Out
of the Chrome OS base.
- A deadlock fix for disk_clear_events(), moving it to the a
unfreezable workqueue. Also from the Chrome OS base."
* 'for-linus' of git://git.kernel.dk/linux-block:
drbd: fix potential protocol error and resulting disconnect/reconnect
mtip32xx: fix for crash when the device surprise removed during rebuild
mtip32xx: fix for driver hang after a command timeout
block: prevent race/cleanup
block: remove deadlock in disk_clear_events
xen-blkfront: handle bvecs with partial data
llist/xen-blkfront: implement safe version of llist_for_each_entry
xen-blkback: implement safe iterator for the list of persistent grants
Here are a few tiny USB fixes for 3.8-rc6.
Nothing major here, some host controller bug fixes to resolve a number
of bugs that people have reported, and a bunch of additional device ids
are added to a number of drivers (which caused code to be deleted from
the usb-storage driver, always nice.)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlERRT4ACgkQMUfUDdst+ylxAQCghY23x/0vT9oCsrWZzKeOFOMB
J64AoNjVngpCpIcmh/yk1tgjSRSQu3dn
=p4nA
-----END PGP SIGNATURE-----
Merge tag 'usb-3.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are a few tiny USB fixes for 3.8-rc6.
Nothing major here, some host controller bug fixes to resolve a number
of bugs that people have reported, and a bunch of additional device
ids are added to a number of drivers (which caused code to be deleted
from the usb-storage driver, always nice)"
* tag 'usb-3.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
USB: storage: optimize to match the Huawei USB storage devices and support new switch command
USB: storage: Define a new macro for USB storage match rules
USB: ftdi_sio: add Zolix FTDI PID
USB: option: add Changhong CH690
USB: ftdi_sio: add PID/VID entries for ELV WS 300 PC II
USB: add OWL CM-160 support to cp210x driver
USB: EHCI: fix bug in scheduling periodic split transfers
USB: EHCI: fix for leaking isochronous data
USB: option: add support for Telit LE920
USB: qcserial: add Telit Gobi QDL device
USB: EHCI: fix timer bug affecting port resume
USB: UHCI: notify usbcore about port resumes
USB: EHCI: notify usbcore about port resumes
USB: add usb_hcd_{start,end}_port_resume
USB: EHCI: unlink one async QH at a time
USB: EHCI: remove ASS/PSS polling timeout
usb: Using correct way to clear usb3.0 device's remote wakeup feature.
usb: Prevent dead ports when xhci is not enabled
USB: XHCI: fix memory leak of URB-private data
drivers: xhci: fix incorrect bit test
...
Typical cputime stats infrastructure relies on the timer tick and
its periodic polling on the CPU to account the amount of time
spent by the CPUs and the tasks per high level domains such as
userspace, kernelspace, guest, ...
Now we are preparing to implement full dynticks capability on
Linux for Real Time and HPC users who want full CPU isolation.
This feature requires a cputime accounting that doesn't depend
on the timer tick.
To implement it, this new cputime infrastructure plugs into
kernel/user/guest boundaries to take snapshots of cputime and
flush these to the stats when needed. This performs pretty
much like CONFIG_VIRT_CPU_ACCOUNTING except that context location
and cputime snaphots are synchronized between write and read
side such that the latter can safely retrieve the pending tickless
cputime of a task and add it to its latest cputime snapshot to
return the correct result to the user.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRBsKnAAoJEIUkVEdQjox3lMgP/2R6DU2f8PyGIao3hne4M3Pu
L3q+mAG53b24Dy014KeW7gd8yv45fE7wp/rs8CGLte9VzbLkRCDSFQPgBuXVagRj
tV5nfAuqD0wHTnA+HhBE3l3C2RKAPGIu79rBpnIR/QIPPl8Z3Dby8YgmxEQKDf8G
j7MEBu2LthSuqEi2ZXemnO5r0oEnQAzAp4TTi/M38k0Fmt59nOGyjLnI+xHYCBMa
1pnz7j3jjR9NJExGu8iVvbo+jupuQngP8qmkLXHvYnj/TEJNwzO1hHVoSwOpjYpS
9ycl+T8IKQLbAkBywLtq3Mzde43xt/t8wYyGZ0oAV+Z7MIpz/9YIfDJwqQeqoNbD
dAdbNjKMbsxCgmrnyqSagfMQg/r3CPZ4vf40TMCaN4gNUJC4Ie+E4kPRKRh59+PB
Ukthmqujn0f40LAa+HXTUuzafd3b0s/ewH+8FuQ6LAG9b5+WnoN8JTJ5u6+ydokO
ZleeOowuRZZEg+abQ8Sm2GRm/BzN29gi/npb//I+ZDXWv/+3yccgsiPjCRzCAAaO
g1RmYryFSRUwHQbGNNypVWVuOLWvrBQ4jqbGO7BBuBByZMSHryKxR6mb+inH3qLE
xIDM9SdSJisc292OzoFKwVZki4MaXaadJXJduVvqYlZQvXXs7eAa4wo3euhtVITD
NLQO5OZXE4oIQmDFb0FV
=1Tzp
-----END PGP SIGNATURE-----
Merge tag 'full-dynticks-cputime-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core
Pull full-dynticks (user-space execution is undisturbed and
receives no timer IRQs) preparation changes that convert the
cputime accounting code to be full-dynticks ready,
from Frederic Weisbecker:
"This implements the cputime accounting on full dynticks CPUs.
Typical cputime stats infrastructure relies on the timer tick and
its periodic polling on the CPU to account the amount of time
spent by the CPUs and the tasks per high level domains such as
userspace, kernelspace, guest, ...
Now we are preparing to implement full dynticks capability on
Linux for Real Time and HPC users who want full CPU isolation.
This feature requires a cputime accounting that doesn't depend
on the timer tick.
To implement it, this new cputime infrastructure plugs into
kernel/user/guest boundaries to take snapshots of cputime and
flush these to the stats when needed. This performs pretty
much like CONFIG_VIRT_CPU_ACCOUNTING except that context location
and cputime snaphots are synchronized between write and read
side such that the latter can safely retrieve the pending tickless
cputime of a task and add it to its latest cputime snapshot to
return the correct result to the user."
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The macro for_each_memcg_cache_index contains a silly yet potentially
deadly mistake. Although the macro parameter is _idx, the loop tests
are done over i, not _idx.
This hasn't generated any problems so far, because all users use i as a
loop index. However, while playing with an extension of the code I
ended using another loop index and the compiler was quick to complain.
Unfortunately, this is not the kind of thing that testing reveals =(
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We use rwsem since commit 5a505085f0 ("mm/rmap: Convert the struct
anon_vma::mutex to an rwsem"). And most of comments are converted to
the new rwsem lock; while just 2 more missed from:
$ git grep 'anon_vma->mutex'
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
kernel/irq_work.c
Add support for printk in full dynticks CPU.
* Don't stop tick with irq works pending. This
fix is generally useful and concerns archs that
can't raise self IPIs.
* Flush irq works before CPU offlining.
* Introduce "lazy" irq works that can wait for the
next tick to be executed, unless it's stopped.
* Implement klogd wake up using irq work. This
removes the ad-hoc printk_tick()/printk_needs_cpu()
hooks and make it working even in dynticks mode.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
As no one is using the return value of irq_work_queue(),
so it is better to just make it void.
Signed-off-by: anish kumar <anish198519851985@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[ Fix stale comments, remove now unnecessary __irq_work_queue() intermediate function ]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1359925703-24304-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rename EVENT_ATTR() to PMU_EVENT_ATTR() and make it global so it is
available to all architectures.
Further to allow architectures flexibility, have PMU_EVENT_ATTR() pass
in the variable name as a parameter.
Changelog[v2]
- [Jiri Olsa] No need to define PMU_EVENT_PTR()
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/20130123062422.GC13720@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull x86 fixes from Peter Anvin:
"This is a collection of miscellaneous fixes, the most important one is
the fix for the Samsung laptop bricking issue (auto-blacklisting the
samsung-laptop driver); the efi_enabled() changes you see below are
prerequisites for that fix.
The other issues fixed are booting on OLPC XO-1.5, an UV fix, NMI
debugging, and requiring CAP_SYS_RAWIO for MSR references, just as
with I/O port references."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
samsung-laptop: Disable on EFI hardware
efi: Make 'efi_enabled' a function to query EFI facilities
smp: Fix SMP function call empty cpu mask race
x86/msr: Add capabilities check
x86/dma-debug: Bump PREALLOC_DMA_DEBUG_ENTRIES
x86/olpc: Fix olpc-xo1-sci.c build errors
arch/x86/platform/uv: Fix incorrect tlb flush all issue
x86-64: Fix unwind annotations in recent NMI changes
x86-32: Start out cr0 clean, disable paging before modifying cr3/4
A device sending 0 length frames as fast as it can has been
observed killing the host system due to the resulting memory
pressure.
Temporarily disable RX skb allocation and URB submission when
the current error ratio is high, preventing us from trying to
allocate an infinite number of skbs. Reenable as soon as we
are finished processing the done queue, allowing the device
to continue working after short error bursts.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally 'efi_enabled' indicated whether a kernel was booted from
EFI firmware. Over time its semantics have changed, and it now
indicates whether or not we are booted on an EFI machine with
bit-native firmware, e.g. 64-bit kernel with 64-bit firmware.
The immediate motivation for this patch is the bug report at,
https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
which details how running a platform driver on an EFI machine that is
designed to run under BIOS can cause the machine to become
bricked. Also, the following report,
https://bugzilla.kernel.org/show_bug.cgi?id=47121
details how running said driver can also cause Machine Check
Exceptions. Drivers need a new means of detecting whether they're
running on an EFI machine, as sadly the expression,
if (!efi_enabled)
hasn't been a sufficient condition for quite some time.
Users actually want to query 'efi_enabled' for different reasons -
what they really want access to is the list of available EFI
facilities.
For instance, the x86 reboot code needs to know whether it can invoke
the ResetSystem() function provided by the EFI runtime services, while
the ACPI OSL code wants to know whether the EFI config tables were
mapped successfully. There are also checks in some of the platform
driver code to simply see if they're running on an EFI machine (which
would make it a bad idea to do BIOS-y things).
This patch is a prereq for the samsung-laptop fix patch.
Cc: David Airlie <airlied@linux.ie>
Cc: Corentin Chary <corentincj@iksaif.net>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Peter Jones <pjones@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Steve Langasek <steve.langasek@canonical.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Ftrace has a snapshot feature available from kernel space and
latency tracers (e.g. irqsoff) are using it. This patch enables
user applictions to take a snapshot via debugfs.
Add "snapshot" debugfs file in "tracing" directory.
snapshot:
This is used to take a snapshot and to read the output of the
snapshot.
# echo 1 > snapshot
This will allocate the spare buffer for snapshot (if it is
not allocated), and take a snapshot.
# cat snapshot
This will show contents of the snapshot.
# echo 0 > snapshot
This will free the snapshot if it is allocated.
Any other positive values will clear the snapshot contents if
the snapshot is allocated, or return EINVAL if it is not allocated.
Link: http://lkml.kernel.org/r/20121226025300.3252.86850.stgit@liselsia
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: David Sharp <dhsharp@google.com>
Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
[
Fixed irqsoff selftest and also a conflict with a change
that fixes the update_max_tr.
]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add a stat about the number of events read from the ring buffer:
# cat /debug/tracing/per_cpu/cpu0/stats
entries: 39869
overrun: 870512
commit overrun: 0
bytes: 1449912
oldest event ts: 6561.368690
now ts: 6565.246426
dropped events: 0
read events: 112 <-- Added
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
doctorture.2013.01.11a: Changes to rcutorture and to RCU documentation.
fixes.2013.01.26a: Miscellaneous fixes.
tagcb.2013.01.24a: Tag RCU callbacks with grace-period number to
simplify callback advancement.
tiny.2013.01.29b: Enhancements to uniprocessor handling in tiny RCU.
We have some build failure fixes (twl4030, vexpress, abx500 and tps65910),
some actual runtime oops and lockup fixes (rtsx, da9052), and some more
hypothetical NULL pointers dereferences fixes for pcf50633 and max776xx.
Then we also have additional rtsx fixes for a correct switch output voltage
and clock divider correctness for rtl8411 (rtsx driver), and irqdomain fix for
db8550-prcmu, and some more cosmetic fixes for arizona and wm5102.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRBIBuAAoJEIqAPN1PVmxKCZYP/3EP7I6nTnHfmMJHr6KhL9F8
h/PFzSJiYC5DHoYpvcD6ESkDtqgZTOgt/R8VzbzcfCoSAlARCyo3WenCjUhREspW
2vCb+rVqBXc3+pn/Hed5WlTx3a231iSYiQd4OMbDkG22TuTKdf4GOWcl4KnAVMjp
NMqD3wCkDeMkutxRO7eWc+B/eXmYDp38abiYU+xJCMfmpvRwiPp7/RQTw/9kHgF/
VHGqzH91YPJmcF9OcDDzsvJ2zGwPsXPhtsOnwxL7KkjI4WM4EZv8Nr0NwTsuIgNJ
liqs4QO1XpTF+bAPKW/aT4VVLxYmLrzVao+bg6A9Vn5Q6Wt+N4McectvN7yndfOQ
GuSPI+LqcZvDEHaKGybRFdsbN+sh95f7Qz6dbFedJ3nWBhlFd7YiXgkQF3Yg38sX
rbK66F0PuH7F010a3cbhZ4jsHUb1MxzU6YSCLwUvukF1ijitPP89md0K9YaN9cbT
YbBdZpphaiFePz9CjRyyYJvo4DC9i9BTgC8Ac3qiG1TELhb/Dl064d4o0oDDEfzH
qVo21yUWeJ9jsHMnFvJuaDe9IbfxyDWJSLXFPlwaW/1qdbDPKzCr1Sro4v+lmOh5
1RIiHfu52RSPDewo0ACZPPOd8h8/Jfra37CDiGPGnjbEkUJTxC7XfHie6M9034ov
m/ORqHJOi6Wh9Iy7YHM3
=rxug
-----END PGP SIGNATURE-----
Merge tag 'mfd-for-linus-3.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
Pull MFD fixes from Samuel Ortiz:
"This is the first pull request for MFD fixes for 3.8
We have some build failure fixes (twl4030, vexpress, abx500 and
tps65910), some actual runtime oops and lockup fixes (rtsx, da9052),
and some more hypothetical NULL pointers dereferences fixes for
pcf50633 and max776xx.
Then we also have additional rtsx fixes for a correct switch output
voltage and clock divider correctness for rtl8411 (rtsx driver), and
irqdomain fix for db8550-prcmu, and some more cosmetic fixes for
arizona and wm5102."
* tag 'mfd-for-linus-3.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: rtsx: Fix oops when rtsx_pci_sdmmc is not probed
mfd: wm5102: Fix definition of WM5102_MAX_REGISTER
mfd: twl4030: Don't warn about uninitialized return code
mfd: da9052/53 lockup fix
mfd: rtsx: Add clock divider hook
mmc: rtsx: Call MFD hook to switch output voltage
mfd: rtsx: Add output voltage switch hook
mfd: Fix compile errors and warnings when !CONFIG_AB8500_BM
mfd: vexpress: Export global functions to fix build error
mfd: arizona: Check errors from regcache_sync()
mfd: tc3589x: Use simple irqdomain
mfd: pcf50633: Init pcf->dev before using it
mfd: max77693: Init max77693->dev before using it
mfd: max77686: Init max77686->dev before using it
mfd: db8500-prcmu: Fix irqdomain usage
mfd: tps65910: Select REGMAP_IRQ in Kconfig to fix build error
mfd: arizona: Disable control interface reporting for WM5102 and WM5110
Pull networking updates from David Miller:
"Much more accumulated than I would have liked due to an unexpected
bout with a nasty flu:
1) AH and ESP input don't set ECN field correctly because the
transport head of the SKB isn't set correctly, fix from Li
RongQing.
2) If netfilter conntrack zones are disabled, we can return an
uninitialized variable instead of the proper error code. Fix from
Borislav Petkov.
3) Fix double SKB free in ath9k driver beacon handling, from Felix
Feitkau.
4) Remove bogus assumption about netns cleanup ordering in
nf_conntrack, from Pablo Neira Ayuso.
5) Remove a bogus BUG_ON in the new TCP fastopen code, from Eric
Dumazet. It uses spin_is_locked() in it's test and is therefore
unsuitable for UP.
6) Fix SELINUX labelling regressions added by the tuntap multiqueue
changes, from Paul Moore.
7) Fix CRC errors with jumbo frame receive in tg3 driver, from Nithin
Nayak Sujir.
8) CXGB4 driver sets interrupt coalescing parameters only on first
queue, rather than all of them. Fix from Thadeu Lima de Souza
Cascardo.
9) Fix regression in the dispatch of read/write registers in dm9601
driver, from Tushar Behera.
10) ipv6_append_data miscalculates header length, from Romain KUNTZ.
11) Fix PMTU handling regressions on ipv4 routes, from Steffen
Klassert, Timo Teräs, and Julian Anastasov.
12) In 3c574_cs driver, add necessary parenthesis to "x << y & z"
expression. From Nickolai Zeldovich.
13) macvlan_get_size() causes underallocation netlink message space,
fix from Eric Dumazet.
14) Avoid division by zero in xfrm_replay_advance_bmp(), from Nickolai
Zeldovich. Amusingly the zero check was already there, we were
just performing it after the modulus :-)
15) Some more splice bug fixes from Eric Dumazet, which fix things
mostly eminating from how we now more aggressively use high-order
pages in SKBs.
16) Fix size calculation bug when freeing hash tables in the IPSEC
xfrm code, from Michal Kubecek.
17) Fix PMTU event propagation into socket cached routes, from Steffen
Klassert.
18) Fix off by one in TX buffer release in netxen driver, from Eric
Dumazet.
19) Fix rediculous memory allocation requirements introduced by the
tuntap multiqueue changes, from Jason Wang.
20) Remove bogus AMD platform workaround in r8169 driver that causes
major problems in normal operation, from Timo Teräs.
21) virtio-net set affinity and select queue don't handle
discontiguous cpu numbers properly, fix from Wanlong Gao.
22) Fix a route refcounting issue in loopback driver, from Eric
Dumazet. There's a similar fix coming that we might add to the
macvlan driver as well.
23) Fix SKB leaks in batman-adv's distributed arp table code, from
Matthias Schiffer.
24) r8169 driver gives descriptor ownership back the hardware before
we're done reading the VLAN tag out of it, fix from Francois
Romieu.
25) Checksums not calculated properly in GRE tunnel driver fix from
Pravin B Shelar.
26) Fix SCTP memory leak on namespace exit."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits)
dm9601: support dm9620 variant
SCTP: Free the per-net sysctl table on net exit. v2
net: phy: icplus: fix broken INTR pin settings
net: phy: icplus: Use the RGMII interface mode to configure clock delays
IP_GRE: Fix kernel panic in IP_GRE with GRE csum.
sctp: set association state to established in dupcook_a handler
ip6mr: limit IPv6 MRT_TABLE identifiers
r8169: fix vlan tag read ordering.
net: cdc_ncm: use IAD provided by the USB core
batman-adv: filter ARP packets with invalid MAC addresses in DAT
batman-adv: check for more types of invalid IP addresses in DAT
batman-adv: fix skb leak in batadv_dat_snoop_incoming_arp_reply()
net: loopback: fix a dst refcounting issue
virtio-net: reset virtqueue affinity when doing cpu hotplug
virtio-net: split out clean affinity function
virtio-net: fix the set affinity bug when CPU IDs are not consecutive
can: pch_can: fix invalid error codes
can: ti_hecc: fix invalid error codes
can: c_can: fix invalid error codes
r8169: remove the obsolete and incorrect AMD workaround
...
While remotely reading the cputime of a task running in a
full dynticks CPU, the values stored in utime/stime fields
of struct task_struct may be stale. Its values may be those
of the last kernel <-> user transition time snapshot and
we need to add the tickless time spent since this snapshot.
To fix this, flush the cputime of the dynticks CPUs on
kernel <-> user transition and record the time / context
where we did this. Then on top of this snapshot and the current
time, perform the fixup on the reader side from task_times()
accessors.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
[fixed kvm module related build errors]
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Do some ground preparatory work before adding guest_enter()
and guest_exit() context tracking callbacks. Those will
be later used to read the guest cputime safely when we
run in full dynticks mode.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
This is in preparation for the full dynticks feature. While
remotely reading the cputime of a task running in a full
dynticks CPU, we'll need to do some extra-computation. This
way we can account the time it spent tickless in userspace
since its last cputime snapshot.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Allow to dynamically switch between tick and virtual based
cputime accounting. This way we can provide a kind of "on-demand"
virtual based cputime accounting. In this mode, the kernel relies
on the context tracking subsystem to dynamically probe on kernel
boundaries.
This is in preparation for being able to stop the timer tick in
more places than just the idle state. Doing so will depend on
CONFIG_VIRT_CPU_ACCOUNTING_GEN which makes it possible to account
the cputime without the tick by hooking on kernel/user boundaries.
Depending whether the tick is stopped or not, we can switch between
tick and vtime based accounting anytime in order to minimize the
overhead associated to user hooks.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
If we want to stop the tick further idle, we need to be
able to account the cputime without using the tick.
Virtual based cputime accounting solves that problem by
hooking into kernel/user boundaries.
However implementing CONFIG_VIRT_CPU_ACCOUNTING require
low level hooks and involves more overhead. But we already
have a generic context tracking subsystem that is required
for RCU needs by archs which plan to shut down the tick
outside idle.
This patch implements a generic virtual based cputime
accounting that relies on these generic kernel/user hooks.
There are some upsides of doing this:
- This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
if context tracking is already built (already necessary for RCU in full
tickless mode).
- We can rely on the generic context tracking subsystem to dynamically
(de)activate the hooks, so that we can switch anytime between virtual
and tick based accounting. This way we don't have the overhead
of the virtual accounting when the tick is running periodically.
And one downside:
- There is probably more overhead than a native virtual based cputime
accounting. But this relies on hooks that are already set anyway.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
An issue has been reported where the PMIC either locks up or fails to
respond following a system Reset. This could result in a second write
in which the bus writes the current content of the write buffer to address
of the last I2C access.
The failure case is where this unwanted write transfers incorrect data to
a critical register.
This patch fixes this issue to by following any read or write with a dummy read
to a safe register address. A safe register address is one where the contents
will not affect the operation of the system.
Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Add callback function conv_clk_and_div_n to convert between SSC clock
and its divider N.
For rtl8411, the formula to calculate SSC clock divider N is different
with the other card reader models.
Signed-off-by: Wei WANG <wei_wang@realsil.com.cn>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Different card reader has different method to switch output voltage,
add this callback to let the card reader implement its individual switch
function.
This is needed as rtl8411 has a specific switch output voltage procedure.
Signed-off-by: Wei WANG <wei_wang@realsil.com.cn>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Export the context state: whether we run in user / kernel
from the context tracking subsystem point of view.
This is going to be used by the generic virtual cputime
accounting subsystem that is needed to implement the full
dynticks.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
This patch (as1649) adds a mechanism for host controller drivers to
inform usbcore when they have begun or ended resume signalling on a
particular root-hub port. The core will then make sure that the root
hub does not get runtime-suspended while the port resume is going on.
Since commit 596d789a21 (USB: set hub's
default autosuspend delay as 0), the system tries to suspend hubs
whenever they aren't in use. While a root-hub port is being resumed,
the root hub does not appear to be in use. Attempted runtime suspends
fail because of the ongoing port resume, but the PM core just keeps on
trying over and over again. We want to prevent this wasteful effort.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The issue below was found in 2.6.34-rt rather than mainline rt
kernel, but the issue still exists upstream as well.
So please let me describe how it was noticed on 2.6.34-rt:
On this version, each softirq has its own thread, it means there
is at least one RT FIFO task per cpu. The priority of these
tasks is set to 49 by default. If user launches an RT FIFO task
with priority lower than 49 of softirq RT tasks, it's possible
there are two RT FIFO tasks enqueued one cpu runqueue at one
moment. By current strategy of balancing RT tasks, when it comes
to RT tasks, we really need to put them off to a CPU that they
can run on as soon as possible. Even if it means a bit of cache
line flushing, we want RT tasks to be run with the least latency.
When the user RT FIFO task which just launched before is
running, the sched timer tick of the current cpu happens. In this
tick period, the timeout value of the user RT task will be
updated once. Subsequently, we try to wake up one softirq RT
task on its local cpu. As the priority of current user RT task
is lower than the softirq RT task, the current task will be
preempted by the higher priority softirq RT task. Before
preemption, we check to see if current can readily move to a
different cpu. If so, we will reschedule to allow the RT push logic
to try to move current somewhere else. Whenever the woken
softirq RT task runs, it first tries to migrate the user FIFO RT
task over to a cpu that is running a task of lesser priority. If
migration is done, it will send a reschedule request to the found
cpu by IPI interrupt. Once the target cpu responds the IPI
interrupt, it will pick the migrated user RT task to preempt its
current task. When the user RT task is running on the new cpu,
the sched timer tick of the cpu fires. So it will tick the user
RT task again. This also means the RT task timeout value will be
updated again. As the migration may be done in one tick period,
it means the user RT task timeout value will be updated twice
within one tick.
If we set a limit on the amount of cpu time for the user RT task
by setrlimit(RLIMIT_RTTIME), the SIGXCPU signal should be posted
upon reaching the soft limit.
But exactly when the SIGXCPU signal should be sent depends on the
RT task timeout value. In fact the timeout mechanism of sending
the SIGXCPU signal assumes the RT task timeout is increased once
every tick.
However, currently the timeout value may be added twice per
tick. So it results in the SIGXCPU signal being sent earlier
than expected.
To solve this issue, we prevent the timeout value from increasing
twice within one tick time by remembering the jiffies value of
last updating the timeout. As long as the RT task's jiffies is
different with the global jiffies value, we allow its timeout to
be updated.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Fan Du <fan.du@windriver.com>
Reviewed-by: Yong Zhang <yong.zhang0@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1342508623-2887-1-git-send-email-ying.xue@windriver.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Here's a long-pending fixes pull request for arm-soc (I didn't send one
in the -rc4 cycle).
The larger deltas are from:
- A fixup of error paths in the mvsdio driver
- Header file move for a driver that hadn't been properly converted to
multiplatform on i.MX, which was causing build failures when included
- Device tree updates for at91 dealing mostly with their new
pinctrl setup merged in 3.8 and mistakes in those initial configs
The rest are the normal mix of small fixes all over the place; sunxi,
omap, imx, mvebu, etc, etc.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRAV2RAAoJEIwa5zzehBx32N0P/AsFOLaVWjuvf3kBZTaZgp3J
jZjhmAfJ2dQCITA792U2bI0d+gPiXm49EY3KWaNdb7S2UmQ1MwXHhKOwOQSVXli0
j+qAgVUa4nSsi3FQesKS0zThG/Xr+RsyiJZ2dHu71hendJu5NB1O1hzO4hDEHkMc
K8NGglKjtGirEiLIoub9ag8E9k5sd8X4nulrEJclon1BoolPcef18Bs96tdPmq/o
Ss634vBqhzSE8OInFc6RDNzTSM52zXbornr/5xGAvFqQv6L0rSXHPvjeeWVdNjj1
aNqkOrQOAHWRwTcyHOR0GdJfuAPSUwF+JkBWcUbgmsda7XunFiSb5tKV3FSVbJfN
pMFvbg/iK+ByhWq8iAOkT7OP64wi++FlOFa39IAiQ1QPRD0j93OlKMp0LjqEEiKd
Gw8o3X03GWhqoJUlSz40TF0Pvkje1UTk2Y8k2y24I3AnnEAcO5x+5pZYUTOe6x5N
THqqSMsdKWIibrQJRuOXll/DkS8zcepTHU7o8hyHBKYh7LxdAs4ITQoYZjcU5lse
HGwldByKfuNlzF3+96Jh9wZsr/9zjD8yovEcQYk37s56T/b7kT0sQm6XGS1dFE8Q
xQgcXLEUXZLt/79B0bn/5ogh26xswx/3GHgNjL1tJQc/MhbQ6C0bb2bBVoU21qzq
I5yMMwNSkH8+7+PGPiaQ
=YDHs
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Here's a long-pending fixes pull request for arm-soc (I didn't send
one in the -rc4 cycle).
The larger deltas are from:
- A fixup of error paths in the mvsdio driver
- Header file move for a driver that hadn't been properly converted
to multiplatform on i.MX, which was causing build failures when
included
- Device tree updates for at91 dealing mostly with their new pinctrl
setup merged in 3.8 and mistakes in those initial configs
The rest are the normal mix of small fixes all over the place; sunxi,
omap, imx, mvebu, etc, etc."
* tag 'fixes-for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (40 commits)
mfd: vexpress-sysreg: Don't skip initialization on probe
ARM: vexpress: Enable A7 cores in V2P-CA15_A7's Device Tree
ARM: vexpress: extend the MPIDR range used for pen release check
ARM: at91/dts: correct comment in at91sam9x5.dtsi for mii
ARM: at91/at91_dt_defconfig: add at91sam9n12 SoC to DT defconfig
ARM: at91/at91_dt_defconfig: remove memory specification to cmdline
ARM: at91/dts: add macb mii pinctrl config for kizbox
ARM: at91: rm9200: remake the BGA as default version
ARM: at91: fix gpios on i2c-gpio for RM9200 DT
ARM: at91/at91sam9x5 DTS: add SCK USART pins
ARM: at91/at91sam9x5 DTS: correct wrong PIO BANK values on u(s)arts
ARM: at91/at91-pinctrl documentation: fix typo and add some details
ARM: kirkwood: fix missing #interrupt-cells property
mmc: mvsdio: use devm_ API to simplify/correct error paths.
clk: mvebu/clk-cpu.c: fix memory leakage
ARM: OMAP2+: omap4-panda: add UART2 muxing for WiLink shared transport
ARM: OMAP2+: DT node Timer iteration fix
ARM: OMAP2+: Fix section warning for omap_init_ocp2scp()
ARM: OMAP2+: fix build break for omapdrm
ARM: OMAP2: Fix missing omap2xxx_clkt_vps_late_init function calls
...
The last remaining user was oprofile and its use has been
removed a while ago in commit bc078e4eab
("oprofile: convert oprofile from timer_hook to hrtimer").
There doesn't seem to be any upstream user of this hook
for about two years now. And I'm not even aware of any out of
tree user.
Let's remove it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1356191991-2251-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>