Commit graph

603553 commits

Author SHA1 Message Date
Anna-Maria Gleixner
236968383c timers: Optimize collect_expired_timers() for NOHZ
After a NOHZ idle sleep the timer wheel must be forwarded to current jiffies.
There might be expired timers so the current code loops and checks the expired
buckets for timers. This can take quite some time for long NOHZ idle periods.

The pending bitmask in the timer base allows us to do a quick search for the
next expiring timer and therefore a fast forward of the base time which
prevents pointless long lasting loops.

For a 3 seconds idle sleep this reduces the catchup time from ~1ms to 5us.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.351296290@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:10 +02:00
Anna-Maria Gleixner
73420fea80 timers: Move __run_timers() function
Move __run_timers() below __next_timer_interrupt() and next_pending_bucket()
in preparation for __run_timers() NOHZ optimization.

No functional change.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.271872665@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:09 +02:00
Thomas Gleixner
53bf837b78 timers: Remove set_timer_slack() leftovers
We now have implicit batching in the timer wheel. The slack API is no longer
used, so remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrew F. Davis <afd@ti.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jaehoon Chung <jh80.chung@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathias Nyman <mathias.nyman@intel.com>
Cc: Pali Rohár <pali.rohar@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.189813118@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:09 +02:00
Thomas Gleixner
500462a9de timers: Switch to a non-cascading wheel
The current timer wheel has some drawbacks:

1) Cascading:

   Cascading can be an unbound operation and is completely pointless in most
   cases because the vast majority of the timer wheel timers are canceled or
   rearmed before expiration. (They are used as timeout safeguards, not as
   real timers to measure time.)

2) No fast lookup of the next expiring timer:

   In NOHZ scenarios the first timer soft interrupt after a long NOHZ period
   must fast forward the base time to the current value of jiffies. As we
   have no way to find the next expiring timer fast, the code loops linearly
   and increments the base time one by one and checks for expired timers
   in each step. This causes unbound overhead spikes exactly in the moment
   when we should wake up as fast as possible.

After a thorough analysis of real world data gathered on laptops,
workstations, webservers and other machines (thanks Chris!) I came to the
conclusion that the current 'classic' timer wheel implementation can be
modified to address the above issues.

The vast majority of timer wheel timers is canceled or rearmed before
expiry. Most of them are timeouts for networking and other I/O tasks. The
nature of timeouts is to catch the exception from normal operation (TCP ack
timed out, disk does not respond, etc.). For these kinds of timeouts the
accuracy of the timeout is not really a concern. Timeouts are very often
approximate worst-case values and in case the timeout fires, we already
waited for a long time and performance is down the drain already.

The few timers which actually expire can be split into two categories:

 1) Short expiry times which expect halfways accurate expiry

 2) Long term expiry times are inaccurate today already due to the
    batching which is done for NOHZ automatically and also via the
    set_timer_slack() API.

So for long term expiry timers we can avoid the cascading property and just
leave them in the less granular outer wheels until expiry or
cancelation. Timers which are armed with a timeout larger than the wheel
capacity are no longer cascaded. We expire them with the longest possible
timeout (6+ days). We have not observed such timeouts in our data collection,
but at least we handle them, applying the rule of the least surprise.

To avoid extending the wheel levels for HZ=1000 so we can accomodate the
longest observed timeouts (5 days in the network conntrack code) we reduce the
first level granularity on HZ=1000 to 4ms, which effectively is the same as
the HZ=250 behaviour. From our data analysis there is nothing which relies on
that 1ms granularity and as a side effect we get better batching and timer
locality for the networking code as well.

Contrary to the classic wheel the granularity of the next wheel is not the
capacity of the first wheel. The granularities of the wheels are in the
currently chosen setting 8 times the granularity of the previous wheel.

So for HZ=250 we end up with the following granularity levels:

 Level Offset   Granularity                  Range
     0      0          4 ms                 0 ms -        252 ms
     1     64         32 ms               256 ms -       2044 ms (256ms - ~2s)
     2    128        256 ms              2048 ms -      16380 ms (~2s   - ~16s)
     3    192       2048 ms (~2s)       16384 ms -     131068 ms (~16s  - ~2m)
     4    256      16384 ms (~16s)     131072 ms -    1048572 ms (~2m   - ~17m)
     5    320     131072 ms (~2m)     1048576 ms -    8388604 ms (~17m  - ~2h)
     6    384    1048576 ms (~17m)    8388608 ms -   67108863 ms (~2h   - ~18h)
     7    448    8388608 ms (~2h)    67108864 ms -  536870911 ms (~18h  - ~6d)

That's a worst case inaccuracy of 12.5% for the timers which are queued at the
beginning of a level.

So the new wheel concept addresses the old issues:

1) Cascading is avoided completely

2) By keeping the timers in the bucket until expiry/cancelation we can track
   the buckets which have timers enqueued in a bucket bitmap and therefore can
   look up the next expiring timer very fast and O(1).

A further benefit of the concept is that the slack calculation which is done
on every timer start is no longer necessary because the granularity levels
provide natural batching already.

Our extensive testing with various loads did not show any performance
degradation vs. the current wheel implementation.

This patch does not address the 'fast lookup' issue as we wanted to make sure
that there is no regression introduced by the wheel redesign. The
optimizations are in follow up patches.

This patch contains fixes from Anna-Maria Gleixner and Richard Cochran.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.108621834@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:09 +02:00
Thomas Gleixner
b0d6e2dcb2 timers: Reduce the CPU index space to 256k
We want to store the array index in the flags space. 256k CPUs should be
enough for a while.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.030144293@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:08 +02:00
Thomas Gleixner
494af3ed78 timers: Give a few structs and members proper names
Some of the names in the internal implementation of the timer code
are not longer correct and others are simply too long to type.

Clean it up before we switch the wheel implementation over to
the new scheme.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.948752516@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:08 +02:00
Thomas Gleixner
15dba1e37b hlist: Add hlist_is_singular_node() helper
Required to figure out whether the entry is the only one in the hlist.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.867631372@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:07 +02:00
Thomas Gleixner
2b1ecc3d1a signals: Use hrtimer for sigtimedwait()
We've converted most timeout related syscalls to hrtimers, but
sigtimedwait() did not get this treatment.

Convert it so we get a reasonable accuracy and remove the
user space exposure to the timer wheel properties.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Cyril Hrubis <chrubis@suse.cz>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.787164909@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:07 +02:00
Thomas Gleixner
177ec0a0a5 timers: Remove the deprecated mod_timer_pinned() API
We switched all users to initialize the timers as pinned and call
mod_timer(). Remove the now unused timer API function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.706205231@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:06 +02:00
Thomas Gleixner
f3438bc781 timers, net/ipv4/inet: Initialize connection request timers as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.617891430@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:06 +02:00
Thomas Gleixner
853f90d49b timers, drivers/tty/mips_ejtag: Initialize the poll timer as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.537448301@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:34:59 +02:00
Thomas Gleixner
01bab536b1 timers, drivers/tty/metag_da: Initialize the poll timer as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.456452642@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:14 +02:00
Thomas Gleixner
25df57605e timers, driver/net/ethernet/tile: Initialize the egress timer as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.376394205@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:14 +02:00
Thomas Gleixner
7bc54b652f timers, cpufreq/powernv: Initialize the gpstate timer as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.297014487@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:14 +02:00
Thomas Gleixner
f9c287ba38 timers, x86/mce: Initialize MCE restart timer as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.215783439@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:14 +02:00
Thomas Gleixner
920a4a70c5 timers, x86/apic/uv: Initialize the UV heartbeat timer as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.133837204@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:14 +02:00
Thomas Gleixner
e675447bda timers: Make 'pinned' a timer property
We want to move the timer migration logic from a 'push' to a 'pull' model.

Under the current 'push' model pinned timers are handled via
a runtime API variant: mod_timer_pinned().

The 'pull' model requires us to store the pinned attribute of a timer
in the timer_list structure itself, as a new TIMER_PINNED bit in
timer->flags.

This flag must be set at initialization time and the timer APIs
recognize the flag.

This patch:

 - Implements the new flag and associated new-style initialization
   methods

 - makes mod_timer() recognize new-style pinned timers,

 - and adds some migration helper facility to allow
   step by step conversion of old-style to new-style
   pinned timers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.049338558@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:13 +02:00
Christophe Jaillet
34c720a915 clocksource/drivers/cadence_ttc: fix a return value in case of error
IS_ERR and PTR_ERR should use the same variable, clk_ce in this case.

Fixes: 4de1eb07c47f (Convert init function to return error)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
2016-07-07 09:44:38 +02:00
Mark Rutland
2c81a64770 perf/core: Fix pmu::filter_match for SW-led groups
The following commit:

  66eb579e66 ("perf: allow for PMU-specific event filtering")

added the pmu::filter_match() callback. This was intended to
avoid HW constraints on events from resulting in extremely
pessimistic scheduling.

However, pmu::filter_match() is only called for the leader of each event
group. When the leader is a SW event, we do not filter the groups, and
may fail at pmu::add() time, and when this happens we'll give up on
scheduling any event groups later in the list until they are rotated
ahead of the failing group.

This can result in extremely sub-optimal event scheduling behaviour,
e.g. if running the following on a big.LITTLE platform:

$ taskset -c 0 ./perf stat \
 -e 'a57{context-switches,armv8_cortex_a57/config=0x11/}' \
 -e 'a53{context-switches,armv8_cortex_a53/config=0x11/}' \
 ls

     <not counted>      context-switches                                              (0.00%)
     <not counted>      armv8_cortex_a57/config=0x11/                                 (0.00%)
                24      context-switches                                              (37.36%)
          57589154      armv8_cortex_a53/config=0x11/                                 (37.36%)

Here the 'a53' event group was always eligible to be scheduled, but
the 'a57' group never eligible to be scheduled, as the task was always
affine to a Cortex-A53 CPU. The SW (group leader) event in the 'a57'
group was eligible, but the HW event failed at pmu::add() time,
resulting in ctx_flexible_sched_in giving up on scheduling further
groups with HW events.

One way of avoiding this is to check pmu::filter_match() on siblings
as well as the group leader. If any of these fail their
pmu::filter_match() call, we must skip the entire group before
attempting to add any events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 66eb579e66 ("perf: allow for PMU-specific event filtering")
Link: http://lkml.kernel.org/r/1465917041-15339-1-git-send-email-mark.rutland@arm.com
[ Small readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 08:57:57 +02:00
Dave Airlie
27c0b7419c Merge branch 'linux-4.7' of git://github.com/skeggsb/linux into drm-fixes
Just one fix for a stupid thinko in a DP training pattern commit.

* 'linux-4.7' of git://github.com/skeggsb/linux:
  drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern
2016-07-07 12:40:12 +10:00
Dave Airlie
fd50870296 Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Just a couple of fixes for amdgpu for 4.7:
- 2 small tonga powerplay fixes
- Additional Polaris fixes

* 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation.
  drm/amd/powerplay: fix bug that get wrong polaris evv voltage.
  drm/amd/powerplay: incorrectly use of the function return value
  drm/amd/powerplay: fix incorrect voltage table value for tonga
  drm/amd/powerplay: fix incorrect voltage table value for polaris10
2016-07-07 12:37:42 +10:00
Randy Dunlap
076501ff6b init/Kconfig: keep Expert users menu together
The "expert" menu was broken (split) such that all entries in it after
KALLSYMS were displayed in the "General setup" area instead of in the
"Expert users" area.  Fix this by adding one kconfig dependency.

Yes, the Expert users menu is fragile.  Problems like this have happened
several times in the past.  I will attempt to isolate the Expert users
menu if there is interest in that.

Fixes: 4d5d5664c9 ("x86: kallsyms: disable absolute percpu symbols on !SMP")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable@vger.kernel.org  # 4.6
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-06 16:27:20 -07:00
Rex Zhu
ab6bad05c8 drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation.
As get the right evv voltage, update them to latest coefficients to
align with BB.

agd: squash in Slava's 32 bit build fix

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-06 17:56:31 -04:00
Rex Zhu
e5eb37170b drm/amd/powerplay: fix bug that get wrong polaris evv voltage.
value is 32 bits for polaris, not 16.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-06 17:44:14 -04:00
Rex Zhu
4b2427605e drm/amd/powerplay: incorrectly use of the function return value
'0' means true.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2016-07-06 17:43:59 -04:00
Huang Rui
1dfefee893 drm/amd/powerplay: fix incorrect voltage table value for tonga
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2016-07-06 16:16:43 -04:00
Huang Rui
095d28c62f drm/amd/powerplay: fix incorrect voltage table value for polaris10
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-06 16:16:09 -04:00
Linus Torvalds
bc86765181 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) All users of AF_PACKET's fanout feature want a symmetric packet
    header hash for load balancing purposes, so give it to them.

 2) Fix vlan state synchronization in e1000e, from Jarod Wilson.

 3) Use correct socket pointer in ip_skb_dst_mtu(), from Shmulik
    Ladkani.

 4) mlx5 bug fixes from Mohamad Haj Yahia, Daniel Jurgens, Matthew
    Finlay, Rana Shahout, and Shaker Daibes.  Mostly to do with
    operation timeouts and PCI error handling.

 5) Fix checksum handling in mirred packet action, from WANG Cong.

 6) Set skb->dev correctly when transmitting in !protect_frames case of
    macsec driver, from Daniel Borkmann.

 7) Fix MTU calculation in geneve driver, from Haishuang Yan.

 8) Missing netif_napi_del() in unregister path of qeth driver, from
    Ursula Braun.

 9) Handle malformed route netlink messages in decnet properly, from
    Vergard Nossum.

10) Memory leak of percpu data in ipv6 routing code, from Martin KaFai
    Lau.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
  ipv6: Fix mem leak in rt6i_pcpu
  net: fix decnet rtnexthop parsing
  cxgb4: update latest firmware version supported
  net/mlx5: Avoid setting unused var when modifying vport node GUID
  bonding: fix enslavement slave link notifications
  r8152: fix runtime function for RTL8152
  qeth: delete napi struct when removing a qeth device
  Revert "fsl/fman: fix error handling"
  fsl/fman: fix error handling
  cdc_ncm: workaround for EM7455 "silent" data interface
  RDS: fix rds_tcp_init() error path
  geneve: fix max_mtu setting
  net: phy: dp83867: Fix initialization of PHYCR register
  enc28j60: Fix race condition in enc28j60 driver
  net: stmmac: Fix null-function call in ISR on stmmac1000
  tipc: fix nl compat regression for link statistics
  net: bcmsysport: Device stats are unsigned long
  macsec: set actual real device for xmit when !protect_frames
  net_sched: fix mirrored packets checksum
  packet: Use symmetric hash for PACKET_FANOUT_HASH.
  ...
2016-07-06 09:42:43 -07:00
Linus Torvalds
4cdbbbd11f sound fixes for 4.7-rc7
Here are a collection of small fixes: at this time, we've got a
 slightly high amount, but all small and trivial fixes, and nothing
 scary can be seen there.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXe+McAAoJEGwxgFQ9KSmkKx8QALm5AuHci2vtfGo4omz1irxw
 CEGvAmSeP8PbfvyBo0duGByTQqUgZaVp50yb9WZaeHskzJukv2VBHeJwE/TgsNI4
 tsJUFF0sp9uYrn6Y/DLBFFeS4qWk4EWr5tusWlCZIzlRxweqIh1WukVU79WgoT9O
 DYdz3O9Sf/Wqkp9el+fjvdl3CuiZMNdp+8AOffx1sF3bUBOxOAMbnG1holYp9aqR
 MbGMpsaKBzKrblhjlKkEHKPK2+ZDM5Fn6x17fQ4Zqy3eZA060m75nE+subVoKrVW
 MvG+WZk01ploDYS1/qflv/RxQFUc7GSrBVnGFkr+KGyTSxuGLqllQl34pcBTJd8M
 xW8sVCTmtVN9eikwqjZbf7anwlfhcqiyBaCqp9muJTJpgROGxPOmxlDvKiHpjRT4
 4PJXBnveWFSG+sgj51d2oPrPf3oXO/8znpamx5caqwUzy3pynlpD1TjX+L6kO/YW
 +bUnOM4TsNmpZ2RExArFs09rTmqxFe/npW/rcEdKryTlDNiJrxfPMagAViIl1Ejd
 l1FONwGbFpWAVyvpaTskDp/nOsURpVbtVsEieOJThUPZVbYq0SQOjhc7V7YkBCLn
 Oa+/oItJ4vPP+C3upi4rYyX6NY3t0si9XnokSzUMF69XpcdHmUkXBnxXI/z3XAr5
 pC86QGz7UlQrqCsvgpZW
 =8ER4
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Here are a collection of small fixes: at this time, we've got a
  slightly high amount, but all small and trivial fixes, and nothing
  scary can be seen there"

* tag 'sound-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ALSA: hda/realtek: Add Lenovo L460 to docking unit fixup
  ALSA: timer: Fix negative queue usage by racy accesses
  ASoC: rt5645: fix reg-2f default value.
  ASoC: fsl_ssi: Fix number of words per frame for I2S-slave mode
  ALSA: au88x0: Fix calculation in vortex_wtdma_bufshift()
  ALSA: hda - Add PCI ID for Kabylake-H
  ALSA: echoaudio: Fix memory allocation
  ASoC: Intel: atom: fix missing breaks that would cause the wrong operation to execute
  ALSA: hda - fix read before array start
  ASoC: cx20442: set tty->receiver_room in v253_open
  ASoC: ak4613: Enable cache usage to fix crashes on resume
  ASoC: wm8940: Enable cache usage to fix crashes on resume
  ASoC: Intel: Skylake: Initialize module list for Broxton
  ASoC: wm5102: Correct supported channels on trace compressed DAI
  ASoC: wm5110: Add missing route from OUT3R to SYSCLK
  ASoC: rt5670: fix HP Playback Volume control
  ASoC: hdmi-codec: select CONFIG_HDMI
  ASoC: davinci-mcasp: Fix dra7 DMA offset when using CFG port
  ASoC: hdac_hdmi: Fix potential NULL dereference
  ASoC: ak4613: Remove owner assignment from platform_driver
  ...
2016-07-06 09:12:43 -07:00
Linus Torvalds
4d0a279c7c platform/chrome: Fix for double-fetched ioctl args
A single fix this time, closing a window where ioctl args are fetched twice.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXfCCOAAoJEIwa5zzehBx3QiMP/AsDZuoswx76p7aN8lg/hOvc
 /KkYup5EhJ4rohUNbZza8RouyUE8TW6i8INuCJjrzwO32oTL+yaNbUeElNyU8E6W
 QLWeQi5j3zKMPQgdHcC229Gn4B52UEvSsLpTtjvdAfCIq8MfEsO2vdkFYj39HFRh
 uuW2eOZvOpHzl2FWZrIa1hDley5sEUX0tfR5xmzkLKgsuUx//ADYKCYWlyDLxUHn
 4+jkip/F29lStOTMo8FLw1OS/Sj+7mxziLGxpMgkpRVpXVx4FrwGLIDIQtf8AtAq
 3091Y/JIJhEB3qY7b0TMPjGNxUGaZIT+aAPFj4mIeYUIpNkGhaOaUyLyI35osJaJ
 3ev+PYgIi7ol1TT3wgdKqICfKRGgC1O81LGPYrrJBt0zEzeFSomQOGREKx4Af7uf
 mqYjoJVgDEKy5JUHrvL9G9/Hlw9B9nM6hqC2fxOWYWhGRmKa61YDV9Csy5xRm3DU
 4faWE4oVMgp4CnAkMoCge8Dzj1NHYSex+R39dg6jKItkIV9h5jZKCNPJGhFw+Hlz
 FgSdbjY8qTy1NnWct7+dV96oWfpNNbCKGApZPKOoVmHAqvbXh0iqjxU9we5bzA/j
 3Ou1FfwRzWU3xSMwbSNLZnh1wEyQvMjlR2ScrAhSzLokID0vLe8v+4+PRRiFbGyI
 VALMkYaroIBFbrRV7UXE
 =xPkx
 -----END PGP SIGNATURE-----

Merge tag 'chrome-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform

Pull chrome platform fix from Olof Johansson:
 "A single fix this time, closing a window where ioctl args are fetched
  twice"

* tag 'chrome-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform:
  platform/chrome: cros_ec_dev - double fetch bug in ioctl
2016-07-06 09:07:23 -07:00
Joerg Roedel
522e5cb76d iommu/amd: Fix unity mapping initialization race
There is a race condition in the AMD IOMMU init code that
causes requested unity mappings to be blocked by the IOMMU
for a short period of time. This results on boot failures
and IO_PAGE_FAULTs on some machines.

Fix this by making sure the unity mappings are installed
before all other DMA is blocked.

Fixes: aafd8ba0ca ('iommu/amd: Implement add_device and remove_device')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-07-06 18:04:55 +02:00
James Bottomley
ea1a25c334 Merge branch 'jejb-fixes' into fixes 2016-07-06 07:25:55 -07:00
David Daney
88d02a2ba6 MIPS: Fix page table corruption on THP permission changes.
When the core THP code is modifying the permissions of a huge page it
calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit
of the page table entry.  The result can be kernel messages like:

mm/memory.c:397: bad pmd 000000040080004d.
mm/memory.c:397: bad pmd 00000003ff00004d.
mm/memory.c:397: bad pmd 000000040100004d.

or:

------------[ cut here ]------------
WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158()
Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80
CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4
Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0
          0000000000000000 0000000000000000 ffffffff85110000 0000000000000119
          0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f
          0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000
          0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80
          ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001
          0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924
          80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780
          80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f
          0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000
          ...
Call Trace:
[<ffffffff80865c4c>] show_stack+0x6c/0xf8
[<ffffffff80885780>] warn_slowpath_common+0x78/0xa8
[<ffffffff809207a0>] exit_mmap+0x150/0x158
[<ffffffff80882d44>] mmput+0x5c/0x110
[<ffffffff8088b450>] do_exit+0x230/0xa68
[<ffffffff8088be34>] do_group_exit+0x54/0x1d0
[<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18

---[ end trace c7b38293191c57dc ]---
BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536

Fix by not clearing _PAGE_HUGE bit.

Signed-off-by: David Daney <david.daney@cavium.com>
Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Cc: stable@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13687/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-06 15:09:03 +02:00
Ville Syrjälä
175a20c16f x86/perf/intel/rapl: Fix module name collision with powercap intel-rapl
Since commit 4b6e2571bf the rapl perf module calls itself intel-rapl. That
name was already in use by the rapl powercap driver, which now fails to load
if the perf module is loaded. Fix the problem by renaming the perf module to
intel-rapl-perf, so that both modules can coexist.

Fixes: 4b6e2571bf ("x86/perf/intel/rapl: Make the Intel RAPL PMU driver modular")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1466694409-3620-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-06 12:51:59 +02:00
Martin KaFai Lau
903ce4abdf ipv6: Fix mem leak in rt6i_pcpu
It was first reported and reproduced by Petr (thanks!) in
https://bugzilla.kernel.org/show_bug.cgi?id=119581

free_percpu(rt->rt6i_pcpu) used to always happen in ip6_dst_destroy().

However, after fixing a deadlock bug in
commit 9c7370a166 ("ipv6: Fix a potential deadlock when creating pcpu rt"),
free_percpu() is not called before setting non_pcpu_rt->rt6i_pcpu to NULL.

It is worth to note that rt6i_pcpu is protected by table->tb6_lock.

kmemleak somehow did not report it.  We nailed it down by
observing the pcpu entries in /proc/vmallocinfo (first suggested
by Hannes, thanks!).

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Fixes: 9c7370a166 ("ipv6: Fix a potential deadlock when creating pcpu rt")
Reported-by: Petr Novopashenniy <pety@rusnet.ru>
Tested-by: Petr Novopashenniy <pety@rusnet.ru>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Petr Novopashenniy <pety@rusnet.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05 14:09:23 -07:00
Vegard Nossum
ab58298cf4 net: fix decnet rtnexthop parsing
dn_fib_count_nhs() could enter an infinite loop if nhp->rtnh_len == 0
(i.e. if userspace passes a malformed netlink message).

Let's use the helpers from net/nexthop.h which take care of all this
stuff. We can do exactly the same as e.g. fib_count_nexthops() and
fib_get_nhs() from net/ipv4/fib_semantics.c.

This fixes the softlockup for me.

Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05 14:08:47 -07:00
Lv Zheng
7e3fd81371 ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal
The FIFO unlocking mechanism in acpi_dbg has been broken by the
following commit:

  Commit: 287980e49f
  Subject: remove lots of IS_ERR_VALUE abuses

It converted !IS_ERR_VALUE(ret) into !ret which was not entirely
correct. Fix the regression by taking ret > 0 into account too as
appropriate.

Fixes: 287980e49f (remove lots of IS_ERR_VALUE abuses)
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Simplifications, changelog & subject massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-05 23:02:34 +02:00
Dan Carpenter
096cdc6f52 platform/chrome: cros_ec_dev - double fetch bug in ioctl
We verify "u_cmd.outsize" and "u_cmd.insize" but we need to make sure
that those values have not changed between the two copy_from_user()
calls.  Otherwise it could lead to a buffer overflow.

Additionally, cros_ec_cmd_xfer() can set s_cmd->insize to a lower value.
We should use the new smaller value so we don't copy too much data to
the user.

Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Fixes: a841178445 ('mfd: cros_ec: Use a zero-length array for command data')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Gwendal Grignou <gwendal@chromium.org>
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-07-05 14:01:52 -07:00
Ben Skeggs
217215041b drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern
Fixes a regression caused by a stupid thinko from "disp/sor/gf119: both
links use the same training register".

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-07-06 06:56:37 +10:00
Lv Zheng
45209046c4 ACPICA: Namespace: Fix namespace/interpreter lock ordering
There is a lock order issue in acpi_load_tables(). The namespace lock
is held before holding the interpreter lock.

With ACPI_MUTEX_DEBUG enabled in the kernel, this is printed to the
log during boot:

  [    0.885699] ACPI Error: Invalid acquire order: Thread 405884224 owns [ACPI_MTX_Namespace], wants [ACPI_MTX_Interpreter] (20160422/utmutex-263)
  [    0.885881] ACPI Error: Could not acquire AML Interpreter mutex (20160422/exutils-95)
  [    0.893846] ACPI Error: Mutex [0x0] is not acquired, cannot release (20160422/utmutex-326)
  [    0.894019] ACPI Error: Could not release AML Interpreter mutex (20160422/exutils-133)

The issue has been introduced by the following commit:

  Commit: 2f38b1b16d
  ACPICA Commit: bfe03ffcde8ed56a7eae38ea0b188aeb12f9c52e
  Subject: ACPICA: Namespace: Fix a regression that MLC support triggers
           dead lock in dynamic table loading

Which fixed a deadlock issue for acpi_ns_load_table() in
acpi_ex_add_table() but didn't take care of the lock order in
acpi_ns_load_table() correctly.

Originally (before the above commit), ACPICA used the
namespace/interpreter locks in the following 2 key code
paths:

 1. Table loading:
 acpi_ns_load_table
	L(Namespace)
		acpi_ns_parse_table
			acpi_ns_one_complete_parse
	U(Namespace)
 2. Object evaluation:
 acpi_ns_evaluate
	L(Interpreter)
	acpi_ps_execute_method
		U(Interpreter)
		acpi_ns_load_table
			L(Namespace)
			U(Namespace)
		acpi_ev_initialize_region
			L(Namespace)
			U(Namespace)
		address_space.setup
			L(Namespace)
			U(Namespace)
		address_space.handler
			L(Namespace)
			U(Namespace)
		acpi_os_wait_semaphore
		acpi_os_acquire_mutex
		acpi_os_sleep
		L(Interpreter)
	U(Interpreter)

During runtime, while acpi_ns_evaluate is called, the lock order is
always Interpreter -> Namespace.

In turn, the problematic commit acquires the locks in the following
order:

 3. Table loading:
 acpi_ns_load_table
	L(Namespace)
		acpi_ns_parse_table
		L(Interpreter)
			acpi_ns_one_complete_parse
		U(Interpreter)
	U(Namespace)

To fix the lock order issue, move the interpreter lock to
acpi_ns_load_table() to ensure the lock order correctness:

 4. Table loading:
 acpi_ns_load_table
	L(Interpreter)
	L(Namespace)
		acpi_ns_parse_table
			acpi_ns_one_complete_parse
	U(Namespace)
	U(Interpreter)

However, this doesn't fix the current design issues related to the
namespace lock. For example, we can notice that in acpi_ns_evaluate(),
outside of acpi_ns_load_table(), the namespace objects may be created
by the named object creation control methods. And the creation of
the method-owned namespace objects are not locked by the namespace
lock. This patch doesn't try to fix such kind of existing issues.

Fixes: 2f38b1b16d (ACPICA: Namespace: Fix a regression that MLC support triggers dead lock in dynamic table loading)
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-05 22:48:44 +02:00
Bruno Prémont
262e2bfd7d qla2xxx: Fix NULL pointer deref in QLA interrupt
In qla24xx_process_response_queue() rsp->msix->cpuid may trigger NULL
pointer dereference when rsp->msix is NULL:

[    5.622457] NULL pointer dereference at 0000000000000050
[    5.622457] IP: [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
[    5.622457] PGD 0
[    5.622457] Oops: 0000 [#1] SMP
[    5.622457] Modules linked in:
[    5.622457] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.6.3-x86_64 #1
[    5.622457] Hardware name: HP ProLiant DL360 G5, BIOS P58 05/02/2011
[    5.622457] task: ffff8801a88f3740 ti: ffff8801a8954000 task.ti: ffff8801a8954000
[    5.622457] RIP: 0010:[<ffffffff8155e614>]  [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
[    5.622457] RSP: 0000:ffff8801afb03de8  EFLAGS: 00010002
[    5.622457] RAX: 0000000000000000 RBX: 0000000000000032 RCX: 00000000ffffffff
[    5.622457] RDX: 0000000000000002 RSI: ffff8801a79bf8c8 RDI: ffff8800c8f7e7c0
[    5.622457] RBP: ffff8801afb03e68 R08: 0000000000000000 R09: 0000000000000000
[    5.622457] R10: 00000000ffff8c47 R11: 0000000000000002 R12: ffff8801a79bf8c8
[    5.622457] R13: ffff8800c8f7e7c0 R14: ffff8800c8f60000 R15: 0000000000018013
[    5.622457] FS:  0000000000000000(0000) GS:ffff8801afb00000(0000) knlGS:0000000000000000
[    5.622457] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.622457] CR2: 0000000000000050 CR3: 0000000001e07000 CR4: 00000000000006e0
[    5.622457] Stack:
[    5.622457]  ffff8801afb03e30 ffffffff810c0f2d 0000000000000086 0000000000000002
[    5.622457]  ffff8801afb03e28 ffffffff816570e1 ffff8800c8994628 0000000000000002
[    5.622457]  ffff8801afb03e60 ffffffff816772d4 b47c472ad6955e68 0000000000000032
[    5.622457] Call Trace:
[    5.622457]  <IRQ>
[    5.622457]  [<ffffffff810c0f2d>] ? __wake_up_common+0x4d/0x80
[    5.622457]  [<ffffffff816570e1>] ? usb_hcd_resume_root_hub+0x51/0x60
[    5.622457]  [<ffffffff816772d4>] ? uhci_hub_status_data+0x64/0x240
[    5.622457]  [<ffffffff81560d00>] qla24xx_intr_handler+0xf0/0x2e0
[    5.622457]  [<ffffffff810d569e>] ? get_next_timer_interrupt+0xce/0x200
[    5.622457]  [<ffffffff810c89b4>] handle_irq_event_percpu+0x64/0x100
[    5.622457]  [<ffffffff810c8a77>] handle_irq_event+0x27/0x50
[    5.622457]  [<ffffffff810cb965>] handle_edge_irq+0x65/0x140
[    5.622457]  [<ffffffff8101a498>] handle_irq+0x18/0x30
[    5.622457]  [<ffffffff8101a276>] do_IRQ+0x46/0xd0
[    5.622457]  [<ffffffff817f8fff>] common_interrupt+0x7f/0x7f
[    5.622457]  <EOI>
[    5.622457]  [<ffffffff81020d38>] ? mwait_idle+0x68/0x80
[    5.622457]  [<ffffffff8102114a>] arch_cpu_idle+0xa/0x10
[    5.622457]  [<ffffffff810c1b97>] default_idle_call+0x27/0x30
[    5.622457]  [<ffffffff810c1d3b>] cpu_startup_entry+0x19b/0x230
[    5.622457]  [<ffffffff810324c6>] start_secondary+0x136/0x140
[    5.622457] Code: 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 8b 47 58 a8 02 0f 84 c5 00 00 00 48 8b 46 50 49 89 f4 65 8b 15 34 bb aa 7e <39> 50 50 74 11 89 50 50 48 8b 46 50 8b 40 50 41 89 86 60 8b 00
[    5.622457] RIP  [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
[    5.622457]  RSP <ffff8801afb03de8>
[    5.622457] CR2: 0000000000000050
[    5.622457] ---[ end trace fa2b19c25106d42b ]---
[    5.622457] Kernel panic - not syncing: Fatal exception in interrupt

The affected code was introduced by commit cdb898c52d
(qla2xxx: Add irq affinity notification).

Only dereference rsp->msix when it has been set so the machine can boot
fine. Possibly rsp->msix is unset because:
[    3.479679] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.33-k.
[    3.481839] qla2xxx [0000:13:00.0]-001d: : Found an ISP2432 irq 17 iobase 0xffffc90000038000.
[    3.484081] qla2xxx [0000:13:00.0]-0035:0: MSI-X; Unsupported ISP2432 (0x2, 0x3).
[    3.485804] qla2xxx [0000:13:00.0]-0037:0: Falling back-to MSI mode -258.
[    3.890145] scsi host0: qla2xxx
[    3.891956] qla2xxx [0000:13:00.0]-00fb:0: QLogic QLE2460 - PCI-Express Single Channel 4Gb Fibre Channel HBA.
[    3.894207] qla2xxx [0000:13:00.0]-00fc:0: ISP2432: PCIe (2.5GT/s x4) @ 0000:13:00.0 hdma+ host#=0 fw=7.03.00 (9496).
[    5.714774] qla2xxx [0000:13:00.0]-500a:0: LOOP UP detected (4 Gbps).

Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
CC: <stable@vger.kernel.org>  # 4.5+
Fixes: cdb898c52d
Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com>
2016-07-05 12:42:54 -07:00
Ganesh Goudar
f5d6516120 cxgb4: update latest firmware version supported
Change t4fw_version.h to update latest firmware version number

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05 11:53:25 -07:00
Or Gerlitz
eae033c1b8 net/mlx5: Avoid setting unused var when modifying vport node GUID
GCC complains on unused-but-set-variable, clean this up.

Fixes: 23898c763f ('net/mlx5: E-Switch, Modify node guid on vf set MAC')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05 11:52:42 -07:00
Aviv Heller
a30b016808 bonding: fix enslavement slave link notifications
Currently, link notifications are not sent by
bond_set_slave_link_state() upon enslavement if
the slave is enslaved when up.

This happens because slave->link default init value
is 0, which is the same as BOND_LINK_UP, resulting
in bond_set_slave_link_state() ignoring this transition.

This patch sets the default value of slave->link to
BOND_LINK_NOCHANGE, assuring it will count as a state
transition and thus trigger notification logic.

Signed-off-by: Aviv Heller <avivh@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05 11:51:55 -07:00
hayeswang
2609af1936 r8152: fix runtime function for RTL8152
The RTL8152 doesn't have U1U2 and U2P3 features, so use different
runtime functions for RTL812 and RTL8153 by adding autosuspend_en()
to rtl_ops.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05 10:22:29 -07:00
Linus Walleij
92c74bceb0 Revert "gpio: gpiolib-of: Allow compile testing"
This reverts commit 1e4a806403.

This creates more problems than it solves right now. Compile
testing needs to go in with patches fixing the problems it
uncovers.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-07-05 19:03:04 +02:00
Jisheng Zhang
5130213721 tick/broadcast-hrtimer: Set name of the ce_broadcast_hrtimer
This is to avoid the "null" name when we either

~ # cat /sys/devices/system/clockevents/broadcast/current_device
(null)

or

~ # cat /proc/timer_list
...
Tick Device: mode:     1
Broadcast device
Clock Event Device: (null)
...

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1467709071-3667-1-git-send-email-jszhang@marvell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-05 17:02:19 +02:00
Paul Burton
547aefc4db irqchip/mips-gic: Match IPI IRQ domain by bus token only
Commit fbde2d7d82 ("MIPS: Add generic SMP IPI support") introduced
code which calls irq_find_matching_host with a NULL node parameter in
order to discover IPI IRQ domains which are not associated with the DT
root node's interrupt parent. This suggests that implementations of IPI
IRQ domains should effectively ignore the node parameter if it is NULL
and search purely based upon the bus token. Commit 2af70a9620
("irqchip/mips-gic: Add a IPI hierarchy domain") did not do this when
implementing the GIC IPI IRQ domain, and on MIPS Boston boards this
leads to no IPI domain being discovered and a NULL pointer dereference
when attempting to send an IPI:

  CPU 0 Unable to handle kernel paging request at virtual address 0000000000000040, epc == ffffffff8016e70c, ra == ffffffff8010ff5c
  Oops[#1]:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc6-00223-gad0d1b6 #945
  task: a8000000ff066fc0 ti: a8000000ff068000 task.ti: a8000000ff068000
  $ 0   : 0000000000000000 0000000000000001 ffffffff80730000 0000000000000003
  $ 4   : 0000000000000000 ffffffff8057e5b0 a800000001e3ee00 0000000000000000
  $ 8   : 0000000000000000 0000000000000023 0000000000000001 0000000000000001
  $12   : 0000000000000000 ffffffff803323d0 0000000000000000 0000000000000000
  $16   : 0000000000000000 0000000000000000 0000000000000001 ffffffff801108fc
  $20   : 0000000000000000 ffffffff8057e5b0 0000000000000001 0000000000000000
  $24   : 0000000000000000 ffffffff8012de28
  $28   : a8000000ff068000 a8000000ff06fbc0 0000000000000000 ffffffff8010ff5c
  Hi    : ffffffff8014c174
  Lo    : a800000001e1e140
  epc   : ffffffff8016e70c __ipi_send_mask+0x24/0x11c
  ra    : ffffffff8010ff5c mips_smp_send_ipi_mask+0x68/0x178
  Status: 140084e2        KX SX UX KERNEL EXL
  Cause : 00800008 (ExcCode 02)
  BadVA : 0000000000000040
  PrId  : 0001a920 (MIPS I6400)
  Process swapper/0 (pid: 1, threadinfo=a8000000ff068000, task=a8000000ff066fc0, tls=0000000000000000)
  Stack : 0000000000000000 0000000000000000 0000000000000001 ffffffff801108fc
            0000000000000000 ffffffff8057e5b0 0000000000000001 ffffffff8010ff5c
            0000000000000001 0000000000000020 0000000000000000 0000000000000000
            0000000000000000 ffffffff801108fc 0000000000000000 0000000000000001
            0000000000000001 0000000000000000 0000000000000000 ffffffff801865e8
            a8000000ff0c7500 a8000000ff06fc90 0000000000000001 0000000000000002
            ffffffff801108fc ffffffff801868b8 0000000000000000 ffffffff801108fc
            0000000000000000 0000000000000003 ffffffff8068c700 0000000000000001
            ffffffff80730000 0000000000000001 a8000000ff00a290 ffffffff80110c50
            0000000000000003 a800000001e48308 0000000000000003 0000000000000008
            ...
  Call Trace:
  [<ffffffff8016e70c>] __ipi_send_mask+0x24/0x11c
  [<ffffffff8010ff5c>] mips_smp_send_ipi_mask+0x68/0x178
  [<ffffffff801865e8>] generic_exec_single+0x150/0x170
  [<ffffffff801868b8>] smp_call_function_single+0x108/0x160
  [<ffffffff80110c50>] cps_boot_secondary+0x328/0x394
  [<ffffffff80110534>] __cpu_up+0x38/0x90
  [<ffffffff8012de4c>] bringup_cpu+0x24/0xac
  [<ffffffff8012df40>] cpuhp_up_callbacks+0x58/0xdc
  [<ffffffff8012e648>] cpu_up+0x118/0x18c
  [<ffffffff806dc158>] smp_init+0xbc/0xe8
  [<ffffffff806d4c18>] kernel_init_freeable+0xa0/0x228
  [<ffffffff8056c908>] kernel_init+0x10/0xf0
  [<ffffffff80105098>] ret_from_kernel_thread+0x14/0x1c

Fix this by allowing the GIC IPI IRQ domain to match purely based upon
the bus token if the node provided is NULL.

Fixes: 2af70a9620 ("irqchip/mips-gic: Add a IPI hierarchy domain")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160705132600.27730-2-paul.burton@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-05 16:54:21 +02:00
Paul Burton
99ec8a3608 irqchip/mips-gic: Map to VPs using HW VPNum
When mapping an interrupt to a VP(E) we must use the identifier for the
VP that the hardware expects, and this does not always match up with the
Linux CPU number. Commit d46812bb0b ("irqchip: mips-gic: Use HW IDs
for VPE_OTHER_ADDR") corrected this for the cases that existed at the
time it was written, but commit 2af70a9620 ("irqchip/mips-gic: Add a
IPI hierarchy domain") added another case before the former patch was
merged. This leads to incorrectly using Linux CPU numbers when mapping
interrupts to VPs, which breaks on certain systems such as those with
multi-core I6400 CPUs. Fix by adding the appropriate call to
mips_cm_vp_id() to retrieve the expected VP identifier.

Fixes: d46812bb0b ("irqchip: mips-gic: Use HW IDs for VPE_OTHER_ADDR")
Fixes: 2af70a9620 ("irqchip/mips-gic: Add a IPI hierarchy domain")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160705132600.27730-1-paul.burton@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-05 16:54:21 +02:00
Torsten Hilbrich
9cd2574376 ALSA: hda/realtek: Add Lenovo L460 to docking unit fixup
This solves the issue that a headphone is not working on the docking
unit.

Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-07-05 12:09:52 +02:00