I.e. in C_WF_REPORT_PARAMS or in C_WF_CONNECTION.
Sending may already work in these cstates, but the peer still expects
the HandShake / ConnectionFeatures packet.
Actually triggered by the Testuite on kugel.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
If the asender thread, or request_timer_fn(), or some other part of
the code, decided to drop the connection (because of timeout or other),
but the receiver just now was processing a P_STATE packet, there was a
chance that receive_state() would do a hard state change
"re-establishing" an already failed connection without additional handshake.
Log excerpt:
Remote failed to finish a request within ko-count * timeout
peer( Secondary -> Unknown ) conn( Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
asender terminated
...
peer( Unknown -> Secondary ) conn( Timeout -> Connected ) pdsk( DUnknown -> UpToDate ) peer_isp( 0 -> 1 )
...
Connection closed
peer( Secondary -> Unknown ) conn( Connected -> Unconnected ) pdsk( UpToDate -> DUnknown ) peer_isp( 1 -> 0 )
receiver terminated
Impact:
while the connection state is erroneously "Connected",
requests may be queued and even sent,
which would never be acknowledged,
and may have been missed by the cleanup.
These requests would never be completed.
The next drbd_suspend_io() will then lock up,
waiting forever for these requests to complete.
Fixed in several code paths:
Make sure the connection state is NetworkFailure or worse
before starting the cleanup in drbd_disconnect().
This should make sure the cleanup won't miss any requests.
Disallow receive_state() to "upgrade" the connection state
from an error state. This will make sure the "illegal" state
transition won't happen.
For all connection failure states,
relax the safe-guard in sanitize_state() again
to silently mask out those state changes
(e.g. Timeout -> Connected becomes Timeout -> Timeout).
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
drbd_try_clear_on_disk_bm() has a sanity check for the number of blocks
left to be resynced (rs_left) in the current resync extent.
If it detects a mismatch, it complains, and forces a disconnect using
drbd_force_state(mdev, NS(conn, C_DISCONNECTING));
Unfortunately, this may be called while holding the req_lock,
and drbd_force_state() want's to aquire that lock itself. Deadlock.
Don't force a disconnect, but fix up rs_left by recounting and
reassigning the number of dirty blocks in that extent.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This bug might have caused troubles if disk-barriers and the ahead-behind
more are enabled at the same time.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
DRBD state changes schedule after_state_ch() actions to a worker thread,
which decides on the old and new states of that change, whether to send
an informational state update packet (P_STATE) to the peer.
If it decides to drbd_send_state(), it would however always send the
_curent_ state, which, if a second state change happens before the
after_state_ch() of the first ran, may "fast-forward" the peer's view
about this node. In most cases that is harmless, but sometimes this can
confuse DRBD, for example into not actually starting a necessary resync
if you do a very tight detach/attach loop on a Connected Secondary.
Fix this by always sending the "new" state of the respective state
transition which scheduled this after_state_ch() work.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
When detaching, even cleanly detaching due to administrator request,
we always go through D_FAILED before we become D_DISKLESS.
Don't let that state change race with an in-flight meta data IO,
or that one might think it actually experienced an IO error.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
drbd_state_lock() is only there to serialize cluster wide state
changes. Testing the local disk state needs to happen while
holding the global_state_lock.
Otherwise you might see something like this (Oct 6 on kugel)
14:20:24 drbd0: conn( WFSyncUUID -> Connected ) disk( Inconsistent -> Failed )
14:20:24 drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
14:20:24 drbd0: conn( Connected -> SyncTarget ) disk( Failed -> Inconsistent )
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
We have one pre-allocated page to do certain synchronous meta data IO with,
using it is serialized like so:
drbd_md_get_buffer();
drbd_md_sync_page_io();
drbd_md_sync_page_io();
...
drbd_md_put_buffer();
In drbd_md_sync_page_io() there is an
ASSERT(atomic_read(&mdev->md_io_in_use) == 1);
We want to be able to timeout on unresponsive lower level devices, so we
can "detach" in that case. Inside drbd_md_sync_page_io() we grab an extra
reference, to not have a dangling pointer in case a delayed IO eventually
does still complete, even after we "detached" already.
We need to put the extra reference before we signal completion from the
completion handler, or the second drbd_md_sync_page_io() above may
trigger the assert (reference count still 2).
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
With sync-after dependencies, given "lucky" timing of pause/unpause
events, and the end of an empty (0 bits set) resync was sometimes not
detected on the SyncTarget, leading to a "stalled" SyncSource state.
Fixed this by expecting not only "Inconsistent -> UpToDate" but also
"Consistent -> UpToDate" transitions for the peer disk state
to end a resync.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
If we get into the C_BROKEN_PIPE cstate once, the state engine set the
thi->t_state of the receiver thread to restarting. But with the while loop
in drbdd_init() a new connection gets established. After the call into
drbdd() returns immediately since the thi->t_state is not RUNNING. The
restart of drbd_init() then resets thi->t_state to RUNNING.
I.e. after entering C_BROKEN_PIPE once, the next successful established
connection gets wasted.
The two parts of the fix:
* Do not cause the thread to restart if we detect the issue
with the sockets while we are in C_WF_CONNECTION.
* Make sure that all actions that would have set us to C_BROKEN_PIPE
happen before the state change to C_WF_REPORT_PARAMS.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
...when the peer has inconsistent data. In that case we failed to
clear the susp_nod flag. When the local disk was attached again
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Now, the new edition of the clause only fires if a diskless
peer gets promoted.
This is a fixup for "drbd: Delayed creation of current-UUID".
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Bitmap IO may happend in the context of an application write,
in the generic block IO path. We need to use GFP_NOIO.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
When the disk-timeout is active, and it expires for a single request,
we consider the local disk as D_FAILED. Note: With this change,
I made both timeout based state transitions HARD state transitions.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The last bunch of commits prepared the 'detach from tar pit' feature.
With that we can be for long time in disk state FAILED. We need
to accept new IO requests during that time.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The current code groups up to 16 nodes in a level and then puts an
ALLNODES domain spanning the entire tree on top of that. This doesn't
reflect the numa topology and esp for the smaller not-fully-connected
machines out there today this might make a difference.
Therefore, build a proper numa topology based on node_distance().
Since there's no fixed numa layers anymore, the static SD_NODE_INIT
and SD_ALLNODES_INIT aren't usable anymore, the new code tries to
construct something similar and scales some values either on the
number of cpus in the domain and/or the node_distance() ratio.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: linux-alpha@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-sh@vger.kernel.org
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: sparclinux@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Greg Pearson <greg.pearson@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: bob.picco@oracle.com
Cc: chris.mason@oracle.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-r74n3n8hhuc2ynbrnp3vt954@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the sched_domain walk is completely unserialized (!SD_SERIALIZE)
it is possible that multiple cpus in the group get elected to do the
next level. Avoid this by adding some serialization.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-vqh9ai6s0ewmeakjz80w4qz6@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently we let the leftmost (or first idle) cpu ascend the
sched_domain tree and perform load-balancing. The result is that the
busiest cpu in the group might be performing this function and pull
more load to itself. The next load balance pass will then try to
equalize this again.
Change this to pick the least loaded cpu to perform higher domain
balancing.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-v8zlrmgmkne3bkcy9dej1fvm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since there's a PID space limit of 30bits (see
futex.h:FUTEX_TID_MASK) and allocating that many tasks (assuming a
lower bound of 2 pages per task) would still take 8T of memory it
seems reasonable to say that unsigned int is sufficient for
rq->nr_running.
When we do get anywhere near that amount of tasks I suspect other
things would go funny, load-balancer load computations would really
need to be hoisted to 128bit etc.
So save a few bytes and convert rq->nr_running and friends to
unsigned int.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-y3tvyszjdmbibade5bw8zl81@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
An API update which wasn't sufficiently thorough in updating the tree...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIbBAABAgAGBQJPqlf6AAoJEBus8iNuMP3duOcP+MUkL8T/PMaehNqqcLW6tJCE
vxmS5RzpRcsGBzc5MYGpuzItpNiqEZGjHsp+2RAGTY3P5ACIJ04jzrwFxCrkoFB4
w4tro6dz8uFA+gWAFqd+/D/UQXE7bDfxy2mZ0PYjgZtA8C2aVaQuIt0Btkt4jcqo
42LioL2JM0OERHPf2cacuownf3YU8O1srVfhTXpbgKjUU4llkIzA6UMOjAZuTf6W
z0dmXet5DHs19f7PM8PnRxTUkCuMM5KtzCUPhENxf/mTb8RWc/rS6iJrhWTkzVYg
Kn41YDXElcjakoOiM/T8kaUe25Y3d1YSKgy/4OiFvn8fdb/icXai+3Z2eEk7Tnye
ZEzSEheSuNw/6IHEcdw8wxy0Dh/Cal6oDUSANk3d3+vVGjxMGK7IpuKdJ3vk353f
+MRwTxlfRQUlwRlv2Wp4J6TQ13nJKksalvGTcsDfmtcAQTP0s+8xtXG0gzWNuqJo
tGTt4sLTMjLsJawH7fLqsUVRzrFqJaFSQ0fkxU3sVnomREDTQkaIPFDXJPS1id2C
E4VIHkYi3mA9ki0TH67HcCvKnNm765oUAiqovZiu4LdH9vJpM+yAnG0Hc8SrM7AQ
9jK6z9YOdHV7KyGpBUq8LM2+lLTLZRsFLPQ/d2xiJw2JxCYZD9YykQv90ZGMtTaR
Ks8T4KioUZinkCSB4eg=
=HF38
-----END PGP SIGNATURE-----
Merge tag 'asoc-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Build fix for SH in 3.4
An API update which wasn't sufficiently thorough in updating the tree...
We're trying to remove all usage of the ASoc level cache and I/O code and
for a device like this with a pretty sparse register map the rbtree cache
is a better idea anyway.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Add driver for running I2S with the MSP-block.
Signed-off-by: Ola Lilja <ola.o.lilja@stericsson.com>
[Fixed trailing whitespace -- broonie]
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Use for_each_set_bit_from to iterate over all the set bit in a memory
region.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Cc: Sekhar Nori <nsekhar@ti.com>
Cc: Christian Riesch <christian.riesch@omicron.at>
Cc: Kevin Hilman <khilman@ti.com>
Cc: davinci-linux-open-source@linux.davincidsp.com
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Fix a recent compilation breakage, caused by a change in SH clock API.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The ISR does quiete a lot of hw access which could be avoided. First it
checks for a pending interrupt by reading alteast one register. Then it
checks for the "activated" slots by reading another register. This is
more or a less a must.
Now, once it found an active slot it does the same two reads again.
After that it "knows" that there must be a pending transfer however it
cross checks with the other register. There are 32 bit in an interger
which are polled instead of considering only the set bits and ignoring
those which are zero. This performs atleast 32 reads which could be
avoided. In case of a first match it does another read.
This patch reorganizes the access by re-using the register which have
been read and then uses ffs() to find the matching slot instead looping
over it. By doing this we get rid of the last (32 + 2 + hits) reads.
It is possible however that by really busy bank0 we never get to handle
bank1. If this is a problem, we could try to handle bank1 after we are
done with bank0 to check if there are any outstanding transfers.
To put some numbers on this, this is from spi transfer via spidev. The
first column is the number of total transfers, the time stamp is taken
before and after the ioctl():
|10000, min: 542us avg: 591us
|20000, min: 542us avg: 592us
|30000, min: 542us avg: 592us
|40000, min: 542us avg: 585us
|50000, min: 542us avg: 593us
The same test case with the patch applied
|10000, min: 444us avg: 493us
|20000, min: 444us avg: 491us
|30000, min: 444us avg: 489us
|40000, min: 444us avg: 491us
|50000, min: 444us avg: 492us
that is almost 100us that just went away.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Apart from the necessity to do this change for multi-platform kernels
the previous logic depended on the zImage decompressor to write the
physical and virtual address to a magic memory location.
If the decompressor is unused or not correctly configured for the
current machid, the addruart macro was an infinite loop. Moreover
debugging the early zImage code was not possible either.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
[nsekhar@ti.com: add braces in _DEBUG_LL_ENTRY() macro to fix checkpatch
error. Fix debug port choice config dependency for traditional DaVincis.
Modify debug port config names and add help text.]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Instead of only checking nonsensical topologies on numa-emu, do it
on real hardware as well, and print a warning.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-re15l0jqjtpz709oxozt2zoh@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When using numa=fake= you can get weird topologies where LLCs can span
nodes and other such nonsense. Cure this by hard partitioning these
masks on node boundaries.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-di5vwjm96q5vrb76opwuflwx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Allows emulating more interesting NUMA configurations like a quad
socket AMD Magny-Cour:
"numa=fake=8:10,16,16,22,16,22,16,22,
16,10,22,16,22,16,22,16,
16,22,10,16,16,22,16,22,
22,16,16,10,22,16,22,16,
16,22,16,22,10,16,16,22,
22,16,22,16,16,10,22,16,
16,22,16,22,16,22,10,16,
22,16,22,16,22,16,16,10"
Which has a non-fully-connected topology.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-e1136ef7kdffj7yf9tjhydln@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
can be used e.g. for ingress traffic policing or
to detect when a host/port consumes more bandwidth than expected.
This is done by optionally making cost to mean
"cost per 16-byte-chunk-of-data" instead of "cost per packet".
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
followup patch would bloat main match function too much.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The target allows you to create rules in the "raw" and "mangle" tables
which set the skbuff mark by means of hash calculation within a given
range. The nfmark can influence the routing method (see "Use netfilter
MARK value as routing key") and can also be used by other subsystems to
change their behaviour.
[ Part of this patch has been refactorized and modified by Pablo Neira Ayuso ]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
An overlay manager's timings (the manager size, and blanking parameters
if an LCD manager) are DISPC shadow registers, and they should hence
follow the correct programming model.
This series makes the video timings an extra_info parameter in manager's
private data. The interface drivers now apply the timings instead of
directly writing to registers.
This change also prevents the need to use display resolution for overlay
checks, hence making some of the APPLY functions less dependent on the
display. Some DISPC functions that needed display width can also use
these privately stored timings.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
This patch adds the flags parameter to ipv6_find_hdr. This flags
allows us to:
* know if this is a fragment.
* stop at the AH header, so the information contained in that header
can be used for some specific packet handling.
This patch also adds the offset parameter for inspection of one
inner IPv6 header that is contained in error messages.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The functions calc_fclk_five_taps() and check_horiz_timing_omap3() use the
function dispc_mgr_get_device() to get the omap_dss_device pointer to which
the manager is connected, the width of the panel is derived from that.
The manager's timing is stored in it's private data in APPLY. This contains
the latest timings applied to the manager. Pass these timings to
dispc_ovl_setup() and use them in the above functions. Remove the function
dispc_mgr_get_device() as it isn't used any more.
Signed-off-by: Archit Taneja <archit@ti.com>
The pixel clock rate for the TV manager is calculated by checking the device
type connected to the manager, and then requesting the VENC/HDMI interface for
the pixel clock rate.
Remove the use of omap_dss_device pointer from here by checking which interface
generates the pixel clock by reading the DSS_CTRL.VENC_HDMI_SWITCH bit.
Signed-off-by: Archit Taneja <archit@ti.com>
The omap_dss_device pointer declared in dss_ovl_setup_fifo() isn't used. Remove
the pointer variable declaration and it's assignment.
Signed-off-by: Archit Taneja <archit@ti.com>
The DPI and HDMI interfaces use their 'set_timing' functions to take in a new
set of timings. If the panel is disabled, they do not disable and re-enable
the interface. Currently, the manager timings are applied in hdmi_power_on()
and dpi_set_mode() respectively, these are not called by set_timings if the
panel is disabled.
When checking overlay and manager data, the DSS driver uses the last applied
manager timings, and not the timings held by omap_dss_device struct. Hence,
there is a need to apply the new manager timings even if the panel is disabled.
Apply the manager timings if the panel is disabled. Eventually, there should be
one common place where the timings are applied independent of the state of the
panel.
Signed-off-by: Archit Taneja <archit@ti.com>
In order to check the validity of overlay and manager info, there was a need to
use the omap_dss_device struct to get the panel resolution. The manager's
private data in APPLY now contains the manager timings. Hence, we don't need to
rely on the display resolution any more.
Pass the manager's timings in private data to dss_mgr_check(). Remove the need
to pass omap_dss_device structs in the functions which check for the validity
of overlay and manager parameters.
Signed-off-by: Archit Taneja <archit@ti.com>
If a manager is disabled, there is no guarantee at any point in time that all
it's parameters are configured. There is always a chance that some more
parameters are yet to be configured by a user of DSS, or by DSS itself.
However, when the manager is enabled, we can be certain that all the parameters
have been configured, as we can't enable a manager with an incomplete
configuration. Therefore, if a manager is disabled, don't check for the validity
of it's parameters or the parameters of the overlays connected to it. Only check
once it is enabled. Add a check in dss_check_settings_low() to achieve the same.
Signed-off-by: Archit Taneja <archit@ti.com>