Fix the IRQ handling on the MN10300 arch.
This patch makes a number of significant changes:
(1) It separates the irq_chip definition for edge-triggered interrupts from
the one for level-triggered interrupts.
This is necessary because the MN10300 PIC latches the IRQ channel's
interrupt request bit (GxICR_REQUEST), even after the device has ceased to
assert its interrupt line and the interrupt channel has been disabled in
the PIC. So for level-triggered interrupts we need to clear this bit when
we re-enable - which is achieved by setting GxICR_DETECT but not
GxICR_REQUEST when writing to the register.
Not doing this results in spurious interrupts occurring because calling
mask_ack() at the start of handle_level_irq() is insufficient - it fails
to clear the REQUEST latch because the device that caused the interrupt is
still asserting its interrupt line at this point.
(2) IRQ disablement [irq_chip::disable_irq()] shouldn't clear the interrupt
request flag for edge-triggered interrupts lest it lose an interrupt.
(3) IRQ unmasking [irq_chip::unmask_irq()] also shouldn't clear the interrupt
request flag for edge-triggered interrupts lest it lose an interrupt.
(4) The end() operation is now left to the default (no-operation) as
__do_IRQ() is compiled out. This may affect misrouted_irq(), but
according to Thomas Gleixner it's the correct thing to do.
(5) handle_level_irq() is used for edge-triggered interrupts rather than
handle_edge_irq() as the MN10300 PIC latches interrupt events even on
masked IRQ channels, thus rendering IRQ_PENDING unnecessary. It is
sufficient to call mask_ack() at the start and unmask() at the end.
(6) For level-triggered interrupts, ack() is now NULL as it's not used, and
there is no effective ACK function on the PIC. mask_ack() is now the
same as mask() as the latch continues to latch, even when the channel is
masked.
Further, the patch discards the disable() op implementation as its now the same
as the mask() op implementation, which is used instead.
It also discards the enable() op implementations as they're now the same as
the unmask() op implementations, which are used instead.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm mpath: add missing path switching locking
dm: cope with access beyond end of device in dm_merge_bvec
dm: always allow one page in dm_merge_bvec
... including some comments about the ordering required to bring
sparsemem up. You have to repeatedly guess, test, reguess, try
again and again to work out what the right ordering is. Many
hours later...
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide helpers for getting physical addresses or pfns from the
meminfo array, and use them. Move for_each_nodebank() to
asm/setup.h alongside the meminfo structure definition.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add support for detecting non-executable stack binaries, and adjust
permissions to prevent execution from data and stack areas. Also,
ensure that READ_IMPLIES_EXEC is enabled for older CPUs where that
is true, and for any executable-stack binary.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As of the previous commit, MT_DEVICE_IXP2000 encodes to the same
PTE bit encoding as MT_DEVICE, so it's now redundant. Convert
MT_DEVICE_IXP2000 to use MT_DEVICE instead, and remove its aliases.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide L_PTE_MT_xxx definitions to describe the memory types that we
use in Linux/ARM. These definitions are carefully picked such that:
1. their LSBs match what is required for pre-ARMv6 CPUs.
2. they all have a unique encoding, including after modification
by build_mem_type_table() (the result being that some have more
than one combination.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There are actually only four separate implementations of set_pte_ext.
Use assembler macros to insert code for these into the proc-*.S files.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The INIT perameter carries the adapatation value in network-byte
order. We need to store it in host byte order as expected
by data types and the user API.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
This patch enables cookie-echo retransmission transport switch
feature. If COOKIE-ECHO retransmission happens, it will be sent
to the address other than the one last sent to.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC3873 defined SCTP_MIB_OUTOFBLUES:
sctpOutOfBlues OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of out of the blue packets received by the host.
An out of the blue packet is an SCTP packet correctly formed,
including the proper checksum, but for which the receiver was
unable to identify an appropriate association."
REFERENCE
"Section 8.4 in RFC2960 deals with the Out-Of-The-Blue
(OOTB) packet definition and procedures."
But OOTB packet INIT, INIT-ACK and SHUTDOWN-ACK(COOKIE-WAIT or
COOKIE-ECHOED state) are not counted by SCTP_MIB_OUTOFBLUES.
Case 1(INIT):
Endpoint A Endpoint B
(CLOSED) (CLOSED)
INIT ---------->
<---------- ABORT
Case 2(INIT-ACK):
Endpoint A Endpoint B
(CLOSED) (CLOSED)
INIT-ACK ---------->
<---------- ABORT
Case 3(SHUTDOWN-ACK):
Endpoint A Endpoint B
(CLOSED) (CLOSED)
<---------- INIT
SHUTDOWN-ACK ---------->
<---------- SHUTDOWN-COMPLETE
Case 4(SHUTDOWN-ACK):
Endpoint A Endpoint B
(CLOSED) (COOKIE-ECHOED)
SHUTDOWN-ACK ---------->
<---------- SHUTDOWN-COMPLETE
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC 4960: Section 9.2
The sender of the SHUTDOWN MAY also start an overall guard timer
'T5-shutdown-guard' to bound the overall time for the shutdown
sequence. At the expiration of this timer, the sender SHOULD abort
the association by sending an ABORT chunk. If the 'T5-shutdown-
guard' timer is used, it SHOULD be set to the recommended value of 5
times 'RTO.Max'.
The timer 'T5-shutdown-guard' is used to counter the overall time
for shutdown sequence, and it's start by the sender of the SHUTDOWN.
So timer 'T5-shutdown-guard' should be start when we send the first
SHUTDOWN chunk and enter the SHUTDOWN-SENT state, not start when we
receipt of the SHUTDOWN primitive and enter SHUTDOWN-PENDING state.
If 'T5-shutdown-guard' timer is start at SHUTDOWN-PENDING state, the
association may be ABORT while data is still transmitting.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
sctp_is_any() function that is used to check for wildcard addresses
only looks at the address itself to determine the address family.
This function is used in the API to check the address passed in from
the user. If the user simply zerroes out the sockaddr_storage and
pass that in, we'll end up failing. So, let's try harder to determine
the address family by also checking the socket if it's possible.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
sctp_chunks should be put on a diet. This is some of the low hanging
fruit that we can strip out. Changes all the __s8/__u8 flags to
bitfields. Saves 12 bytes per chunk.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Chunks placed on the retransmit list are marked as inelegible
for fast retrasnmission. Since missing indications determine
when fast reransmission is done, there is not point in calling
sctp_mark_missing() on the retransmit list since those chunks
will not be marked.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
There is a possibility of walking the transport list twice during
SACK processing when doing SFR-CACC algorithm. We can restructure
the code to only do this once.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Frist small step in optimizing SACK processing. Do not call
sctp_mark_missing() when there are no gaps reported and thus
not missing chunks.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
The iptables tproxy code has to be able to do UDP socket hash lookups,
so we have to provide an exported lookup function for this purpose.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current TCP code relies on the local port of the listening socket
being the same as the destination address of the incoming
connection. Port redirection used by many transparent proxying
techniques obviously breaks this, so we have to store the original
destination port address.
This patch extends struct inet_request_sock and stores the incoming
destination port value there. It also modifies the handshake code to
use that value as the source port when sending reply packets.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netfilter's ip_route_me_harder() tries to re-route packets either
generated or re-routed by Netfilter. This patch changes
ip_route_me_harder() to handle packets from non-locally-bound sockets
with IP_TRANSPARENT set as local and to set the appropriate flowi
flags when re-doing the routing lookup.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
incoming packets. The non-local source address check on output bites
us again, as replies for transparently redirected traffic won't have a
chance to leave the node.
This patch selectively sets the FLOWI_FLAG_ANYSRC flag when doing the
route lookup for those replies. Transparent replies are enabled if the
listening socket has the transparent socket flag set.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set FLOWI_FLAG_ANYSRC in flowi->flags if the socket has the
transparent socket option set. This way we selectively enable certain
connections with non-local source addresses to be routed.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
inet_iif() in inet_sock.h requires route.h. Since users of inet_iif()
usually require other route.h functionality anyway this patch moves
inet_iif() to route.h.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting IP_TRANSPARENT is not really useful without allowing non-local
binds for the socket. To make user-space code simpler we allow these
binds even if IP_TRANSPARENT is set but IP_FREEBIND is not.
Signed-off-by: Tóth László Attila <panther@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces the IP_TRANSPARENT socket option: enabling that
will make the IPv4 routing omit the non-local source address check on
output. Setting IP_TRANSPARENT requires NET_ADMIN capability.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.
This patch introduces a flowi flag which makes omitting this check
possible. The new flag provides a way of handling transparent and
non-transparent connections differently.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the caller of pskb_expand_head specifies a negative nhead
we'll silently overwrite other people's memory. This patch
makes it BUG instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu came up with the idea and the original patch to make
xfrm_state dump list contain also dumpers:
As it is we go to extraordinary lengths to ensure that states
don't go away while dumpers go to sleep. It's much easier if
we just put the dumpers themselves on the list since they can't
go away while they're going.
I've also changed the order of addition on new states to prevent
a never-ending dump.
Timo Teräs improved the patch to apply cleanly to latest tree,
modified iteration code to be more readable by using a common
struct for entries in the list, implemented the same idea for
xfrm_policy dumping and moved the af_key specific "last" entry
caching to af_key.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moving the path activation to workqueue along with scsi_dh patches introduced
a race. It is due to the fact that the current_pgpath (in the multipath data
structure) can be modified if changes happen in any of the paths leading to
the lun. If the changes lead to current_pgpath being set to NULL, then it
leads to the invalid access which results in the panic below.
This patch fixes that by storing the pgpath to activate in the multipath data
structure and properly protecting it.
Note that if activate_path is called twice in succession with different pgpath,
with the second one being called before the first one is done, then activate
path will be called twice for the second pgpath, which is fine.
Unable to handle kernel paging request for data at address 0x00000020
Faulting instruction address: 0xd000000000aa1844
cpu 0x1: Vector: 300 (Data Access) at [c00000006b987a80]
pc: d000000000aa1844: .activate_path+0x30/0x218 [dm_multipath]
lr: c000000000087a2c: .run_workqueue+0x114/0x204
sp: c00000006b987d00
msr: 8000000000009032
dar: 20
dsisr: 40000000
current = 0xc0000000676bb3f0
paca = 0xc0000000006f3680
pid = 2528, comm = kmpath_handlerd
enter ? for help
[c00000006b987da0] c000000000087a2c .run_workqueue+0x114/0x204
[c00000006b987e40] c000000000088b58 .worker_thread+0x120/0x144
[c00000006b987f00] c00000000008ca70 .kthread+0x78/0xc4
[c00000006b987f90] c000000000027cc8 .kernel_thread+0x4c/0x68
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
If for any reason dm_merge_bvec() is given an offset beyond the end of the
device, avoid an oops and always allow one page to be added to an empty bio.
We'll reject the I/O later after the bio is submitted.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Some callers assume they can always add at least one page to an empty bio,
so dm_merge_bvec should not return 0 in this case: we'll reject the I/O
later after the bio is submitted.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Fix a xfrm_{state,policy}_walk leak if pfkey socket is closed while
dumping is on-going.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
This gives a nice increase in the maximum loss-free packet forwarding
rate in routing workloads.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds skb_recycle_check(), which can be used by a network
driver after transmitting an skb to check whether this skb can be
recycled as a receive buffer.
skb_recycle_check() checks that the skb is not shared or cloned, and
that it is linear and its head portion large enough (as determined by
the driver) to be recycled as a receive buffer. If these conditions
are met, it does any necessary reference count dropping and cleans
up the skbuff as if it just came from __alloc_skb().
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following actions are possible:
tcp_v6_rcv
skb->dev = NULL;
tcp_v6_do_rcv
tcp_v6_hnd_req
tcp_check_req
req->rsk_ops->send_ack == tcp_v6_send_ack
So, skb->dev can be NULL in tcp_v6_send_ack. We must obtain namespace
from dst entry.
Thanks to Vitaliy Gusev <vgusev@openvz.org> for initial problem finding
in IPv4 code.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>