linux-pinenote

Author	SHA1	Message	Date
David Howells	817913d8cd	af_rxrpc: Expose more RxRPC parameters via sysctls Expose RxRPC parameters via sysctls to control the Rx window size, the Rx MTU maximum size and the number of packets that can be glued into a jumbo packet. More info added to Documentation/networking/rxrpc.txt. Signed-off-by: David Howells <dhowells@redhat.com>	2014-02-26 17:25:07 +00:00
David Howells	9823f39a17	af_rxrpc: Improve ACK production Improve ACK production by the following means: (1) Don't send an ACK_REQUESTED ack immediately even if the RXRPC_MORE_PACKETS flag isn't set on a data packet that has also has RXRPC_REQUEST_ACK set. MORE_PACKETS just means that the sender just emptied its Tx data buffer. More data will be forthcoming unless RXRPC_LAST_PACKET is also flagged. It is possible to see runs of DATA packets with MORE_PACKETS unset that aren't waiting for an ACK. It is therefore better to wait a small instant to see if we can combine an ACK for several packets. (2) Don't send an ACK_IDLE ack immediately unless we're responding to the terminal data packet of a call. Whilst sending an ACK_IDLE mid-call serves to let the other side know that we won't be asking it to resend certain Tx buffers and that it can discard them, spamming it with loads of acks just because we've temporarily run out of data just distracts it. (3) Put the ACK_IDLE ack generation timeout up to half a second rather than a single jiffy. Just because we haven't been given more data immediately doesn't mean that more isn't forthcoming. The other side may be busily finding the data to send to us. Signed-off-by: David Howells <dhowells@redhat.com>	2014-02-26 17:25:07 +00:00
David Howells	5873c0834f	af_rxrpc: Add sysctls for configuring RxRPC parameters Add sysctls for configuring RxRPC protocol handling, specifically controls on delays before ack generation, the delay before resending a packet, the maximum lifetime of a call and the expiration times of calls, connections and transports that haven't been recently used. More info added in Documentation/networking/rxrpc.txt. Signed-off-by: David Howells <dhowells@redhat.com>	2014-02-26 17:25:06 +00:00
David Howells	6c9a2d3202	af_rxrpc: Fix UDP MTU calculation from ICMP_FRAG_NEEDED AF_RXRPC sends UDP packets with the "Don't Fragment" bit set in an attempt to determine the maximum packet size between the local socket and the peer by invoking the generation of ICMP_FRAG_NEEDED packets. Once a packet is sent with the "Don't Fragment" bit set, it is then inconvenient to break it up as that requires recalculating all the rxrpc serial and sequence numbers and reencrypting all the fragments, so we switch off the "Don't Fragment" service temporarily and send the bounced packet again. Future packets then use the new MTU. That's all fine. The problem lies in rxrpc_UDP_error_report() where the code that deals with ICMP_FRAG_NEEDED packets lives. Packets of this type have a field (ee_info) to indicate the maximum packet size at the reporting node - but sometimes ee_info isn't filled in and is just left as 0 and the code must allow for this. When ee_info is 0, the code should take the MTU size we're currently using and reduce it for the next packet we want to send. However, it takes ee_info (which is known to be 0) and tries to reduce that instead. This was discovered by Coverity. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>	2014-02-26 17:25:01 +00:00
Steffen Klassert	3a9016f97f	xfrm: Fix unlink race when policies are deleted. When a policy is unlinked from the lists in thread context, the xfrm timer can fire before we can mark this policy as dead. So reinitialize the bydst hlist, then hlist_unhashed() will notice that this policy is not linked and will avoid a doulble unlink of that policy. Reported-by: Xianpeng Zhao <673321875@qq.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-26 09:52:02 +01:00
Mike Pecovnik	46833a86f7	net: Fix permission check in netlink_connect() netlink_sendmsg() was changed to prevent non-root processes from sending messages with dst_pid != 0. netlink_connect() however still only checks if nladdr->nl_groups is set. This patch modifies netlink_connect() to check for the same condition. Signed-off-by: Mike Pecovnik <mike.pecovnik@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-25 18:35:14 -05:00
Hannes Frederic Sowa	91a48a2e85	ipv4: ipv6: better estimate tunnel header cut for correct ufo handling Currently the UFO fragmentation process does not correctly handle inner UDP frames. (The following tcpdumps are captured on the parent interface with ufo disabled while tunnel has ufo enabled, 2000 bytes payload, mtu 1280, both sit device): IPv6: 16:39:10.031613 IP (tos 0x0, ttl 64, id 3208, offset 0, flags [DF], proto IPv6 (41), length 1300) 192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 1240) 2001::1 > 2001::8: frag (0x00000001:0\|1232) 44883 > distinct: UDP, length 2000 16:39:10.031709 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPv6 (41), length 844) 192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 784) 2001::1 > 2001::8: frag (0x00000001:0\|776) 58979 > 46366: UDP, length 5471 We can see that fragmentation header offset is not correctly updated. (fragmentation id handling is corrected by `916e4cf46d` ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data")). IPv4: 16:39:57.737761 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPIP (4), length 1296) 192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57034, offset 0, flags [none], proto UDP (17), length 1276) 192.168.99.1.35961 > 192.168.99.2.distinct: UDP, length 2000 16:39:57.738028 IP (tos 0x0, ttl 64, id 3210, offset 0, flags [DF], proto IPIP (4), length 792) 192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57035, offset 0, flags [none], proto UDP (17), length 772) 192.168.99.1.13531 > 192.168.99.2.20653: UDP, length 51109 In this case fragmentation id is incremented and offset is not updated. First, I aligned inet_gso_segment and ipv6_gso_segment: * align naming of flags * ipv6_gso_segment: setting skb->encapsulation is unnecessary, as we always ensure that the state of this flag is left untouched when returning from upper gso segmenation function * ipv6_gso_segment: move skb_reset_inner_headers below updating the fragmentation header data, we don't care for updating fragmentation header data * remove currently unneeded comment indicating skb->encapsulation might get changed by upper gso_segment callback (gre and udp-tunnel reset encapsulation after segmentation on each fragment) If we encounter an IPIP or SIT gso skb we now check for the protocol == IPPROTO_UDP and that we at least have already traversed another ip(6) protocol header. The reason why we have to special case GSO_IPIP and GSO_SIT is that we reset skb->encapsulation to 0 while skb_mac_gso_segment the inner protocol of GSO_UDP_TUNNEL or GSO_GRE packets. Reported-by: Wolfgang Walter <linux@stwm.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-25 18:27:06 -05:00
Johan Hedberg	a9a58f8612	Bluetooth: Ignore IRKs with no Identity Address The Core Specification (4.1) leaves room for sending an SMP Identity Address Information PDU with an all-zeros BD_ADDR value. This essentially means that we would not have an Identity Address for the device and the only means of identifying it would be the IRK value itself. Due to lack of any known implementations behaving like this it's best to keep our implementation as simple as possible as far as handling such situations is concerned. This patch updates the Identity Address Information handler function to simply ignore the IRK received from such a device. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-25 12:30:41 -08:00
Johan Hedberg	a4858cb942	Bluetooth: Fix advertising address type when toggling connectable When the connectable setting is toggled using mgmt_set_connectable the HCI_CONNECTABLE flag will only be set once the related HCI commands succeed. When determining what kind of advertising to do we need to therefore also check whether there is a pending Set Connectable command in addition to the current flag value. The enable_advertising function was already taking care of this for the advertising type with the help of the get_adv_type function, but was failing to do the same for the address type selection. This patch converts the get_adv_type function to be more generic in that it returns the expected connectable state and updates the enable_advertising function to use the return value both for the advertising type as well as the advertising address type. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-25 10:02:53 -08:00
Andrzej Kaczmarek	ede81a2a12	Bluetooth: Fix NULL pointer dereference when sending data When trying to allocate skb for new PDU, l2cap_chan is unlocked so we can sleep waiting for memory as otherwise there's possible deadlock as fixed in `e454c84464`. However, in `a6a5568c03` lock was moved from socket to channel level and it's no longer safe to just unlock and lock again without checking l2cap_chan state since channel can be disconnected when lock is not held. This patch adds missing checks for l2cap_chan state when returning from call which allocates skb. Scenario is easily reproducible by running rfcomm-tester in a loop. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth] PGD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 7 PID: 4038 Comm: krfcommd Not tainted 3.14.0-rc2+ #15 Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A10 11/24/2011 task: ffff8802bdd731c0 ti: ffff8801ec986000 task.ti: ffff8801ec986000 RIP: 0010:[<ffffffffa0442169>] [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 RSP: 0018:ffff8801ec987ad8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff8800c5796800 RCX: 0000000000000000 RDX: ffff880410e7a800 RSI: ffff8802b6c1da00 RDI: ffff8800c5796800 RBP: ffff8801ec987af8 R08: 00000000000000c0 R09: 0000000000000300 R10: 000000000000573b R11: 000000000000573a R12: ffff8802b6c1da00 R13: 0000000000000000 R14: ffff8802b6c1da00 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000041257c000 CR4: 00000000000407e0 Stack: ffff8801ec987d78 ffff8800c5796800 ffff8801ec987d78 0000000000000000 ffff8801ec987ba8 ffffffffa0449e37 0000000000000004 ffff8801ec987af0 ffff8801ec987d40 0000000000000282 0000000000000000 ffffffff00000004 Call Trace: [<ffffffffa0449e37>] l2cap_chan_send+0xaa7/0x1120 [bluetooth] [<ffffffff81770100>] ? _raw_spin_unlock_bh+0x20/0x40 [<ffffffffa045188b>] l2cap_sock_sendmsg+0xcb/0x110 [bluetooth] [<ffffffff81652b0f>] sock_sendmsg+0xaf/0xc0 [<ffffffff810a8381>] ? update_curr+0x141/0x200 [<ffffffff810a8961>] ? dequeue_entity+0x181/0x520 [<ffffffff81652b60>] kernel_sendmsg+0x40/0x60 [<ffffffffa04a8505>] rfcomm_send_frame+0x45/0x70 [rfcomm] [<ffffffff810766f0>] ? internal_add_timer+0x20/0x50 [<ffffffffa04a8564>] rfcomm_send_cmd+0x34/0x60 [rfcomm] [<ffffffffa04a8605>] rfcomm_send_disc+0x75/0xa0 [rfcomm] [<ffffffffa04aacec>] rfcomm_run+0x8cc/0x1a30 [rfcomm] [<ffffffffa04aa420>] ? rfcomm_check_accept+0xc0/0xc0 [rfcomm] [<ffffffff8108e3a9>] kthread+0xc9/0xe0 [<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0 [<ffffffff817795fc>] ret_from_fork+0x7c/0xb0 [<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0 Code: 00 00 66 66 66 66 90 55 48 89 e5 48 83 ec 20 f6 05 d6 a3 02 00 04 RIP [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth] RSP <ffff8801ec987ad8> CR2: 0000000000000000 Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-25 10:02:53 -08:00
Ilan Peer	7c8d5e03ac	cfg80211: send stop AP event only due to internal reason Commit "nl80211: send event when AP operation is stopped" added an event to notify user space that an AP interface has been stopped, to handle cases such as suspend etc. The event is sent regardless if the stop AP flow was triggered by user space or due to internal state change. This might cause issues with wpa_supplicant/hostapd flows that consider stop AP flow as a synchronous one, e.g., AP/GO channel change in the absence of CSA support. In such cases, the flow will restart the AP immediately after the stop AP flow is done, and only handle the stop AP event after the current flow is done, and as a result stop the AP again. Change the current implementation to only send the event in case the stop AP was triggered due to an internal reason. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-25 17:34:18 +01:00
Janusz Dziedzic	31559f35c5	cfg80211: DFS get CAC time from regulatory database Send Channel Availability Check time as a parameter of start_radar_detection() callback. Get CAC time from regulatory database. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-25 17:32:54 +01:00
Janusz Dziedzic	089027e57c	cfg80211: regulatory: allow getting DFS CAC time from userspace Introduce DFS CAC time as a regd param, configured per REG_RULE and set per channel in cfg80211. DFS CAC time is close connected with regulatory database configuration. Instead of using hardcoded values, get DFS CAC time form regulatory database. Pass DFS CAC time to user mode (mainly for iw reg get, iw list, iw info). Allow setting DFS CAC time via CRDA. Add support for internal regulatory database. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> [rewrap commit log] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-25 17:29:25 +01:00
Janusz Dziedzic	fb5c96368f	cfg80211: regulatory: allow user to set world regdomain Allow to set world regulatory domain in case of user request (iw reg set 00). Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-25 16:27:45 +01:00
Janusz Dziedzic	092008abee	cfg80211: regulatory: reset regdomain in case of error Reset regdomain to world regdomain in case of errors in set_regdom() function. This will fix a problem with such scenario: - iw reg set US - iw reg set 00 - iw reg set US The last step always fail and we get deadlock in kernel regulatory code. Next setting new regulatory wasn't possible due to: Pending regulatory request, waiting for it to be processed... Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-25 16:27:04 +01:00
Johannes Berg	1226d25870	cfg80211: regulatory: simplify uevent sending There's no need for the struct device_type with the uevent function etc., just fill the country alpha2 when sending the event. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-25 15:44:44 +01:00
Florian Westphal	39111fd261	netfilter: nfnetlink_log: remove unused code Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-25 11:30:01 +01:00
Patrick McHardy	e0abdadcc6	netfilter: nf_tables: accept QUEUE/DROP verdict parameters Allow userspace to specify the queue number or the errno code for QUEUE and DROP verdicts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-25 11:29:26 +01:00
Patrick McHardy	67a8fc27cc	netfilter: nf_tables: add nft_dereference() macro Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-25 11:29:23 +01:00
Patrick McHardy	0eb5db7ad3	netfilter: nfnetlink: add rcu_dereference_protected() helpers Add a lockdep_nfnl_is_held() function and a nfnl_dereference() macro for RCU dereferences protected by a NFNL subsystem mutex. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-25 11:29:21 +01:00
Patrick McHardy	3e90ebd3c9	netfilter: ip_set: rename nfnl_dereference()/nfnl_set() The next patch will introduce a nfnl_dereference() macro that actually checks that the appropriate mutex is held and therefore needs a subsystem argument. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-25 11:29:18 +01:00
Steffen Klassert	895de9a348	vti4: Enable namespace changing vti4 is now fully namespace aware, so allow namespace changing for vti devices Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:19 +01:00
Steffen Klassert	6e2de802af	vti4: Check the tunnel endpoints of the xfrm state and the vti interface The tunnel endpoints of the xfrm_state we got from the xfrm_lookup must match the tunnel endpoints of the vti interface. This patch ensures this matching. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:19 +01:00
Steffen Klassert	78a010cca0	vti4: Support inter address family tunneling. With this patch we can tunnel ipv6 traffic via a vti4 interface. A vti4 interface can now have an ipv6 address and ipv6 traffic can be routed via a vti4 interface. The resulting traffic is xfrm transformed and tunneled throuhg ipv4 if matching IPsec policies and states are present. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:19 +01:00
Steffen Klassert	a34cd4f319	vti4: Use the on xfrm_lookup returned dst_entry directly We need to be protocol family indepenent to support inter addresss family tunneling with vti. So use a dst_entry instead of the ipv4 rtable in vti_tunnel_xmit. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	9994bb8e1e	xfrm4: Remove xfrm_tunnel_notifier This was used from vti and is replaced by the IPsec protocol multiplexer hooks. It is now unused, so remove it. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	df3893c176	vti: Update the ipv4 side to use it's own receive hook. With this patch, vti uses the IPsec protocol multiplexer to register it's own receive side hooks for ESP, AH and IPCOMP. Vti now does the following on receive side: 1. Do an input policy check for the IPsec packet we received. This is required because this packet could be already prosecces by IPsec, so an inbuond policy check is needed. 2. Mark the packet with the i_key. The policy and the state must match this key now. Policy and state belong to the outer namespace and policy enforcement is done at the further layers. 3. Call the generic xfrm layer to do decryption and decapsulation. 4. Wait for a callback from the xfrm layer to properly clean the skb to not leak informations on namespace and to update the device statistics. On transmit side: 1. Mark the packet with the o_key. The policy and the state must match this key now. 2. Do a xfrm_lookup on the original packet with the mark applied. 3. Check if we got an IPsec route. 4. Clean the skb to not leak informations on namespace transitions. 5. Attach the dst_enty we got from the xfrm_lookup to the skb. 6. Call dst_output to do the IPsec processing. 7. Do the device statistics. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	6d608f06e3	ip_tunnel: Make vti work with i_key set Vti uses the o_key to mark packets that were transmitted or received by a vti interface. Unfortunately we can't apply different marks to in and outbound packets with only one key availabe. Vti interfaces typically use wildcard selectors for vti IPsec policies. On forwarding, the same output policy will match for both directions. This generates a loop between the IPsec gateways until the ttl of the packet is exceeded. The gre i_key/o_key are usually there to find the right gre tunnel during a lookup. When vti uses the i_key to mark packets, the tunnel lookup does not work any more because vti does not use the gre keys as a hash key for the lookup. This patch workarounds this my not including the i_key when comupting the hash for the tunnel lookup in case of vti tunnels. With this we have separate keys available for the transmitting and receiving side of the vti interface. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	70be6c91c8	xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer IPsec vti_rcv needs to remind the tunnel pointer to check it later at the vti_rcv_cb callback. So add this pointer to the IPsec common buffer, initialize it and check it to avoid transport state matching of a tunneled packet. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	d099160e02	ipcomp4: Use the IPsec protocol multiplexer API Switch ipcomp4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	e5b56454e0	ah4: Use the IPsec protocol multiplexer API Switch ah4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	827789cbd7	esp4: Use the IPsec protocol multiplexer API Switch esp4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	3328715e6c	xfrm4: Add IPsec protocol multiplexer This patch add an IPsec protocol multiplexer. With this it is possible to add alternative protocol handlers as needed for IPsec virtual tunnel interfaces. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:16 +01:00
Joe Perches	0409114282	bridge: netfilter: Use ether_addr_copy Convert the uses of memcpy to ether_addr_copy because for some architectures it is smaller and faster. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 19:16:44 -05:00
Joe Perches	e5a727f663	bridge: Use ether_addr_copy and ETH_ALEN Convert the more obvious uses of memcpy to ether_addr_copy. There are still uses of memcpy that could be converted but these addresses are __aligned(2). Convert a couple uses of 6 in gr_private.h to ETH_ALEN. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 19:16:43 -05:00
Eric Dumazet	d10473d4e3	tcp: reduce the bloat caused by tcp_is_cwnd_limited() tcp_is_cwnd_limited() allows GSO/TSO enabled flows to increase their cwnd to allow a full size (64KB) TSO packet to be sent. Non GSO flows only allow an extra room of 3 MSS. For most flows with a BDP below 10 MSS, this results in a bloat of cwnd reaching 90, and an inflate of RTT. Thanks to TSO auto sizing, we can restrict the bloat to the number of MSS contained in a TSO packet (tp->xmit_size_goal_segs), to keep original intent without performance impact. Because we keep cwnd small, it helps to keep TSO packet size to their optimal value. Example for a 10Mbit flow, with low TCP Small queue limits (no more than 2 skb in qdisc/device tx ring) Before patch : lpk51:~# ./ss -i dst lpk52:44862 \| grep cwnd cubic wscale:6,6 rto:215 rtt:15.875/2.5 mss:1448 cwnd:96 ssthresh:96 send 70.1Mbps unacked:14 rcv_space:29200 After patch : lpk51:~# ./ss -i dst lpk52:52916 \| grep cwnd cubic wscale:6,6 rto:206 rtt:5.206/0.036 mss:1448 cwnd:15 ssthresh:14 send 33.4Mbps unacked:4 rcv_space:29200 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Van Jacobson <vanj@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 19:13:38 -05:00
Mathias Krause	72f8e06f3e	pktgen: document all supported flags The documentation misses a few of the supported flags. Fix this. Also respect the dependency to CONFIG_XFRM for the IPSEC flag. Cc: Fan Du <fan.du@windriver.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:54:26 -05:00
Mathias Krause	0945574750	pktgen: simplify error handling in pgctrl_write() The 'out' label is just a relict from previous times as pgctrl_write() had multiple error paths. Get rid of it and simply return right away on errors. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:54:26 -05:00
Mathias Krause	20b0c718c3	pktgen: fix out-of-bounds access in pgctrl_write() If a privileged user writes an empty string to /proc/net/pktgen/pgctrl the code for stripping the (then non-existent) '\n' actually writes the zero byte at index -1 of data[]. The then still uninitialized array will very likely fail the command matching tests and the pr_warning() at the end will therefore leak stack bytes to the kernel log. Fix those issues by simply ensuring we're passed a non-empty string as the user API apparently expects a trailing '\n' for all commands. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:54:25 -05:00
David S. Miller	1f5a7407e4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== 1) Introduce skb_to_sgvec_nomark function to add further data to the sg list without calling sg_unmark_end first. Needed to add extended sequence number informations. From Fan Du. 2) Add IPsec extended sequence numbers support to the Authentication Header protocol for ipv4 and ipv6. From Fan Du. 3) Make the IPsec flowcache namespace aware, from Fan Du. 4) Avoid creating temporary SA for every packet when no key manager is registered. From Horia Geanta. 5) Support filtering of SA dumps to show only the SAs that match a given filter. From Nicolas Dichtel. 6) Remove caching of xfrm_policy_sk_bundles. The cached socket policy bundles are never used, instead we create a new cache entry whenever xfrm_lookup() is called on a socket policy. Most protocols cache the used routes to the socket, so this caching is not needed. 7) Fix a forgotten SADB_X_EXT_FILTER length check in pfkey, from Nicolas Dichtel. 8) Cleanup error handling of xfrm_state_clone. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:13:33 -05:00
Frederic Weisbecker	c46fff2a3b	smp: Rename __smp_call_function_single() to smp_call_function_single_async() The name __smp_call_function_single() doesn't tell much about the properties of this function, especially when compared to smp_call_function_single(). The comments above the implementation are also misleading. The main point of this function is actually not to be able to embed the csd in an object. This is actually a requirement that result from the purpose of this function which is to raise an IPI asynchronously. As such it can be called with interrupts disabled. And this feature comes at the cost of the caller who then needs to serialize the IPIs on this csd. Lets rename the function and enhance the comments so that they reflect these properties. Suggested-by: Christoph Hellwig <hch@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-02-24 14:47:15 -08:00
Frederic Weisbecker	fce8ad1568	smp: Remove wait argument from __smp_call_function_single() The main point of calling __smp_call_function_single() is to send an IPI in a pure asynchronous way. By embedding a csd in an object, a caller can send the IPI without waiting for a previous one to complete as is required by smp_call_function_single() for example. As such, sending this kind of IPI can be safe even when irqs are disabled. This flexibility comes at the expense of the caller who then needs to synchronize the csd lifecycle by himself and make sure that IPIs on a single csd are serialized. This is how __smp_call_function_single() works when wait = 0 and this usecase is relevant. Now there don't seem to be any usecase with wait = 1 that can't be covered by smp_call_function_single() instead, which is safer. Lets look at the two possible scenario: 1) The user calls __smp_call_function_single(wait = 1) on a csd embedded in an object. It looks like a nice and convenient pattern at the first sight because we can then retrieve the object from the IPI handler easily. But actually it is a waste of memory space in the object since the csd can be allocated from the stack by smp_call_function_single(wait = 1) and the object can be passed an the IPI argument. Besides that, embedding the csd in an object is more error prone because the caller must take care of the serialization of the IPIs for this csd. 2) The user calls __smp_call_function_single(wait = 1) on a csd that is allocated on the stack. It's ok but smp_call_function_single() can do it as well and it already takes care of the allocation on the stack. Again it's more simple and less error prone. Therefore, using the underscore prepend API version with wait = 1 is a bad pattern and a sign that the caller can do safer and more simple. There was a single user of that which has just been converted. So lets remove this option to discourage further users. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-02-24 14:47:09 -08:00
John W. Linville	db18014f65	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-02-24 15:04:35 -05:00
Johan Hedberg	8b064a3ad3	Bluetooth: Clean up HCI state when doing power off To be friendly to user space and to behave well with controllers that lack a proper internal power off procedure we should try to clean up as much state as possible before requesting the HCI driver to power off. This patch updates the power off procedure that's triggered by mgmt_set_powered to clean any scan modes, stop LE scanning and advertising and to disconnect any open connections. The asynchronous cleanup procedure uses the HCI request framework, however since HCI_Disconnect is only covered until its Command Status event we need some extra tracking/waiting of disconnections. This is done by monitoring when hci_conn_count() indicates that there are no more connections. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-24 11:10:36 -08:00
Johan Hedberg	7c4cfab808	Bluetooth: Don't clear HCI_ADVERTISING when powering off Once mgmt_set_powered(off) is updated to clear the scan mode we should not just blindly clear the HCI_ADVERTISING flag in mgmt_advertising() but first check if there is a pending set_powered operation. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-24 11:10:36 -08:00
Johan Hedberg	ce3f24cfb2	Bluetooth: Don't clear HCI_CONNECTABLE when powering off Once mgmt_set_powered(off) is updated to clear the scan mode we should not just blindly clear the HCI_CONNECTABLE flag in mgmt_connectable() but first check if there is a pending set_powered operation. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-24 11:10:36 -08:00
Johan Hedberg	bd10799933	Bluetooth: Don't clear HCI_DISCOVERABLE when powering off Once mgmt_set_powered(off) is updated to clear the scan mode we should not just blindly clear the HCI_DISCOVERABLE flag in mgmt_discoverable() but first check if there is a pending set_powered operation. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-24 11:10:36 -08:00
Johan Hedberg	12d4a3b2cc	Bluetooth: Move check for MGMT_CONNECTED flag into mgmt.c Once mgmt_set_powered(off) starts doing disconnections we'll need to care about any disconnections in mgmt.c and not just those with the MGMT_CONNECTED flag set. Therefore, move the check into mgmt.c from hci_event.c. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-24 11:10:36 -08:00
Johan Hedberg	778b235a3b	Bluetooth: Move HCI_ADVERTISING handling into mgmt.c We'll soon need to make decisions on toggling the HCI_ADVERTISING flag based on pending mgmt_set_powered commands. Therefore, move the handling from hci_event.c into mgmt.c. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-24 11:10:36 -08:00
Johan Hedberg	4518bb0fb5	Bluetooth: Fix canceling RPA expiry timer The RPA expiry timer is only initialized inside mgmt.c when we receive the first command from user space. This action also involves setting the HCI_MGMT flag for the first time so that flag acts as a good indicator of whether the delayed work variable can be touched or not. This patch fixes hci_dev_do_close to first check the flag. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-02-24 11:05:26 -08:00

... 41 42 43 44 45 ...

33,976 commits