linux-pinenote

Author	SHA1	Message	Date
Johannes Berg	dde7dc759b	Merge remote-tracking branch 'mac80211/master' into mac80211-next	2013-05-25 00:01:30 +02:00
Ashok Nagarajan	b422c6cd7e	{cfg,mac}80211: move mandatory rates calculation to cfg80211 Move mandatory rates calculation to cfg80211, shared with non mac80211 drivers. Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> [extend documentation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 23:54:43 +02:00
Ashok Nagarajan	d4a5a48976	mac80211: Move mesh estab_plinks outside mesh_stats debug group As estab_plinks is not a statistics member, don't show its debug information along with other mesh stat members Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 23:18:42 +02:00
Jouni Malinen	5e4b6f5698	cfg80211: Allow TDLS peer AID to be configured for VHT VHT uses peer AID in the PARTIAL_AID field in TDLS frames. The current design for TDLS is to first add a dummy STA entry before completing TDLS Setup and then update information on this STA entry based on what was received from the peer during the setup exchange. In theory, this could use NL80211_ATTR_STA_AID to set the peer AID just like this is used in AP mode to set the AID of an association station. However, existing cfg80211 validation rules prevent this attribute from being used with set_station operation. To avoid interoperability issues between different kernel and user space version combinations, introduce a new nl80211 attribute for the purpose of setting TDLS peer AID. This attribute can be used in both the new_station and set_station operations. It is not supposed to be allowed to change the AID value during the lifetime of the STA entry, but that validation is left for drivers to do in the change_station callback. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 22:36:28 +02:00
Linus Torvalds	eb3d33900a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "It's been a while since my last pull request so quite a few fixes have piled up." Indeed. 1) Fix nf_{log,queue} compilation with PROC_FS disabled, from Pablo Neira Ayuso. 2) Fix data corruption on some tg3 chips with TSO enabled, from Michael Chan. 3) Fix double insertion of VLAN tags in be2net driver, from Sarveshwar Bandi. 4) Don't have TCP's MD5 support pass > PAGE_SIZE page offsets in scatter-gather entries into the crypto layer, the crypto layer can't handle that. From Eric Dumazet. 5) Fix lockdep splat in 802.1Q MRP code, also from Eric Dumazet. 6) Fix OOPS in netfilter log module when called from conntrack, from Hans Schillstrom. 7) FEC driver needs to use netif_tx_{lock,unlock}_bh() rather than the non-BH disabling variants. From Fabio Estevam. 8) TCP GSO can generate out-of-order packets, fix from Eric Dumazet. 9) vxlan driver doesn't update 'used' field of fdb entries when it should, from Sridhar Samudrala. 10) ipv6 should use kzalloc() to allocate inet6 socket cork options, otherwise we can OOPS in ip6_cork_release(). From Eric Dumazet. 11) Fix races in bonding set mode, from Nikolay Aleksandrov. 12) Fix checksum generation regression added by "r8169: fix 8168evl frame padding.", from Francois Romieu. 13) ip_gre can look at stale SKB data pointer, fix from Eric Dumazet. 14) Fix checksum handling when GSO is enabled in bnx2x driver with certain chips, from Yuval Mintz. 15) Fix double free in batman-adv, from Martin Hundebøll. 16) Fix device startup synchronization with firmware in tg3 driver, from Nithin Sujit. 17) perf networking dropmonitor doesn't work at all due to mixed up trace parameter ordering, from Ben Hutchings. 18) Fix proportional rate reduction handling in tcp_ack(), from Nandita Dukkipati. 19) IPSEC layer doesn't return an error when a valid state is detected, causing an OOPS. Fix from Timo Teräs. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits) be2net: bug fix on returning an invalid nic descriptor tcp: xps: fix reordering issues net: Revert unused variable changes. xfrm: properly handle invalid states as an error virtio_net: enable napi for all possible queues during open tcp: bug fix in proportional rate reduction. net: ethernet: sun: drop unused variable net: ethernet: korina: drop unused variable net: ethernet: apple: drop unused variable qmi_wwan: Added support for Cinterion's PLxx WWAN Interface perf: net_dropmonitor: Remove progress indicator perf: net_dropmonitor: Use bisection in symbol lookup perf: net_dropmonitor: Do not assume ordering of dictionaries perf: net_dropmonitor: Fix symbol-relative addresses perf: net_dropmonitor: Fix trace parameter order net: fec: use a more proper compatible string for MVF type device qlcnic: Fix updating netdev->features qlcnic: remove netdev->trans_start updates within the driver qlcnic: Return proper error codes from probe failure paths tg3: Update version to 3.132 ...	2013-05-24 08:27:32 -07:00
Oleksij Rempel	786677d100	mac80211: add STBC flag for radiotap Some chips can tell us if received frame was encoded with STBC or not. To make this information available in user space we can use updated radiotap specification: http://www.radiotap.org/defined-fields/MCS This patch will set number of STBC encoded spatial streams (Nss). The HAVE_STBC flag should be provided by driver. Signed-off-by: Oleksij Rempel <linux@rempel-privat.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 12:07:25 +02:00
Vlad Yasevich	161f65ba35	bridge: Set vlan_features to allow offloads on vlans. When vlan device is configured on top of the brige, it does not support any offload capabilities because the bridge device does not initiliaze vlan_fatures. Set vlan_fatures to be equivalent to hw_fatures. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 18:54:30 -07:00
Eric Dumazet	547669d483	tcp: xps: fix reordering issues commit `3853b5841c` ("xps: Improvements in TX queue selection") introduced ooo_okay flag, but the condition to set it is slightly wrong. In our traces, we have seen ACK packets being received out of order, and RST packets sent in response. We should test if we have any packets still in host queue. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 18:29:20 -07:00
Johannes Berg	4c8a9d4bfa	mac80211: close AP_VLAN interfaces before unregistering all Since Eric's commit `efe117ab8` ("Speedup ieee80211_remove_interfaces") there's a bug in mac80211 when it unregisters with AP_VLAN interfaces up. If the AP_VLAN interface was registered after the AP it belongs to (which is the typical case) and then we get into this code path, unregister_netdevice_many() will crash because it isn't prepared to deal with interfaces being closed in the middle of it. Exactly this happens though, because we iterate the list, find the AP master this AP_VLAN belongs to and dev_close() the dependent VLANs. After this, unregister_netdevice_many() won't pick up the fact that the AP_VLAN is already down and will do it again, causing a crash. Cc: stable@vger.kernel.org [2.6.33+] Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 01:06:09 +02:00
Johannes Berg	5f38a11274	mac80211: assign AP_VLAN hw queues correctly A lot of code in mac80211 assumes that the hw queues are set up correctly for all interfaces (except for monitor) but this isn't true for AP_VLAN interfaces. Fix this by copying the AP master configuration when an AP VLAN is brought up, after this the AP interface can't change its configuration any more and needs to be brought down to change it, which also forces AP_VLAN interfaces down, so just copying in open() is sufficient. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 23:11:33 +02:00
Felix Fietkau	4325d724cd	cfg80211: fix reporting 64-bit station info tx bytes Copy & paste mistake - STATION_INFO_TX_BYTES64 is the name of the flag, not NL80211_STA_INFO_TX_BYTES64. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 22:08:18 +02:00
Johannes Berg	2b436312f0	mac80211: fix queue handling crash The code I added in "mac80211: don't start new netdev queues if driver stopped" crashes for monitor and AP VLAN interfaces because while they have a netdev, they don't have queues set up by the driver. To fix the crash, exclude these from queue accounting here and just start their netdev queues unconditionally. For monitor, this is the best we can do, as we can redirect frames there to any other interface and don't know which one that will since it can be different for each frame. For AP VLAN interfaces, we can do better later and actually properly track the queue status. Not doing this is really a separate bug though. Reported-by: Ilan Peer <ilan.peer@intel.com> Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 21:04:38 +02:00
Johannes Berg	c815797663	cfg80211: check wdev->netdev in connection work If a P2P-Device is present and another virtual interface triggers the connection work, the system crash because it tries to check if the P2P-Device's netdev (which doesn't exist) is up. Skip any wdevs that have no netdev to fix this. Cc: stable@vger.kernel.org Reported-by: YanBo <dreamfly281@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 18:12:38 +02:00
Chen Gang	4f36ea6eed	netfilter: ipt_ULOG: fix non-null terminated string in the nf_log path If nf_log uses ipt_ULOG as logging output, we can deliver non-null terminated strings to user-space since the maximum length of the prefix that is passed by nf_log is NF_LOG_PREFIXLEN but pm->prefix is 32 bytes long (ULOG_PREFIX_LEN). This is actually happening already from nf_conntrack_tcp if ipt_ULOG is used, since it is passing strings longer than 32 bytes. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 14:25:40 +02:00
Simon Horman	a38e5e230e	ipvs: use cond_resched_rcu() helper when walking connections This avoids the situation where walking of a large number of connections may prevent scheduling for a long time while also avoiding excessive calls to rcu_read_unlock() and rcu_read_lock(). Note that in the case of !CONFIG_PREEMPT_RCU this will add a call to cond_resched(). Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 14:23:18 +02:00
Pablo Neira Ayuso	de94c4591b	netfilter: {ipt,ebt}_ULOG: rise warning on deprecation This target has been superseded by NFLOG. Spot a warning so we prepare removal in a couple of years. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-23 14:23:16 +02:00
Pablo Neira Ayuso	6d11cfdba5	netfilter: don't panic on error while walking through the init path Don't panic if we hit an error while adding the nf_log or pernet netfilter support, just bail out. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-23 14:22:30 +02:00
Florian Westphal	2a7851bffb	netfilter: add nf_ipv6_ops hook to fix xt_addrtype with IPv6 Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812: [ ip6tables -m addrtype ] When I tried to use in the nat/PREROUTING it messes up the routing cache even if the rule didn't matched at all. [..] If I remove the --limit-iface-in from the non-working scenario, so just use the -m addrtype --dst-type LOCAL it works! This happens when LOCAL type matching is requested with --limit-iface-in, and the default ipv6 route is via the interface the packet we test arrived on. Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation creates an unwanted cached entry, and the packet won't make it to the real/expected destination. Silently ignoring --limit-iface-in makes the routing work but it breaks rule matching (--dst-type LOCAL with limit-iface-in is supposed to only match if the dst address is configured on the incoming interface; without --limit-iface-in it will match if the address is reachable via lo). The test should call ipv6_chk_addr() instead. However, this would add a link-time dependency on ipv6. There are two possible solutions: 1) Revert the commit that moved ipt_addrtype to xt_addrtype, and put ipv6 specific code into ip6t_addrtype. 2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions. While the former might seem preferable, Pablo pointed out that there are more xt modules with link-time dependeny issues regarding ipv6, so lets go for 2). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:58:55 +02:00
Chen Gang	8bc14d25ff	bridge: netfilter: using strlcpy() instead of strncpy() 'name' has already set all zero when it is defined, so not need let strncpy() to pad it again. 'name' is a string, better always let is NUL terminated, so use strlcpy() instead of strncpy(). Signed-off-by: Chen Gang <gang.chen@asianux.com> Acked-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:24:57 +02:00
Eric Dumazet	00028aa370	netfilter: xt_socket: use IP early demux With IP early demux added in linux-3.6, we perform TCP lookup in IP layer before iptables hooks. We can avoid doing a second lookup in xt_socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:09:53 +02:00
Eric Dumazet	27e7190efd	netfilter: xt_CT: optimize XT_CT_NOTRACK The percpu untracked ct are not currently used for XT_CT_NOTRACK. xt_ct_tg_check()/xt_ct_target() provides a single ct. Thats not optimal as the ct->ct_general.use cache line will bounce among cpus. Use the intended [1] thing : xt_ct_target() should select the percpu object. [1] Refs : commit `5bfddbd46a` ("netfilter: nf_conntrack: IPS_UNTRACKED bit") commit `b3c5163fe0` ("netfilter: nf_conntrack: per_cpu untracking") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:09:29 +02:00
Timo Teräs	497574c72c	xfrm: properly handle invalid states as an error The error exit path needs err explicitly set. Otherwise it returns success and the only caller, xfrm_output_resume(), would oops in skb_dst(skb)->ops derefence as skb_dst(skb) is NULL. Bug introduced in commit `bb65a9cb` (xfrm: removes a superfluous check and add a statistic). Signed-off-by: Timo Teräs <timo.teras@iki.fi> Cc: Li RongQing <roy.qing.li@gmail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 01:20:07 -07:00
Cong Wang	8892475386	ipv6: use ipv6_addr_scope() helper ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK could be replaced by ipv6_addr_scope(), which is slightly faster. Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 01:17:47 -07:00
Cong Wang	7996c799ae	ipv6: use ipv6_addr_any() helper ipv6_addr_any() is a faster way to determine if an addr is ipv6 any addr, no need to compute the addr type. Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 01:17:47 -07:00
Nandita Dukkipati	35f079ebbc	tcp: bug fix in proportional rate reduction. This patch is a fix for a bug triggering newly_acked_sacked < 0 in tcp_ack(.). The bug is triggered by sacked_out decreasing relative to prior_sacked, but packets_out remaining the same as pior_packets. This is because the snapshot of prior_packets is taken after tcp_sacktag_write_queue() while prior_sacked is captured before tcp_sacktag_write_queue(). The problem is: tcp_sacktag_write_queue (tcp_match_skb_to_sack() -> tcp_fragment) adjusts the pcount for packets_out and sacked_out (MSS change or other reason). As a result, this delta in pcount is reflected in (prior_sacked - sacked_out) but not in (prior_packets - packets_out). This patch does the following: 1) initializes prior_packets at the start of tcp_ack() so as to capture the delta in packets_out created by tcp_fragment. 2) introduces a new "previous_packets_out" variable that snapshots packets_out right before tcp_clean_rtx_queue, so pkts_acked can be correctly computed as before. 3) Computes pkts_acked using previous_packets_out, and computes newly_acked_sacked using prior_packets. Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 00:10:09 -07:00
Eric Dumazet	e43ac79a4b	sch_tbf: segment too big GSO packets If a GSO packet has a length above tbf burst limit, the packet is currently silently dropped. Current way to handle this is to set the device in non GSO/TSO mode, or setting high bursts, and its sub optimal. We can actually segment too big GSO packets, and send individual segments as tbf parameters allow, allowing for better interoperability. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 00:06:40 -07:00
Simon Horman	1cdbcb7957	net: Loosen constraints for recalculating checksum in skb_segment() This is a generic solution to resolve a specific problem that I have observed. If the encapsulation of an skb changes then ability to offload checksums may also change. In particular it may be necessary to perform checksumming in software. An example of such a case is where a non-GRE packet is received but is to be encapsulated and transmitted as GRE. Another example relates to my proposed support for for packets that are non-MPLS when received but MPLS when transmitted. The cost of this change is that the value of the csum variable may be checked when it previously was not. In the case where the csum variable is true this is pure overhead. In the case where the csum variable is false it leads to software checksumming, which I believe also leads to correct checksums in transmitted packets for the cases described above. Further analysis: This patch relies on the return value of can_checksum_protocol() being correct and in turn the return value of skb_network_protocol(), used to provide the protocol parameter of can_checksum_protocol(), being correct. It also relies on the features passed to skb_segment() and in turn to can_checksum_protocol() being correct. I believe that this problem has not been observed for VLANs because it appears that almost all drivers, the exception being xgbe, set vlan_features such that that the checksum offload support for VLAN packets is greater than or equal to that of non-VLAN packets. I wonder if the code in xgbe may be an oversight and the hardware does support checksumming of VLAN packets. If so it may be worth updating the vlan_features of the driver as this patch will force such checksums to be performed in software rather than hardware. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 15:06:13 -07:00
Cong Wang	6b7df111ec	bridge: send query as soon as leave is received Continue sending queries when leave is received if the user marks it as a querier. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 14:54:37 -07:00
Cong Wang	9f00b2e7cf	bridge: only expire the mdb entry when query is received Currently we arm the expire timer when the mdb entry is added, however, this causes problem when there is no querier sent out after that. So we should only arm the timer when a corresponding query is received, as suggested by Herbert. And he also mentioned "if there is no querier then group subscriptions shouldn't expire. There has to be at least one querier in the network for this thing to work. Otherwise it just degenerates into a non-snooping switch, which is OK." Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 14:54:37 -07:00
Cong Wang	1c8ad5bfa2	bridge: use the bridge IP addr as source addr for querier Quote from Adam: "If it is believed that the use of 0.0.0.0 as the IP address is what is causing strange behaviour on other devices then is there a good reason that a bridge rather than a router shouldn't be the active querier? If not then using the bridge IP address and having the querier enabled by default may be a reasonable solution (provided that our querier obeys the election rules and shuts up if it sees a query from a lower IP address that isn't 0.0.0.0). Just because a device is the elected querier for IGMP doesn't appear to mean it is required to perform any other routing functions." And introduce a new troggle for it, as suggested by Herbert. Suggested-by: Adam Baker <linux@baker-net.org.uk> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 14:54:37 -07:00
Trond Myklebust	a3c3cac5d3	SUNRPC: Prevent an rpc_task wakeup race The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to be careful about ordering the calls to rpc_test_and_set_running(task) and rpc_clear_queued(task). If we get the order wrong, then we may end up testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped and changed the state of the rpc_task. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2013-05-22 14:55:32 -04:00
Martin Hundebøll	f69ae770e7	batman-adv: Avoid double freeing of bat_counters On errors in batadv_mesh_init(), bat_counters will be freed in both batadv_mesh_free() and batadv_softif_init_late(). This patch fixes this by returning earlier from batadv_softif_init_late() in case of errors in batadv_mesh_init() and by setting bat_counters to NULL after freeing. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-21 21:34:36 +02:00
chaoting fan	b6040f9706	sunrpc: the cache_detail in cache_is_valid is unused any more The cache_detail(*detail) in function cache_is_valid is not used any more. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-21 11:02:02 -04:00
Paul Bolle	7c055881de	NFC: Remove commented out LLCP related Makefile line The Kconfig symbol NFC_LLCP was removed in commit `30cc458765` ("NFC: Move LLCP code to the NFC top level diirectory"). But the reference to its macro in this Makefile was only commented out. Remove it now. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-05-21 10:47:41 +02:00
Eric Dumazet	71cea17ed3	tcp: md5: remove spinlock usage in fast path TCP md5 code uses per cpu variables but protects access to them with a shared spinlock, which is a contention point. [ tcp_md5sig_pool_lock is locked twice per incoming packet ] Makes things much simpler, by allocating crypto structures once, first time a socket needs md5 keys, and not deallocating them as they are really small. Next step would be to allow crypto allocations being done in a NUMA aware way. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-20 14:00:42 -07:00
Willem de Bruijn	99bbc70741	rps: selective flow shedding during softnet overflow A cpu executing the network receive path sheds packets when its input queue grows to netdev_max_backlog. A single high rate flow (such as a spoofed source DoS) can exceed a single cpu processing rate and will degrade throughput of other flows hashed onto the same cpu. This patch adds a more fine grained hashtable. If the netdev backlog is above a threshold, IRQ cpus track the ratio of total traffic of each flow (using 4096 buckets, configurable). The ratio is measured by counting the number of packets per flow over the last 256 packets from the source cpu. Any flow that occupies a large fraction of this (set at 50%) will see packet drop while above the threshold. Tested: Setup is a muli-threaded UDP echo server with network rx IRQ on cpu0, kernel receive (RPS) on cpu0 and application threads on cpus 2--7 each handling 20k req/s. Throughput halves when hit with a 400 kpps antagonist storm. With this patch applied, antagonist overload is dropped and the server processes its complete load. The patch is effective when kernel receive processing is the bottleneck. The above RPS scenario is a extreme, but the same is reached with RFS and sufficient kernel processing (iptables, packet socket tap, ..). Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-20 13:48:04 -07:00
John W. Linville	ba7c96bec5	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-05-20 15:19:01 -04:00
Eric Dumazet	96f5a846bd	ip_gre: fix a possible crash in ipgre_err() Another fix needed in ipgre_err(), as parse_gre_header() might change skb->head. Bug added in commit `c544193214` (GRE: Refactor GRE tunneling code.) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-20 00:18:52 -07:00
Yuchung Cheng	3e59cb0ddf	tcp: remove bad timeout logic in fast recovery tcp_timeout_skb() was intended to trigger fast recovery on timeout, unfortunately in reality it often causes spurious retransmission storms during fast recovery. The particular sign is a fast retransmit over the highest sacked sequence (SND.FACK). Currently the RTO timer re-arming (as in RFC6298) offers a nice cushion to avoid spurious timeout: when SND.UNA advances the sender re-arms RTO and extends the timeout by icsk_rto. The sender does not offset the time elapsed since the packet at SND.UNA was sent. But if the next (DUP)ACK arrives later than ~RTTVAR and triggers tcp_fastretrans_alert(), then tcp_timeout_skb() will mark any packet sent before the icsk_rto interval lost, including one that's above the highest sacked sequence. Most likely a large part of scorebard will be marked. If most packets are not lost then the subsequent DUPACKs with new SACK blocks will cause the sender to continue to retransmit packets beyond SND.FACK spuriously. Even if only one packet is lost the sender may falsely retransmit almost the entire window. The situation becomes common in the world of bufferbloat: the RTT continues to grow as the queue builds up but RTTVAR remains small and close to the minimum 200ms. If a data packet is lost and the DUPACK triggered by the next data packet is slightly delayed, then a spurious retransmission storm forms. As the original comment on tcp_timeout_skb() suggests: the usefulness of this feature is questionable. It also wastes cycles walking the sack scoreboard and is actually harmful because of false recovery. It's time to remove this. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 23:51:17 -07:00
Rusty Russell	d2f83e9078	Hoist memcpy_fromiovec/memcpy_toiovec into lib/ ERROR: "memcpy_fromiovec" [drivers/vhost/vhost_scsi.ko] undefined! That function is only present with CONFIG_NET. Turns out that crypto/algif_skcipher.c also uses that outside net, but it actually needs sockets anyway. In addition, commit `6d4f0139d6` added CONFIG_NET dependency to CONFIG_VMCI for memcpy_toiovec, so hoist that function and revert that commit too. socket.h already includes uio.h, so no callers need updating; trying only broke things fo x86_64 randconfig (thanks Fengguang!). Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-05-20 10:24:22 +09:30
Chen Gang	ff0102ee10	net: irda: using kzalloc() instead of kmalloc() to avoid strncpy() issue. 'discovery->data.info' length is 22, NICKNAME_MAX_LEN is 21, so the strncpy() will always left the last byte of 'discovery->data.info' uninitialized. When 'text' length is longer than 21 (NICKNAME_MAX_LEN), if still left the last byte of 'discovery->data.info' uninitialized, the next strlen() will cause issue. Also 'discovery->data' is 'struct irda_device_info' which defined in "include/uapi/...", it may copy to user mode, so need whole initialized. All together, need use kzalloc() instead of kmalloc() to initialize all members firstly. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 15:10:47 -07:00
Nicolas Dichtel	caeaba7900	ipv6: add support of peer address This patch adds the support of peer address for IPv6. For example, it is possible to specify the remote end of a 6inY tunnel. This was already possible in IPv4: ip addr add ip1 peer ip2 dev dev1 The peer address is specified with IFA_ADDRESS and the local address with IFA_LOCAL (like explained in include/uapi/linux/if_addr.h). Note that the API is not changed, because before this patch, it was not possible to specify two different addresses in IFA_LOCAL and IFA_REMOTE. There is a small change for the dump: if the peer is different from ::, IFA_ADDRESS will contain the peer address instead of the local address. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 15:09:26 -07:00
Paul Moore	6b21e1b77d	netlabel: improve domain mapping validation The net/netlabel/netlabel_domainhash.c:netlbl_domhsh_add() function does not properly validate new domain hash entries resulting in potential problems when an administrator attempts to add an invalid entry. One such problem, as reported by Vlad Halilov, is a kernel BUG (found in netlabel_domainhash.c:netlbl_domhsh_audit_add()) when adding an IPv6 outbound mapping with a CIPSO configuration. This patch corrects this problem by adding the necessary validation code to netlbl_domhsh_add() via the newly created netlbl_domhsh_validate() function. Ideally this patch should also be pushed to the currently active -stable trees. Reported-by: Vlad Halilov <vlad.halilov@gmail.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 14:49:55 -07:00
Eric Dumazet	284041ef21	ipv6: fix possible crashes in ip6_cork_release() commit `0178b695fd` ("ipv6: Copy cork options in ip6_append_data") added some code duplication and bad error recovery, leading to potential crash in ip6_cork_release() as kfree() could be called with garbage. use kzalloc() to make sure this wont happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Neal Cardwell <ncardwell@google.com>	2013-05-18 12:55:45 -07:00
Nicolas Dichtel	57b354e66b	dev: remove duplicate 'skb->dev = dev' in dev_forward_skb() This was added by commit `59b9997bab` (Revert "net: maintain namespace isolation between vlan and real device"). In fact, before the initial commit - the one that is reverted -, this statement was not present. 'skb->dev = dev' is already done in eth_type_trans(), which is call just after. Spotted-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-17 18:23:08 -07:00
Alex Elder	14d2f38df6	libceph: must hold mutex for reset_changed_osds() An osd client has a red-black tree describing its osds, and occasionally we would get crashes due to one of these trees tree becoming corrupt somehow. The problem turned out to be that reset_changed_osds() was being called without protection of the osd client request mutex. That function would call __reset_osd() for any osd that had changed, and __reset_osd() would call __remove_osd() for any osd with no outstanding requests, and finally __remove_osd() would remove the corresponding entry from the red-black tree. Thus, the tree was getting modified without having any lock protection, and was vulnerable to problems due to concurrent updates. This appears to be the only osd tree updating path that has this problem. It can be fairly easily fixed by moving the call up a few lines, to just before the request mutex gets dropped in kick_requests(). This resolves: http://tracker.ceph.com/issues/5043 Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-17 12:45:40 -05:00
Stanislaw Gruszka	6211dd12da	mac80211: fix direct probe auth We send direct probe to broadcast address, as some APs do not respond to unicast PROBE frames when unassociated. Broadcast frames are not acked, so we can not use that for trigger MLME state machine, but we need to use old timeout mechanism. This fixes authentication timed out like below: [ 1024.671974] wlan6: authenticate with 54:e6:fc:98:63:fe [ 1024.694125] wlan6: direct probe to 54:e6:fc:98:63:fe (try 1/3) [ 1024.695450] wlan6: direct probe to 54:e6:fc:98:63:fe (try 2/3) [ 1024.700586] wlan6: send auth to 54:e6:fc:98:63:fe (try 3/3) [ 1024.701441] wlan6: authentication with 54:e6:fc:98:63:fe timed out With fix, we have: [ 4524.198978] wlan6: authenticate with 54:e6:fc:98:63:fe [ 4524.220692] wlan6: direct probe to 54:e6:fc:98:63:fe (try 1/3) [ 4524.421784] wlan6: send auth to 54:e6:fc:98:63:fe (try 2/3) [ 4524.423272] wlan6: authenticated [ 4524.423811] wlan6: associate with 54:e6:fc:98:63:fe (try 1/3) [ 4524.427492] wlan6: RX AssocResp from 54:e6:fc:98:63:fe (capab=0x431 status=0 aid=1) Cc: stable@vger.kernel.org # 3.9 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-17 13:59:18 +02:00
Linus Lüssing	72822225bd	batman-adv: Fix rcu_barrier() miss due to double call_rcu() in TT code rcu_barrier() only waits for the currently scheduled rcu functions to finish - it won't wait for any function scheduled via another call_rcu() within an rcu scheduled function. Unfortunately our batadv_tt_orig_list_entry_free_ref() does just that, via a batadv_orig_node_free_ref() call, leading to our rcu_barrier() call potentially missing such a batadv_orig_node_free_ref(). This patch fixes this issue by calling the batadv_orig_node_free_rcu() directly from the rcu callback, removing the unnecessary, additional call_rcu() layer here. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Antonio Quartulli <ordex@autistici.org>	2013-05-17 09:54:28 +02:00
Eric Dumazet	d2cf43674e	tcp: speedup tcp_fixup_rcvbuf() tcp_fixup_rcvbuf() contains a loop to estimate initial socket rcv space needed for a given mss. With large MTU (like 64K on lo), we can loop ~500 times and consume a lot of cpu cycles. perf top of 200 concurrent netperf -t TCP_CRR 5.62% netperf [kernel.kallsyms] [k] tcp_init_buffer_space 1.71% netperf [kernel.kallsyms] [k] _raw_spin_lock 1.55% netperf [kernel.kallsyms] [k] kmem_cache_free 1.51% netperf [kernel.kallsyms] [k] tcp_transmit_skb 1.50% netperf [kernel.kallsyms] [k] tcp_ack Lets use a 100% factor, and remove the loop. 100% is needed anyway for tcp_adv_win_scale=1 default value, and is also the maximum factor. Refs: commit `b49960a05e` ("tcp: change tcp_adv_win_scale and tcp_rmem[2]") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-16 15:19:45 -07:00
Eric Dumazet	6ff50cd555	tcp: gso: do not generate out of order packets GSO TCP handler has following issues : 1) ooo_okay from original GSO packet is duplicated to all segments 2) segments (but the last one) are orphaned, so transmit path can not get transmit queue number from the socket. This happens if GSO segmentation is done before stacked device for example. Result is we can send packets from a given TCP flow to different TX queues (if using multiqueue NICS). This generates OOO problems and spurious SACK & retransmits. Fix this by keeping socket pointer set for all segments. This means that every segment must also have a destructor, and the original gso skb truesize must be split on all segments, to keep precise sk->sk_wmem_alloc accounting. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-16 14:43:40 -07:00

... 21 22 23 24 25 ...

29,386 commits