Commit graph

35769 commits

Author SHA1 Message Date
Neal Cardwell
87d943085b tcp: remove obsolete comment about TCP_SKB_CB(skb)->when in tcp_fragment()
The TCP_SKB_CB(skb)->when field no longer exists as of recent change
7faee5c0d5 ("tcp: remove TCP_SKB_CB(skb)->when"). And in any case,
tcp_fragment() is called on already-transmitted packets from the
__tcp_retransmit_skb() call site, so copying timestamps of any kind
in this spot is quite sensible.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reported-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-06 12:29:10 -07:00
Eric Dumazet
7faee5c0d5 tcp: remove TCP_SKB_CB(skb)->when
After commit 740b0f1841 ("tcp: switch rtt estimations to usec resolution"),
we no longer need to maintain timestamps in two different fields.

TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:49:33 -07:00
Eric Dumazet
04317dafd1 tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn
TCP_SKB_CB(skb)->when has different meaning in output and input paths.

In output path, it contains a timestamp.
In input path, it contains an ISN, chosen by tcp_timewait_state_process()

Lets add a different name to ease code comprehension.

Note that 'when' field will disappear in following patch,
as skb_mstamp already contains timestamp, the anonymous
union will promptly disappear as well.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:49:33 -07:00
Alexander Duyck
56193d1bce net: Add function for parsing the header length out of linear ethernet frames
This patch updates some of the flow_dissector api so that it can be used to
parse the length of ethernet buffers stored in fragments.  Most of the
changes needed were to __skb_get_poff as it needed to be updated to support
sending a linear buffer instead of a skb.

I have split __skb_get_poff into two functions, the first is skb_get_poff
and it retains the functionality of the original __skb_get_poff.  The other
function is __skb_get_poff which now works much like __skb_flow_dissect in
relation to skb_flow_dissect in that it provides the same functionality but
works with just a data buffer and hlen instead of needing an skb.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:47:02 -07:00
Alexander Duyck
82eabd9eb2 net: merge cases where sock_efree and sock_edemux are the same function
Since sock_efree and sock_demux are essentially the same code for non-TCP
sockets and the case where CONFIG_INET is not defined we can combine the
code or replace the call to sock_edemux in several spots.  As a result we
can avoid a bit of unnecessary code or code duplication.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:43:45 -07:00
Alexander Duyck
62bccb8cdb net-timestamp: Make the clone operation stand-alone from phy timestamping
The phy timestamping takes a different path than the regular timestamping
does in that it will create a clone first so that the packets needing to be
timestamped can be placed in a queue, or the context block could be used.

In order to support these use cases I am pulling the core of the code out
so it can be used in other drivers beyond just phy devices.

In addition I have added a destructor named sock_efree which is meant to
provide a simple way for dropping the reference to skb exceptions that
aren't part of either the receive or send windows for the socket, and I
have removed some duplication in spots where this destructor could be used
in place of sock_edemux.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:43:45 -07:00
Alexander Duyck
37846ef018 net-timestamp: Merge shared code between phy and regular timestamping
This change merges the shared bits that exist between skb_tx_tstamp and
skb_complete_tx_timestamp.  By doing this we can avoid the two diverging as
there were already changes pushed into skb_tx_tstamp that hadn't made it
into the other function.

In addition this resolves issues with the fact that
skb_complete_tx_timestamp was included in linux/skbuff.h even though it was
only compiled in if phy timestamping was enabled.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:43:45 -07:00
Eric Dumazet
d546c62154 ipv4: harden fnhe_hashfun()
Lets make this hash function a bit secure, as ICMP attacks are still
in the wild.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:40:33 -07:00
Masanari Iida
e793c0f70e net: treewide: Fix typo found in DocBook/networking.xml
This patch fix spelling typo found in DocBook/networking.xml.
It is because the neworking.xml is generated from comments
in the source, I have to fix typo in comments within the source.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:35:28 -07:00
Pablo Neira Ayuso
84a59ca55f netfilter: add explicit Kconfig for NETFILTER_XT_NAT
Paul Bolle reports that 'select NETFILTER_XT_NAT' from the IPV4 and IPV6
NAT tables becomes noop since there is no Kconfig switch for it. Add the
Kconfig switch to resolve this problem.

Fixes: 8993cf8 netfilter: move NAT Kconfig switches out of the iptables scope
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:23:31 -07:00
Eric Dumazet
caa415270c ipv4: fix a race in update_or_create_fnhe()
nh_exceptions is effectively used under rcu, but lacks proper
barriers. Between kzalloc() and setting of nh->nh_exceptions(),
we need a proper memory barrier.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 4895c771c7 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:15:50 -07:00
Nicolas Dichtel
e7478dfc46 ipv6: use addrconf_get_prefix_route() to remove peer addr
addrconf_get_prefix_route() ensures to get the right route in the right table.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:13:24 -07:00
Nicolas Dichtel
f24062b07d ipv6: fix a refcnt leak with peer addr
There is no reason to take a refcnt before deleting the peer address route.
It's done some lines below for the local prefix route because
inet6_ifa_finish_destroy() will release it at the end.
For the peer address route, we want to free it right now.

This bug has been introduced by commit
caeaba7900 ("ipv6: add support of peer address").

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:13:24 -07:00
Andy Zhou
29abe2fda5 l2tp: fix missing line continuation
This syntax error was covered by L2TP_REFCNT_DEBUG not being set by
default.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 15:19:53 -07:00
Willem de Bruijn
c199105d15 net-timestamp: only report sw timestamp if reporting bit is set
The timestamping API has separate bits for generating and reporting
timestamps. A software timestamp should only be reported for a packet
when the packet has the relevant generation flag (SKBTX_..) set
and the socket has reporting bit SOF_TIMESTAMPING_SOFTWARE set.

The second check was accidentally removed. Reinstitute the original
behavior.

Tested:
  Without this patch, Documentation/networking/txtimestamp reports
  timestamps regardless of whether SOF_TIMESTAMPING_SOFTWARE is set.
  After the patch, it only reports them when the flag is set.

Fixes: f24b9be595 ("net-timestamp: extend SCM_TIMESTAMPING ancillary data struct")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 15:02:43 -07:00
Guillaume Nault
eed4d839b0 l2tp: fix race while getting PMTU on PPP pseudo-wire
Use dst_entry held by sk_dst_get() to retrieve tunnel's PMTU.

The dst_mtu(__sk_dst_get(tunnel->sock)) call was racy. __sk_dst_get()
could return NULL if tunnel->sock->sk_dst_cache was reset just before the
call, thus making dst_mtu() dereference a NULL pointer:

[ 1937.661598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 1937.664005] IP: [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005] PGD daf0c067 PUD d9f93067 PMD 0
[ 1937.664005] Oops: 0000 [#1] SMP
[ 1937.664005] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables udp_tunnel pppoe pppox ppp_generic slhc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common blowfish_generic blowfish_x86_64 blowfish_common des_generic cbc xcbc rmd160 sha512_generic hmac crypto_null af_key xfrm_algo 8021q garp bridge stp llc tun atmtcp clip atm ext3 mbcache jbd iTCO_wdt coretemp kvm_intel iTCO_vendor_support kvm pcspkr evdev ehci_pci lpc_ich mfd_core i5400_edac edac_core i5k_amb shpchp button processor thermal_sys xfs crc32c_generic libcrc32c dm_mod usbhid sg hid sr_mod sd_mod cdrom crc_t10dif crct10dif_common ata_generic ahci ata_piix tg3 libahci libata uhci_hcd ptp ehci_hcd pps_core usbcore scsi_mod libphy usb_common [last unloaded: l2tp_core]
[ 1937.664005] CPU: 0 PID: 10022 Comm: l2tpstress Tainted: G           O   3.17.0-rc1 #1
[ 1937.664005] Hardware name: HP ProLiant DL160 G5, BIOS O12 08/22/2008
[ 1937.664005] task: ffff8800d8fda790 ti: ffff8800c43c4000 task.ti: ffff8800c43c4000
[ 1937.664005] RIP: 0010:[<ffffffffa049db88>]  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005] RSP: 0018:ffff8800c43c7de8  EFLAGS: 00010282
[ 1937.664005] RAX: ffff8800da8a7240 RBX: ffff8800d8c64600 RCX: 000001c325a137b5
[ 1937.664005] RDX: 8c6318c6318c6320 RSI: 000000000000010c RDI: 0000000000000000
[ 1937.664005] RBP: ffff8800c43c7ea8 R08: 0000000000000000 R09: 0000000000000000
[ 1937.664005] R10: ffffffffa048e2c0 R11: ffff8800d8c64600 R12: ffff8800ca7a5000
[ 1937.664005] R13: ffff8800c439bf40 R14: 000000000000000c R15: 0000000000000009
[ 1937.664005] FS:  00007fd7f610f700(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000
[ 1937.664005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1937.664005] CR2: 0000000000000020 CR3: 00000000d9d75000 CR4: 00000000000027e0
[ 1937.664005] Stack:
[ 1937.664005]  ffffffffa049da80 ffff8800d8fda790 000000000000005b ffff880000000009
[ 1937.664005]  ffff8800daf3f200 0000000000000003 ffff8800c43c7e48 ffffffff81109b57
[ 1937.664005]  ffffffff81109b0e ffffffff8114c566 0000000000000000 0000000000000000
[ 1937.664005] Call Trace:
[ 1937.664005]  [<ffffffffa049da80>] ? pppol2tp_connect+0x235/0x41e [l2tp_ppp]
[ 1937.664005]  [<ffffffff81109b57>] ? might_fault+0x9e/0xa5
[ 1937.664005]  [<ffffffff81109b0e>] ? might_fault+0x55/0xa5
[ 1937.664005]  [<ffffffff8114c566>] ? rcu_read_unlock+0x1c/0x26
[ 1937.664005]  [<ffffffff81309196>] SYSC_connect+0x87/0xb1
[ 1937.664005]  [<ffffffff813e56f7>] ? sysret_check+0x1b/0x56
[ 1937.664005]  [<ffffffff8107590d>] ? trace_hardirqs_on_caller+0x145/0x1a1
[ 1937.664005]  [<ffffffff81213dee>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1937.664005]  [<ffffffff8114c262>] ? spin_lock+0x9/0xb
[ 1937.664005]  [<ffffffff813092b4>] SyS_connect+0x9/0xb
[ 1937.664005]  [<ffffffff813e56d2>] system_call_fastpath+0x16/0x1b
[ 1937.664005] Code: 10 2a 84 81 e8 65 76 bd e0 65 ff 0c 25 10 bb 00 00 4d 85 ed 74 37 48 8b 85 60 ff ff ff 48 8b 80 88 01 00 00 48 8b b8 10 02 00 00 <48> 8b 47 20 ff 50 20 85 c0 74 0f 83 e8 28 89 83 10 01 00 00 89
[ 1937.664005] RIP  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005]  RSP <ffff8800c43c7de8>
[ 1937.664005] CR2: 0000000000000020
[ 1939.559375] ---[ end trace 82d44500f28f8708 ]---

Fixes: f34c4a35d8 ("l2tp: take PMTU from tunnel UDP socket")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 14:40:18 -07:00
Govindarajulu Varadarajan
f0db9b0734 ethtool: Add generic options for tunables
This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting
tunable values from driver.

Add get_tunable and set_tunable to ethtool_ops. Driver implements these
functions for getting/setting tunable value.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 12:12:20 -07:00
Daniel Borkmann
e020836d95 dev_ioctl: remove dev_load() CAP_SYS_MODULE message
Marcel reported to see the following message when autoloading
is being triggered when adding nlmon device:

  Loading kernel module for a network device with
  CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
  netdev-nlmon instead.

This false-positive happens despite with having correct
capabilities set, e.g. through issuing `ip link del dev nlmon`
more than once on a valid device with name nlmon, but Marcel
has also seen it on creation time when no nlmon module is
previously compiled-in or loaded as module and the device
name equals a link type name (e.g. nlmon, vxlan, team).

Stephen says:

  The netdev module alias is a hold over from the past. For
  normal devices, people used to create a alias eth0 to and
  point it to the type of network device used, that was back
  in the bad old ISA days before real discovery.

  Also, the tunnels create module alias for the control device
  and ip used to use this to autoload the tunnel device.

  The message is bogus and should just be removed, I also see
  it in a couple of other cases where tap devices are renamed
  for other usese.

As mentioned in 8909c9ad8f ("net: don't allow CAP_NET_ADMIN
to load non-netdev kernel modules"), we nevertheless still
might want to leave the old autoloading behaviour in place
as it could break old scripts, so for now, lets just remove
the log message as Stephen suggests.

Reference: http://thread.gmane.org/gmane.linux.kernel/1105168
Reported-by: Marcel Holtmann <marcel@holtmann.org>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 12:04:40 -07:00
Daniel Borkmann
60a3b2253c net: bpf: make eBPF interpreter images read-only
With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate.  This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bca ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables.  Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.

Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 12:02:48 -07:00
Sabrina Dubroca
a9ed4a2986 ipv6: fix rtnl locking in setsockopt for anycast and multicast
Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()

ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
calling ipv6_dev_mc_inc/dec.

This patch moves ASSERT_RTNL() up a level in the call stack.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 11:52:28 -07:00
Eyal Shapira
f3000e1b43 mac80211: fix broken use of VHT/20Mhz with some APs
commit "mac80211: disable 40MHz support in case of 20MHz AP"
broke working VHT in 20Mhz with APs like Netgear R6300v2 which
do not publish support for 40Mhz but allow use of VHT in 20Mhz.
The break is because VHT is disabled once no HT cap doesn't indicate
support for 40Mhz. This causes the assoc request to be sent without
any VHT IE and the association is only HT due to this.

For more details check out commit 4a817aa7
"mac80211: allow VHT with peers not capable of 40MHz"

Fixes: 53b954ee4a ("mac80211: disable 40MHz support in case of 20MHz AP")
Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:54:07 +02:00
Lorenzo Bianconi
a4bcaf5556 mac80211: extend set_coverage_class signature
Extend mac80211 set_coverage_class API in order to enable ACK timeout
estimation algorithm (dynack) passing coverage class equals to -1
to lower drivers. Synchronize set_coverage_class routine signature with
mac80211 function pointer for p54, ath9k, ath9k_htc and ath5k drivers.

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:54:07 +02:00
Lorenzo Bianconi
3057dbfdab cfg80211: enable dynack through nl80211
Enable ACK timeout estimation algorithm (dynack) using mac80211
set_coverage_class API. Dynack is activated passing coverage class equals to -1
to lower drivers and it is automatically disabled setting valid value for
coverage class.
Define NL80211_ATTR_WIPHY_DYN_ACK flag attribute to enable dynack from
userspace. In order to activate dynack NL80211_FEATURE_ACKTO_ESTIMATION feature
flag must be set by lower drivers to indicate dynack capability.

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:54:03 +02:00
Eliad Peller
eaa336b0f5 mac80211: combine roc with the "next roc" if possible
If the remaining time in the current roc is not long
enough, mac80211 adds the new roc right after it
(if they have similar params).

However, in case of multiple rocs, the "next roc"
is not considered, resulting in multiple rocs,
each one with its own duration.

Refactor the code a bit and consider the next roc,
so a single max roc will be used instead of
multiple rocs (which might last much longer).

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:09 +02:00
Eliad Peller
24ecd45e2e mac80211: adjust roc duration when combining ROCs
The new duration (remaining duration after the current
ROC ends) was calculated but not used, making the
optimization worthless.

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:08 +02:00
Eliad Peller
a62a1aed37 cfg80211: avoid duplicate entries on regdomain intersection
The regdom intersection code simply tries intersecting
each rule of the source with each rule of the target.

Since the resulting intersections are not observed
as a whole, this can result in multiple overlapping/duplicate
entries.

Make the rule addition a bit more smarter, by looking
for rules that can be contained within other rules,
and adding only extended ones.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:08 +02:00
Assaf Krauss
cd2f5dd709 mac80211: Add RRM support to assoc request
In case of a RRM-supporting connection, in the association request
frame: set the RRM capability flag, and add the required IEs.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:08 +02:00
Assaf Krauss
bab5ab7d2a nl80211: Add flag attribute for RRM connections
Add a flag attribute to use in associations, for tagging the target
connection as supporting RRM. It is the responsibility of upper
layers to set this flag only if both the underlying device, and the
target network indeed support RRM.
To be used in ASSOCIATE and CONNECT commands.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:08 +02:00
Liad Kaufman
6188c271f0 mac80211: fix description comment of ieee80211_subif_start_xmit
The function description claimed that on error the skb isn't
freed even though it is, and stated return values that are
different than what really happens in the code.

Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:07 +02:00
Johannes Berg
2740f0cf8e cfg80211: add Intel Mobile Communications copyright
Our legal structure changed at some point (see wikipedia), but
we forgot to immediately switch over to the new copyright
notice.

For files that we have modified in the time since the change,
add the proper copyright notice now.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:06 +02:00
Johannes Berg
d98ad83ee8 mac80211: add Intel Mobile Communications copyright
Our legal structure changed at some point (see wikipedia), but
we forgot to immediately switch over to the new copyright
notice.

For files that we have modified in the time since the change,
add the proper copyright notice now.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:52:06 +02:00
Emmanuel Grumbach
785e21a89d mac80211: use bss_conf->dtim_period instead of conf.ps_dtim_period
sta_set_sinfo is obviously takes data for specific station.
This specific station is attached to a specific virtual
interface. Hence we should use the dtim_period from this
virtual interface rather than the system wide dtim_period.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 13:41:08 +02:00
Johannes Berg
c10a19930f mac80211: clean up ieee80211_i.h
Not sure how the declaration of ieee80211_tdls_peer_del_work
landed after the double inclusion protection end.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-05 10:56:50 +02:00
Hannes Frederic Sowa
a9fe8e2994 ipv4: implement igmp_qrv sysctl to tune igmp robustness variable
As in IPv6 people might increase the igmp query robustness variable to
make sure unsolicited state change reports aren't lost on the network. Add
and document this new knob to igmp code.

RFCs allow tuning this parameter back to first IGMP RFC, so we also use
this setting for all counters, including source specific multicast.

Also take over sysctl value when upping the interface and don't reuse
the last one seen on the interface.

Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-04 22:26:14 -07:00
Hannes Frederic Sowa
2f711939d2 ipv6: add sysctl_mld_qrv to configure query robustness variable
This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
robustness variable. It specifies how many retransmit of unsolicited mld
retransmit should happen. Admins might want to tune this on lossy links.

Also reset mld state on interface down/up, so we pick up new sysctl
settings during interface up event.

IPv6 certification requests this knob to be available.

I didn't make this knob netns specific, as it is mostly a setting in a
physical environment and should be per host.

Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-04 22:26:14 -07:00
John W. Linville
ef4ead3f29 Not that much content this time. Some RCU cleanups, crypto
performance improvements, and various patches all over,
 rather than listing them one might as well look into the
 git log instead.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUAIx4AAoJEDBSmw7B7bqrUYcP/3t4qdFxm0bd4j2AEkl3mPwB
 Qu7obTicOTfBRoJNEgS+8AU2u3PfztU6+ErZs4ETLUuqaZwXisqmwBiMo86+Wtdf
 gx9KonwEW051g7YmB0+6EMwuy04MGzTEk8VavQwqM4g9LIPJ4Buo/kj7MNJ51m11
 XyRmJqZJnKKeiiQ4eC0gPf8e44qiQqaDuYZ0r1UDnNRg2KrbAHlGTBKYI3VRl2u4
 xRpPGVnHwT0qkWb1Zw9fk0VfPr9m1ETthzcZvnhk6uMnJ28D+1B1FjZR1GJU6BW7
 Zx2FbevbZTjDoNT1GQpLGMXBuW0lsZFetXVFiJCr/StaPBtHmtdu28fuNVm8yJYz
 euDlEgrE8F4npdec2F5R2zh7Ue2U7eMEL2uxxjciNSJOipHgx5EXH12Y/5QtrChy
 4OHPbNHgpmqFB7TmkvHDgP/0A7XdyqKVc+NtIV+eECIwE4tHcJ6A+bQ+ZCoRV2Vw
 zmsNuNeNeDW7NEAw9veRXissLZMy/EjUnsOrnW29BpO/yG+2YjqpyQ6JQpcXeCPD
 WQgl2FHpk6ap3jpVjxminxw2HkDnQ0oTKusGLcezalhUlWMo7VYNN59aLzcphxX5
 Fotp/8v1sbDTF46uc/QJ38N5TqflwWeFpxvGkdNGuAT4llP03NaXV0ORBecFmMW2
 esb+PLwlByCDeVFu53q+
 =Qth6
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg <johannes@sipsolutions.net> says:

"Not that much content this time. Some RCU cleanups, crypto
performance improvements, and various patches all over,
rather than listing them one might as well look into the
git log instead."

Signed-off-by: John W. Linville <linville@tuxdriver.com>

Conflicts:
	drivers/net/wireless/ath/wil6210/wmi.c
2014-09-04 13:41:33 -04:00
John W. Linville
190355cc06 Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
 shouldn't be necessary but since many places treat it as a string we
 couldn't move to just sending two bytes.
 
 In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
 fix for the recently introduced RX aggregation offload, a revert for
 a broken patch (that luckily didn't really cause any harm) and a small
 fix for alignment in debugfs.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUAFrmAAoJEDBSmw7B7bqr79AP/1yGi9lkv/wWUs5y0AhUSen9
 850MU26BBlyAAFSz11xqgaEeRmeBeqhR3K7w/M02TX0CHxBzMqMZfyE//tq0UJaI
 ZwZmtyQmdMiOSNKignTIIx7OHTioq0wrGKb6O2UvKoJfTlB9t01jCC4jmCTF5Vos
 6ReF7NaZEbxW6XDOsClNTAtIa1c6n1RQ5VbDIEL5Vfvqv8LbcobduF8WcYl80eIQ
 +EvIHtUm/Luxg6DblibgEVtwYOtNpvRz4pofdw3xoSHAnF+zhXbUr0dUjpkBNA7o
 vWboCBl14Qn1M7pOJZ0+TBzFmquAr6CDbDvArVCH01Swh27EUDQUcHQAggGpT71w
 DFgWHOYP0UCB6Y4U0GjBehy8PeuytqJLBSceKVud7DDqd8fY+Lq3MMyicIk0aw3o
 IIDLWrujkCBXsdfuxQETmYxHU05WHSuYOCTgGSqbq3QPTWm8pBGWTdbk+1t/0FyH
 cGLJOWs/jCrtHdzDj6TH+kL8NmvwB7sC9MT45qG0ilevmPW25yrnTJPEMEvFBqvZ
 lnaqiX6D1kGNZd09CxgSIhxrQi+N0Yg+UlLa4IUtOIqnQussOC3xH2U5qTufdpa1
 Gi9aCkBGVKQiObPWucf2QB4t1sZ18rxBhrAelZhQPLTKrnsuLhpcVBlU+L6ScCAk
 FVni4HZH2IGtDQ577k10
 =G/pB
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg <johannes@sipsolutions.net> says:

"Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.

In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs."

Signed-off-by: John W. Linville <linville@redhat.com>
2014-09-04 13:08:24 -04:00
Li RongQing
c5eba0b6f8 openvswitch: distinguish between the dropped and consumed skb
distinguish between the dropped and consumed skb, not assume the skb
is consumed always

Cc: Thomas Graf <tgraf@noironetworks.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-03 20:50:51 -07:00
Jesper Dangaard Brouer
1f59533f9c qdisc: validate frames going through the direct_xmit path
In commit 50cbe9ab5f ("net: Validate xmit SKBs right when we
pull them out of the qdisc") the validation code was moved out of
dev_hard_start_xmit and into dequeue_skb.

However this overlooked the fact that we do not always enqueue
the skb onto a qdisc. First situation is if qdisc have flag
TCQ_F_CAN_BYPASS and qdisc is empty.  Second situation is if
there is no qdisc on the device, which is a common case for
software devices.

Originally spotted and inital patch by Alexander Duyck.
As a result Alex was seeing issues trying to connect to a
vhost_net interface after commit 50cbe9ab5f was applied.

Added a call to validate_xmit_skb() in __dev_xmit_skb(), in the
code path for qdiscs with TCQ_F_CAN_BYPASS flag, and in
__dev_queue_xmit() when no qdisc.

Also handle the error situation where dev_hard_start_xmit() could
return a skb list, and does not return dev_xmit_complete(rc) and
falls through to the kfree_skb(), in that situation it should
call kfree_skb_list().

Fixes:  50cbe9ab5f ("net: Validate xmit SKBs right when we pull them out of the qdisc")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-03 20:41:42 -07:00
Jesper Dangaard Brouer
3f3c7eec60 qdisc: exit case fixes for skb list handling in qdisc layer
More minor fixes to merge commit 53fda7f7f9 (Merge branch 'xmit_list')
that allows us to work with a list of SKBs.

Fixing exit cases in qdisc_reset() and qdisc_destroy(), where a
leftover requeued SKB (qdisc->gso_skb) can have the potential of
being a skb list, thus use kfree_skb_list().

This is a followup to commit 10770bc2d1 ("qdisc: adjustments for
API allowing skb list xmits").

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-03 20:41:42 -07:00
Pablo Neira Ayuso
cbb8125eb4 netfilter: nfnetlink: deliver netlink errors on batch completion
We have to wait until the full batch has been processed to deliver the
netlink error messages to userspace. Otherwise, we may deliver
duplicated errors to userspace in case that we need to abort and replay
the transaction if any of the required modules needs to be autoloaded.

A simple way to reproduce this (assumming nft_meta is not loaded) with
the following test file:

 add table filter
 add chain filter test
 add chain bad test                 # intentional wrong unexistent table
 add rule filter test meta mark 0

Then, when trying to load the batch:

 # nft -f test
 test:4:1-19: Error: Could not process rule: No such file or directory
 add chain bad test
 ^^^^^^^^^^^^^^^^^^^
 test:4:1-19: Error: Could not process rule: No such file or directory
 add chain bad test
 ^^^^^^^^^^^^^^^^^^^

The error is reported twice, once when the batch is aborted due to
missing nft_meta and another when it is fully processed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-03 16:56:23 +02:00
Michal Kazior
4549cf2b18 mac80211: fix offloaded BA session traffic after hw restart
When starting an offloaded BA session it is
unknown what starting sequence number should be
used. Using last_seq worked in most cases except
after hw restart.

When hw restart is requested last_seq is
(rightfully so) kept unmodified. This ended up
with BA sessions being restarted with an aribtrary
BA window values resulting in dropped frames until
sequence numbers caught up.

Instead of last_seq pick seqno of a first Rxed
frame of a given BA session.

This fixes stalled traffic after hw restart with
offloaded BA sessions (currently only ath10k).

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-03 13:40:38 +02:00
Johannes Berg
bd8c78e78d nl80211: clear skb cb before passing to netlink
In testmode and vendor command reply/event SKBs we use the
skb cb data to store nl80211 parameters between allocation
and sending. This causes the code for CONFIG_NETLINK_MMAP
to get confused, because it takes ownership of the skb cb
data when the SKB is handed off to netlink, and it doesn't
explicitly clear it.

Clear the skb cb explicitly when we're done and before it
gets passed to netlink to avoid this issue.

Cc: stable@vger.kernel.org [this goes way back]
Reported-by: Assaf Azulay <assaf.azulay@intel.com>
Reported-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-09-03 11:13:14 +02:00
Pablo Neira Ayuso
d99407f42f netfilter: nft_rbtree: no need for spinlock from set destroy path
The sets are released from the rcu callback, after the rule is removed
from the chain list, which implies that nfnetlink cannot update the
rbtree and no packets are walking on the set anymore. Thus, we can get
rid of the spinlock in the set destroy path there.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewied-by: Thomas Graf <tgraf@suug.ch>
2014-09-03 10:57:08 +02:00
Pablo Neira Ayuso
39f390167e netfilter: nft_hash: no need for rcu in the hash set destroy path
The sets are released from the rcu callback, after the rule is removed
from the chain list, which implies that nfnetlink cannot update the
hashes (thus, no resizing may occur) and no packets are walking on the
set anymore.

This resolves a lockdep splat in the nft_hash_destroy() path since the
nfnl mutex is not held there.

===============================
[ INFO: suspicious RCU usage. ]
3.16.0-rc2+ #168 Not tainted
-------------------------------
net/netfilter/nft_hash.c:362 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
1 lock held by ksoftirqd/0/3:
 #0:  (rcu_callback){......}, at: [<ffffffff81096393>] rcu_process_callbacks+0x27e/0x4c7

stack backtrace:
CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.16.0-rc2+ #168
Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
 0000000000000001 ffff88011769bb98 ffffffff8142c922 0000000000000006
 ffff880117694090 ffff88011769bbc8 ffffffff8107c3ff ffff8800cba52400
 ffff8800c476bea8 ffff8800c476bea8 ffff8800cba52400 ffff88011769bc08
Call Trace:
 [<ffffffff8142c922>] dump_stack+0x4e/0x68
 [<ffffffff8107c3ff>] lockdep_rcu_suspicious+0xfa/0x103
 [<ffffffffa079931e>] nft_hash_destroy+0x50/0x137 [nft_hash]
 [<ffffffffa078cd57>] nft_set_destroy+0x11/0x2a [nf_tables]

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
2014-09-03 10:57:06 +02:00
Jesper Dangaard Brouer
10770bc2d1 qdisc: adjustments for API allowing skb list xmits
Minor adjustments for merge commit 53fda7f7f9 (Merge branch 'xmit_list')
that allows us to work with a list of SKBs.

Update code doc to function sch_direct_xmit().

In handle_dev_cpu_collision() use kfree_skb_list() in error handling.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02 14:06:17 -07:00
Li RongQing
4ee45ea05c openvswitch: fix a memory leak
The user_skb maybe be leaked if the operation on it failed and codes
skipped into the label "out:" without calling genlmsg_unicast.

Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02 14:01:21 -07:00
Pablo Neira
41ad82f7f8 netfilter: fix missing dependencies in NETFILTER_XT_TARGET_LOG
make defconfig reports:

warning: (NETFILTER_XT_TARGET_LOG) selects NF_LOG_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NETFILTER_ADVANCED)

Fixes: d79a61d netfilter: NETFILTER_XT_TARGET_LOG selects NF_LOG_*
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02 13:59:54 -07:00
David S. Miller
abccc5878a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
pull request: Netfilter/IPVS fixes for net

The following patchset contains seven Netfilter fixes for your net
tree, they are:

1) Make the NAT infrastructure independent of x_tables, some users are
   already starting to test nf_tables with NAT without enabling x_tables.
   Without this patch for Kconfig, there's a superfluous dependency
   between NAT and x_tables.
2) Allow to use 0 in the cgroup match, the kernel rejects with -EINVAL
   with no good reason. From Daniel Borkmann.

3) Select CONFIG_NF_NAT from the nf_tables NAT expression, this also
   resolves another NAT dependency with x_tables.

4) Use HAVE_JUMP_LABEL instead of CONFIG_JUMP_LABEL in the Netfilter hook
   code as elsewhere in the kernel to resolve toolchain problems, from
   Zhouyi Zhou.

5) Use iptunnel_handle_offloads() to set up tunnel encapsulation
   depending on the offload capabilities, reported by Alex Gartrell
   patch from Julian Anastasov.

6) Fix wrong family when registering the ip_vs_local_reply6() hook,
   also from Julian.

7) Select the NF_LOG_* symbols from NETFILTER_XT_TARGET_LOG. Rafał
   Miłecki reported that when jumping from 3.16 to 3.17-rc, his log
   target is not selected anymore due to changes in the previous
   development cycle to accomodate the full logging support for
   nf_tables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02 13:56:30 -07:00
Nicolas Dichtel
ba9989069f rtnl/do_setlink(): notify when a netdev is modified
Depending on which parameters were updated, the changes were not propagated via
the notifier chain and netlink.

The new flag has been set only when the change did not cause a call to the
notifier chain and/or to the netlink notification functions.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02 12:57:04 -07:00