Commit graph

30,660 commits

Author SHA1 Message Date
Jozsef Kadlecsik
03c8b234e6 netfilter: ipset: Generalize extensions support
Get rid of the structure based extensions and introduce a blob for
the extensions. Thus we can support more extension types easily.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik
ca134ce864 netfilter: ipset: Move extension data to set structure
Default timeout and extension offsets are moved to struct set, because
all set types supports all extensions and it makes possible to generalize
extension support.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik
f925f70569 netfilter: ipset: Rename extension offset ids to extension ids
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik
a04d8b6bd9 netfilter: ipset: Prepare ipset to support multiple networks for hash types
In order to support hash:net,net, hash:net,port,net etc. types,
arrays are introduced for the book-keeping of existing cidr sizes
and network numbers in a set.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:26 +02:00
Jozsef Kadlecsik
5e04c0c38c netfilter: ipset: Introduce new operation to get both setname and family
ip[6]tables set match and SET target need to know the family of the set
in order to reject adding rules which refer to a set with a non-mathcing
family. Currently such rules are silently accepted and then ignored
instead of generating a clear error message to the user, which is not
helpful.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:26 +02:00
Jozsef Kadlecsik
bd3129fc5e netfilter: ipset: order matches and targets separatedly in xt_set.c
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:26 +02:00
Anders K. Pedersen
60b0fe3724 netfilter: ipset: Support package fragments for IPv4 protos without ports
Enable ipset port set types to match IPv4 package fragments for
protocols that doesn't have ports (or the port information isn't
supported by ipset).

For example this allows a hash:ip,port ipset containing the entry
192.168.0.1,gre:0 to match all package fragments for PPTP VPN tunnels
to/from the host. Without this patch only the first package fragment
(with fragment offset 0) was matched, while subsequent fragments wasn't.

This is not possible for IPv6, where the protocol is in the fragmented
part of the package unlike IPv4, where the protocol is in the IP header.

IPPROTO_ICMPV6 is deliberately not included, because it isn't relevant
for IPv4.

Signed-off-by: Anders K. Pedersen <akp@surftown.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:26 +02:00
Jozsef Kadlecsik
20b2fab483 netfilter: ipset: Fix "may be used uninitialized" warnings
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik
35b8dcf8c3 netfilter: ipset: Rename simple macro names to avoid namespace issues.
Reported-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik
a0f28dc754 netfilter: ipset: Fix sparse warnings due to missing rcu annotations
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik
b3aabd149c netfilter: ipset: Sparse warning about shadowed variable fixed
net/netfilter/ipset/ip_set_hash_ipportnet.c:275:20:
warning: symbol 'cidr' shadows an earlier one

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik
122ebbf24c netfilter: ipset: Don't call ip_nest_end needlessly in the error path
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-09-30 21:33:25 +02:00
Eric Dumazet
b86783587b net: flow_dissector: fix thoff for IPPROTO_AH
In commit 8ed781668d ("flow_keys: include thoff into flow_keys for
later usage"), we missed that existing code was using nhoff as a
temporary variable that could not always contain transport header
offset.

This is not a problem for TCP/UDP because port offset (@poff)
is 0 for these protocols.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 15:32:05 -04:00
David S. Miller
7b77d161ce Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Conflicts:
	include/net/xfrm.h

Simple conflict between Joe Perches "extern" removal for function
declarations in header files and the changes in Steffen's tree.

Steffen Klassert says:

====================
Two patches that are left from the last development cycle.
Manual merging of include/net/xfrm.h is needed. The conflict
can be solved as it is currently done in linux-next.

1) We announce the creation of temporary acquire state via an asyc event,
   so the deletion should be annunced too. From Nicolas Dichtel.

2) The VTI tunnels do not real tunning, they just provide a routable
   IPsec tunnel interface. So introduce and use xfrm_tunnel_notifier
   instead of xfrm_tunnel for xfrm tunnel mode callback. From Fan Du.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 15:24:57 -04:00
Nicolas Dichtel
991fb3f74c dev: always advertise rx_flags changes via netlink
When flags IFF_PROMISC and IFF_ALLMULTI are changed, netlink messages are not
consistent. For example, if a multicast daemon is running (flag IFF_ALLMULTI
set in dev->flags but not dev->gflags, ie not exported to userspace) and then a
user sets it via netlink (flag IFF_ALLMULTI set in dev->flags and dev->gflags, ie
exported to userspace), no netlink message is sent.
Same for IFF_PROMISC and because dev->promiscuity is exported via
IFLA_PROMISCUITY, we may send a netlink message after each change of this
counter.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 15:08:13 -04:00
Nicolas Dichtel
a528c219df dev: update __dev_notify_flags() to send rtnl msg
This patch only prepares the next one, there is no functional change.
Now, __dev_notify_flags() can also be used to notify flags changes via
rtnetlink.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 15:08:12 -04:00
Paul Marks
c9d55d5bff ipv6: Fix preferred_lft not updating in some cases
Consider the scenario where an IPv6 router is advertising a fixed
preferred_lft of 1800 seconds, while the valid_lft begins at 3600
seconds and counts down in realtime.

A client should reset its preferred_lft to 1800 every time the RA is
received, but a bug is causing Linux to ignore the update.

The core problem is here:
  if (prefered_lft != ifp->prefered_lft) {

Note that ifp->prefered_lft is an offset, so it doesn't decrease over
time.  Thus, the comparison is always (1800 != 1800), which fails to
trigger an update.

The most direct solution would be to compute a "stored_prefered_lft",
and use that value in the comparison.  But I think that trying to filter
out unnecessary updates here is a premature optimization.  In order for
the filter to apply, both of these would need to hold:

  - The advertised valid_lft and preferred_lft are both declining in
    real time.
  - No clock skew exists between the router & client.

So in this patch, I've set "update_lft = 1" unconditionally, which
allows the surrounding code to be greatly simplified.

Signed-off-by: Paul Marks <pmarks@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 15:06:19 -04:00
Pravin B Shelar
d4a71b155c ip_tunnel: Do not use stale inner_iph pointer.
While sending packet skb_cow_head() can change skb header which
invalidates inner_iph pointer to skb header. Following patch
avoid using it. Found by code inspection.

This bug was introduced by commit 0e6fbc5b6c (ip_tunnels: extend
iptunnel_xmit()).

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 15:05:07 -04:00
Patrick McHardy
f4a87e7bd2 netfilter: synproxy: fix BUG_ON triggered by corrupt TCP packets
TCP packets hitting the SYN proxy through the SYNPROXY target are not
validated by TCP conntrack. When th->doff is below 5, an underflow happens
when calculating the options length, causing skb_header_pointer() to
return NULL and triggering the BUG_ON().

Handle this case gracefully by checking for NULL instead of using BUG_ON().

Reported-by: Martin Topholm <mph@one.com>
Tested-by: Martin Topholm <mph@one.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-09-30 12:44:38 +02:00
Jouni Malinen
22c4ceed01 mac80211: Run deferred scan if last roc_list item is not started
mac80211 scan processing could get stuck if roc work for pending, but
not started when a scan request was deferred due to such roc item.
Normally the deferred scan would be started from
ieee80211_start_next_roc(), but ieee80211_sw_roc_work() calls that only
if the finished ROC was started. Fix this by calling
ieee80211_run_deferred_scan() in the case the last ROC was not actually
started.

This issue was hit relatively easily in P2P find operations where Listen
state (remain-on-channel) and Search state (scan) are repeated in a
loop.

Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-09-30 12:36:56 +02:00
Felix Fietkau
0c5b93290b mac80211: update sta->last_rx on acked tx frames
When clients are idle for too long, hostapd sends nullfunc frames for
probing. When those are acked by the client, the idle time needs to be
updated.

To make this work (and to avoid unnecessary probing), update sta->last_rx
whenever an ACK was received for a tx packet. Only do this if the flag
IEEE80211_HW_REPORTS_TX_ACK_STATUS is set.

Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-09-30 12:34:09 +02:00
Felix Fietkau
03bb7f4276 mac80211: use sta_info_get_bss() for nl80211 tx and client probing
This allows calls for clients in AP_VLANs (e.g. for 4-addr) to succeed

Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-09-30 11:30:57 +02:00
Eric Dumazet
62748f32d5 net: introduce SO_MAX_PACING_RATE
As mentioned in commit afe4fd0624 ("pkt_sched: fq: Fair Queue packet
scheduler"), this patch adds a new socket option.

SO_MAX_PACING_RATE offers the application the ability to cap the
rate computed by transport layer. Value is in bytes per second.

u32 val = 1000000;
setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));

To be effectively paced, a flow must use FQ packet scheduler.

Note that a packet scheduler takes into account the headers for its
computations. The effective payload rate depends on MSS and retransmits
if any.

I chose to make this pacing rate a SOL_SOCKET option instead of a
TCP one because this can be used by other protocols.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:35:41 -07:00
Francesco Fusco
aa66158145 ipv4: processing ancillary IP_TOS or IP_TTL
If IP_TOS or IP_TTL are specified as ancillary data, then sendmsg() sends out
packets with the specified TTL or TOS overriding the socket values specified
with the traditional setsockopt().

The struct inet_cork stores the values of TOS, TTL and priority that are
passed through the struct ipcm_cookie. If there are user-specified TOS
(tos != -1) or TTL (ttl != 0) in the struct ipcm_cookie, these values are
used to override the per-socket values. In case of TOS also the priority
is changed accordingly.

Two helper functions get_rttos and get_rtconn_flags are defined to take
into account the presence of a user specified TOS value when computing
RT_TOS and RT_CONN_FLAGS.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:21:52 -07:00
Francesco Fusco
f02db315b8 ipv4: IP_TOS and IP_TTL can be specified as ancillary data
This patch enables the IP_TTL and IP_TOS values passed from userspace to
be stored in the ipcm_cookie struct. Three fields are added to the struct:

- the TTL, expressed as __u8.
  The allowed values are in the [1-255].
  A value of 0 means that the TTL is not specified.

- the TOS, expressed as __s16.
  The allowed values are in the range [0,255].
  A value of -1 means that the TOS is not specified.

- the priority, expressed as a char and computed when
  handling the ancillary data.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:21:51 -07:00
Eric Dumazet
9a3bab6b05 net: net_secret should not depend on TCP
A host might need net_secret[] and never open a single socket.

Problem added in commit aebda156a5
("net: defer net_secret[] initialization")

Based on prior patch from Hannes Frederic Sowa.

Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@strressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:19:40 -07:00
Eric W. Biederman
50624c934d net: Delay default_device_exit_batch until no devices are unregistering v2
There is currently serialization network namespaces exiting and
network devices exiting as the final part of netdev_run_todo does not
happen under the rtnl_lock.  This is compounded by the fact that the
only list of devices unregistering in netdev_run_todo is local to the
netdev_run_todo.

This lack of serialization in extreme cases results in network devices
unregistering in netdev_run_todo after the loopback device of their
network namespace has been freed (making dst_ifdown unsafe), and after
the their network namespace has exited (making the NETDEV_UNREGISTER,
and NETDEV_UNREGISTER_FINAL callbacks unsafe).

Add the missing serialization by a per network namespace count of how
many network devices are unregistering and having a wait queue that is
woken up whenever the count is decreased.  The count and wait queue
allow default_device_exit_batch to wait until all of the unregistration
activity for a network namespace has finished before proceeding to
unregister the loopback device and then allowing the network namespace
to exit.

Only a single global wait queue is used because there is a single global
lock, and there is a single waiter, per network namespace wait queues
would be a waste of resources.

The per network namespace count of unregistering devices gives a
progress guarantee because the number of network devices unregistering
in an exiting network namespace must ultimately drop to zero (assuming
network device unregistration completes).

The basic logic remains the same as in v1.  This patch is now half
comment and half rtnl_lock_unregistering an expanded version of
wait_event performs no extra work in the common case where no network
devices are unregistering when we get to default_device_exit_batch.

Reported-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:09:15 -07:00
Catalin\(ux\) M. BOIE
7df37ff33d IPv6 NAT: Do not drop DNATed 6to4/6rd packets
When a router is doing DNAT for 6to4/6rd packets the latest
anti-spoofing commit 218774dc ("ipv6: add anti-spoofing checks for
6to4 and 6rd") will drop them because the IPv6 address embedded does
not match the IPv4 destination. This patch will allow them to pass by
testing if we have an address that matches on 6to4/6rd interface.  I
have been hit by this problem using Fedora and IPV6TO4_IPV4ADDR.
Also, log the dropped packets (with rate limit).

Signed-off-by: Catalin(ux) M. BOIE <catab@embedromix.ro>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:56:15 -04:00
Hannes Frederic Sowa
0a67d3efa4 ipv6: compare sernum when walking fib for /proc/net/ipv6_route as safety net
This patch provides an additional safety net against NULL
pointer dereferences while walking the fib trie for the new
/proc/net/ipv6_route walkers. I never needed it myself and am unsure
if it is needed at all, but the same checks where introduced in
2bec5a369e ("ipv6: fib: fix crash when
changing large fib while dumping it") to fix NULL pointer bugs.

This patch is separated from the first patch to make it easier to revert
if we are sure we can drop this logic.

Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-27 17:32:17 -04:00
Hannes Frederic Sowa
8d2ca1d7b5 ipv6: avoid high order memory allocations for /proc/net/ipv6_route
Dumping routes on a system with lots rt6_infos in the fibs causes up to
11-order allocations in seq_file (which fail). While we could switch
there to vmalloc we could just implement the streaming interface for
/proc/net/ipv6_route. This patch switches /proc/net/ipv6_route from
single_open_net to seq_open_net.

loff_t *pos tracks dst entries.

Also kill never used struct rt6_proc_arg and now unused function
fib6_clean_all_ro.

Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-27 17:32:16 -04:00
John W. Linville
0a878747e1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
Also fixed-up a badly indented closing brace...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-09-27 13:11:17 -04:00
Gustavo Padovan
1025c04cec Merge git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Conflicts:
	net/bluetooth/hci_core.c
2013-09-27 11:56:14 -03:00
Gao feng
7722e0d1c0 netfilter: xt_TCPMSS: lookup route from proper net namespace
Otherwise the pmtu will be incorrect.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-09-27 16:18:23 +02:00
Gao feng
de1389b116 netfilter: xt_TCPMSS: Get mtu only if clamp-mss-to-pmtu is specified
This patch refactors the code to skip tcpmss_reverse_mtu if no
clamp-mss-to-pmtu is specified.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-09-27 16:17:59 +02:00
holger@eitzenberger.org
b21613aeb6 netfilter: nf_ct_sip: extend RCU read lock in set_expected_rtp_rtcp()
Currently set_expected_rtp_rtcp() in the SIP helper uses
rcu_dereference() two times to access two different NAT hook
functions. However, only the first one is protected by the RCU
reader lock, but the 2nd isn't. Fix it by extending the RCU
protected area.

This is more a cosmetic thing since we rely on all netfilter hooks
being rcu_read_lock()ed by nf_hook_slow() in many places anyways,
as Patrick McHardy clarified.

Signed-off-by: Holger Eitzenberger <holger.eitzenberger@sophos.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-09-27 16:17:47 +02:00
Tejun Heo
58292cbe66 sysfs: make attr namespace interface less convoluted
sysfs ns (namespace) implementation became more convoluted than
necessary while trying to hide ns information from visible interface.
The relatively recent attr ns support is a good example.

* attr ns tag is determined by sysfs_ops->namespace() callback while
  dir tag is determined by kobj_type->namespace().  The placement is
  arbitrary.

* Instead of performing operations with explicit ns tag, the namespace
  callback is routed through sysfs_attr_ns(), sysfs_ops->namespace(),
  class_attr_namespace(), class_attr->namespace().  It's not simpler
  in any sense.  The only thing this convolution does is traversing
  the whole stack backwards.

The namespace callbacks are unncessary because the operations involved
are inherently synchronous.  The information can be provided in in
straight-forward top-down direction and reversing that direction is
unnecessary and against basic design principles.

This backward interface is unnecessarily convoluted and hinders
properly separating out sysfs from driver model / kobject for proper
layering.  This patch updates attr ns support such that

* sysfs_ops->namespace() and class_attr->namespace() are dropped.

* sysfs_{create|remove}_file_ns(), which take explicit @ns param, are
  added and sysfs_{create|remove}_file() are now simple wrappers
  around the ns aware functions.

* ns handling is dropped from sysfs_chmod_file().  Nobody uses it at
  this point.  sysfs_chmod_file_ns() can be added later if necessary.

* Explicit @ns is propagated through class_{create|remove}_file_ns()
  and netdev_class_{create|remove}_file_ns().

* driver/net/bonding which is currently the only user of attr
  namespace is updated to use netdev_class_{create|remove}_file_ns()
  with @bh->net as the ns tag instead of using the namespace callback.

This patch should be an equivalent conversion without any functional
difference.  It makes the code easier to follow, reduces lines of code
a bit and helps proper separation and layering.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26 14:50:01 -07:00
Veaceslav Falico
5831d66e80 net: create sysfs symlinks for neighbour devices
Also, remove the same functionality from bonding - it will be already done
for any device that links to its lower/upper neighbour.

The links will be created for dev's kobject, and will look like
lower_eth0 for lower device eth0 and upper_bridge0 for upper device
bridge0.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:08 -04:00
Veaceslav Falico
842d67a7b3 net: expose the master link to sysfs, and remove it from bond
Currently, we can have only one master upper neighbour, so it would be
useful to create a symlink to it in the sysfs device directory, the way
that bonding now does it, for every device. Lower devices from
bridge/team/etc will automagically get it, so we could rely on it.

Also, remove the same functionality from bonding.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:08 -04:00
Veaceslav Falico
47701a36a3 vlan: unlink the upper neighbour before unregistering
On netdev unregister we're removing also all of its sysfs-associated stuff,
including the sysfs symlinks that are controlled by netdev neighbour code.
Also, it's a subtle race condition - cause we can still access it after
unregistering.

Move the unlinking right before the unregistering to fix both.

CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:07 -04:00
Veaceslav Falico
5df27e6cb2 vlan: link the upper neighbour only after registering
Otherwise users might access it without being fully registered, as per
sysfs - it only inits in register_netdevice(), so is unusable till it is
called.

CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:07 -04:00
Veaceslav Falico
b6ccba4c68 net: add a possibility to get private from netdev_adjacent->list
It will be useful to get first/last element.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:06 -04:00
Veaceslav Falico
31088a113c net: add for_each iterators through neighbour lower link's private
Add a possibility to iterate through netdev_adjacent's private, currently
only for lower neighbours.

Add both RCU and RTNL/other locking variants of iterators, and make the
non-rcu variant to be safe from removal.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:04 -04:00
Veaceslav Falico
402dae9614 net: add netdev_adjacent->private and allow to use it
Currently, even though we can access any linked device, we can't attach
anything to it, which is vital to properly manage them.

To fix this, add a new void *private to netdev_adjacent and functions
setting/getting it (per link), so that we can save, per example, bonding's
slave structures there, per slave device.

netdev_master_upper_dev_link_private(dev, upper_dev, private) links dev to
upper dev and populates the neighbour link only with private.

netdev_lower_dev_get_private{,_rcu}() returns the private, if found.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:04 -04:00
Veaceslav Falico
5249dec738 net: add RCU variant to search for netdev_adjacent link
Currently we have only the RTNL flavour, however we can traverse it while
holding only RCU, so add the RCU search. Add an RCU variant that uses
list_head * as an argument, so that it can be universally used afterwards.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:04 -04:00
Veaceslav Falico
2f268f129c net: add adj_list to save only neighbours
Currently, we distinguish neighbours (first-level linked devices) from
non-neighbours by the neighbour bool in the netdev_adjacent. This could be
quite time-consuming in case we would like to traverse *only* through
neighbours - cause we'd have to traverse through all devices and check for
this flag, and in a (quite common) scenario where we have lots of vlans on
top of bridge, which is on top of a bond - the bonding would have to go
through all those vlans to get its upper neighbour linked devices.

This situation is really unpleasant, cause there are already a lot of cases
when a device with slaves needs to go through them in hot path.

To fix this, introduce a new upper/lower device lists structure -
adj_list, which contains only the neighbours. It works always in
pair with the all_adj_list structure (renamed from upper/lower_dev_list),
i.e. both of them contain the same links, only that all_adj_list contains
also non-neighbour device links. It's really a small change visible,
currently, only for __netdev_adjacent_dev_insert/remove(), and doesn't
change the main linked logic at all.

Also, add some comments a fix a name collision in
netdev_for_each_upper_dev_rcu() and rework the naming by the following
rules:

netdev_(all_)(upper|lower)_*

If "all_" is present, then we work with the whole list of upper/lower
devices, otherwise - only with direct neighbours. Uninline functions - to
get better stack traces.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:04 -04:00
Veaceslav Falico
7863c054d1 net: use lists as arguments instead of bool upper
Currently we make use of bool upper when we want to specify if we want to
work with upper/lower list. It's, however, harder to read, debug and
occupies a lot more code.

Fix this by just passing the correct upper/lower_dev_list list_head pointer
instead of bool upper, and work internally with it.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:03 -04:00
Johannes Berg
aa5f66d5a1 cfg80211: fix sysfs registration race
My locking rework/race fixes caused a regression in the
registration, causing uevent notifications for wireless
devices before the device is really fully registered and
available in nl80211.

Fix this by moving the device_add() under rtnl and move
the rfkill to afterwards (it can't be under rtnl.)

Reported-and-tested-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-09-26 20:03:45 +02:00
Chun-Yeow Yeoh
cc63ec766b mac80211: fix the setting of extended supported rate IE
The patch "mac80211: select and adjust bitrates according to
channel mode" causes regression and breaks the extended supported rate
IE setting. Since "i" is starting with 8, so this is not necessary
to introduce "skip" here.

Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@cozybit.com>
Signed-off-by: Colleen Twitty <colleen@cozybit.com>
Reviewed-by: Jason Abele <jason@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-09-26 19:56:59 +02:00
Felix Fietkau
6329b8d917 mac80211: drop spoofed packets in ad-hoc mode
If an Ad-Hoc node receives packets with the Cell ID or its own MAC
address as source address, it hits a WARN_ON in sta_info_insert_check()
With many packets, this can massively spam the logs. One way that this
can easily happen is through having Cisco APs in the area with rouge AP
detection and countermeasures enabled.
Such Cisco APs will regularly send fake beacons, disassoc and deauth
packets that trigger these warnings.

To fix this issue, drop such spoofed packets early in the rx path.

Cc: stable@vger.kernel.org
Reported-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-09-26 19:56:06 +02:00
John W. Linville
7c6a4acc64 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2013-09-26 13:47:05 -04:00