Commit graph

604320 commits

Author SHA1 Message Date
Jakub Kicinski
2370def2e4 nfp: implement ethtool .get_link() callback
Point the ethtool .get_link() callback to the standard
ethtool_op_get_link() implementation.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 09:12:14 -04:00
Jakub Kicinski
f642963bff nfp: remove unused parameter from nfp_net_write_mac_addr()
nfp_net_write_mac_addr() always writes to the BAR the current
device address taken from netdev struct.  The address given
as parameter is actually ignored.  Since all callers pass
netdev->dev_addr simply remove the parameter.

While at it improve the function's kdoc a bit.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 09:12:14 -04:00
Jakub Kicinski
796312cd00 nfp: correct name of control BAR define
Spell abbreviation of control as ctrl not crtl.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 09:12:14 -04:00
Soohoon Lee
43daa96b16 usbnet: Stop RX Q on MTU change
When MTU is changed unlink_urbs() flushes RX Q but mean while usbnet_bh()
can fill up the Q at the same time.
Depends on which HCD is down there unlink takes long time then the flush
never ends.

Signed-off-by: Soohoon Lee <soohoon.lee@f5.com>
Reviewed-by: Kimball Murray <kmurray@f5.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 09:05:05 -04:00
Shmulik Ladkani
fedbb6b4ff ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output
ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it
calls ip_sk_use_pmtu which casts sk as an inet_sk).

However, in the case of UDP tunneling, the skb->sk is not necessarily an
inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from
tun/tap).

OTOH, the sk passed as an argument throughout IP stack's output path is
the one which is of PMTU interest:
 - In case of local sockets, sk is same as skb->sk;
 - In case of a udp tunnel, sk is the tunneling socket.

Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu.
This augments 7026b1ddb6 'netfilter: Pass socket pointer down through okfn().'

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 09:02:48 -04:00
David S. Miller
5ab0d6a016 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2016-06-29

This series contains fixes to e1000e and ixgbevf.

Jarod Wilson's fix for e1000e was a follow-on patch to his previous fix to
keep the hardware VLAN CTAG's for receive and transmit in sync, which
in turn resolves the original issue, so revert a portion of the original
fix.

Xin Long noticed that the ret_val needed to be initialized to IXGBE_ERR_MBX,
instead of -IXGBE_ERR_MBX.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:57:24 -04:00
Dan Carpenter
6fde0e63ec be2net: signedness bug in be_msix_enable()
"num_vec" needs to be signed for the error handling to work.

Fixes: e261768e9e ('be2net: support asymmetric rx/tx queue counts')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:54:18 -04:00
Masanari Iida
9b9a553c90 net: netcp: Fix a typo in keystone-netcp.txt
This patch fix a spelling typo in keystone-netcp.txt

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:53:44 -04:00
David S. Miller
833ba3d585 Merge branch 'mediatek-next'
John Crispin says:

====================
net-next: mediatek: IRQ cleanups, fixes and grouping

This series contains 2 small code cleanups that are leftovers from the
MIPS support. There is also a small fix that adds proper locking to the
code accessing the IRQ registers. Without this fix we saw deadlocks caused
by the last patch of the series, which adds IRQ grouping. The grouping
feature allows us to use different IRQs for TX and RX. By doing so we can
use affinity to let the SoC handle the IRQs on different cores.

This series depends on a previous series currently sitting in net.git
starting with
	commit 562c5a7040 ("net: mediatek: only wake the queue if it is stopped")
up to
	commit 82c6544ddd ("net: mediatek: remove superfluous queue wake up call")
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:52:12 -04:00
John Crispin
8067302973 net-next: mediatek: add support for IRQ grouping
The ethernet core has 3 IRQs. Using the IRQ grouping registers we are able
to separate TX and RX IRQs, which allows us to service them on separate
cores. This patch splits the IRQ handler into 2 separate functions, one for
TX and another for RX. The TX housekeeping is split out into its own NAPI
handler.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:52:04 -04:00
John Crispin
7bc9ccec34 net-next: mediatek: add IRQ locking
The code that enables and disables IRQs is missing proper locking. After
adding the IRQ grouping patch and routing the RX and TX IRQs to different
cores we experienced IRQ stalls. Fix this by adding proper locking.
We use a dedicated lock to reduce the latency if the IRQ code.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:52:04 -04:00
John Crispin
eece71e8fb net-next: mediatek: don't use intermediate variables to store IRQ masks
The code currently uses variables to store and never modify the bit masks
of interrupts. This is legacy code from an early version of the driver
that supported MIPS based SoCs where the IRQ bits depended on the actual
SoC. As the bits are the same for all ARM based SoCs using this driver we
can remove the intermediate variables.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:52:04 -04:00
John Crispin
6e6edd8b96 net-next: mediatek: remove superfluous register reads
The driver was originally written for MIPS based SoC. These required the
IRQ mask register to be read after writing it to ensure that the content
was actually applied. As this version only works on ARM based SoCs, we can
safely remove the 2 reads as they are no longer required.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:52:04 -04:00
Mateusz Bajorski
153380ec4b fib_rules: Added NLM_F_EXCL support to fib_nl_newrule
When adding rule with NLM_F_EXCL flag then check if the same rule exist.
If yes then exit with -EEXIST.

This is already implemented in iproute2:
        if (cmd == RTM_NEWRULE) {
                req.n.nlmsg_flags |= NLM_F_CREATE|NLM_F_EXCL;
                req.r.rtm_type = RTN_UNICAST;
        }

Tested ipv4 and ipv6 with net-next linux on qemu x86

expected behavior after patch:
localhost ~ # ip rule
0:    from all lookup local
32766:    from all lookup main
32767:    from all lookup default
localhost ~ # ip rule add from 10.46.177.97 lookup 104 pref 1005
localhost ~ # ip rule add from 10.46.177.97 lookup 104 pref 1005
RTNETLINK answers: File exists
localhost ~ # ip rule
0:    from all lookup local
1005:    from 10.46.177.97 lookup 104
32766:    from all lookup main
32767:    from all lookup default

There was already topic regarding this but I don't see any changes
merged and problem still occurs.
https://lkml.kernel.org/r/1135778809.5944.7.camel+%28%29+localhost+%21+localdomain

Signed-off-by: Mateusz Bajorski <mateusz.bajorski@nokia.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:23:19 -04:00
Mark Brown
2a9b27b326 Merge remote-tracking branches 'spi/fix/ep93xx', 'spi/fix/rockchip', 'spi/fix/sunxi' and 'spi/fix/ti-qspi' into spi-linus 2016-06-30 13:17:29 +01:00
Seymour, Shane M
2631b79f6c tcp: increase size at which tcp_bound_to_half_wnd bounds to > TCP_MSS_DEFAULT
In previous commit 01f83d6984
the following comments were added:

"When peer uses tiny windows, there is no use in packetizing to sub-MSS
pieces for the sake of SWS or making sure there are enough packets in
the pipe for fast recovery."

The test should be > TCP_MSS_DEFAULT not >= 512. This allows low end
devices that send an MSS of 536 (TCP_MSS_DEFAULT) to see better network
performance by sending it 536 bytes of data at a time instead of bounding
to half window size (268). Other network stacks work this way, e.g. HP-UX.

Signed-off-by: Shane Seymour <shane.seymour@hpe.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:17:21 -04:00
Andrey Vagin
b1ed4c4fa9 tcp: add an ability to dump and restore window parameters
We found that sometimes a restored tcp socket doesn't work.

A reason of this bug is incorrect window parameters and in this case
tcp_acceptable_seq() returns tcp_wnd_end(tp) instead of tp->snd_nxt. The
other side drops packets with this seq, because seq is less than
tp->rcv_nxt ( tcp_sequence() ).

Data from a send queue is sent only if there is enough space in a
window, so when we restore unacked data, we need to expand a window to
fit this data.

This was in a first version of this patch:
"tcp: extend window to fit all restored unacked data in a send queue"

Then Alexey recommended me to restore window parameters instead of
adjusted them according with data in a sent queue. This sounds resonable.

rcv_wnd has to be restored, because it was reported to another side
and the offered window is never shrunk.
One of reasons why we need to restore snd_wnd was described above.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:15:31 -04:00
Miklos Szeredi
5c672ab3f0 fuse: serialize dirops by default
Negotiate with userspace filesystems whether they support parallel readdir
and lookup.  Disable parallelism by default for fear of breaking fuse
filesystems.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 9902af79c0 ("parallel lookups: actual switch to rwsem")
Fixes: d9b3dbdcfd ("fuse: switch to ->iterate_shared()")
2016-06-30 13:10:49 +02:00
Wei Yongjun
cab1032743 drm/i915: Fix missing unlock on error in i915_ppgtt_info()
Add the missing unlock before return from function i915_ppgtt_info()
in the error handling case.

Fixes: 1d2ac403ae3b(drm: Protect dev->filelist with its own mutex)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1465861320-26221-1-git-send-email-weiyj_lk@163.com
(cherry picked from commit b021248690)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-30 13:30:15 +03:00
David S. Miller
641f7e405e Merge branch 'bridge-igmp-stats'
Nikolay Aleksandrov says:

====================
net: bridge: add support for IGMP/MLD stats

This patchset adds support for the new IFLA_STATS_LINK_XSTATS_SLAVE
attribute which can be used with RTM_GETSTATS in order to export per-slave
statistics. It works by passing the attribute to the linkxstats callback
and if the callback user supports it - it should dump that slave's stats.
This is much more scalable and permits us to request only a single port's
statistics instead of dumping everything every time.
The second patch adds support for per-port IGMP/MLD statistics and uses
the new API to export them for the bridge and its ports. The stats are
made in a very lightweight manner, the normal fast-path is not affected
at all and the flood paths (br_flood/br_multicast_flood) are only affected
if the packet is IGMP and the IGMP stats have been enabled using cache-hot
data for the check.

v2: Patch 01 is new, patch 02 has been reworked to use the new API, also
in addition counters for IGMP/MLD parse errors have been added and members
are added for per-port multicast traffic stats. The multicast counting has
been slightly optimized (moved the br_multicast_count inside the IPv4/6
IGMP functions after the checks for IGMP traffic) to avoid one conditional
that was on all of the multicast traffic path (both IGMP and other).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 06:18:32 -04:00
Nikolay Aleksandrov
1080ab95e3 net: bridge: add support for IGMP/MLD stats and export them via netlink
This patch adds stats support for the currently used IGMP/MLD types by the
bridge. The stats are per-port (plus one stat per-bridge) and per-direction
(RX/TX). The stats are exported via netlink via the new linkxstats API
(RTM_GETSTATS). In order to minimize the performance impact, a new option
is used to enable/disable the stats - multicast_stats_enabled, similar to
the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
lookups and checks, we make use of the current "igmp" member of the bridge
private skb->cb region to record the type on Rx (both host-generated and
external packets pass by multicast_rcv()). We can do that since the igmp
member was used as a boolean and all the valid IGMP/MLD types are positive
values. The normal bridge fast-path is not affected at all, the only
affected paths are the flooding ones and since we make use of the IGMP/MLD
type, we can quickly determine if the packet should be counted using
cache-hot data (cb's igmp member). We add counters for:
* IGMP Queries
* IGMP Leaves
* IGMP v1/v2/v3 reports

* MLD Queries
* MLD Leaves
* MLD v1/v2 reports

These are invaluable when monitoring or debugging complex multicast setups
with bridges.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 06:18:24 -04:00
Nikolay Aleksandrov
80e73cc563 net: rtnetlink: add support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
This patch adds support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
which allows to export per-slave statistics if the master device supports
the linkxstats callback. The attribute is passed down to the linkxstats
callback and it is up to the callback user to use it (an example has been
added to the only current user - the bridge). This allows us to query only
specific slaves of master devices like bridge ports and export only what
we're interested in instead of having to dump all ports and searching only
for a single one. This will be used to export per-port IGMP/MLD stats and
also per-port vlan stats in the future, possibly other statistics as well.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 06:15:04 -04:00
Michal Kazior
59a7c828d7 mac80211: fix fq lockdep warnings
Some lockdep assertions were not fulfilled and
resulted in a kernel warning/call trace if driver
used intermediate software queues (e.g. ath10k).

Existing code sequences should've guaranteed safety
but it's always good to be extra careful.

The call trace could look like this:

 [ 237.335805] ------------[ cut here ]------------
 [ 237.335852] WARNING: CPU: 3 PID: 1921 at include/net/fq_impl.h:22 fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.335855] Modules linked in: ath10k_pci(E-) ath10k_core(E) ath(E) mac80211(E) cfg80211(E)
 [ 237.335913] CPU: 3 PID: 1921 Comm: rmmod Tainted: G        W   E   4.7.0-rc4-wt-ath+ #1377
 [ 237.335916] Hardware name: Hewlett-Packard HP ProBook 6540b/1722, BIOS 68CDD Ver. F.04 01/27/2010
 [ 237.335918]  00200286 00200286 eff85dac c14151e2 f901574e 00000000 eff85de0 c1081075
 [ 237.335928]  c1ab91f0 00000003 00000781 f901574e 00000016 f8fbabad f8fbabad 00000016
 [ 237.335938]  eb24ff60 00000000 ef3886c0 eff85df4 c10810ba 00000009 00000000 00000000
 [ 237.335948] Call Trace:
 [ 237.335953]  [<c14151e2>] dump_stack+0x76/0xb4
 [ 237.335957]  [<c1081075>] __warn+0xe5/0x100
 [ 237.336002]  [<f8fbabad>] ? fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.336046]  [<f8fbabad>] ? fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.336053]  [<c10810ba>] warn_slowpath_null+0x2a/0x30
 [ 237.336095]  [<f8fbabad>] fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.336137]  [<f8fbc67a>] fq_flow_reset.constprop.56+0x2a/0x90 [mac80211]
 [ 237.336180]  [<f8fbc79a>] fq_reset.constprop.59+0x2a/0x50 [mac80211]
 [ 237.336222]  [<f8fc04e8>] ieee80211_txq_teardown_flows+0x38/0x40 [mac80211]
 [ 237.336258]  [<f8f7c1a4>] ieee80211_unregister_hw+0xe4/0x120 [mac80211]
 [ 237.336275]  [<f933f536>] ath10k_mac_unregister+0x16/0x50 [ath10k_core]
 [ 237.336292]  [<f934592d>] ath10k_core_unregister+0x3d/0x90 [ath10k_core]
 [ 237.336301]  [<f85f8836>] ath10k_pci_remove+0x36/0xa0 [ath10k_pci]
 [ 237.336307]  [<c1470388>] pci_device_remove+0x38/0xb0
 ...

Fixes: 5caa328e38 ("mac80211: implement codel on fair queuing flows")
Fixes: fa962b9212 ("mac80211: implement fair queueing per txq")
Tested-by: Kalle Valo <kvalo@qca.qualcomm.com>
Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:07:44 +02:00
Bob Copeland
efc401f49a mac80211: use common cleanup for user/!user_mpm
We've accumulated a couple of different fixes now to mesh_sta_cleanup()
due to the different paths that user_mpm and !user_mpm cases take -- one
fix to flush nexthop paths and one to fix the counting.

The only caller of mesh_plink_deactivate() is mesh_sta_cleanup(), so we
can push the user_mpm checks down into there in order to share more
code.

In doing so, we can remove an extra call to mesh_path_flush_by_nexthop()
and the (unnecessary) call to mesh_accept_plinks_update().  This will
also ensure the powersaving state code gets called in the user_mpm case.

The only cleanup tasks we need to avoid when MPM is in user-space
are sending the peering frames and stopping the plink timer, so wrap
those in the appropriate check.

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:41 +02:00
Masashi Honma
46f6b06050 mac80211: Encrypt "Group addressed privacy" action frames
Previously, the action frames to group address was not encrypted. But
[1] "Table 8-38 Category values" indicates "Mesh" and "Multihop" category
action frames should be encrypted (Group addressed privacy == yes). And the
encyption key should be MGTK ([1] 10.13 Group addressed robust management frame
procedures). So this patch modifies the code to make it suitable for spec.

[1] IEEE Std 802.11-2012

Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:20 +02:00
Dan Carpenter
49708e3772 mac80211: silence an uninitialized variable warning
We normally return an uninitialized value, but no one checks it so it
doesn't matter.  Anyway, let's silence the static checker warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:19 +02:00
Masashi Honma
e98e915e11 wireless: Use macro instead of number
Use IEEE80211_MIN_ACTION_SIZE macro for robust management frame check.

Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:18 +02:00
Arnd Bergmann
f151d9db4c nl80211: improve nl80211_parse_mesh_config type checking
When building a kernel with W=1, the nl80211.c file causes a number of
warnings, all about the same problem:

net/wireless/nl80211.c: In function 'nl80211_parse_mesh_config':
net/wireless/nl80211.c:5287:103: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5290:96: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5293:124: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5295:148: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5298:106: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5305:116: error: comparison is always false due to limited range of data type [-Werror=type-limits]

The problem is that gcc does not notice that the check is generate
by a macro, so it complains about comparing an unsigned type against 0.

I've tried to come up with a way to rephrase that code in a way that
avoids the warnings and otherwise improves the code as well.

This uses a set of new helper functions that perform the range checking,
and should provide slightly better type safety than the older patch,
at the expense of adding 44 lines to the code. Binary code size is
basically unchanged though (20 bytes added to 126561 bytes .text).

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:18 +02:00
Martin Willi
f21e4d8ed1 mac80211_hwsim: Allow wmediumd to attach to radios created in its netns
Registering wmediumd is currently limited to the initial network
namespace. This patch enables wmediumd to attach from non-initial
network namespaces using a user namespace having CAP_NET_ADMIN. A
registered wmediumd can forward frames on radios that have been created
in the same network namespace, even if they have been moved to other
network namespaces.

The wmediumd Netlink portid is tracked per net namespace. Additionally,
the portid is stored on all radios created in that net namespace to
simplify the portid lookup in the data path.

Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-06-30 12:06:17 +02:00
David S. Miller
545c321ba3 Merge branch 'bpf-helper-improvements'
Daniel Borkmann says:

====================
BPF helper improvements

This set adds various BPF helper improvements, that is, cleaning
up and adding BPF_F_CURRENT_CPU flag for tracing helper, allowing
for preemption checks on bpf_get_smp_processor_id() helper, and
adding two new helpers bpf_skb_change_{proto, type} for tc related
programs. For further details please see individual patches.

Note, this set requires -net to be merged into -net-next tree first.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:48 -04:00
Daniel Borkmann
d2485c4242 bpf: add bpf_skb_change_type helper
This work adds a helper for changing skb->pkt_type in a controlled way.
We only allow a subset of possible values and can extend that in future
should other use cases come up. Doing this as a helper has the advantage
that errors can be handeled gracefully and thus helper kept extensible.

It's a write counterpart to pkt_type member we can already read from
struct __sk_buff context. Major use case is to change incoming skbs to
PACKET_HOST in a programmatic way instead of having to recirculate via
redirect(..., BPF_F_INGRESS), for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
Daniel Borkmann
6578171a7f bpf: add bpf_skb_change_proto helper
This patch adds a minimal helper for doing the groundwork of changing
the skb->protocol in a controlled way. Currently supported is v4 to
v6 and vice versa transitions, which allows f.e. for a minimal, static
nat64 implementation where applications in containers that still
require IPv4 can be transparently operated in an IPv6-only environment.
For example, host facing veth of the container can transparently do
the transitions in a programmatic way with the help of clsact qdisc
and cls_bpf.

Idea is to separate concerns for keeping complexity of the helper
lower, which means that the programs utilize bpf_skb_change_proto(),
bpf_skb_store_bytes() and bpf_lX_csum_replace() to get the job done,
instead of doing everything in a single helper (and thus partially
duplicating helper functionality). Also, bpf_skb_change_proto()
shouldn't need to deal with raw packet data as this is done by other
helpers.

bpf_skb_proto_6_to_4() and bpf_skb_proto_4_to_6() unclone the skb to
operate on a private one, push or pop additionally required header
space and migrate the gso/gro meta data from the shared info. We do
mark the gso type as dodgy so that headers are checked and segs
recalculated by the gso/gro engine. The gso_size target is adapted
as well. The flags argument added is currently reserved and can be
used for future extensions.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
Daniel Borkmann
80b48c4457 bpf: don't use raw processor id in generic helper
Use smp_processor_id() for the generic helper bpf_get_smp_processor_id()
instead of the raw variant. This allows for preemption checks when we
have DEBUG_PREEMPT, and otherwise uses the raw variant anyway. We only
need to keep the raw variant for socket filters, but we can reuse the
helper that is already there from cBPF side.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
Daniel Borkmann
6816a7ffce bpf, trace: add BPF_F_CURRENT_CPU flag for bpf_perf_event_read
Follow-up commit to 1e33759c78 ("bpf, trace: add BPF_F_CURRENT_CPU
flag for bpf_perf_event_output") to add the same functionality into
bpf_perf_event_read() helper. The split of index into flags and index
component is also safe here, since such large maps are rejected during
map allocation time.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
Daniel Borkmann
d793133031 bpf, trace: fetch current cpu only once
We currently have two invocations, which is unnecessary. Fetch it only
once and use the smp_processor_id() variant, so we also get preemption
checks along with it when DEBUG_PREEMPT is set.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
Daniel Borkmann
1ca1cc98bf bpf: minor cleanups on fd maps and helpers
Some minor cleanups: i) Remove the unlikely() from fd array map lookups
and let the CPU branch predictor do its job, scenarios where there is not
always a map entry are very well valid. ii) Move the attribute type check
in the bpf_perf_event_read() helper a bit earlier so it's consistent wrt
checks with bpf_perf_event_output() helper as well. iii) remove some
comments that are self-documenting in kprobe_prog_is_valid_access() and
therefore make it consistent to tp_prog_is_valid_access() as well.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
David S. Miller
ee58b57100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, except the packet scheduler
conflicts which deal with the addition of the free list parameter
to qdisc_enqueue().

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:03:36 -04:00
Sven Eckelmann
a2d0816608 batman-adv: Fix bat_(iv|v) function declaration header
The bat_algo.h had some functions declared which were not part of the
bat_algo.c file. These are instead stored in bat_v.c and bat_iv_ogm.c. The
declaration should therefore be also in bat_v.h and bat_iv_ogm,h to make
them easier to find.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
4e3e823b5a batman-adv: Add debugfs table for mcast flags
This patch adds a debugfs table with originators and their according
multicast flags to help users figure out why multicast optimizations
might be enabled or disabled for them.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
ba412080fb batman-adv: Consolidate logging related functions
There are several places in batman-adv which provide logging related
functions. These should be grouped together in the log.* files to make them
easier to find.

Reported-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
72f7b2deaf batman-adv: Adding logging of mcast flag changes
With this patch changes relevant to a node's own multicast flags are
printed to the 'mcast' log level.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
01d350d147 batman-adv: move bat_algo functions into a separate file
The bat_algo functionality in main.c is mostly unrelated to the rest of the
content. It still takes up a large portion of this source file (~15%, 103
lines). Moving it to a separate file makes it better visible as a main
component of the batman-adv implementation and hides it less in the other
helper functions in main.c.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
687937ab34 batman-adv: Add multicast optimization support for bridged setups
With this patch we are finally able to support multicast optimizations
in bridged setups, too. So far, if a bridge was added on top of a
soft-interface (e.g. bat0) the batman-adv multicast optimizations
needed to be disabled to avoid packetloss.

Current Linux bridge implementations and API can now provide us
with the so far missing information about interested but "remote"
multicast receivers behind bridge ports.

The Linux bridge performs the detection of remote participants
interested in multicast packets with its own and mature so
called IGMP and MLD snooping code and stores that in its
database. With the new API provided by the bridge batman-adv can
now simply hook into this database.

We then reliably announce the gathered multicast listeners to
other nodes through the batman-adv translation table.

Additionally, the Linux bridge provides us with the information about
whether an IGMP/MLD querier exists. If there is none then we need to
disable multicast optimizations as we cannot learn about multicast
listeners on external, bridged-in host then.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Markus Pargmann
1f8dce4992 batman-adv: split tvlv into a separate file
The tvlv functionality in main.c is mostly unrelated to the rest of the
content. It still takes up a large portion of this source file (~45%, 588
lines). Moving it to a separate file makes it better visible as a main
component of the batman-adv implementation and hides it less in the other
helper functions in main.c

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
[sven@narfation.org: fix conflicts with current version, fix includes,
rewrote commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
bd2a979e53 batman-adv: Always flood IGMP/MLD reports
With this patch IGMP or MLD reports are always flooded. This is
necessary for the upcoming bridge integration to function without
multicast packet loss.

With the report handling so far bridges might miss interested multicast
listeners, leading to wrongly excluding ports from multicast packet
forwarding.

Currently we are treating IGMP/MLD reports, the messages bridges use to
learn about interested multicast listeners, just as any other multicast
packet: We try to send them to nodes matching its multicast destination.

Unfortunately, the destination address of reports of the older
IGMPv2/MLDv1 protocol families do not strictly adhere to their own
protocol: More precisely, the interested receiver, an IGMPv2 or MLDv1
querier, itself usually does not listen to the multicast destination
address of any reports.

Therefore with this patch we are simply excluding IGMP/MLD reports from
the multicast forwarding code path and keep flooding them. By that
any bridge receives them and can properly learn about listeners.

To avoid compatibility issues with older nodes not yet implementing this
report handling, we need to force them to flood reports: We do this by
bumping the multicast TVLV version to 2, effectively disabling their
multicast optimization.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
fcafa5e74b batman-adv: Keep includes ordered by filename
It is easier to detect if a include is already there for a used
functionality when the includes are ordered. Using an alphabetic order
together with the grouping in commit 1e2c2a4fe4 ("batman-adv: Add
required includes to all files") makes includes better manageable.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Andrew Lunn
c0f25c802b batman-adv: Include frame priority in fragment header
Unfragmented frames which traverse a node have their skb->priority set
by looking at the IP ToS byte, or the 802.1p header. However for
fragments this is not possible, only one of the fragments will contain
the headers. Instead, place the priority into the fragment header and
on receiving a fragment, use this information to set the skb->priority
for when the fragment is forwarded.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
67a5613ed0 batman-adv: Include main.h in bat_v_ogm.h
main.h includes statements which (re)define preprocessor variables which
influence the compiled code. This makes it necessary to include it in all
files. For example, it redefines pr_fmt used to the module as prefix for
each pr_* message.

Reported-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Andrew Lunn
1914848e0d batman-adv: Set skb priority in fragments
BATMAN will set the skb->priority based on the IP precedence or 802.1q
tag. However, if it needs to fragment the frame, it currently leaves
the fragment skb with the default priority and actually overwrites the
priority in the unfragmented frame. Fix this.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Marek Lindner
7db682d1c3 batman-adv: init ELP tweaking options only once
The ELP interval and throughput override interface settings are initialized
with default settings on every time an interface is added to a mesh.
This patch prevents this behavior by moving the configuration init to the
interface detection routine which runs only once per interface.

Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
[a@unstable.cc: move initialization to batadv_v_hardif_init]
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00