Commit graph

52,701 commits

Author SHA1 Message Date
Xin Long
d30fc5126e sctp: only update outstanding_bytes for transmitted queue when doing prsctp_prune
Now outstanding_bytes is only increased when appending chunks into one
packet and sending it at 1st time, while decreased when it is about to
move into retransmit queue. It means outstanding_bytes value is already
decreased for all chunks in retransmit queue.

However sctp_prsctp_prune_sent is a common function to check the chunks
in both transmitted and retransmit queue, it decrease outstanding_bytes
when moving a chunk into abandoned queue from either of them.

It could cause outstanding_bytes underflow, as it also decreases it's
value for the chunks in retransmit queue.

This patch fixes it by only updating outstanding_bytes for transmitted
queue when pruning queues for prsctp prio policy, the same fix is also
needed in sctp_check_transmitted.

Fixes: 8dbdf1f5b0 ("sctp: implement prsctp PRIO policy")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-01 15:06:23 -05:00
Sven Eckelmann
198a62ddff batman-adv: Fix check of retrieved orig_gw in batadv_v_gw_is_eligible
The batadv_v_gw_is_eligible function already assumes that orig_node is not
NULL. But batadv_gw_node_get may have failed to find the originator. It
must therefore be checked whether the batadv_gw_node_get failed and not
whether orig_node is NULL to detect this error.

Fixes: 50164d8f50 ("batman-adv: B.A.T.M.A.N. V - implement GW selection logic")
Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-12-01 12:37:58 +01:00
Sven Eckelmann
fe77d8257c batman-adv: Always initialize fragment header priority
The batman-adv unuicast fragment header contains 3 bits for the priority of
the packet. These bits will be initialized when the skb->priority contains
a value between 256 and 263. But otherwise, the uninitialized bits from the
stack will be used.

Fixes: c0f25c802b ("batman-adv: Include frame priority in fragment header")
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-12-01 12:37:58 +01:00
Steffen Klassert
ddc47e4404 xfrm: Fix stack-out-of-bounds read on socket policy lookup.
When we do tunnel or beet mode, we pass saddr and daddr from the
template to xfrm_state_find(), this is ok. On transport mode,
we pass the addresses from the flowi, assuming that the IP
addresses (and address family) don't change during transformation.
This assumption is wrong in the IPv4 mapped IPv6 case, packet
is IPv4 and template is IPv6.

Fix this by catching address family missmatches of the policy
and the flow already before we do the lookup.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-12-01 08:18:36 +01:00
Aviv Heller
4ce3dbe397 xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)
Code path when (encap_type < 0) does not verify the state is valid
before progressing.

This will result in a crash if, for instance, x->km.state ==
XFRM_STATE_ACQ.

Fixes: 7785bba299 ("esp: Add a software GRO codepath")
Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-12-01 07:58:53 +01:00
Aviv Heller
9b7e14dba0 xfrm: Remove redundant state assignment in xfrm_input()
x is already initialized to the same value, above.

Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-12-01 07:41:48 +01:00
Yossef Efraim
43024b9ccd xfrm: Fix xfrm_dev_state_add to fail for unsupported HW SA option
xfrm_dev_state_add function returns success for unsupported HW SA options.
Resulting the calling function to create SW SA without corrlating HW SA.
Desipte IPSec device offloading option was chosen.
These not supported HW SA options are hard coded within xfrm_dev_state_add
function.
SW backward compatibility will break if we add any of these option as old
HW will fail with new SW.

This patch changes the behaviour to return -EINVAL in case unsupported
option is chosen.
Notifying user application regarding failure and not breaking backward
compatibility for newly added HW SA options.

Signed-off-by: Yossef Efraim <yossefe@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-12-01 07:10:01 +01:00
Yossef Efraim
0ba23a2113 xfrm: Fix xfrm_replay_overflow_offload_esn
In case of wrap around, replay_esn->oseq_hi is not updated
before it is tested for it's actual value, leading function
to fail with overflow indication and packets being dropped.

This patch updates replay_esn->oseq_hi in the right place.

Fixes: d7dbefc45c ("xfrm: Add xfrm_replay_overflow functions for offloading")
Signed-off-by: Yossef Efraim <yossefe@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-12-01 07:10:01 +01:00
Michal Kubecek
e719135881 xfrm: fix XFRMA_OUTPUT_MARK policy entry
This seems to be an obvious typo, NLA_U32 is type of the attribute, not its
(minimal) length.

Fixes: 077fbac405 ("net: xfrm: support setting an output mark.")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-12-01 06:50:30 +01:00
Trond Myklebust
eb5b46faa6 SUNRPC: Handle ENETDOWN errors
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-11-30 11:52:52 -05:00
Paolo Abeni
e94a62f507 net/reuseport: drop legacy code
Since commit e32ea7e747 ("soreuseport: fast reuseport UDP socket
selection") and commit c125e80b88 ("soreuseport: fast reuseport
TCP socket selection") the relevant reuseport socket matching the current
packet is selected by the reuseport_select_sock() call. The only
exceptions are invalid BPF filters/filters returning out-of-range
indices.
In the latter case the code implicitly falls back to using the hash
demultiplexing, but instead of selecting the socket inside the
reuseport_select_sock() function, it relies on the hash selection
logic introduced with the early soreuseport implementation.

With this patch, in case of a BPF filter returning a bad socket
index value, we fall back to hash-based selection inside the
reuseport_select_sock() body, so that we can drop some duplicate
code in the ipv4 and ipv6 stack.

This also allows faster lookup in the above scenario and will allow
us to avoid computing the hash value for successful, BPF based
demultiplexing - in a later patch.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 10:56:32 -05:00
Hangbin Liu
f859b4af1c sit: update frag_off info
After parsing the sit netlink change info, we forget to update frag_off in
ipip6_tunnel_update(). Fix it by assigning frag_off with new value.

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 10:25:41 -05:00
Eric Dumazet
3016dad75b tcp: remove buggy call to tcp_v6_restore_cb()
tcp_v6_send_reset() expects to receive an skb with skb->cb[] layout as
used in TCP stack.
MD5 lookup uses tcp_v6_iif() and tcp_v6_sdif() and thus
TCP_SKB_CB(skb)->header.h6

This patch probably fixes RST packets sent on behalf of a timewait md5
ipv6 socket.

Before Florian patch, tcp_v6_restore_cb() was needed before jumping to
no_tcp_socket label.

Fixes: 271c3b9b7b ("tcp: honour SO_BINDTODEVICE for TW_RST case too")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 10:21:44 -05:00
Cong Wang
90a6ec8535 act_sample: get rid of tcf_sample_cleanup_rcu()
Similar to commit d7fb60b9ca ("net_sched: get rid of tcfa_rcu"),
TC actions don't need to respect RCU grace period, because it
is either just detached from tc filter (standalone case) or
it is removed together with tc filter (bound case) in which case
RCU grace period is already respected at filter layer.

Fixes: 5c5670fae4 ("net/sched: Introduce sample tc action")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 10:19:17 -05:00
David S. Miller
6c9257a708 RxRPC fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWh7KoPSw1s6N8H32AQJO/BAAkMTbBkXmSsUBvWSxdfbFr8RT6ZVr7M0f
 DPYeSC/kJJSuY0JI4xB6QUkg98G+H0kmdtPdJlrqgyC+kH6hpXFT9A7NqFufvjdz
 e5jjLN0WSnpmGJ4c8wGac/ER3/gWm3kaDeXabkNwf6oBICh4xRzpVnx2vAnETNbj
 ExhUoPtxI1QE+xPlNlrFYpA9XmBMoyvlXaUvwBMB8DwBhhsimWSVIWrhLavjbKKt
 dENhF6CanO9vez1QabEQFflWhW5VPARBlgR4sXZ/K4qYpwiKNPNs2TBiKJ6vfq4F
 ck8IbDj4U49TDnxTvNJdXKLh2vxlSIyFocKqMxb9zHFU/HMvL2h+K6N4dq9MCG4o
 5oS9ZQBbxTxxILr27yGQdVxA31MQ3IoGDGa7TAPnFVHduTjpv87nawVghY+dOZQE
 FXvzaUMjmL949ipaeIPstCtVbRSQT6tDxEu3iUsAIQqdy7gEFyTIr0x1GGunXYci
 pJVsmbC7L/F9FD9uITmBoViRP8eZMNKHAn5R8NeQsL8ylCFlc3ITM1TKIy8RgTmy
 V3XKmxCYCJab+gSgQRe2fsomyFtOKNkhzCUNjKjG5+gAt+dd4C1WFRjAlrbE5rQ8
 l5xI8swerULeSBYZmqTwzPM6iaJgUm5nu5qUn6chV2bXKUAGBGiUSFO//Xp1xE+o
 qxdcoS7MIXY=
 =cyW0
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-fixes-20171129' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fixes

Here are three patches for AF_RXRPC.  One removes some whitespace, one
fixes terminal ACK generation and the third makes a couple of places
actually use the timeout value just determined rather than ignoring it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 10:07:34 -05:00
David Miller
7149f813d1 net: Remove dst->next
There are no more users.

Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
2017-11-30 09:54:27 -05:00
David Miller
5492093dc4 xfrm: Stop using dst->next in bundle construction.
While building ipsec bundles, blocks of xfrm dsts are linked together
using dst->next from bottom to the top.

The only thing this is used for is initializing the pmtu values of the
xfrm stack, and for updating the mtu values at xfrm_bundle_ok() time.

The bundle pmtu entries must be processed in this order so that pmtu
values lower in the stack of routes can propagate up to the higher
ones.

Avoid using dst->next by simply maintaining an array of dst pointers
as we already do for the xfrm_state objects when building the bundle.

Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
2017-11-30 09:54:27 -05:00
David Miller
0f6c480f23 xfrm: Move dst->path into struct xfrm_dst
The first member of an IPSEC route bundle chain sets it's dst->path to
the underlying ipv4/ipv6 route that carries the bundle.

Stated another way, if one were to follow the xfrm_dst->child chain of
the bundle, the final non-NULL pointer would be the path and point to
either an ipv4 or an ipv6 route.

This is largely used to make sure that PMTU events propagate down to
the correct ipv4 or ipv6 route.

When we don't have the top of an IPSEC bundle 'dst->path == dst'.

Move it down into xfrm_dst and key off of dst->xfrm.

Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
2017-11-30 09:54:26 -05:00
David Miller
3a2232e92e ipv6: Move dst->from into struct rt6_info.
The dst->from value is only used by ipv6 routes to track where
a route "came from".

Any time we clone or copy a core ipv6 route in the ipv6 routing
tables, we have the copy/clone's ->from point to the base route.

This is used to handle route expiration properly.

Only ipv6 uses this mechanism, and only ipv6 code references
it.  So it is safe to move it into rt6_info.

Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
2017-11-30 09:54:26 -05:00
David Miller
b6ca8bd5a9 xfrm: Move child route linkage into xfrm_dst.
XFRM bundle child chains look like this:

	xdst1 --> xdst2 --> xdst3 --> path_dst

All of xdstN are xfrm_dst objects and xdst->u.dst.xfrm is non-NULL.
The final child pointer in the chain, here called 'path_dst', is some
other kind of route such as an ipv4 or ipv6 one.

The xfrm output path pops routes, one at a time, via the child
pointer, until we hit one which has a dst->xfrm pointer which
is NULL.

We can easily preserve the above mechanisms with child sitting
only in the xfrm_dst structure.  All children in the chain
before we break out of the xfrm_output() loop have dst->xfrm
non-NULL and are therefore xfrm_dst objects.

Since we break out of the loop when we find dst->xfrm NULL, we
will not try to dereference 'dst' as if it were an xfrm_dst.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 09:54:26 -05:00
David Miller
45b018bedd ipsec: Create and use new helpers for dst child access.
This will make a future change moving the dst->child pointer less
invasive.

Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
2017-11-30 09:54:26 -05:00
David Miller
b92cf4aab8 net: Create and use new helper xfrm_dst_child().
Only IPSEC routes have a non-NULL dst->child pointer.  And IPSEC
routes are identified by a non-NULL dst->xfrm pointer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 09:54:25 -05:00
David Miller
071fb37ec4 ipv6: Move rt6_next from dst_entry into ipv6 route structure.
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
2017-11-30 09:54:25 -05:00
David Miller
fe736e778c decnet: Move dn_next into decnet route structure.
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
2017-11-30 09:54:25 -05:00
Tina Ruchandani
d750dbdc07 atm: mpoa: remove 32-bit timekeeping
net/atm/mpoa_* files use 'struct timeval' to store event
timestamps. struct timeval uses a 32-bit seconds field which will
overflow in the year 2038 and beyond. Morever, the timestamps are being
compared only to get seconds elapsed, so struct timeval which stores
a seconds and microseconds field is an overkill. This patch replaces
the use of struct timeval with time64_t to store a 64-bit seconds field.

Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 09:26:32 -05:00
Arnd Bergmann
311af51dcb openvswitch: use ktime_get_ts64() instead of ktime_get_ts()
timespec is deprecated because of the y2038 overflow, so let's convert
this one to ktime_get_ts64(). The code is already safe even on 32-bit
architectures, since it uses monotonic times. On 64-bit architectures,
nothing changes, while on 32-bit architectures this avoids one
type conversion.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30 09:26:32 -05:00
Lorenzo Colitti
be8f8284cd net: xfrm: allow clearing socket xfrm policies.
Currently it is possible to add or update socket policies, but
not clear them. Therefore, once a socket policy has been applied,
the socket cannot be used for unencrypted traffic.

This patch allows (privileged) users to clear socket policies by
passing in a NULL pointer and zero length argument to the
{IP,IPV6}_{IPSEC,XFRM}_POLICY setsockopts. This results in both
the incoming and outgoing policies being cleared.

The simple approach taken in this patch cannot clear socket
policies in only one direction. If desired this could be added
in the future, for example by continuing to pass in a length of
zero (which currently is guaranteed to return EMSGSIZE) and
making the policy be a pointer to an integer that contains one
of the XFRM_POLICY_{IN,OUT} enum values.

An alternative would have been to interpret the length as a
signed integer and use XFRM_POLICY_IN (i.e., 0) to clear the
input policy and -XFRM_POLICY_OUT (i.e., -1) to clear the output
policy.

Tested: https://android-review.googlesource.com/539816
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-11-30 10:53:06 +01:00
Linus Torvalds
b915176102 Highlights:
- Fixes from Trond for some races in the NFSv4 state code.
 	- Fix from Naofumi Honda for a typo in the blocked lock
 	  notificiation code.
 	- Fixes from Vasily Averin for some problems starting and
 	  stopping lockd especially in network namespaces.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaHxq/AAoJECebzXlCjuG+QOYP/jIa9dZnbau3owP8RJJv1+VI
 RSMYAZkIjy1vixn/BymZo55R7+23BhdLe8CDsknXWo85mIj61kpV1bwF2lVc7FWm
 +Pt93DkUsUBEjf+/3/58TLknYs5o7UhsEw2Qjg+D3BkO+z95biNa0hUBle2+Nnwi
 vBQLGqdlCIFZxuzEo7yUlGdKTyefzab4bocgRnh/5JMs+bHzPDD74W1GrGB1oEKX
 VSGzq0d7LLe23yIJwgP1eaa0tQr1/WsxlL8xD5Im6mXcN9aYa/7VZhg/oCluy8ac
 v95IBjQUkFqvw5OjDgSX5ZgKzokmRxLjnaUX2JT/sLCk1WxdJhUomw8qb1AJLvav
 e6xce1M+dR5VihTrD/cEe0xB7CXKXywPd6pXQBosAMInhS79aU8brIPCDtLvNwCw
 XvtNybbqC0Go89YMt2zuRfBkV7W3FmM1h+h4PWVl+iCl/7+AYIXD1qeX/FuIjnk6
 SMEdtTb/cqECuh55YefEljUzY1vKYgquxCNCvcbSrMtVSOZYXXufheY+fBjf5DBb
 Bnsd1FiPtVkwFwX8bTbGlOOub1Ryl9SD4Ae0Ynu2FNYSFL8BVXTHkTHm9UHl83s5
 pr0T6bKlpg+YzZrHVh2Herr9Ze89C9uM7oCU1M062vk4+Cg65paqNTnWVtflYPhG
 y9p0hsY5csyzm0SZ/1Ui
 =pR9D
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.15-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd fixes from Bruce Fields:
 "I screwed up my merge window pull request; I only sent half of what I
  meant to.

  There were no new features, just bugfixes of various importance and
  some very minor cleanup, so I think it's all still appropriate for
  -rc2.

  Highlights:

   - Fixes from Trond for some races in the NFSv4 state code.

   - Fix from Naofumi Honda for a typo in the blocked lock notificiation
     code

   - Fixes from Vasily Averin for some problems starting and stopping
     lockd especially in network namespaces"

* tag 'nfsd-4.15-1' of git://linux-nfs.org/~bfields/linux: (23 commits)
  lockd: fix "list_add double add" caused by legacy signal interface
  nlm_shutdown_hosts_net() cleanup
  race of nfsd inetaddr notifiers vs nn->nfsd_serv change
  race of lockd inetaddr notifiers vs nlmsvc_rqst change
  SUNRPC: make cache_detail structures const
  NFSD: make cache_detail structures const
  sunrpc: make the function arg as const
  nfsd: check for use of the closed special stateid
  nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat
  lockd: lost rollback of set_grace_period() in lockd_down_net()
  lockd: added cleanup checks in exit_net hook
  grace: replace BUG_ON by WARN_ONCE in exit_net hook
  nfsd: fix locking validator warning on nfs4_ol_stateid->st_mutex class
  lockd: remove net pointer from messages
  nfsd: remove net pointer from debug messages
  nfsd: Fix races with check_stateid_generation()
  nfsd: Ensure we check stateid validity in the seqid operation checks
  nfsd: Fix race in lock stateid creation
  nfsd4: move find_lock_stateid
  nfsd: Ensure we don't recognise lock stateids after freeing them
  ...
2017-11-29 14:49:26 -08:00
Linus Torvalds
96c22a49ac Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) The forcedeth conversion from pci_*() DMA interfaces to dma_*() ones
    missed one spot. From Zhu Yanjun.

 2) Missing CRYPTO_SHA256 Kconfig dep in cfg80211, from Johannes Berg.

 3) Fix checksum offloading in thunderx driver, from Sunil Goutham.

 4) Add SPDX to vm_sockets_diag.h, from Stephen Hemminger.

 5) Fix use after free of packet headers in TIPC, from Jon Maloy.

 6) "sizeof(ptr)" vs "sizeof(*ptr)" bug in i40e, from Gustavo A R Silva.

 7) Tunneling fixes in mlxsw driver, from Petr Machata.

 8) Fix crash in fanout_demux_rollover() of AF_PACKET, from Mike
    Maloney.

 9) Fix race in AF_PACKET bind() vs. NETDEV_UP notifier, from Eric
    Dumazet.

10) Fix regression in sch_sfq.c due to one of the timer_setup()
    conversions. From Paolo Abeni.

11) SCTP does list_for_each_entry() using wrong struct member, fix from
    Xin Long.

12) Don't use big endian netlink attribute read for
    IFLA_BOND_AD_ACTOR_SYSTEM, it is in cpu endianness. Also from Xin
    Long.

13) Fix mis-initialization of q->link.clock in CBQ scheduler, preventing
    adding filters there. From Jiri Pirko.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits)
  ethernet: dwmac-stm32: Fix copyright
  net: via: via-rhine: use %p to format void * address instead of %x
  net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit
  myri10ge: Update MAINTAINERS
  net: sched: cbq: create block for q->link.block
  atm: suni: remove extraneous space to fix indentation
  atm: lanai: use %p to format kernel addresses instead of %x
  VSOCK: Don't set sk_state to TCP_CLOSE before testing it
  atm: fore200e: use %pK to format kernel addresses instead of %x
  ambassador: fix incorrect indentation of assignment statement
  vxlan: use __be32 type for the param vni in __vxlan_fdb_delete
  bonding: use nla_get_u64 to extract the value for IFLA_BOND_AD_ACTOR_SYSTEM
  sctp: use right member as the param of list_for_each_entry
  sch_sfq: fix null pointer dereference at timer expiration
  cls_bpf: don't decrement net's refcount when offload fails
  net/packet: fix a race in packet_bind() and packet_notifier()
  packet: fix crash in fanout_demux_rollover()
  sctp: remove extern from stream sched
  sctp: force the params with right types for sctp csum apis
  sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail
  ...
2017-11-29 13:10:25 -08:00
Trond Myklebust
4ba161a793 SUNRPC: Allow connect to return EHOSTUNREACH
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-11-29 14:02:01 -05:00
Gustavo A. R. Silva
282ef47291 rxrpc: Fix variable overwrite
Values assigned to both variable resend_at and ack_at are overwritten
before they can be used.

The correct fix here is to add 'now' to the previously computed value in
resend_at and ack_at.

Addresses-Coverity-ID: 1462262
Addresses-Coverity-ID: 1462263
Addresses-Coverity-ID: 1462264
Fixes: beb8e5e4f3 ("rxrpc: Express protocol timeouts in terms of RTT")
Link: https://marc.info/?i=17004.1511808959%40warthog.procyon.org.uk
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-29 14:44:22 +00:00
David Howells
5fc62f6a13 rxrpc: Fix ACK generation from the connection event processor
Repeat terminal ACKs and now terminal ACKs are now generated from the
connection event processor rather from call handling as this allows us to
discard client call structures as soon as possible and free up the channel
for a follow on call.

However, in ACKs so generated, the additional information trailer is
malformed because the padding that's meant to be in the middle isn't
included in what's transmitted.

Fix it so that the 3 bytes of padding are included in the transmission.

Further, the trailer is misaligned because of the padding, so assigment to
the u16 and u32 fields inside it might cause problems on some arches, so
fix this by breaking the padding and the trailer out of the packed struct.

(This also deals with potential compiler weirdies where some of the nested
structs are packed and some aren't).

The symptoms can be seen in wireshark as terminal DUPLICATE or IDLE ACK
packets in which the Max MTU, Interface MTU and rwind fields have weird
values and the Max Packets field is apparently missing.

Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-29 14:40:41 +00:00
David Howells
3d7682af22 rxrpc: Clean up whitespace
Clean up some whitespace from rxrpc.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-29 14:40:41 +00:00
Cong Wang
6a53b75932 xfrm: check id proto in validate_tmpl()
syzbot reported a kernel warning in xfrm_state_fini(), which
indicates that we have entries left in the list
net->xfrm.state_all whose proto is zero. And
xfrm_id_proto_match() doesn't consider them as a match with
IPSEC_PROTO_ANY in this case.

Proto with value 0 is probably not a valid value, at least
verify_newsa_info() doesn't consider it valid either.

This patch fixes it by checking the proto value in
validate_tmpl() and rejecting invalid ones, like what iproute2
does in xfrm_xfrmproto_getbyname().

Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-11-29 08:55:29 +01:00
Paul E. McKenney
ffa53c5863 netfilter: Eliminate cond_resched_rcu_qs() in favor of cond_resched()
Now that cond_resched() also provides RCU quiescent states when
needed, it can be used in place of cond_resched_rcu_qs().  This
commit therefore makes this change.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: <netfilter-devel@vger.kernel.org>
2017-11-28 16:00:27 -08:00
Jiri Pirko
d51aae68b1 net: sched: cbq: create block for q->link.block
q->link.block is not initialized, that leads to EINVAL when one tries to
add filter there. So initialize it properly.

This can be reproduced by:
$ tc qdisc add dev eth0 root handle 1: cbq avpkt 1000 rate 1000Mbit bandwidth 1000Mbit
$ tc filter add dev eth0 parent 1: protocol ip prio 100 u32 match ip protocol 0 0x00 flowid 1:1

Reported-by: Jaroslav Aster <jaster@redhat.com>
Reported-by: Ivan Vecera <ivecera@redhat.com>
Fixes: 6529eaba33 ("net: sched: introduce tcf block infractructure")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 16:04:26 -05:00
Jorgen Hansen
4a5def7f6a VSOCK: Don't set sk_state to TCP_CLOSE before testing it
A recent commit (3b4477d2dc) converted the sk_state to use
TCP constants. In that change, vmci_transport_handle_detach
was changed such that sk->sk_state was set to TCP_CLOSE before
we test whether it is TCP_SYN_SENT. This change moves the
sk_state change back to the original locations in that function.

Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 16:01:50 -05:00
Xin Long
a8dd397903 sctp: use right member as the param of list_for_each_entry
Commit d04adf1b35 ("sctp: reset owner sk for data chunks on out queues
when migrating a sock") made a mistake that using 'list' as the param of
list_for_each_entry to traverse the retransmit, sacked and abandoned
queues, while chunks are using 'transmitted_list' to link into these
queues.

It could cause NULL dereference panic if there are chunks in any of these
queues when peeling off one asoc.

So use the chunk member 'transmitted_list' instead in this patch.

Fixes: d04adf1b35 ("sctp: reset owner sk for data chunks on out queues when migrating a sock")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:55:44 -05:00
Paolo Abeni
f85729d07c sch_sfq: fix null pointer dereference at timer expiration
While converting sch_sfq to use timer_setup(), the commit cdeabbb881
("net: sched: Convert timers to use timer_setup()") forgot to
initialize the 'sch' field. As a result, the timer callback tries to
dereference a NULL pointer, and the kernel does oops.

Fix it initializing such field at qdisc creation time.

Fixes: cdeabbb881 ("net: sched: Convert timers to use timer_setup()")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:54:05 -05:00
Jakub Kicinski
25415cec50 cls_bpf: don't decrement net's refcount when offload fails
When cls_bpf offload was added it seemed like a good idea to
call cls_bpf_delete_prog() instead of extending the error
handling path, since the software state is fully initialized
at that point.  This handling of errors without jumping to
the end of the function is error prone, as proven by later
commit missing that extra call to __cls_bpf_delete_prog().

__cls_bpf_delete_prog() is now expected to be invoked with
a reference on exts->net or the field zeroed out.  The call
on the offload's error patch does not fullfil this requirement,
leading to each error stealing a reference on net namespace.

Create a function undoing what cls_bpf_set_parms() did and
use it from __cls_bpf_delete_prog() and the error path.

Fixes: aae2c35ec8 ("cls_bpf: use tcf_exts_get_net() before call_rcu()")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:49:44 -05:00
Eric Dumazet
15fe076ede net/packet: fix a race in packet_bind() and packet_notifier()
syzbot reported crashes [1] and provided a C repro easing bug hunting.

When/if packet_do_bind() calls __unregister_prot_hook() and releases
po->bind_lock, another thread can run packet_notifier() and process an
NETDEV_UP event.

This calls register_prot_hook() and hooks again the socket right before
first thread is able to grab again po->bind_lock.

Fixes this issue by temporarily setting po->num to 0, as suggested by
David Miller.

[1]
dev_remove_pack: ffff8801bf16fa80 not found
------------[ cut here ]------------
kernel BUG at net/core/dev.c:7945!  ( BUG_ON(!list_empty(&dev->ptype_all)); )
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
device syz0 entered promiscuous mode
CPU: 0 PID: 3161 Comm: syzkaller404108 Not tainted 4.14.0+ #190
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801cc57a500 task.stack: ffff8801cc588000
RIP: 0010:netdev_run_todo+0x772/0xae0 net/core/dev.c:7945
RSP: 0018:ffff8801cc58f598 EFLAGS: 00010293
RAX: ffff8801cc57a500 RBX: dffffc0000000000 RCX: ffffffff841f75b2
RDX: 0000000000000000 RSI: 1ffff100398b1ede RDI: ffff8801bf1f8810
device syz0 entered promiscuous mode
RBP: ffff8801cc58f898 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801bf1f8cd8
R13: ffff8801cc58f870 R14: ffff8801bf1f8780 R15: ffff8801cc58f7f0
FS:  0000000001716880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020b13000 CR3: 0000000005e25000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:106
 tun_detach drivers/net/tun.c:670 [inline]
 tun_chr_close+0x49/0x60 drivers/net/tun.c:2845
 __fput+0x333/0x7f0 fs/file_table.c:210
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x199/0x270 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x9bb/0x1ae0 kernel/exit.c:865
 do_group_exit+0x149/0x400 kernel/exit.c:968
 SYSC_exit_group kernel/exit.c:979 [inline]
 SyS_exit_group+0x1d/0x20 kernel/exit.c:977
 entry_SYSCALL_64_fastpath+0x1f/0x96
RIP: 0033:0x44ad19

Fixes: 30f7ea1c2b ("packet: race condition in packet_bind")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Francesco Ruggeri <fruggeri@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:13:30 -05:00
Mike Maloney
57f015f5ec packet: fix crash in fanout_demux_rollover()
syzkaller found a race condition fanout_demux_rollover() while removing
a packet socket from a fanout group.

po->rollover is read and operated on during packet_rcv_fanout(), via
fanout_demux_rollover(), but the pointer is currently cleared before the
synchronization in packet_release().   It is safer to delay the cleanup
until after synchronize_net() has been called, ensuring all calls to
packet_rcv_fanout() for this socket have finished.

To further simplify synchronization around the rollover structure, set
po->rollover in fanout_add() only if there are no errors.  This removes
the need for rcu in the struct and in the call to
packet_getsockopt(..., PACKET_ROLLOVER_STATS, ...).

Crashing stack trace:
 fanout_demux_rollover+0xb6/0x4d0 net/packet/af_packet.c:1392
 packet_rcv_fanout+0x649/0x7c8 net/packet/af_packet.c:1487
 dev_queue_xmit_nit+0x835/0xc10 net/core/dev.c:1953
 xmit_one net/core/dev.c:2975 [inline]
 dev_hard_start_xmit+0x16b/0xac0 net/core/dev.c:2995
 __dev_queue_xmit+0x17a4/0x2050 net/core/dev.c:3476
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3509
 neigh_connected_output+0x489/0x720 net/core/neighbour.c:1379
 neigh_output include/net/neighbour.h:482 [inline]
 ip6_finish_output2+0xad1/0x22a0 net/ipv6/ip6_output.c:120
 ip6_finish_output+0x2f9/0x920 net/ipv6/ip6_output.c:146
 NF_HOOK_COND include/linux/netfilter.h:239 [inline]
 ip6_output+0x1f4/0x850 net/ipv6/ip6_output.c:163
 dst_output include/net/dst.h:459 [inline]
 NF_HOOK.constprop.35+0xff/0x630 include/linux/netfilter.h:250
 mld_sendpack+0x6a8/0xcc0 net/ipv6/mcast.c:1660
 mld_send_initial_cr.part.24+0x103/0x150 net/ipv6/mcast.c:2072
 mld_send_initial_cr net/ipv6/mcast.c:2056 [inline]
 ipv6_mc_dad_complete+0x99/0x130 net/ipv6/mcast.c:2079
 addrconf_dad_completed+0x595/0x970 net/ipv6/addrconf.c:4039
 addrconf_dad_work+0xac9/0x1160 net/ipv6/addrconf.c:3971
 process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2113
 worker_thread+0x223/0x1990 kernel/workqueue.c:2247
 kthread+0x35e/0x430 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:432

Fixes: 0648ab70af ("packet: rollover prepare: per-socket state")
Fixes: 509c7a1ecc ("packet: avoid panic in packet_getsockopt()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Mike Maloney <maloney@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:13:30 -05:00
Al Viro
7594bf37ae 9p: untangle ->poll() mess
First of all, NULL ->poll() means "always POLLIN, always POLLOUT", not an error.
Furthermore, mixing -EREMOTEIO with POLL... masks and expecting it to do anything
good is insane - both are arch-dependent, to start with.  Pass a pointer to
store the error value separately and make it return POLLERR in such case.

And ->poll() calling conventions do *not* include "return -Esomething".  Never
had.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-28 11:07:13 -05:00
Xin Long
1ba896f6f5 sctp: remove extern from stream sched
Now each stream sched ops is defined in different .c file and
added into the global ops in another .c file, it uses extern
to make this work.

However extern is not good coding style to get them in and
even make C=2 reports errors for this.

This patch adds sctp_sched_ops_xxx_init for each stream sched
ops in their .c file, then get them into the global ops by
calling them when initializing sctp module.

Fixes: 637784ade2 ("sctp: introduce priority based stream scheduler")
Fixes: ac1ed8b82c ("sctp: introduce round robin stream scheduler")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:00:13 -05:00
Xin Long
08f46070dd sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail
This patch is to force SCTP_ERROR_INV_STRM with right type to
fit in sctp_chunk_fail to avoid the sparse error.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:00:13 -05:00
Stephen Hemminger
e02554e9a4 ipx: move Novell IPX protocol support into staging
The Netware IPX protocol is very old and no one should still be using
it. It is time to move it into staging for a while and eventually
decommision it.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-28 13:55:00 +01:00
Jay Elliott
8b1836c4b6 netfilter: conntrack: clamp timeouts to INT_MAX
When the conntracking code multiplies a timeout by HZ, it can overflow
from positive to negative; this causes it to instantly expire.  To
protect against this the multiplication is done in 64-bit so we can
prevent it from exceeding INT_MAX.

Signed-off-by: Jay Elliott <jelliott@arista.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-11-28 01:17:04 +01:00
Bhumika Goyal
ee24eac3eb SUNRPC: make cache_detail structures const
Make these const as they are only getting passed to the function
cache_create_net having the argument as const.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-27 16:45:11 -05:00
Bhumika Goyal
d34971a65b sunrpc: make the function arg as const
Make the struct cache_detail *tmpl argument of the function
cache_create_net as const as it is only getting passed to kmemup having
the argument as const void *.
Add const to the prototype too.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-27 16:45:11 -05:00
Al Viro
ade994f4f6 net: annotate ->poll() instances
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-27 16:20:04 -05:00