linux-uconsole/net
Florian Westphal 9e0f4d2473 netfilter: ctnetlink: netns exit must wait for callbacks
[ Upstream commit 18a110b022 ]

Curtis Taylor and Jon Maxwell reported and debugged a crash on 3.10
based kernel.

Crash occurs in ctnetlink_conntrack_events because net->nfnl socket is
NULL.  The nfnl socket was set to NULL by netns destruction running on
another cpu.

The exiting network namespace calls the relevant destructors in the
following order:

1. ctnetlink_net_exit_batch

This nulls out the event callback pointer in struct netns.

2. nfnetlink_net_exit_batch

This nulls net->nfnl socket and frees it.

3. nf_conntrack_cleanup_net_list

This removes all remaining conntrack entries.

This is order is correct. The only explanation for the crash so ar is:

cpu1: conntrack is dying, eviction occurs:
 -> nf_ct_delete()
   -> nf_conntrack_event_report \
     -> nf_conntrack_eventmask_report
       -> notify->fcn() (== ctnetlink_conntrack_events).

cpu1: a. fetches rcu protected pointer to obtain ctnetlink event callback.
      b. gets interrupted.
 cpu2: runs netns exit handlers:
     a runs ctnetlink destructor, event cb pointer set to NULL.
     b runs nfnetlink destructor, nfnl socket is closed and set to NULL.
cpu1: c. resumes and trips over NULL net->nfnl.

Problem appears to be that ctnetlink_net_exit_batch only prevents future
callers of nf_conntrack_eventmask_report() from obtaining the callback.
It doesn't wait of other cpus that might have already obtained the
callbacks address.

I don't see anything in upstream kernels that would prevent similar
crash: We need to wait for all cpus to have exited the event callback.

Fixes: 9592a5c01e ("netfilter: ctnetlink: netns support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12 12:17:05 +01:00
..
6lowpan
9p 9p: Transport error uninitialized 2019-10-11 18:21:12 +02:00
802
8021q vlan: disable SIOCSHWTSTAMP in container 2019-05-16 19:41:30 +02:00
appletalk appletalk: Set error code if register_snap_client failed 2019-12-13 08:52:59 +01:00
atm net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
ax25 ax25: enforce CAP_NET_RAW for raw sockets 2019-10-05 13:09:32 +02:00
batman-adv batman-adv: Avoid free/alloc race when handling OGM buffer 2019-11-06 13:06:22 +01:00
bluetooth Bluetooth: Fix memory leak in hci_connect_le_scan 2020-01-09 10:19:04 +01:00
bpf bpf/test_run: support cgroup local storage 2018-08-03 00:47:32 +02:00
bpfilter net: bpfilter: use get_pid_task instead of pid_task 2018-10-17 22:03:40 -07:00
bridge net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:13:37 +01:00
caif net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
can can: gw: Fix error path of cgw_module_init 2019-08-29 08:28:30 +02:00
ceph libceph: fix PG split vs OSD (re)connect race 2019-08-29 08:28:50 +02:00
core net: add annotations on hh->hh_len lockless accesses 2020-01-09 10:19:09 +01:00
dcb net: dcb: Add priority-to-DSCP map getters 2018-07-27 13:17:50 -07:00
dccp inet: stop leaking jiffies on the wire 2019-11-10 11:27:37 +01:00
decnet net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:13:37 +01:00
dns_resolver net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
dsa net: dsa: fix switch tree list 2019-11-10 11:27:53 +01:00
ethernet net: add annotations on hh->hh_len lockless accesses 2020-01-09 10:19:09 +01:00
hsr net/hsr: fix possible crash in add_timer() 2019-03-19 13:12:38 +01:00
ieee802154 ieee802154: enforce CAP_NET_RAW for raw sockets 2019-10-05 13:09:32 +02:00
ife
ipv4 tcp: annotate tp->rcv_nxt lockless reads 2020-01-09 10:19:08 +01:00
ipv6 tcp: annotate tp->rcv_nxt lockless reads 2020-01-09 10:19:08 +01:00
iucv Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
kcm kcm: switch order of device registration to fix a crash 2019-04-17 08:38:40 +02:00
key af_key: fix leaks in key_pol_get_resp and dump_sp. 2019-07-26 09:14:01 +02:00
l2tp compat_ioctl: pppoe: fix PPPOEIOCSFWD handling 2019-08-09 17:52:34 +02:00
l3mdev
lapb lapb: fixed leak of control-blocks. 2019-06-22 08:15:13 +02:00
llc llc: avoid blocking in llc_sap_close() 2019-11-20 18:46:35 +01:00
mac80211 mac80211: consider QoS Null frames for STA_NULLFUNC_ACKED 2019-12-31 16:36:13 +01:00
mac802154 net: mac802154: tx: expand tailroom if necessary 2018-08-06 11:21:37 +02:00
mpls mpls: Return error for RTA_GATEWAY attribute 2019-03-10 07:17:19 +01:00
ncsi net/ncsi: Fixup .dumpit message flags and ID check in Netlink handler 2018-08-22 21:39:08 -07:00
netfilter netfilter: ctnetlink: netns exit must wait for callbacks 2020-01-12 12:17:05 +01:00
netlabel netlabel: fix out-of-bounds memory accesses 2019-03-10 07:17:18 +01:00
netlink genetlink: Fix a memory leak on error path 2019-04-03 06:26:15 +02:00
netrom netrom: hold sock when setting skb->destructor 2019-07-28 08:29:27 +02:00
nfc net: nfc: nci: fix a possible sleep-in-atomic-context bug in nci_uart_tty_receive() 2019-12-31 16:34:38 +01:00
nsh
openvswitch openvswitch: support asymmetric conntrack 2019-12-21 10:57:14 +01:00
packet af_packet: set defaule value for tmo 2019-12-31 16:34:35 +01:00
phonet net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
psample net: psample: fix skb_over_panic 2019-12-05 09:21:30 +01:00
qrtr net: qrtr: fix memort leak in qrtr_tun_write_iter 2019-12-13 08:52:58 +01:00
rds net/rds: Fix error handling in rds_ib_add_one() 2019-10-07 18:57:24 +02:00
rfkill rfkill: allocate static minor 2019-12-31 16:35:38 +01:00
rose net/rose: fix unbound loop in rose_loopback_timer() 2019-05-02 09:59:00 +02:00
rxrpc rxrpc: Fix possible NULL pointer access in ICMP handling 2020-01-09 10:19:08 +01:00
sched net: sched: fix dump qlen for sch_mq/sch_mqprio with NOLOCK subqueues 2019-12-21 10:57:12 +01:00
sctp net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:13:37 +01:00
smc net/smc: do not wait under send_lock 2019-12-17 20:35:33 +01:00
strparser net: strparser: partially revert "strparser: Call skb_unclone conditionally" 2019-05-16 19:41:27 +02:00
sunrpc sunrpc: fix crash when cache_head become valid before update 2019-12-17 20:35:52 +01:00
switchdev
tipc tipc: fix ordering of tipc module init and exit routine 2019-12-21 10:57:16 +01:00
tls net: tls, fix sk_write_space NULL write when tx disabled 2019-09-06 10:22:04 +02:00
unix net: fix warning in af_unix 2019-12-01 09:16:33 +01:00
vmw_vsock VSOCK: bind to random port for VMADDR_PORT_ANY 2019-12-05 09:20:19 +01:00
wimax wimax: remove blank lines at EOF 2018-07-24 14:10:42 -07:00
wireless cfg80211: call disconnect_wk when AP stops 2019-12-01 09:17:34 +01:00
x25 net/x25: fix null_x25_address handling 2019-12-13 08:52:15 +01:00
xdp xsk: proper AF_XDP socket teardown ordering 2019-11-24 08:20:32 +01:00
xfrm xfrm interface: fix management of phydev 2019-12-13 08:52:42 +01:00
compat.c sock: Make sock->sk_stamp thread-safe 2019-01-09 17:38:33 +01:00
Kconfig net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
Makefile
socket.c net: make socket read/write_iter() honor IOCB_NOWAIT 2020-01-09 10:18:57 +01:00
sysctl_net.c