linux-uconsole/net/ipv4
Florian Westphal 86a0a00794 tcp: do not restart timewait timer on rst reception
[ Upstream commit 63cc357f7b ]

RFC 1337 says:
 ''Ignore RST segments in TIME-WAIT state.
   If the 2 minute MSL is enforced, this fix avoids all three hazards.''

So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
expire rather than removing it instantly when a reset is received.

However, Linux will also re-start the TIME-WAIT timer.

This causes connect to fail when tying to re-use ports or very long
delays (until syn retry interval exceeds MSL).

packetdrill test case:
// Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode.
`sysctl net.ipv4.tcp_rfc1337=1`

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0

0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4

// Receive first segment
0.310 < P. 1:1001(1000) ack 1 win 46

// Send one ACK
0.310 > . 1:1(0) ack 1001

// read 1000 byte
0.310 read(4, ..., 1000) = 1000

// Application writes 100 bytes
0.350 write(4, ..., 100) = 100
0.350 > P. 1:101(100) ack 1001

// ACK
0.500 < . 1001:1001(0) ack 101 win 257

// close the connection
0.600 close(4) = 0
0.600 > F. 101:101(0) ack 1001 win 244

// Our side is in FIN_WAIT_1 & waits for ack to fin
0.7 < . 1001:1001(0) ack 102 win 244

// Our side is in FIN_WAIT_2 with no outstanding data.
0.8 < F. 1001:1001(0) ack 102 win 244
0.8 > . 102:102(0) ack 1002 win 244

// Our side is now in TIME_WAIT state, send ack for fin.
0.9 < F. 1002:1002(0) ack 102 win 244
0.9 > . 102:102(0) ack 1002 win 244

// Peer reopens with in-window SYN:
1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>

// Therefore, reply with ACK.
1.000 > . 102:102(0) ack 1002 win 244

// Peer sends RST for this ACK.  Normally this RST results
// in tw socket removal, but rfc1337=1 setting prevents this.
1.100 < R 1002:1002(0) win 244

// second syn. Due to rfc1337=1 expect another pure ACK.
31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
31.0 > . 102:102(0) ack 1002 win 244

// .. and another RST from peer.
31.1 < R 1002:1002(0) win 244
31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`

// third syn after one minute.  Time-Wait socket should have expired by now.
63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>

// so we expect a syn-ack & 3whs to proceed from here on.
63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>

Without this patch, 'ss' shows restarts of tw timer and last packet is
thus just another pure ack, more than one minute later.

This restores the original code from commit 283fd6cf0be690a83
("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .

For some reason the else branch was removed/lost in 1f28b683339f7
("Merge in TCP/UDP optimizations and [..]") and timer restart became
unconditional.

Reported-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15 09:40:37 +02:00
..
netfilter netfilter: x_tables: set module owner for icmp(6) matches 2018-08-24 13:26:58 +02:00
af_inet.c net: ping: do not abuse udp_poll() 2017-06-14 13:16:19 +02:00
ah4.c ipsec: check return value of skb_to_sgvec always 2018-04-13 19:50:23 +02:00
arp.c arp: fix arp_filter on l3slave devices 2018-04-13 19:50:24 +02:00
cipso_ipv4.c Cipso: cipso_v4_optptr enter infinite loop 2018-09-05 09:18:33 +02:00
datagram.c net: Set sk_txhash from a random number 2015-07-29 22:44:04 -07:00
devinet.c ipv4: igmp: guard against silly MTU values 2018-01-02 20:33:24 +01:00
esp4.c ipsec: check return value of skb_to_sgvec always 2018-04-13 19:50:23 +02:00
fib_frontend.c ipv4: remove BUG_ON() from fib_compute_spec_dst 2018-08-06 16:24:40 +02:00
fib_lookup.h ipv4: consider TOS in fib_select_default 2015-07-24 22:46:11 -07:00
fib_rules.c net: ipv6: use common fib_default_rule_pref 2015-09-09 14:19:50 -07:00
fib_semantics.c ipv4: Fix error return value in fib_convert_metrics() 2018-07-11 16:03:47 +02:00
fib_trie.c net: Improve handling of failures on link and route dumps 2017-06-07 12:05:58 +02:00
fou.c net: add recursion limit to GRO 2016-11-15 07:46:38 +01:00
gre_demux.c gre: Remove support for sharing GRE protocol hook. 2015-08-10 14:03:54 -07:00
gre_offload.c net: add recursion limit to GRO 2016-11-15 07:46:38 +01:00
icmp.c Revert "ipv4/icmp: redirect messages can use the ingress daddr as source" 2015-10-14 06:01:07 -07:00
igmp.c net: igmp: add a missing rcu locking section 2018-02-16 20:09:37 +01:00
inet_connection_sock.c tcp/dccp: fix other lockdep splats accessing ireq_opt 2017-11-18 11:11:07 +01:00
inet_diag.c tcp/dccp: install syn_recv requests into ehash table 2015-10-03 04:32:41 -07:00
inet_fragment.c inet: frag: enforce memory limits earlier 2018-08-06 16:24:41 +02:00
inet_hashtables.c tcp/dccp: fix hashdance race for passive sessions 2015-10-23 05:42:21 -07:00
inet_lro.c
inet_timewait_sock.c soreuseport: initialise timewait reuseport field 2018-05-16 10:06:50 +02:00
inetpeer.c net: Add helper function to compare inetpeer addresses 2015-08-28 13:32:36 -07:00
ip_forward.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
ip_fragment.c inet: frag: release spinlock before calling icmp_send() 2017-12-25 14:22:11 +01:00
ip_gre.c vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices 2016-06-24 10:18:18 -07:00
ip_input.c ipv4: Pass struct net into ip_defrag and ip_check_defrag 2015-10-12 19:44:16 -07:00
ip_options.c
ip_output.c ip: hash fragments consistently 2018-07-28 07:45:02 +02:00
ip_sockglue.c ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull 2018-07-28 07:45:03 +02:00
ip_tunnel.c ip_tunnel: better validate user provided tunnel names 2018-04-13 19:50:26 +02:00
ip_tunnel_core.c tunnels: Remove encapsulation offloads on decap. 2016-10-31 04:13:59 -06:00
ip_vti.c Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU" 2018-05-30 22:11:35 +02:00
ipcomp.c
ipconfig.c ipconfig: Correctly initialise ic_nameservers 2018-08-06 16:24:38 +02:00
ipip.c ipip: only increase err_count for some certain type icmp in ipip_err 2017-11-18 11:11:06 +01:00
ipmr.c ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route 2016-11-15 07:46:37 +01:00
Kconfig ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV 2018-08-15 17:42:05 +02:00
Makefile tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
netfilter.c netfilter: use skb_to_full_sk in ip_route_me_harder 2018-03-18 11:17:51 +01:00
ping.c ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg 2018-05-26 08:48:46 +02:00
proc.c net: track success and failure of TCP PMTU probing 2015-07-21 22:36:33 -07:00
protocol.c
raw.c net: ipv4: fix for a race condition in raw_sendmsg 2018-01-02 20:33:25 +01:00
route.c ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu 2018-05-30 07:49:04 +02:00
syncookies.c tcp/dccp: fix ireq->opt races 2017-11-18 11:11:06 +01:00
sysctl_net_ipv4.c ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns 2018-07-25 10:18:16 +02:00
tcp.c tcp: identify cryptic messages as TCP seq # bugs 2018-08-24 13:27:00 +02:00
tcp_bic.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_cdg.c tcp: do not slow start when cwnd equals ssthresh 2015-07-09 14:22:52 -07:00
tcp_cong.c tcp: disallow cwnd undo when switching congestion control 2017-06-14 13:16:19 +02:00
tcp_cubic.c tcp_cubic: do not set epoch_start in the future 2015-09-17 22:35:07 -07:00
tcp_dctcp.c tcp: remove DELAYED ACK events in DCTCP 2018-08-24 13:26:59 +02:00
tcp_diag.c tcp: ensure proper barriers in lockless contexts 2015-11-15 18:36:38 -05:00
tcp_fastopen.c tcp: initialize max window for a new fastopen socket 2017-02-04 09:45:09 +01:00
tcp_highspeed.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_htcp.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_hybla.c tcp: do not slow start when cwnd equals ssthresh 2015-07-09 14:22:52 -07:00
tcp_illinois.c net/tcp/illinois: replace broken algorithm reference link 2018-05-30 07:49:02 +02:00
tcp_input.c tcp: Fix missing range_truesize enlargement in the backport 2018-08-17 20:56:44 +02:00
tcp_ipv4.c tcp: verify the checksum of the first data segment in a new connection 2018-07-03 11:21:25 +02:00
tcp_lp.c tcp: fix wraparound issue in tcp_lp 2017-05-14 13:32:58 +02:00
tcp_memcontrol.c
tcp_metrics.c tcp: convert cached rtt from usec to jiffies when feeding initial rto 2016-04-20 15:41:56 +09:00
tcp_minisocks.c tcp: do not restart timewait timer on rst reception 2018-09-15 09:40:37 +02:00
tcp_offload.c tcp: reserve tcp_skb_mss() to tcp stack 2015-06-11 16:33:10 -07:00
tcp_output.c tcp: remove DELAYED ACK events in DCTCP 2018-08-24 13:26:59 +02:00
tcp_probe.c
tcp_recovery.c tcp: use RACK to detect losses 2015-10-21 07:00:53 -07:00
tcp_scalable.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_timer.c net: tcp: close sock if net namespace is exiting 2018-01-31 12:06:14 +01:00
tcp_vegas.c tcp: fix under-evaluated ssthresh in TCP Vegas 2017-12-25 14:22:15 +01:00
tcp_vegas.h
tcp_veno.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_westwood.c
tcp_yeah.c tcp: cwnd does not increase in TCP YeAH 2016-09-30 10:18:34 +02:00
tunnel4.c
udp.c ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg 2018-05-26 08:48:46 +02:00
udp_diag.c sock_diag: specify info_size per inet protocol 2015-06-15 19:49:22 -07:00
udp_impl.h
udp_offload.c net: avoid skb_warn_bad_offload false positives on UFO 2017-08-12 19:29:08 -07:00
udp_tunnel.c tunnel: Clear IPCB(skb)->opt before dst_link_failure called 2016-04-20 15:41:56 +09:00
udplite.c
xfrm4_input.c netfilter: Pass net into okfn 2015-09-17 17:18:37 -07:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-24 06:54:12 -07:00
xfrm4_policy.c ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu 2018-05-30 07:49:04 +02:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c