linux-uconsole/net
Yuchung Cheng 659a8ad56f tcp: track the packet timings in RACK
This patch is the first half of the RACK loss recovery.

RACK loss recovery uses the notion of time instead
of packet sequence (FACK) or counts (dupthresh). It's inspired by the
previous FACK heuristic in tcp_mark_lost_retrans(): when a limited
transmit (new data packet) is sacked, then current retransmitted
sequence below the newly sacked sequence must been lost,
since at least one round trip time has elapsed.

But it has several limitations:
1) can't detect tail drops since it depends on limited transmit
2) is disabled upon reordering (assumes no reordering)
3) only enabled in fast recovery ut not timeout recovery

RACK (Recently ACK) addresses these limitations with the notion
of time instead: a packet P1 is lost if a later packet P2 is s/acked,
as at least one round trip has passed.

Since RACK cares about the time sequence instead of the data sequence
of packets, it can detect tail drops when later retransmission is
s/acked while FACK or dupthresh can't. For reordering RACK uses a
dynamically adjusted reordering window ("reo_wnd") to reduce false
positives on ever (small) degree of reordering.

This patch implements tcp_advanced_rack() which tracks the
most recent transmission time among the packets that have been
delivered (ACKed or SACKed) in tp->rack.mstamp. This timestamp
is the key to determine which packet has been lost.

Consider an example that the sender sends six packets:
T1: P1 (lost)
T2: P2
T3: P3
T4: P4
T100: sack of P2. rack.mstamp = T2
T101: retransmit P1
T102: sack of P2,P3,P4. rack.mstamp = T4
T205: ACK of P4 since the hole is repaired. rack.mstamp = T101

We need to be careful about spurious retransmission because it may
falsely advance tp->rack.mstamp by an RTT or an RTO, causing RACK
to falsely mark all packets lost, just like a spurious timeout.

We identify spurious retransmission by the ACK's TS echo value.
If TS option is not applicable but the retransmission is acknowledged
less than min-RTT ago, it is likely to be spurious. We refrain from
using the transmission time of these spurious retransmissions.

The second half is implemented in the next patch that marks packet
lost using RACK timestamp.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21 07:00:48 -07:00
..
6lowpan 6lowpan: move shared settings to lowpan_netdev_setup 2015-10-08 14:25:34 +02:00
9p net/9p: Remove ib_get_dma_mr calls 2015-08-30 18:12:36 -04:00
802
8021q net: 8021q: convert to using IFF_NO_QUEUE 2015-08-18 11:55:06 -07:00
appletalk
atm atm: deal with setting entry before mkip was called 2015-09-17 22:13:32 -07:00
ax25
batman-adv batman-adv: turn batadv_neigh_node_get() into local function 2015-08-27 20:15:34 +02:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
bridge Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-10-17 14:28:03 +02:00
caif net: caif: convert to using IFF_NO_QUEUE 2015-08-18 11:55:07 -07:00
can can: avoid using timeval for uapi 2015-10-13 17:42:34 +02:00
ceph rbd: use writefull op for object size writes 2015-10-16 16:49:01 +02:00
core Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp tcp/dccp: add inet_csk_reqsk_queue_drop_and_put() helper 2015-10-16 00:52:18 -07:00
decnet Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-10-17 14:28:03 +02:00
dns_resolver
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
ethernet net: help compiler generate better code in eth_get_headlen 2015-09-28 22:51:15 -07:00
hsr net: hsr: convert to using IFF_NO_QUEUE 2015-08-18 11:55:07 -07:00
ieee802154 6lowpan: move shared settings to lowpan_netdev_setup 2015-10-08 14:25:34 +02:00
ipv4 tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
ipx
irda
iucv s390/iucv: do not use arrays as argument 2015-09-21 16:03:04 -07:00
key net: Fix RCU splat in af_key 2015-08-24 14:48:10 -07:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-02 07:21:25 -07:00
l3mdev net: Add netif_is_l3_slave 2015-10-07 04:27:43 -07:00
lapb
llc tcp: fix recv with flags MSG_WAITALL | MSG_PEEK 2015-07-27 01:06:53 -07:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
mac802154 ieee802154: change mtu size behaviour 2015-09-30 13:21:32 +02:00
mpls dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
netfilter Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-10-17 14:28:03 +02:00
netlabel
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
netrom
nfc nfc: netlink: Add capability to reply to vendor_cmd with data 2015-08-20 22:00:11 +02:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
packet ipv4: Pass struct net into ip_defrag and ip_check_defrag 2015-10-12 19:44:16 -07:00
phonet
rds RDS: fix rds-ping deadlock over TCP transport 2015-10-18 22:45:55 -07:00
rfkill rfkill: Copy "all" global state to other types 2015-09-04 14:26:56 +02:00
rose
rxrpc rxrpc: Replace get_seconds with ktime_get_seconds 2015-09-20 21:53:56 -07:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
sctp net: sctp: avoid incorrect time_t use 2015-10-05 03:16:48 -07:00
sunrpc Changes for 4.3-rc5 2015-10-15 13:44:35 -07:00
switchdev Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
unix net/unix: fix logic about sk_peek_offset 2015-10-05 06:33:09 -07:00
vmw_vsock
wimax net:wimax: Fix doucble word "the the" in networking.xml 2015-08-09 22:43:52 -07:00
wireless For the current cycle, we have the following right now: 2015-10-07 04:29:18 -07:00
x25
xfrm dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
compat.c
Kconfig net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
Makefile net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
socket.c
sysctl_net.c