linux-uconsole/include/net
Eric Dumazet 5e25ba5003 tcp: TSO packets automatic sizing
[ Upstream commits 6d36824e730f247b602c90e8715a792003e3c5a7,
  02cf4ebd82, and parts of
  7eec4174ff ]

After hearing many people over past years complaining against TSO being
bursty or even buggy, we are proud to present automatic sizing of TSO
packets.

One part of the problem is that tcp_tso_should_defer() uses an heuristic
relying on upcoming ACKS instead of a timer, but more generally, having
big TSO packets makes little sense for low rates, as it tends to create
micro bursts on the network, and general consensus is to reduce the
buffering amount.

This patch introduces a per socket sk_pacing_rate, that approximates
the current sending rate, and allows us to size the TSO packets so
that we try to send one packet every ms.

This field could be set by other transports.

Patch has no impact for high speed flows, where having large TSO packets
makes sense to reach line rate.

For other flows, this helps better packet scheduling and ACK clocking.

This patch increases performance of TCP flows in lossy environments.

A new sysctl (tcp_min_tso_segs) is added, to specify the
minimal size of a TSO packet (default being 2).

A follow-up patch will provide a new packet scheduler (FQ), using
sk_pacing_rate as an input to perform optional per flow pacing.

This explains why we chose to set sk_pacing_rate to twice the current
rate, allowing 'slow start' ramp up.

sk_pacing_rate = 2 * cwnd * mss / srtt

v2: Neal Cardwell reported a suspect deferring of last two segments on
initial write of 10 MSS, I had to change tcp_tso_should_defer() to take
into account tp->xmit_size_goal_segs

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Tom Herbert <therbert@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-04 04:30:59 -08:00
..
9p 9p: turn fid->dlist into hlist 2013-02-27 22:51:08 -05:00
bluetooth Bluetooth: Introduce a new HCI_RFKILLED flag 2013-10-13 16:08:32 -07:00
caif caif: Remove my bouncing email address. 2013-04-23 13:25:51 -04:00
irda irda: small read past the end of array in debug code 2013-04-19 17:32:31 -04:00
iucv af_iucv: fix recvmsg by replacing skb_pull() function 2013-04-08 17:16:57 -04:00
netfilter netfilter: log: netns NULL ptr bug when calling from conntrack 2013-05-15 14:11:07 +02:00
netns netfilter: nf_log: prepare net namespace support for loggers 2013-04-05 20:12:54 +02:00
nfc NFC: RFKILL support 2013-04-12 16:54:45 +02:00
phonet net: remove my future former mail address 2012-06-17 16:29:38 -07:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-05-01 14:08:52 -07:00
tc_act
act_api.h act_police: move struct tcf_police to act_police.c 2013-02-12 18:59:45 -05:00
addrconf.h IPv6 NAT: Do not drop DNATed 6to4/6rd packets 2013-10-13 16:08:30 -07:00
af_ieee802154.h
af_rxrpc.h
af_unix.h af_unix: fix a fatal race with bit fields 2013-05-01 15:13:49 -04:00
ah.h
arp.h net: Dont use ifindices in hash fns 2012-08-09 16:18:06 -07:00
atmclip.h atm: clip: Use device neigh support on top of "arp_tbl". 2011-11-30 18:51:03 -05:00
ax25.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ax88796.h
cfg80211-wext.h
cfg80211.h cfg80211: introduce critical protocol indication from user-space 2013-04-22 15:48:00 +02:00
checksum.h net: core: add function for incremental IPv6 pseudo header checksum updates 2012-08-30 03:00:16 +02:00
cipso_ipv4.h cipso: handle CIPSO options correctly when NetLabel is disabled 2012-06-01 14:18:29 -04:00
cls_cgroup.h cls_cgroup: remove task_struct parameter from sock_update_classid() 2013-04-09 13:19:35 -04:00
codel.h codel: refine one condition to avoid a nul rec_inv_sqrt 2012-08-10 16:52:54 -07:00
compat.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
datalink.h
dcbevent.h
dcbnl.h net/dcb: Add an optional max rate attribute 2012-04-05 05:08:04 -04:00
dn.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dn_dev.h
dn_fib.h decnet: Parse netlink attributes on our own 2013-03-22 10:31:16 -04:00
dn_neigh.h
dn_nsp.h
dn_route.h decnet: use correct RCU API to deref sk_dst_cache field 2013-01-28 00:15:27 -05:00
dsa.h dsa: Include linux/if_ether.h to fix build error 2011-12-01 11:41:06 -05:00
dsfield.h ipv6: Optimize ipv6_change_dsfield(). 2013-01-09 23:59:53 -08:00
dst.h Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug 2013-03-15 09:06:58 -04:00
dst_ops.h net: Fix warnings in dst_ops.h 2012-07-19 10:43:03 -07:00
esp.h
ethoc.h
fib_rules.h ipv4: Elide fib_validate_source() completely when possible. 2012-06-29 01:36:36 -07:00
firewire.h firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection. 2013-03-26 12:32:13 -04:00
flow.h ipv4: Add FLOWI_FLAG_KNOWN_NH 2012-10-08 17:42:36 -04:00
flow_keys.h flow_keys: include thoff into flow_keys for later usage 2013-03-20 12:14:36 -04:00
garp.h
gen_stats.h
genetlink.h genl: Hold reference on correct module while netlink-dump. 2013-09-14 06:54:55 -07:00
gre.h GRE: Refactor GRE tunneling code. 2013-03-26 12:27:18 -04:00
gro_cells.h gro: Fix kcalloc argument order 2013-01-27 22:46:33 -05:00
icmp.h ipv4: fix error handling in icmp_protocol. 2013-02-22 15:10:18 -05:00
ieee80211_radiotap.h mac80211: support (partial) VHT radiotap information 2012-11-27 11:56:18 +01:00
ieee802154.h
ieee802154_netdev.h ieee802154/nl-mac.c: make some MLME operations optional 2013-04-08 12:00:16 -04:00
if_inet6.h net: ipv6: only invalidate previously tokenized addresses 2013-04-09 13:12:23 -04:00
inet6_connection_sock.h ipv6: Add helper inet6_csk_update_pmtu(). 2012-07-16 03:44:56 -07:00
inet6_hashtables.h ipv6: use a stronger hash for tcp 2013-02-21 18:15:58 -05:00
inet_common.h net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN) 2012-07-19 11:02:03 -07:00
inet_connection_sock.h tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
inet_ecn.h tunnel: drop packet if ECN present with not-ECT 2012-09-27 18:12:37 -04:00
inet_frag.h net: frag, fix race conditions in LRU list maintenance 2013-05-06 11:06:51 -04:00
inet_hashtables.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inet_sock.h ipv6: use a stronger hash for tcp 2013-02-21 18:15:58 -05:00
inet_timewait_sock.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inetpeer.h ipv4: Maintain redirect and PMTU info in struct rtable again. 2012-07-10 22:40:14 -07:00
ip.h ip: generate unique IP identificator if local fragmentation is allowed 2013-10-13 16:08:30 -07:00
ip6_checksum.h ipv6: move csum_ipv6_magic() and udp6_csum_init() into static library 2013-01-08 17:56:10 -08:00
ip6_fib.h ipv6: fix race condition regarding dst->expires and dst->from. 2013-02-20 15:11:45 -05:00
ip6_route.h ipv6: Remove unused neigh argument for icmp6_dst_alloc() and its callers. 2013-01-18 14:41:13 -05:00
ip6_tunnel.h GRE: Refactor GRE tunneling code. 2013-03-26 12:27:18 -04:00
ip_fib.h ipv4: fix definition of FIB_TABLE_HASHSZ 2013-03-13 10:47:09 -04:00
ip_tunnels.h ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id. 2013-09-14 06:54:55 -07:00
ip_vs.h ipvs: fix sparse warnings for some parameters 2013-04-23 11:43:05 +09:00
ipcomp.h
ipconfig.h
ipv6.h ipv6: implement RFC3168 5.3 (ecn protection) for ipv6 fragmentation handling 2013-03-24 17:16:30 -04:00
ipx.h
iw_handler.h
lapb.h lapb: Neaten debugging 2012-05-17 18:45:20 -04:00
lib80211.h hostap: Don't use create_proc_read_entry() 2013-04-29 15:41:56 -04:00
llc.h llc: Remove stray reference to sysctl_llc_station_ack_timeout. 2012-09-17 13:13:24 -04:00
llc_c_ac.h
llc_c_ev.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h net: delete all instances of special processing for token ring 2012-05-15 20:14:35 -04:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
mac80211.h mac80211: add a flag to indicate CCK support for HT clients 2013-09-07 22:09:59 -07:00
mac802154.h mac802154: add wpan device-class support 2012-06-26 21:06:11 -07:00
mip6.h
mld.h
mrp.h net/802: Implement Multiple Registration Protocol (MRP) 2013-02-10 20:37:22 -05:00
ndisc.h ndisc: Add missing inline to ndisc_addr_option_pad 2013-08-11 18:35:26 -07:00
neighbour.h net neighbour, decnet: Ensure to align device private data on preferred alignment. 2013-02-11 00:21:44 -05:00
net_namespace.h netfilter: make /proc/net/netfilter pernet 2013-04-05 19:35:02 +02:00
net_ratelimit.h
netdma.h
netevent.h ipv6 netevent: Remove old_neigh from netevent_redirect. 2013-01-14 15:04:59 -05:00
netlabel.h userns: Convert the audit loginuid to be a kuid 2012-09-17 18:08:54 -07:00
netlink.h netlink: Rename pid to portid to avoid confusion 2012-09-10 15:30:41 -04:00
netprio_cgroup.h netprio_cgroup: remove task_struct parameter from sock_update_netprio() 2013-04-09 13:19:37 -04:00
netrom.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
nexthop.h
nl802154.h
p8022.h
ping.h
pkt_cls.h pkt_sched: namespace aware act_mirred 2013-01-14 15:09:36 -05:00
pkt_sched.h sch_api: introduce qdisc_watchdog_schedule_ns() 2013-02-12 18:59:45 -05:00
protocol.h net: Remove code duplication between offload structures 2012-11-15 17:39:51 -05:00
psnap.h
raw.h
rawv6.h ipv6: bool/const conversions phase2 2012-05-19 01:08:16 -04:00
red.h net_sched: red: Make minor corrections to comments 2012-04-16 23:53:11 -04:00
regulatory.h regulatory: use RCU to protect last_request 2013-01-03 13:01:30 +01:00
request_sock.h net: remove a stale comment for dl_next 2013-04-22 15:55:48 -04:00
rose.h
route.h ipv4: avoid a test in ip_rt_put() 2012-11-03 14:59:04 -04:00
rtnetlink.h rtnetlink: Remove passing of attributes into rtnl_doit functions 2013-03-22 10:31:16 -04:00
sch_generic.h net_sched: restore "linklayer atm" handling 2013-09-14 06:54:55 -07:00
scm.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-04-22 20:32:51 -04:00
secure_seq.h net: net_secret should not depend on TCP 2013-10-13 16:08:30 -07:00
slhc_vj.h
snmp.h net: avoid reloads in SNMP_UPD_PO_STATS 2012-08-06 13:40:47 -07:00
sock.h tcp: TSO packets automatic sizing 2013-11-04 04:30:59 -08:00
stp.h
tcp.h tcp: TSO packets automatic sizing 2013-11-04 04:30:59 -08:00
tcp_memcontrol.h cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg 2012-04-10 10:04:07 -07:00
tcp_states.h
timewait_sock.h [PATCH] tcp: Cache inetpeer in timewait socket, and only when necessary. 2012-06-09 14:56:12 -07:00
transp_v6.h ipv6: rename datagram_send_ctl and datagram_recv_ctl 2013-01-31 13:53:08 -05:00
udp.h ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data 2013-07-28 16:29:49 -07:00
udplite.h net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
wext.h
wimax.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
wpan-phy.h mac802154: monitor device support 2012-05-16 15:17:08 -04:00
x25.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
x25device.h
xfrm.h xfrm: force a garbage collection after deleting a policy 2013-05-31 17:30:07 -07:00