Internet Engineering Task Force Sally Floyd
INTERNET DRAFT LBL
File: draft-floyd-incr-init-win-00.txt Mark Allman
NASA Lewis/Sterling Software
Craig Partridge
BBN Technologies
July, 1997
Expires: January, 1998
Increasing TCP's Initial Window
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as ``work in
progress.''
To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet- Drafts
Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Abstract
This is a note to suggest changing the permitted initial window in
TCP from 1 segment to roughly 4K bytes. This draft considers the
advantages and disadvantages of such a changes, as well as outlining
some experimental results that indicate the costs and benefits of
making such a change to TCP, and pointing out remaining research
questions.
1. TCP Modification
This draft suggests allowing the initial window used by a TCP
connection to increase from 1 segment to roughly 4K bytes. The
initial window size would be that given in (1):
min (4*MSS, max (2*MSS, 4380 bytes)) (1)
The initial window would contain between 2 and 4 segments, rather
than the 1 segment initial window currently used. The initial
window would contain at least 2 segments, regardless of the MSS.
Furthermore, the initial window may contain up to 4380 bytes in at
most 4 segments. This increased initial window would be optional:
Expires: January 1998 [Page 1]
draft-floyd-incr-init-win-00.txt July 1997
that a TCP MAY start with a larger initial window, not that it
SHOULD.
For example, a host sending 1460 byte segments may use an initial
window of 4380 bytes (3 segments). A host sending 512 byte segments
may use an initial window of 2048 bytes (4 segments). Finally, a
host sending 3000 byte segments may use an initial window of 6000
bytes (2 segments).
This change would only apply to the initial window of the
connection, in the first round trip time (RTT) of transmission, or to
connections that are just beginning to send data after a long
quiescent period. This would not change the behavior after a
retransmit timeout, when the sender would continue to slow-start
from an initial window of one segment.
2. Advantages of Larger Initial Windows
1. For connections transmitting only a small amount of data, a
larger initial window would reduce the transmission time
(assuming moderate segment drop rates). For many email (SMTP
[Pos82]) and web page (HTTP [BLFN96] [FJGFBL97]) transfers that
are less than 4K bytes, the larger initial window would reduce
the data transfer time to a single RTT.
2. For connections that will be able to use large congestion
windows, this modification eliminates up to three RTTs and a
delayed ACK timeout during the initial slow-start phase. This
would be of particular benefit for high-bandwidth
large-propagation-delay TCP connections, such as those over
satellite links.
3. When the initial window is 1 segment, a receiver employing
delayed acknowledgments (ACK) [Bra89] is forced to wait for a
timeout before generating an ACK. With a larger initial window,
the receiver will be able to generate an ACK after the second
data segment arrives. This eliminates the need to wait on the
timeout (0.1 seconds, or more).
3. Implementation Issues
When larger initial windows are implemented along with Path MTU
Discovery [MD90], only one of the segments in the initial window
should have the "Don't Fragment" bit set. If implemented, the
initial window MUST be configurable. The default setting of the
initial window (to either one segment, or up to 4380 bytes) SHOULD
be per assigned numbers. Thus implementations will use the
preconfigured standard value by default, but the standard value can
be tuned within the allowed range for some specific context.
Even though the initial window is at most four times the initial
segment size, under some limited conditions TCP may send more than
four segments in the initial burst. This would occur, for example,
if the TCP data sender sends an initial large segment with the "Don't
Expires: January 1998 [Page 2]
draft-floyd-incr-init-win-00.txt July 1997
Fragment" bit set, discovers that the MTU should be set to 512
bytes, and then retransmits eight 512-byte segments.
This larger initial window SHOULD NOT be viewed as an encouragement
for web browsers to open multiple simultaneous TCP connections all
with larger initial windows. (Web browsers should not open four
simultaneous TCP connections to the same destination in any case,
because this works against TCP's congestion control mechanisms).
4. Disadvantages of Larger Initial Windows for the Individual
Connection
In high-congestion environments, particularly for routers that have
a bias against bursty traffic (as in the typical Drop Tail router
queues), a TCP connection can sometimes be better off starting with
an initial window of one segment. There are scenarios where a TCP
connection slow-starting from an initial window of one segment might
not have segments dropped, while a TCP connection starting with an
initial window of four segments might experience unnecessary
retransmits due to the inability of the router to handle small
bursts. This could result in an unnecessary retransmit timeout.
For a large-window connection that is able to recover without a
retransmit timeout, this could result in an unnecessarily-early
transition from the slow-start to the congestion-avoidance phase of
the window increase algorithm. These premature segment drops should
not happen in uncongested networks, or in moderately-congested
networks where the congested router used active queue management
(such as Random Early Detection [FJ93]).
Some TCP connections will receive better performance with the higher
initial window even if the burstiness of the initial window results
in premature segment drops. This will be true if (1) the TCP
connection recovers from the segment drop without a retransmit
timeout, and (2) the TCP connection is ultimately limited to a small
congestion window by either network congestion or by the receiver's
advertised window.
5. Disadvantages of Larger Initial Windows for the Network
We consider two separate potential dangers for the network. The
first danger would be a scenario where a large number of segments on
congested links were duplicate or unnecessarily-retransmitted
segments that had already been received at the receiver. The second
danger would be a scenario where a large number of segments on
congested links were segments that would be dropped later in the
network before reaching their final destination.
Unnecessarily-retransmitted segments:
As described in the previous section, the larger initial window
could occasionally result in a segment dropped from the initial
window, when that segment might not have been dropped if the
sender had slow-started from an initial window of one segment.
However, Appendix A shows that even in this case, the larger
Expires: January 1998 [Page 3]
draft-floyd-incr-init-win-00.txt July 1997
initial window would not result in a large number of
unnecessarily-retransmitted segments.
Segments dropped later in the network:
How much would the larger initial window for TCP increase the
number of segments on congested links that would be dropped
before reaching their final destination? This is a problem that
can only occur for connections with multiple congested links,
where some segments might use scarce bandwidth on the first
congested link along the path, only to be dropped later along
the path.
First, many of the TCP connections will have only one congested
link along the path. Segments dropped from these connections do
not ``waste'' scarce bandwidth, and do not contribute to
congestion collapse.
However, some network paths will have multiple congested links,
and segments dropped from the initial window could use scarce
bandwidth along the earlier congested links before being dropped
on subsequent congested links. To the extent that the drop rate
is independent of the initial window used by TCP segments, the
problem of congested links carrying segments that will be
dropped before reaching their destination will be similar for
TCP connections that start by sending four segments or one
segment.
For a network with a high segment drop rate, increasing the
initial TCP congestion window could increase the segment drop
rate even further. This is in part because routers with drop
tail queue management have difficulties with bursty traffic in
times of congestion. However, this should be a second order
effect. Given uncorrelated arrivals for TCP connections, the
larger initial TCP congestion window should generally not
significantly increase the segment drop rate.
6. Network Changes
There are other changes in the network that make a larger initial
window less of a problem. These include the increasing deployment
of higher-speed links where 4K bytes is a rather small quantity of
data and the deployment of queue management mechanisms such as RED
that are more tolerant of transient traffic bursts. The current
dangers of congestion collapse most likely now come not from a 4K
initial burst from TCP connections, but from the increased
deployment of UDP connections without end-to-end congestion control.
7. Concerns
All the experiments (see section 8) with larger initial windows have
tested how the larger window affects the TCP connection that uses
the larger window. No one has thoroughly studied the impact of the
larger window on other TCP connections. In particular, no one has a
Expires: January 1998 [Page 4]
draft-floyd-incr-init-win-00.txt July 1997
thorough set of answers about what happens when a TCP bursts a
larger initial window into or across a path already being shared by
a set of established TCP connections.
Part of the reason for this omission is the assumption that the
effect is small. In much of the Internet, large bursts already
occur due to delayed ACKs.
However, there are some common scenarios where a larger initial
window might have an effect. One example is low speed tail circuits
with routers with small buffers. For instance, imagine a dialup
link connecting routers each of which have a handful of buffers.
Further imagine the link is already being shared by a few TCP
connections. Then a new connection launches a large initial window,
causing losses. How long will it be before the connections resume
sharing the link fairly? Are there any signs of a capture effect,
in which the new TCP gets a large fraction of the bandwidth? (A
capture effect could ensure that, say, an SMTP server got more
bandwidth than a long running FTP).
Another scenario of concern is heavily loaded links. For instance,
a couple of years ago, one of the trans-Atlantic links was so
heavily loaded that the correct congestion window size for a
connection was about one segment. In this environment, new
connections using larger initial windows would be starting with
windows that were four times too big. What would the effects be?
Do connections thrash?
8. Experimental Results
A number of studies have been done using larger initial windows.
The first study considers the effects on the global Internet, as
well as slow dialup modem links. These test results [AHO97] show an
increase in the drop rate of 0.1 segments/transfer for 16 KB
transfers to 100 Internet hosts. While the drop rate increased
slightly, the throughput of the transfers using a 4 segment (512
byte MSS) initial window showed an approximately 80% throughput
improvement over standard TCP. Tests over a 28.8 bps dialup channel
showed no increase in the drop rate and a roughly 10% throughput
improvement over standard TCP.
In another study, larger initial windows have been shown to improve
performance over satellite channels [All97]. In this study, an
initial window of 4 segments (512 byte MSS) resulted in throughput
improvements of up to 30% (depending upon transfer size) without
increasing the loss rate.
Next, a study involving simulations of a large number of HTTP
transactions over hybrid fiber coax (HFC) indicates that the use of
larger initial windows decreases the time required to load WWW pages
[Nic97].
Finally, a study investigated the effects of using a larger initial
window on a host connected by a slow modem link and a router with a
Expires: January 1998 [Page 5]
draft-floyd-incr-init-win-00.txt July 1997
3 packet buffer [SP97]. This study found that in this environment,
larger initial windows slightly improved performance.
9. Conclusion
This draft suggests a small change to TCP that may be beneficial to
short lived TCP connections and those over links with long RTTs
(saving several RTTs during the initial slow-start phase). However,
before this change is implemented several concerns need to be
addressed to ensure that this mechanism will not negatively impact
the Internet.
10. Acknowledgments
We would like to acknowledge Tim Shepard and the members of the
End-to-End-Interest Mailing List for continuing discussions of these
issues.
References
[AHO97] Mark Allman, Chris Hayes and Shawn Ostermann. An Evaluation
of TCP Slow Start Modifications, 1997. In preparation. (Draft
available from http://jarok.cs.ohiou.edu/papers/).
[All97] Mark Allman. Improving TCP Performance Over Satellite
Channels. Master's thesis, Ohio University, June 1997.
[BLFN96] Tim Berners-Lee, R. Fielding, and H. Nielsen. Hypertext
Transfer Protocol -- HTTP/1.0, May 1996. RFC 1945.
[Bra89] Robert Braden. Requirements for Internet Hosts --
Communication Layers, October 1989. RFC 1122.
[FF96] Fall, K., and Floyd, S., Simulation-based Comparisons of
Tahoe, Reno, and SACK TCP. To appear in Computer Communications
Review, July 1996.
[FJGFBL97] R. Fielding, Jeffrey C. Mogul, Jim Gettys, H. Frystyk,
and Tim Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1,
January 1997. RFC 2068.
[FJ93] Floyd, S., and Jacobson, V., Random Early Detection gateways
for Congestion Avoidance. IEEE/ACM Transactions on Networking,
V.1 N.4, August 1993, p. 397-413.
[Flo96] Floyd, S., Issues of TCP with SACK. Technical report, January
1996. Available from http://www-nrg.ee.lbl.gov/floyd/.
[MD90] Jeffrey C. Mogul and Steve Deering. Path MTU Discovery,
November 1990. RFC 1191.
[MMFR96] Matt Mathis, Jamshid Mahdavi, Sally Floyd and Allyn
Romanow. TCP Selective Acknowledgment Options, October 1996.
RFC 2018.
Expires: January 1998 [Page 6]
draft-floyd-incr-init-win-00.txt July 1997
[Nic97] Kathleen Nichols. Improving Network Simulation with
Feedback. Submitted to InfoCom 97.
[Pos82] Jon Postel. Simple Mail Transfer Protocol, August 1982.
RFC 821.
[SP97] Tim Shepard and Craig Partridge. When TCP Starts Up With
Four Packets Into Only Three Buffers, July 1997. Internet-Draft
draft-shepard-TCP-4-packets-3-buff-00.txt (work in progress).
Appendix A
In the current environment (without Explicit Congestion
Notification), all TCPs use segment drops as indications from the
network about the limits of available bandwidth. The change to a
larger initial window should not result in a large number of
unnecessarily-retransmitted segments.
If a segment is dropped from the initial window, there are three
different ways for TCP to recover: (1) Slow-starting from a window
of one segment, as is done after a retransmit timeout, or after Fast
Retransmit in Tahoe TCP; (2) Fast Recovery without selective
acknowledgments (SACK), as is done after three duplicate ACKs in
Reno TCP; and (3) Fast Recovery with SACK, for TCP where both the
sender and the receiver support the SACK option [MMFR96]. In all
three cases, if a single segment is dropped from the initial window,
there are no unnecessarily-retransmitted segments. Note that for a
TCP sending four 512-byte segments in the initial window, a single
segment drop will not require a retransmit timeout, but can be
recovered from using the Fast Retransmit algorithm.
We now consider the case when multiple segments are dropped from the
initial window. Using the first recovery method, slow-starting from
a window of one segment, the number of unnecessarily-retransmitted
segments is limited [FF96]. In the second case of Fast Recovery
without SACK, multiple segment drops from a window of data generally
result in a retransmit timeout. Again, the number of
unnecessarily-retransmitted segments is small. In the third case,
of Fast Recovery with SACK, there can only be
unnecessarily-retransmitted segments if a precise pattern of ACK
segments are also lost [Flo96], or if segments are
seriously-reordered in the network. In any case, the number of
unnecessarily-retransmitted segments due to a larger initial window
should be small.
Expires: January 1998 [Page 7]
draft-floyd-incr-init-win-00.txt July 1997
Author's Addresses
Sally Floyd
Lawrence Berkeley National Laboratory
One Cyclotron Road
Berkeley, CA 94720
floyd@ee.lbl.gov
Mark Allman
NASA Lewis Research Center/Sterling Software
21000 Brookpark Road
MS 54-2
Cleveland, OH 44135
mallman@lerc.nasa.gov
Craig Partridge
BBN Technologies
10 Moulton Street
Cambridge, MA 02138
craig@bbn.com
Expires: January 1998 [Page 8]