Andrei Gurtov INTERNET-DRAFT Sonera Expires: April 2003 October 2002 On Treating DUPACKs in TCP <draft-gurtov-tsvwg-tcp-delay-spikes-01.txt> Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is an individual submission to the IETF. Comments should be directed to the authors. Abstract We suggest a more conservative management of TCP's retransmit timer and a more careful response of the TCP sender on duplicate ACKs. The former reduces the risk of triggering spurious timeouts, while the latter eliminates potentially unnecessary retransmits. Although, we believe that these suggestions are permitted (although not explicitly) by RFC2581 and RFC2988, they do not seem to be widely known nor deployed in existing TCP implementations. We therefore want to make implementers aware of these choices. Terminology We use the term 'DupThresh' as the number of DUPACKs that need to arrive at the TCP sender to trigger the fast retransmit algorithm, the default value is three [RFC2581]. Gurtov [Page 1]
INTERNET-DRAFT On Treating DUPACKs in TCP October, 2002 1. Introduction This document discusses changes to the TCP senderÆs logic with the purpose of avoiding (spurious) retransmission timeouts and unnecessary retransmissions. Initial motivation of this work came from a number of traces collected on a wireless wide area network (WWAN) showing spurious timeouts and unnecessary retransmissions during a series of DUPACKs [GU01b], [GU01c]. WWANs are often slow, have high RTT, posses significant bandwidth and latency variation, and can have occasional delay spikes in the order of several seconds [GU01a], [KY02], [LU99]. On the other hand, on WWANs unnecessary retransmissions (retransmission timeouts) are particularly costly as they directly translate into wasting of scarce radio resources and consumption of battery power (wasting of expensive connection time). Large RTT variations seem to be more common on bandwidth-dominated paths especially if the bottleneck link experiences a low degree of statistical multiplexing. On such paths, the RTT is largely determined by the packet transmission delay across the bottleneck link. Thus, packet sizes and rate changes of the bottleneck link can greatly impact the RTT. Related work [LK00], [GL03] discusses the issue of spurious timeouts during the normal transmission state while this document focuses on TCP behavior during a series of DUPACKs. We believe that our suggestions should be explicitly addressed when advancing [RFC2581] and [RFC2988] further along the standards track. This document is therefore aimed at becoming an informational document. 2. Restarting the retransmit timer on DUPACKs The general motivation for restarting the retransmit timer on DUPACKs is that a TCP sender may not want to timeout as long as it still receives feedback that segments are being delivered by the network. In addition, DUPACKs may carry useful SACK information [RFC2018]. 2.1 Restarting on DUPACKs Before DupThresh For a number of reasons, the retransmit timer can fire before enough DUPACKs arrive to reach DupThresh. First, segments or ACKs can be lost and a timeout is appropriate in such case. Second, DUPACKs are delayed and a timeout is not necessary. Third, if the TCP sender uses the Limited Transmit algorithm [RFC3042] and has a small congestion window, it can take several RTTs to reach DupThresh. Fourth, a DUPACK series can result from duplicate or reordered segments that do not Gurtov [Page 2]
INTERNET-DRAFT On Treating DUPACKs in TCP October, 2002 indicate any data loss. A timeout in such case is clearly undesirable. The penalty of a spurious timeout in cases 2 and 3 is small given that Section 3 is implemented. In fact, the following go-back-N recovery can be even faster than relying on fast retransmit and fast recovery. Furthermore, restarting the timer can also unnecessary delay recovery in case 1. Therefore, it does not appear necessary to postpone firing of the retransmit timer by restarting it on DUPACKs in cases 1-3. However, when case 4 is identified for example by DSACK [RFC2883], the timer should be restarted on DUPACKs. For instance, when DupThresh is increased above 3 as proposed in [BA01], DUPACKs are likely arriving due to reordered packets and the timer should be restarted upon them. 2.2 Restarting on Fast Retransmit The retransmit timer should be restarted when the fast retransmit is sent. Although, many TCP implementations already do this (e.g., see [WS95]), it is not explicitly recommended by [RFC2988]. However, [RFC2988] does recommend it for timeout-based retransmits. A fast retransmit should not be treated differently from a timeout-based retransmit in this respect. In both cases the ACK for the retransmit should be given the same amount of time to return as for normal transmits. 2.3 Restarting on DUPACKs After DupThresh Restarting the retransmit timer on DUPACKs after a fast retransmit should not be done without an advanced loss recovery method capable of recovering lost retransmits. It is because when a fast retransmitted segment is lost and the timer is restarted, an ôinfiniteö loop is created by arriving DUPACKs that trigger transmission of new segments that in turn trigger new DUPACKs. Recovery of lost fast retransmits appears to be especially useful on a high speed link with low RTT. For such a link timeouts with a 1 s minimum [RFC2988] are painful because the huge pipe is drained. One such advanced loss recovery schemes has been proposed in [LK98]. The idea is to count DUPACKs and to retransmit the oldest outstanding segment again when the first DUPACK returns that must correspond to a new segment sent during the second half of fast recovery. Clearly, on such repeating fast retransmits the congestion window should be halved again. For such advanced loss recovery schemes, we believe that the retransmit timer should also be restarted for every DUPACK that arrives after the DupThresh had been reached. Gurtov [Page 3]
INTERNET-DRAFT On Treating DUPACKs in TCP October, 2002 Previous work [AP99] concludes that TCP timestamps [RFC1323] are not helpful in general. In contrary, we found that enabling the TCP Timestamps option is decreases likelihood of a spurious timeout during fast recovery. Timing every segment allows the TCP sender to gain a more accurate estimate of the RTT. This leads to a more conservative retransmit timer if the RTT varies considerably [GU01c] [IMLGK02]. Note that the limited transmit algorithm can increase the length of fast recovery by two DUPACKs. 3. Treating DUPACKs after a retransmission timeout A TCP sender can experience a spurious timeout during a DUPACK series before or after reaching DupThresh. It is possible despite of restarting the retransmit timer on DUPACKs for example when a delay spike occurs or when the bottleneck bandwidth changes. Alternatively, a necessary timeout resulting from a lost fast retransmit can happen. The simplest way to avoid unnecessary retransmissions in such a case is to ignore arriving DUPACKs and follow the timeout recovery procedure described in [RFC2581]. That is, to retransmit the oldest outstanding segment, wait for a new ACK and back-off the RTO timer if it expires again. Instead, a fairly common behavior among TCPs is to use DUPACKs arriving after a timeout for clocking-out retransmissions of segments. At least the following TCPs are doing this: FreeBSD4.1, NS2, MS Windows. Doing so allows the sender to perform a large number of unnecessary retransmissions. This can result into further misbehavior as the TCP receiver generates further DUPACKs for the arriving duplicate segments. A æFast TimeoutÆ algorithm [LK00] proposes a different response to a DUPACK which arrives after a timeout but would trigger a fast retransmit otherwise. In particular, the TCP sender adjusts the congestion control state and starts transmitting new segments as though it has entered the fast recovery. Currently, there is no experimental evidence on which approach works better in practice. 4. Security Considerations Security consideration for computing the retransmit timer are given in [RFC2988]. There are no known additional security concerns for recommendations of this document. Acknowledgments Many thanks Mark Allman and Sally Floyd for discussions on the contents of this document. Reiner Ludwig, Alexey Kuznetsov, Noritoshi Demizu, Farid Khafizov provided valuable comments that contributed to this document. Gurtov [Page 4]
INTERNET-DRAFT On Treating DUPACKs in TCP October, 2002 References [IMLGK02] H. Inamura, G. Montenegro, R. Ludwig, A. Gurtov, F. Khafizov, TCP over Second (2.5G) and Third (3G) Generation Wireless Networks, draft-ietf-pilc-2.5g3g-10.txt, work in progress. [AP99] M. Allman and V. Paxson, On Estimating End-to-End Network Path Properties, ACM SIGCOMM '99, September 1999, Cambridge, MA. [BA01] E. Blanton, M. Allman. Using TCP DSACKs and SCTP Duplicate TSNs to Detect Spurious Retransmissions, work in progress. [GL03] A. Gurtov, R. Ludwig, TCP Response to Spurious Timeouts, To appear in INFOCOMÆ03. [GU01a] A. Gurtov, Effect of Delays on TCP Performance, In Proceedings of IFIP Personal Wireless Communications, August 2001. [GU01b] A. Gurtov, Traces of TCP connections experiencing a delay spike, http://www.cs.helsinki.fi/u/gurtov/tcp/, January 2002. [GU01c] Gurtov, A., "Making TCP Robust Against Delay Spikes", University of Helsinki, Department of Computer Science, C-2001-53, November 2001, http://www.cs.helsinki.fi/u/gurtov/papers/ [RFC1323] V. Jacobson, R. Braden, D. Borman, TCP Extensions for High Performance, RFC 1323, May 1992. [RFC2018] M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, TCP Selective Acknowledgement Options, RFC 2018, October 1996. [RFC2581] M. Allman, V. Paxson, W. Stevens, TCP Congestion Control, RFC 2581, April 1999. [RFC2883] S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky, A. Romanow, An Extension to the Selective Acknowledgement (SACK) Option for TCP, RFC 2883, July 2000. [RFC2988] V. Paxson, M. Allman, Computing TCP's Retransmission Timer, RFC 2988, November 2000. [RFC3042] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's Loss Recovery Using Limited Transmit", RFC 3042, January, 2001. [KY02] Khafizov, F. and M. Yavuz, "Running TCP over IS-2000", In Proceedings of IEEE ICC 2002, June 2002. [LK98] D. Lin, H. T. Kung, TCP Fast Recovery Strategies: Analysis and Improvements, In Proceedings of IEEE INFOCOM 98, March 1998. Gurtov [Page 5]
INTERNET-DRAFT On Treating DUPACKs in TCP October, 2002 [LK00] R. Ludwig, R. H. Katz, The Eifel Algorithm: Making TCP Robust Against Spurious Retransmissions, ACM Computer Communication Review, Vol. 30, No. 1, January 2000. [LU99] R. Ludwig, B. Rathonyi, A. Konrad, K. Oden, and A. Joseph. Multi-layer tracing of TCP over a reliable wireless link. In Proceedings of the ACM SIGMETRICS, May 1999. [WS95] G. R. Wright, W. R. Stevens, TCP/IP Illustrated, Volume 2 (The Implementation), Addison Wesley, January 1995. Author's Address: Andrei Gurtov Sonera Corp. Cellular Systems Development P.O. Box 970, FIN-00051 Helsinki, Finland Fax: +358(0)204064365 Tel: +358(0)20401 Email: andrei.gurtov@sonera.com URL: http://www.cs.helsinki.fi/~gurtov This Internet-Draft expires in April 2003. Gurtov [Page 6]