TCPM Working Group                                          G. Fairhurst
Internet-Draft                                           A. Sathiaseelan
Obsoletes: 2861 (if approved)                                  R. Secchi
Updates: 5681 (if approved)                       University of Aberdeen
Intended status: Experimental                             March 23, 2014
Expires: September 24, 2014

              Updating TCP to support Rate-Limited Traffic


   This document proposes an update to RFC 5681 to address issues that
   arise when TCP is used to support traffic that exhibits periods where
   the sending rate is limited by the application rather than the
   congestion window.  It provides an experimental update to TCP that
   allows a TCP sender to restart quickly following either a rate-
   limited interval.  This method is expected to benefit applications
   that send rate-limited traffic using TCP, while also providing an
   appropriate response if congestion is experienced.

   It also evaluates the Experimental specification of TCP Congestion
   Window Validation, CWV, defined in RFC 2861, and concludes that RFC
   2861 sought to address important issues, but failed to deliver a
   widely used solution.  This document therefore recommends that the
   status of RFC 2861 is moved from Experimental to Historic, and that
   it is replaced by the current specification.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 24, 2014.

Fairhurst, et al.      Expires September 24, 2014               [Page 1]

Internet-Draft                   new-CWV                      March 2014

Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   ( in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Standards Status of this Document . . . . . . . . . . . .   4
   2.  Reviewing experience with TCP-CWV . . . . . . . . . . . . . .   5
   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
     4.1.  Initialisation  . . . . . . . . . . . . . . . . . . . . .   8
     4.2.  Estimating the validated capacity supported by a path . .   8
     4.3.  Preserving cwnd during a rate-limited period. . . . . . .   9
     4.4.  TCP congestion control during the non-validated phase . .   9
       4.4.1.  Response to congestion in the non-validated phase . .  11
       4.4.2.  Sender burst control during the non-validated phase .  12
       4.4.3.  Adjustment at the end of the non-validated phase  . .  13
     4.5.  Examples of Implementation  . . . . . . . . . . . . . . .  13
       4.5.1.  Implementing the pipeACK measurement  . . . . . . . .  13
       4.5.2.  Implementing detection of the cwnd-limited condition   15
   5.  Determining a safe period to preserve cwnd  . . . . . . . . .  15
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  16
   9.  Author Notes  . . . . . . . . . . . . . . . . . . . . . . . .  16
     9.1.  Other related work  . . . . . . . . . . . . . . . . . . .  16
     9.2.  Revision notes  . . . . . . . . . . . . . . . . . . . . .  19
   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  21
     10.1.  Normative References . . . . . . . . . . . . . . . . . .  21
     10.2.  Informative References . . . . . . . . . . . . . . . . .  22
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  23

1.  Introduction

   TCP is used to support a range of application behaviours.  The TCP
   congestion window (cwnd) controls the number of unacknowledged
   packets/bytes that a TCP flow may have in the network at any time, a

Fairhurst, et al.      Expires September 24, 2014               [Page 2]

Internet-Draft                   new-CWV                      March 2014

   value known as the FlightSize [RFC5681].  A bulk application will
   always have data available to transmit.  The rate at which it sends
   is therefore limited by the maximum permitted by the receiver
   advertised window and the sender congestion window (cwnd).  In
   contrast, a rate-limited application will experience periods when the
   sender is either idle or is unable to send at the maximum rate
   permitted by the cwnd.  The update in this document targets the
   operation of TCP in such rate-limited cases.

   Standard TCP [RFC5681] states that a TCP sender SHOULD set cwnd to no
   more than the Restart Window (RW) before beginning transmission, if
   the TCP sender has not sent data in an interval exceeding the
   retransmission timeout, i..e when an application becomes idle.
   [RFC2861] noted that this TCP behaviour was not always observed in
   current implementations.  Experiments [Bis08] confirm this to still
   be the case.

   CWV introduced the terminology of "application limited periods".
   This document describes any time that an application limits the
   sending rate, rather than being limited by the transport, as "rate-
   limited".  This update improves support for applications that vary
   their transmission rate, either with (short) idle periods between
   transmission or by changing the rate the application sends.  These
   applications are characterised by the TCP FlightSize often being less
   than cwnd.  Many Internet applications exhibit this behaviour,
   including web browsing, http-based adaptive streaming, applications
   that support query/response type protocols, network file sharing, and
   live video transmission.  Many such applications currently avoid
   using long-lived (persistent) TCP connections (e.g. [RFC2616] servers
   typically support persistent HTTP connections, but short server
   timeouts often prevent using it).  Such applications often instead
   either use a succession of short TCP transfers or use UDP.

   Standard TCP does not impose additional restrictions on the growth of
   the congestion window when a TCP sender is unable to send at the
   maximum rate allowed by the cwnd.  In this case the rate-limited
   sender may grow a cwnd far beyond that corresponding to the current
   transmit rate, resulting in a value that does not reflect current
   information about the state of the network path the flow is using.
   Use of such an invalid cwnd may result in reduced application
   performance and/or could significantly contribute to network

   [RFC2861] proposed a solution to these issues in an experimental
   method known as Congestion Window Validation (CWV).  CWV was intended
   to help reduce cases where TCP accumulated an invalid cwnd.  The use
   and drawbacks of using the CWV algorithm in RFC 2861 with an
   application are discussed in Section 2.

Fairhurst, et al.      Expires September 24, 2014               [Page 3]

Internet-Draft                   new-CWV                      March 2014

   Section 3 defines relevant terminology.

   Section 4 specifies an alternative to CWV that seeks to address the
   same issues, but does this in a way that is expected to mitigate the
   impact on an application that varies its sending rate.  The updated
   method applies to the rate-limited conditions (including both an
   application-limited and idle sender).

   The goals of this update are:

   o  To not change the behaviour of a TCP sender that performs bulk
      transfers that consume the cwnd.

   o  To provide a method that co-exists with Standard TCP and other
      flows that use this updated method.

   o  To reduce transfer latency for applications that change their rate
      over short intervals of time.

   o  To avoid a TCP sender growing a large "non-validated" cwnd, when
      it has not recently sent using this cwnd.

   o  To remove the incentive for ad-hoc application or network stack
      methods (such as "padding") solely to maintain a large cwnd for
      future transmission.

   o  To incentivise the use of long-lived connections, rather than a
      succession of short-lived flows, benefiting both flows and network
      when actual congestion is encountered.

   Section 5 describes the rationale for selecting the safe period to
   preserve the cwnd.

1.1.  Standards Status of this Document

   This document was produced by the TCP Maintenance and Minor
   Extensions (tcpm) working group.

   The document updates and obsoletes the methods described in
   [RFC2861].  It recommends a set of mechanisms, including the use of
   pacing during a non-validated period.  The updated mechanisms are
   intended to have a less aggressive congestion impact than would be
   exhibited by a standard TCP sender.

   The specification in this draft is classified as "Experimental"
   pending experience with deployed implementations of the methods.

Fairhurst, et al.      Expires September 24, 2014               [Page 4]

Internet-Draft                   new-CWV                      March 2014

2.  Reviewing experience with TCP-CWV

   [RFC2861] described a simple modification to the TCP congestion
   control algorithm that decayed the cwnd after the transition to a
   "sufficiently-long" idle period.  This used the slow-start threshold
   (ssthresh) to save information about the previous value of the
   congestion window.  The approach relaxed the standard TCP behaviour
   [RFC5681] for an idle session, intended to improve application
   performance.  CWV also modified the behaviour where a sender
   transmitted at a rate less than allowed by cwnd.

   [RFC2861] proposed two set of responses, one after an "application-
   limited" and one after an "idle period".  Although this distinction
   was argued, in practice differentiating the two conditions was found
   problematic in actual networks (e.g.[Bis10]).  This offers
   predictable performance for long on-off periods (>>1 RTT), or slowly
   varying rate-based traffic, the performance could be unpredictable
   for variable-rate traffic and depended both upon whether an accurate
   RTT had been obtained and the pattern of application traffic relative
   to the measured RTT.

   Many applications can and often do vary their transmission over a
   wide range rates.  Using [RFC2861] such applications often
   experienced varying performance, which made it hard for application
   developers to predict the TCP latency even when using a path with
   stable network characteristics.  We argue that an attempt to classify
   application behaviour as application-limited or idle is problematic
   and also inappropriate.  This document therefore explicitly avoids
   trying to differentiate these two cases, instead treating all rate-
   limited traffic uniformly.

   [RFC2861] has been implemented in some mainstream operating systems
   as the default behaviour [Bis08].  Analysis (e.g. [Bis10] [Fai12])
   has shown that a TCP sender using CWV is able to use available
   capacity on a shared path after an idle period.  This can benefit
   variable-rate applications, especially over long delay paths, when
   compared to the slow-start restart specified by standard TCP.
   However, CWV would only benefit an application if the idle period
   were less than several Retransmission Time Out (RTO) intervals
   [RFC6298], since the behaviour would otherwise be the same as for
   standard TCP, which resets the cwnd to the TCP Restart Window after
   this period.

   To enable better performance for variable-rate applications with TCP,
   some operating systems have chosen to support non-standard methods,
   or applications have resorted to "padding" streams to maintain their
   sending rate when they have no data to transmit.  Although
   transmitting redundant data across a network path provides good

Fairhurst, et al.      Expires September 24, 2014               [Page 5]

Internet-Draft                   new-CWV                      March 2014

   evidence that the path can sustain data at the offered rate, padding
   also consumes network capacity and reduces the opportunity for
   congestion-free statistical multiplexing.  For variable-rate flows,
   the benefits of statistical multiplexing can be significant and it is
   therefore a goal to find a viable alternative to padding streams.

   Experience with [RFC2861] suggests that although the CWV method
   benefited the network in a rate-limited scenario (reducing the
   probability of network congestion), the behaviour was too
   conservative for many common rate-limited applications.  This
   mechanism did not therefore offer the desirable increase in
   application performance for rate-limited applications and it is
   unclear whether applications actually use this mechanism in the
   general Internet.

   It is therefore concluded that CWV, as defined in [RFC2861], was
   often a poor solution for many rate-limited applications.  It had the
   correct motivation, but had the wrong approach to solving this

3.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in [RFC2119].

   The document assumes familiarity with the terminology of TCP
   congestion control [RFC5681].

   The following terminology is used in this document:

   cwnd-limited: A TCP flow that has sent the maximum number of segments
   permitted by the cwnd, where the application utilises the allowed
   sending rate (see Section 4.5.2).

   pipeACK sample: A measure of the volume of data acknowledged by the
   network within an RTT.

   pipeACK variable: A variable that measures the available capacity
   using the set of pipeACK samples.

   pipeACK Sampling Period: The maximum period that a measured pipeACK
   sample may influence the pipeACK variable.

   Non-validated phase: The phase where the cwnd reflects a previous
   measurement of the available path capacity.

Fairhurst, et al.      Expires September 24, 2014               [Page 6]

Internet-Draft                   new-CWV                      March 2014

   Non-validated period, NVP: The maximum period for which cwnd is
   preserved in the non-validated phase.

   Rate-limited: A TCP flow that does not consume more than one half of
   cwnd, and hence operates in the non-validated phase.  This includes
   periods when an application is either idle or chooses to send at a
   rate less than the maximum permitted by the cwnd.

   Validated phase: The phase where the cwnd reflects a current estimate
   of the available path capacity.

4.  A New Congestion Window Validation method

   This section proposes an update to the TCP congestion control
   behaviour during a rate-limited interval.  This new method
   intentionally does not differentiate between times when the sender
   has become idle or chooses to send at a rate less than the maximum
   allowed by the cwnd.

   The period where actual usage is less than allowed by cwnd, is named
   as the non-validated phase.  The update allows an application in the
   non-validated phase to resume transmission at a previous rate without
   incurring the delay of slow-start.  However, if the TCP sender
   experiences congestion using the preserved cwnd, it is required to
   immediately reset the cwnd to an appropriate value specified by the
   method.  If a sender does not take advantage of the preserved cwnd
   within the NVP, the value of cwnd is reduced, ensuring the value
   better reflects the capacity that was recently actually used.

   It is expected that this update will satisfy the requirements of many
   rate-limited applications and at the same time provide an appropriate
   method for use in the Internet.  Some applications use dummy packets
   (aka "padding") to maintain a sending rate when an application has
   now data to send.  Although this ensures the path continues to
   support the rate permitted by the cwnd, it wastes network capacity
   sending useless data.  New-CWV reduces this incentive for an
   application to send data simply to keep transport congestion state.

   The method is specified in following subsections and is expected to
   encourage applications and TCP stacks to use standards-based
   congestion control methods.  It may also encourage the use of long-
   lived connections where this offers benefit (such as persistent

Fairhurst, et al.      Expires September 24, 2014               [Page 7]

Internet-Draft                   new-CWV                      March 2014

4.1.  Initialisation

   A sender starts a TCP connection in the validated phase and
   initialises the pipeACK variable to the "undefined" value.  This
   value inhibits use of the value in cwnd calculations.

4.2.  Estimating the validated capacity supported by a path

   [RFC6675] defines a variable, FlightSize, that indicates the
   instantaneous amount of data that has been sent, but not cumulatively
   acknowledged.  In this method a new variable "pipeACK" is introduced
   to measure the acknowledged size of the network pipe.  This is used
   to determine if the sender has validated the cwnd. pipeACK differs
   from FlightSize in that it is evaluated over a window of acknowledged
   data, rather than reflecting the amount of data outstanding.

   A sender determines a pipeACK sample by measuring the volume of data
   that was acknowledged by the network over the period of a measured
   Round Trip Time (RTT).  Using the variables defined in [RFC6675], a
   value could be measured by caching the value of HighACK and after one
   RTT measuring the difference between the cached HighACK value and the
   current HighACK value.  Other equivalent methods may be used.

   A sender is not required to continuously update the pipeACK variable
   after each received ACK, but SHOULD perform a pipeACK sample at least
   once per RTT when it has sent unacknowledged segments.

   The pipeACK variable MAY consider multiple pipeACK samples over the
   pipeACK Sampling Period.  The value of the pipeACK variable MUST NOT
   exceed the maximum (highest value) within the sampling period.  This
   specification defines the pipeACK Sampling Period as Max(3*RTT, 1
   second).  This period enables a sender to compensate for large
   fluctuations in the sending rate, where there may be pauses in
   transmission, and allows the pipeACK variable to reflect the largest
   recently measured pipeACK sample.

   When no measurements are available, the pipeACK variable is set to
   the "undefined value".  This value is used to inhibit entering the
   non-validated phase until the first new measurement of a pipeACK

   The pipeACK variable MUST NOT be updated during TCP Fast Recovery.
   That is, the sender stops collecting pipeACK samples during loss
   recovery.  The method RECOMMENDS that the TCP SACK option [RFC2018]
   is enabled and the method defined on [RFC6675]is used to recover
   missing segments.  This allows the sender to more accurately
   determine the number of missing bytes during the loss recovery phase,

Fairhurst, et al.      Expires September 24, 2014               [Page 8]

Internet-Draft                   new-CWV                      March 2014

   and using this method will result in a more appropriate cwnd
   following loss.

4.3.  Preserving cwnd during a rate-limited period.

   The updated method creates a new TCP sender phase that captures
   whether the cwnd reflects a validated or non-validated value.  The
   phases are defined as:

   o  Validated phase: pipeACK >=(1/2)*cwnd, or pipeACK is undefined.
      This is the normal phase, where cwnd is expected to be an
      approximate indication of the capacity currently available along
      the network path, and the standard methods are used to increase
      cwnd (currently [RFC5681]).

   o  Non-validated phase: pipeACK <(1/2)*cwnd.  This is the phase where
      the cwnd has a value based on a previous measurement of the
      available capacity, and the usage of this capacity has not been
      validated in the pipeACK Sampling Period.  That is, when it is not
      known whether the cwnd reflects the currently available capacity
      along the network path.  The mechanisms to be used in this phase
      seek to determine a safe value for cwnd and an appropriate
      reaction to congestion.

   Note: A threshold is needed to determine whether a sender is in the
   validated or non-validated phase.  We start by noting that a standard
   TCP sender in slow-start is permitted to double its FlightSize from
   one RTT to the next.  This motivated the choice of a threshold value
   of 1/2.  This threshold ensures a sender does not further increase
   the cwnd as long as the FlightSize is less than (1/2*cwnd).
   Furthermore, a sender with a FlightSize less than (1/2*cwnd) may in
   the next RTT be permitted by the cwnd to send at a rate that more
   than doubles the FlightSize, and hence this case needs to be regarded
   as non-validated and a sender therefore needs to employ additional
   mechanisms while in this phase.

4.4.  TCP congestion control during the non-validated phase

   A TCP sender MUST enter the non-validated phase when the pipeACK is
   less than (1/2)*cwnd.

   A TCP sender that enters the non-validated phase SHOULD preserve the
   cwnd (i.e., this neither grows nor reduces while the sender remains
   in this phase).  If the sender receives an indication of congestion
   (loss or Explicit Congestion Notification, ECN, mark [RFC3168]) it
   uses the method described below.  The phase is concluded after a
   fixed period of time (the NVP, as explained in Section 4.4.3) or when

Fairhurst, et al.      Expires September 24, 2014               [Page 9]

Internet-Draft                   new-CWV                      March 2014

   the sender transmits sufficient data so that pipeACK > (1/2)*cwnd
   (i.e. the sender is no longer rate-limited).

   The behaviour in the non-validated phase is specified as:

   o  A sender determines whether to increase the cwnd based upon
      whether it is cwnd-limited (see Section 4.5.2):


      *  A sender that is cwnd-limited MAY use the standard TCP method
         to increase cwnd (i.e. a TCP sender that fully utilises the
         cwnd is permitted to increase cwnd each received ACK using
         standard methods).

      *  A sender that is not cwnd-limited MUST NOT increase the cwnd
         when ACK packets are received in this phase.

   o  If the sender receives an indication of congestion while in the
      non-validated phase (i.e., detects loss, or an ECN mark), the
      sender MUST exit the non-validated phase (reducing the cwnd as
      defined in Section 4.4.1).

   o  If the Retransmission Time Out (RTO) expires while in the non-
      validated phase, the sender MUST exit the non-validated phase.  It
      then resumes using the standard TCP RTO mechanism [RFC5681].

   o  A sender with a pipeACK variable greater than (1/2)*cwnd SHOULD
      enter the validated phase.  (A rate-limited sender will not
      normally be impacted by whether it is in a validated or non-
      validated phase, since it will normally not consume the entire
      cwnd.  However a change to the validated phase will release the
      sender from constraints on the growth of cwnd, and restore the use
      of the standard congestion response.)

   The cwnd-limited behaviour may be triggered during a transient
   condition that occurs when a sender is in the non-validated phase and
   receives an ACK that acknowledges received data, the cwnd was fully
   utilised, and more data is awaiting transmission than may be sent
   with the current cwnd.  The sender is then allowed to use the
   standard method to increase the cwnd.  (Note, if the sender succeeds
   in sending these new segments, the updated cwnd and pipeACK variables
   will eventually result in a transition to the validated phase.)

Fairhurst, et al.      Expires September 24, 2014              [Page 10]

Internet-Draft                   new-CWV                      March 2014

4.4.1.  Response to congestion in the non-validated phase

   Reception of congestion feedback while in the non-validated phase is
   interpreted as an indication that it was inappropriate for the sender
   to use the preserved cwnd.  The sender is therefore required to
   quickly reduce the rate to avoid further congestion.  Since the cwnd
   does not have a validated value, a new cwnd value must be selected
   based on the utilised rate.

   A sender that detects a packet-drop, or receives an indication of an
   ECN marked packet, MUST record the current FlightSize in the variable
   LossFlightSize and MUST calculate a safe cwnd for loss recovery using
   the method below:

           cwnd = (Max(pipeACK,LossFlightSize))/2.

   The pipeACK value is not updated during loss recoverySection 4.2.  If
   there is a valid pipeACK value, the new cwnd is adjusted to reflect
   that a non-validated cwnd may be larger than the actual FlightSize,
   or recently used FlightSize (recorded in pipeACK).  The updated cwnd
   therefore prevents overshoot by a sender significantly increasing its
   transmission rate during the recovery period.

   At the end of the recovery phase, the TCP sender MUST reset the cwnd
   using the method below:

           cwnd = (Max(pipeACK,LossFlightSize) - R)/2.

   Where R is the volume of data that was retransmitted during the
   recovery phase.

   If the sender implements a method that allows it to identify the
   number of ECN-marked segments within a window that were observed by
   the receiver, the sender SHOULD use the method above, further
   reducing R by the number of marked segments.

   After completing the loss recovery phase, the sender MUST re-
   initialise the pipeACK variable to the "undefined" value.  This
   ensures that standard TCP methods are used immediately after
   completing loss recovery until a new pipeACK value can be determined.

   ssthresh is adjusted using the standard TCP method.

   Note: The adjustment by reducing cwnd by the volume of data not sent
   (R) follows the method proposed for Jump Start [Liu07].  The
   inclusion of the term R makes the adjustment more conservative than
   standard TCP.  This is required, since a sender in the non-validated

Fairhurst, et al.      Expires September 24, 2014              [Page 11]

Internet-Draft                   new-CWV                      March 2014

   state may increase the rate more than a standard TCP would have done
   relative to what was sent in the last RTT (i.e., more than doubled
   the number of segments in flight relative to what it sent in the last
   RTT).  The additional reduction after congestion is beneficial when
   the LossFlightSize has significantly overshot the available path
   capacity incurring significant loss (e.g. following a change of path
   characteristics or when additional traffic has taken a larger share
   of the network bottleneck during a period when the sender transmits

   Note: The pipeACK value is only valid during a non-validated phase,
   and therefore does not exceed cwnd/2.  If LossFlightSize and R were
   small, then this can result in the final cwnd after loss recovery
   being not more than 1/4 of the cwnd on detection of congestion.  This
   reduction is conservative compared to standard TCP.  pipeACK is reset
   to undefined after completing loss recovery.  Subsequent updates to
   cwnd do not therefore reflect pipeACK history before any congestion

4.4.2.  Sender burst control during the non-validated phase

   TCP congestion control allows a sender to accumulate a cwnd that
   would allow it to send a burst of segments with a total size up to
   the difference between the FlightsSize and cwnd.  Such bursts can
   impact other flows that share a network bottleneck and/or may induce
   congestion when buffering is limited.

   Various methods have been proposed to control the sender burstiness
   [Hug01], [All05].  For example, TCP can limit the number of new
   segments it sends per received ACK.  This is effective when a flow of
   ACKs is received, but can not be used to control a sender that has
   not send appreciable data in the previous RTT [All05].

   This document recommends using a method to avoid line-rate bursts
   after an idle or rate-limited interval when there is less reliable
   information about the capacity of the network path: A TCP sender in
   the non-validated phase SHOULD control the maximum burst size, e.g.
   using a rate-based pacing algorithm in which a sender paces out the
   cwnd over its estimate of the RTT, or some other method, to prevent
   many segments being transmitted contiguously at line-rate.  The most
   appropriate method(s) to implement pacing depend on the design of the
   TCP/IP stack, speed of interface and whether hardware support (such
   as TCP Segment Offload, TSO) is used.  The present document does not
   recommend any specific method.

Fairhurst, et al.      Expires September 24, 2014              [Page 12]

Internet-Draft                   new-CWV                      March 2014

4.4.3.  Adjustment at the end of the non-validated phase

   An application that remains in the non-validated phase for a period
   greater than the NVP is required to adjust its congestion control
   state.  If the sender exits the non-validated phase after this
   period, it MUST update the ssthresh:

         ssthresh = max(ssthresh, 3*cwnd/4).

   (This adjustment of ssthresh ensures that the sender records that it
   has safely sustained the present rate.  The change is beneficial to
   rate-limited flows that encounter occasional congestion, and could
   otherwise suffer an unwanted additional delay in recovering the
   sending rate.)

   The sender MUST then update cwnd to be not greater than:

            cwnd = max((1/2)*cwnd, IW).

   Where IW is the appropriate TCP initial window, used by the TCP
   sender (e.g. [RFC5681]).

   Note: This adjustment ensures that the sender responds conservatively
   after remaining in the non-validated phase for more than the non-
   validated period.  In this case, it reduces the cwnd by a factor of
   two from the preserved value.  This adjustment is helpful when flows
   accumulate but do not use a large cwnd, and seeks to mitigate the
   impact when these flows later resume transmission.  This could for
   instance mitigate the impact if multiple high-rate application flows
   were to become idle over an extended period of time and then were
   simultaneously awakened by some external event.

4.5.  Examples of Implementation

   This section provides informative examples of implementation methods.
   Implementations may choose to use other methods that comply with the
   normative requirements.

4.5.1.  Implementing the pipeACK measurement

   A pipeACK sample may be measured once each RTT.  This reduces the
   sender processing burden for calculating after each acknowledgement
   and also reduces storage requirements at the sender.

   Since application behaviour can be bursty using CWV, it may be
   desirable to implement a maximum filter to accumulate the measured
   values so that the pipeACK variable records the largest pipeACK

Fairhurst, et al.      Expires September 24, 2014              [Page 13]

Internet-Draft                   new-CWV                      March 2014

   sample within the pipeACK Sampling Period.  One simple way to
   implement this is to divide the pipeACK Sampling Period into several
   (e.g. 5) equal length measurement periods.  The sender then records
   the start time for each measurement period and the highest measured
   pipeACK sample.  At the end of the measurement period, any
   measurement(s) that are older than the pipeACK Sampling Period are
   discarded.  The pipeACK variable is then assigned the largest of the
   set of the highest measured values.

     +----------+----------+           +----------+---......
     | Sample A | Sample B | No        | Sample C | Sample D
     |          |          | Sample    |          |
     | |\ 5     |          |           |          |
     | | |      |          |           |  /\ 4    |
     | | |      |  |\ 3    |           |  | \     |
     | | \      | |  \---  |           |  /  \    |   /| 2
     |/   \------|       - |           | /    \------/ \...
     +----------+---------\+----/ /----+/---------+-------------> Time

                         Sampling Period          Current Time

   Figure 1: Example of measuring pipeACK samples

   Figure 1 shows an example of how measurement samples may be
   collected.  At the time represented by the figure new samples are
   being accumulated into sample D. Three previous samples also fall
   within the pipeACK Sampling Period: A, B, and C. There was also a
   period of inactivity between samples B and C during which no
   measurements were taken.  The current value of the pipeACK variable
   will be 5, the maximum across all samples.

   After one further measurement period, Sample A will be discarded,
   since it then is older than the pipeACK Sampling Period and the
   pipeACK variable will be recalculated, Its value will be the larger
   of Sample C or the final value accumulated in Sample D.

   Note that the pipeACK Sampling Period and the NVP period do not
   necessarily require a new timer to be implemented.  An alternative is
   to record a timestamp when the sender enters the NVP.  Each time a
   sender transmits a new segment, this timestamp may be used to
   determine if the NVP period has expired.  If the period expires, the
   sender may take into account how many units of the NVP period have
   passed and make one reduction (as defined in Section 4.4.3) for each
   NVP period.

Fairhurst, et al.      Expires September 24, 2014              [Page 14]

Internet-Draft                   new-CWV                      March 2014

4.5.2.  Implementing detection of the cwnd-limited condition

   A method is required to detect the cwnd-limited condition (see
   Section 4.4.  This is used to detect a condition where a sender in
   the non-validated phase receives an ACK, but the size of cwnd
   prevents sending more new data.

   In simple terms this condition is true only when the TCP sender's
   FlightSize is equal to or larger than the cwnd.  However, an
   implementation must consider other constraints on the way in which
   cwnd variable is used, for instance the need to support methods such
   as the Nagle Algorithm and TCP Segment Offload (TSO).  This can
   result in a sender becoming cwnd-limited when the cwnd is nearly,
   rather than completely, equal to the FlightSize.

5.  Determining a safe period to preserve cwnd

   This section documents the rationale for selecting the maximum period
   that cwnd may be preserved, known as the non-validated period, NVP.

   Limiting the period that cwnd may be preserved avoids undesirable
   side effects that would result if the cwnd were to be kept
   unnecessarily high for an arbitrary long period, which was a part of
   the problem that CWV originally attempted to address.  The period a
   sender may safely preserve the cwnd, is a function of the period that
   a network path is expected to sustain the capacity reflected by cwnd.
   There is no ideal choice for this time.

   A period of five minutes was chosen for this NVP.  This is a
   compromise that was larger than the idle intervals of common
   applications, but not sufficiently larger than the period for which
   the capacity of an Internet path may commonly be regarded as stable.
   The capacity of wired networks is usually relatively stable for
   periods of several minutes and that load stability increases with the
   capacity.  This suggests that cwnd may be preserved for at least a
   few minutes.

   There are cases where the TCP throughput exhibits significant
   variability over a time less than five minutes.  Examples could
   include wireless topologies, where TCP rate variations may fluctuate
   on the order of a few seconds as a consequence of medium access
   protocol instabilities.  Mobility changes may also impact TCP
   performance over short time scales.  Senders that observe such rapid
   changes in the path characteristic may also experience increased
   congestion with the new method, however such variation would likely
   also impact TCP's behaviour when supporting interactive and bulk

Fairhurst, et al.      Expires September 24, 2014              [Page 15]

Internet-Draft                   new-CWV                      March 2014

   Routing algorithms may modify the network path, disrupting the RTT
   measurement and changing the capacity available to a TCP connection,
   however such changes do not often occur within a time frame of a few

   The value of five minutes is therefore expected to be sufficient for
   most current applications.  Simulation studies (e.g. [Bis11]) also
   suggest that for many practical applications, the performance using
   this value will not be significantly different to that observed using
   a non-standard method that does not reset the cwnd after idle.

   Finally, other TCP sender mechanisms have used a 5 minute timer, and
   there could be simplifications in some implementations by reusing the
   same interval.  TCP defines a default user timeout of 5 minutes
   [RFC0793] i.e. how long transmitted data may remain unacknowledged
   before a connection is forcefully closed.

6.  Security Considerations

   General security considerations concerning TCP congestion control are
   discussed in [RFC5681].  This document describes an algorithm that
   updates one aspect of the congestion control procedures, and so the
   considerations described in RFC 5681 also apply to this algorithm.

7.  IANA Considerations

   There are no IANA considerations.

8.  Acknowledgments

   The authors acknowledge the contributions of Dr I Biswas, Mr Ziaul
   Hossain in supporting the evaluation of CWV and for their help in
   developing the mechanisms proposed in this draft.  We also
   acknowledge comments received from the Internet Congestion Control
   Research Group, in particular Yuchung Cheng, Mirja Kuehlewind, Joe
   Touch, and Mark Allman.  This work was part-funded by the European
   Community under its Seventh Framework Programme through the Reducing
   Internet Transport Latency (RITE) project (ICT-317700).

9.  Author Notes

   RFC-Editor note: please remove this section prior to publication.

9.1.  Other related work

   RFC-Editor note: please remove this section prior to publication.

   There are several issues to be discussed more widely:

Fairhurst, et al.      Expires September 24, 2014              [Page 16]

Internet-Draft                   new-CWV                      March 2014

      o There are potential interactions with the Experimental update in
      [RFC6928] that raises the TCP initial Window to ten segments, do
      these cases need to be elaborated?

         This relates to the Experimental specification for increasing
         the TCP IW defined in RFC 6928.

         The two methods have different functions and different response
         to loss/congestion.

         RFC 6928 proposes an experimental update to TCP that would
         increase the IW to ten segments.  This would allow faster
         opening of the cwnd, and also a large (same size) restart
         window.  This approach is based on the assumption that many
         forward paths can sustain bursts of up to ten segments without
         (appreciable) loss.  Such a significant increase in cwnd must
         be matched with an equally large reduction of cwnd if loss/
         congestion is detected, and such a congestion indication is
         likely to require future use of IW=10 to be disabled for this
         path for some time.  This guards against the unwanted behaviour
         of a series of short flows continuously flooding a network path
         without network congestion feedback.

         In contrast, this document proposes an update with a rationale
         that relies on recent previous path history to select an
         appropriate cwnd after restart.

         The behaviour differs in three ways:

         1) For applications that send little initially, new-cwv may
         constrain more than RFC 6928, but would not require the
         connection to reset any path information when a restart
         incurred loss.  In contrast, new-cwv would allow the TCP
         connection to preserve the cached cwnd, any loss, would impact
         cwnd, but not impact other flows.

         2) For applications that utilise more capacity than provided by
         a cwnd of 10 segments, this method would permit a larger
         restart window compared to a restart using the method in RFC
         6928.  This is justified by the recent path history.

         3) new-CWV is attended to also be used for rate-limited
         applications, where the application sends, but does not seek to
         fully utilise the cwnd.  In this case, new-cwv constrains the
         cwnd to that justified by the recent path history.  The
         performance trade-offs are hence different, and it would be

Fairhurst, et al.      Expires September 24, 2014              [Page 17]

Internet-Draft                   new-CWV                      March 2014

         possible to enable new-cwv when also using the method in RFC
         6928, and yield benefits.

      o There is potential overlap with the Laminar proposal (draft-

         The current draft was intended as a standards-track update to
         TCP, rather than a new transport variant.  At least, it would
         be good to understand how the two interact and whether there is
         a possibility of a single method.

      o There is potential performance loss in loss of a short burst
      (off list with M Allman)

         A sender can transmit several segments then become idle.  If
         the first segments are all ACK'ed the ssthresh collapses to a
         small value (no new data is sent by the idle sender).  Loss of
         the later data results in congestion (e.g. maybe a RED drop or
         some other cause, rather than the maximum rate of this flow).
         When the sender performs loss recovery it may have an
         appreciable pipeACK and cwnd, but a very low FlightSize - the
         Standard algorithm results in an unusually low cwnd ((1/2)*

         A constant rate flow would have maintained a FlightSize
         appropriate to pipeACK (cwnd if it is a bulk flow).

         This could be fixed by adding a new state variable?  It could
         also be argued this is a corner case (e.g. loss of only the
         last segments would have resulted in RTO), the impact could be

      o There is potential interaction with TCP Control Block Sharing(M

         An application that is non-validated can accumulate a cwnd that
         is larger than the actual capacity.  Is this a fair value to
         use in TCB sharing?

         We propose that TCB sharing should use the pipeACK in place of
         cwnd when a TCP sender is in the Non-validated phase.  This

Fairhurst, et al.      Expires September 24, 2014              [Page 18]

Internet-Draft                   new-CWV                      March 2014

         value better reflects the capacity that the flow has utilised
         in the network path.

9.2.  Revision notes

   RFC-Editor note: please remove this section prior to publication.

   Draft 03 was submitted to ICCRG to receive comments and feedback.

   Draft 04 contained the first set of clarifications after feedback:

   o  Changed name to application limited and used the term rate-limited
      in all places.

   o  Added justification and many minor changes suggested on the list.

   o  Added text to tie-in with more accurate ECN marking.

   o  Added ref to Hug01

   Draft 05 contained various updates:

   o  New text to redefine how to measure the acknowledged pipe,
      differentiating this from the FlightSize, and hence avoiding
      previous issues with infrequent large bursts of data not being
      validated.  A key point new feature is that pipeACK only triggers
      leaving the NVP after the size of the pipe has been acknowledged.
      This removed the need for hysteresis.

   o  Reduction values were changed to 1/2, following analysis of
      suggestions from ICCRG.  This also sets the "target" cwnd as twice
      the used rate for non-validated case.

   o  Introduced a symbolic name (NVP) to denote the 5 minute period.

   Draft 06 contained various updates:

   o  Required reset of pipeACK after congestion.

   o  Added comment on the effect of congestion after a short burst (M.

   o  Correction of minor Typos.

   WG draft 00 contained various updates:

   o  Updated initialisation of pipeACK to maximum value.

Fairhurst, et al.      Expires September 24, 2014              [Page 19]

Internet-Draft                   new-CWV                      March 2014

   o  Added note on intended status still to be determined.

   WG draft 01 contained:

   o  Added corrections from Richard Scheffenegger.

   o  Raffaello Secchi added to the mechanism, based on implementation

   o  Removed that the requirement for the method to use TCP SACK option

   o  Although it may be desirable to use SACK, this is not essential to
      the algorithm.

   o  Added the notion of the sampling period to accommodate large rate
      variations and ensure that the method is stable.  This algorithm
      to be validated through implementation.

   WG draft 02 contained:

   o  Clarified language around pipeACK variable and pipeACK sample -
      Feedback from Aris Angelogiannopoulos.

   WG draft 03 contained:

   o  Editorial corrections - Feedback from Anna Brunstrom.

   o  An adjustment to the procedure at the start and end of Reoloss
      recovery to align the two equations.

   o  Further clarification of the "undefined" value of the pipeACK

   WG draft 04 contained:

   o  Editorial corrections.

   o  Introduced the "cwnd-limited" term.

   o  An adjustment to the procedure at the start of a cwnd-limited
      phase - the new text is intended to ensure that new-cwv is not
      unnecessarily more conservative than standard TCP when the flow is
      cwnd-limited.  This resolves two issues: first it prevents
      pathologies in which pipeACK increases slowly and erratically.  It
      also ensures that performance of bulk applications is not
      significantly impacted when using the method.

Fairhurst, et al.      Expires September 24, 2014              [Page 20]

Internet-Draft                   new-CWV                      March 2014

   o  Clearly identifies that pacing (or equivalent) is requiring during
      the NVP to control burstiness.  New section added.

   WG draft 05 contained:

   o  Clarification to first two bullets in Section 4.4 describing cwnd-
      limited, to explain these are really alternates to the same case.

   o  Section giving implementation examples was restructured to clarify
      there are two methods described.

   o  Cross References to sections updated - thanks to comments from
      Martin Winbjoerk and Tim Wicinski.

   WG draft 06 contained:

   o  The section giving implementation examples was restructured to
      clarify there are two methods described.

   o  Justification of design decisions.

   o  Re-organised text to improve clarity of argument.

10.  References

10.1.  Normative References

   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7, RFC
              793, September 1981.

   [RFC2018]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
              Selective Acknowledgment Options", RFC 2018, October 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2861]  Handley, M., Padhye, J., and S. Floyd, "TCP Congestion
              Window Validation", RFC 2861, June 2000.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP", RFC
              3168, September 2001.

   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
              Control", RFC 5681, September 2009.

Fairhurst, et al.      Expires September 24, 2014              [Page 21]

Internet-Draft                   new-CWV                      March 2014

   [RFC6675]  Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M.,
              and Y. Nishida, "A Conservative Loss Recovery Algorithm
              Based on Selective Acknowledgment (SACK) for TCP", RFC
              6675, August 2012.

10.2.  Informative References

   [All05]    Allman, M. and E. Blanton, "Notes on burst mitigation for
              transport protocols", March 2005.

   [Bis08]    Biswas, I. and G. Fairhurst, "A Practical Evaluation of
              Congestion Window Validation Behaviour, 9th Annual
              Postgraduate Symposium in the Convergence of
              Telecommunications, Networking and Broadcasting (PGNet),
              Liverpool, UK", June 2008.

   [Bis10]    Biswas, I., Sathiaseelan, A., Secchi, R., and G.
              Fairhurst, "Analysing TCP for Bursty Traffic, Int'l J. of
              Communications, Network and System Sciences, 7(3)", June

   [Bis11]    Biswas, I., "PhD Thesis, Internet congestion control for
              variable rate TCP traffic, School of Engineering,
              University of Aberdeen", June 2011.

   [Fai12]    Sathiaseelan, A., Secchi, R., Fairhurst, G., and I.
              Biswas, "Enhancing TCP Performance to support Variable-
              Rate Traffic, 2nd Capacity Sharing Workshop, ACM CoNEXT,
              Nice, France, 10th December 2012.", June 2008.

   [Hug01]    Hughes, A., Touch, J., and J. Heidemann, "Issues in TCP
              Slow-Start Restart After Idle (Work-in-Progress)",
              December 2001.

   [Liu07]    Liu, D., Allman, M., Jiny, S., and L. Wang, "Congestion
              Control without a Startup Phase, 5th International
              Workshop on Protocols for Fast Long-Distance Networks
              (PFLDnet), Los Angeles, California, USA", February 2007.

   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
              "Computing TCP's Retransmission Timer", RFC 6298, June

Fairhurst, et al.      Expires September 24, 2014              [Page 22]

Internet-Draft                   new-CWV                      March 2014

   [RFC6928]  Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
              "Increasing TCP's Initial Window", RFC 6928, April 2013.

Authors' Addresses

   Godred Fairhurst
   University of Aberdeen
   School of Engineering
   Fraser Noble Building
   Aberdeen, Scotland  AB24 3UE


   Arjuna Sathiaseelan
   University of Aberdeen
   School of Engineering
   Fraser Noble Building
   Aberdeen, Scotland  AB24 3UE


   Raffaello Secchi
   University of Aberdeen
   School of Engineering
   Fraser Noble Building
   Aberdeen, Scotland  AB24 3UE


Fairhurst, et al.      Expires September 24, 2014              [Page 23]