TSVWG Working Group                                         G. Fairhurst
Internet-Draft                                    University of Aberdeen
Updates: RFC 3819 (if published) (if approved)                B. Briscoe
Intended status: Best Current Practice                                BT
Expires: September 12, 2013                               March 11, 2013

                      Advice on network buffering


   This document proposes an update to the advice given in RFC 3819.
   Subsequent research has altered understanding of buffer sizing and
   queue management.  Therefore this document significantly revises the
   previous recommendations on buffering.  The advice applies to all
   packet buffers, whether in network equipment, end hosts or
   middleboxes such as firewalls or NATs.  And the advice applies to
   packet buffers at any layer: whether subnet, IP, transport or

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 12, 2013.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect

Fairhurst & Briscoe    Expires September 12, 2013               [Page 1]

Internet-Draft        Advice on network buffering             March 2013

   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Updated Recommendations on Buffering  . . . . . . . . . . . .   4
     3.1.  Recommendations Applicable to Any Buffer  . . . . . . . .   4
     3.2.  Buffering recommendations for end hosts . . . . . . . . .   5
     3.3.  Buffering recommendations for edge routers and switches .   5
     3.4.  Buffering recommendations for core routers and switches .   6
     3.5.  Recommendations on Flow Isolation . . . . . . . . . . . .   6
   4.  Buffer Management Methods . . . . . . . . . . . . . . . . . .   6
     4.1.  Examples of subnetwork buffering  . . . . . . . . . . . .   6
     4.2.  Examples of methods for active buffer management  . . . .   7
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   7
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   8
     8.2.  Informative References  . . . . . . . . . . . . . . . . .   8
   Appendix A.  vious IETF guidance for configuring network buffers    9
   Appendix B.  Revision notes . . . . . . . . . . . . . . . . . . .  10
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

1.  Introduction

   [RFC3819] provides guidance on the design of subnetworks and
   networking equipment.  This document updates this guidance for the
   topic of Internet buffer configuration and control.  The guidance is
   aimed at both equipment designers and network operators.

   All networking devices use buffers to temporarily store packets that
   are waiting for transmission on an out-going link during traffic
   bursts or at times when the capacity of the ingress/egress changes.

Fairhurst & Briscoe    Expires September 12, 2013               [Page 2]

Internet-Draft        Advice on network buffering             March 2013

   The congestion control algorithms in TCP (and derivatives of TCP) are
   designed to try to fully utilise the link that has the least
   available capacity on the path across the network.  This is called
   the bottleneck link.  Network link capacities are typically arranged
   so that it will be rare for a bottleneck to arise in the network
   core.  However, depending on prevailing patterns of traffic, any link
   might become the bottleneck (within the host, at an edge router, at a
   core router, at a switch in the subnet between routers or at some
   middlebox such as a firewall or a network address translator).
   Modern TCP stacks are capable of filling a link of any capacity.

   A buffer that simply discards incoming packets when it is full is
   called a tail-drop buffer.  A long-running TCP flow will fill a tail-
   drop buffer and keep it full, so that there is no longer any space to
   absorb bursts.  This is called a standing queue.  Packets arriving at
   the tail of a standing queue still work their way through the buffer
   until they emerge onto the link, but this introduces unnecessary
   delay to every packet, including those from other sessions sharing
   the link.  This can intermittently add intolerable delay to a real-
   time interactive media session (e.g.  voice or video).  Also, most
   Web pages involve dozens of short back-and-forth exchanges, so adding
   even a small amount of queuing delay to each round can accumulate
   considerable delay in the completion of the whole task.

   The recommended way to avoid these problems is to use an active queue
   management (AQM) algorithm in every potential bottleneck buffer
   (subnet, router, middlebox or host), and to enable explicit
   congestion notification (ECN).  However, if AQM has not been
   implemented in existing equipment, the next best option is to at
   least size the buffer so that it is no larger than needed to absorb

   This document gives advice on using and configuring AQM algorithms
   and ECN, and advice on buffer sizing in the absence of such

   The correct buffer size depends on the link rate, so a common problem
   is where equipment auto-adjusts its rate, often over a wide range, so
   the buffer size can be badly incorrect.  Advice is also given on how
   to relate buffer auto-sizing algorithms to rate-adjusting algorithms,
   and the best static buffer size to configure if auto-sizing has not
   been implemented.

   It is difficult to test whether a network might exhibit these
   problems.  They only appear intermittently, because they depend on
   four pathologies co-inciding: i) a particular buffer has become the
   bottleneck for a long-running TCP flow, which depends on relative
   traffic levels in other links, ii) the TCP flow has run for long

Fairhurst & Briscoe    Expires September 12, 2013               [Page 3]

Internet-Draft        Advice on network buffering             March 2013

   enough to fill this buffer, iii) the buffer lacks AQM or the AQM is
   badly configured and iv) the buffer has been badly over-sized.  When
   all four conditions co-incide, the delays can be bad enough to lead
   to support desk calls.

   This document updates section 13 of RFC 3819, which gave guidance to
   subnet designers on the use and sizing of buffers.  Appendix A
   reviews that guidance, which now requires considerable revision in
   the light of subsequent research.  Also, whereas RFC 3819 addressed
   subnet designers, the advice in this document is relevant to a wider
   audience, because it concerns buffers wherever they are, including in
   end-systems and middleboxes not just in subnet technology.

2.  Terminology

   The document assumes familiarity with the terminology of RFC 3819

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in [RFC2119].

   The term active queue management (AQM) has been applied to
   technologies that work only at the packet level as well as
   technologies that identify and police flows with above average rates
   or that enforce flow-level or user-level policies such as fair
   queuing.  For this document, we will use the term 'AQM' for
   technologies or parts of technologies that treat packets
   indiscriminately, and the term 'policing' for the additional
   technologies that attempt to enforce some level of behaviour or
   isolation at the flow or user level of granularity.

3.  Updated Recommendations on Buffering

   This section updates the rules for network buffers in section 13 of
   RFC 3819.

3.1.  Recommendations Applicable to Any Buffer

   XX Work in Progress, to be included in next revision XX

   AQM is strongly recommended recommended for any buffer.  Auto-tuned
   configuration is recommended.

   Explicit Congestion Notification (ECN) [RFC3168] is also strongly
   recommended for any buffer (this avoids delays due to timeouts after
   loss).  It is safe to enable ECN for routers and servers.  If
   concerns arise over the use of ECN, this can be fully addressed by

Fairhurst & Briscoe    Expires September 12, 2013               [Page 4]

Internet-Draft        Advice on network buffering             March 2013

   turning off ECN support at the endpoint.  If routers and servers were
   not to enable ECN, where it is deemed safe, it will not be possible
   for endpoints to turn it on.

   Buffer size: if AQM is implemented, there is no harm in having a
   large buffer to absorb bursts.  However, if there is no AQM, it is
   important to keep the buffer small.

   o  Too little buffering can result in poor utilisation of the egress
      link, since many traffic flows are not smooth-paced and bursts of
      traffic may fail to be buffered.

   o  Large buffers can help ensure full utilisation of the egress link,
      but excessive buffering results in slow response to congestion and
      in unnecessary delay experienced by any flow that shares the
      egress link.  Such events are not uncommon, since a single long-
      lived connection using a modern TCP stack can fill any size of
      network buffer.

   Auto-sizing is recommended if the line rate is adjustable or auto-
   adjusts (e.g.  setting buffer time, not byte-size).  If auto-sizing
   has not been implemented, a large buffer is not best.  Too small a
   buffer reduces link utilisaiton.  If it is necessary to find a
   compromise size for adjustable line rates, should consider
   sacrificing some utilisation at lower rates to keep the buffer delay

3.2.  Buffering recommendations for end hosts

   XX Work in Progress, to be included in next revision XX

   Large buffers are not best.  AQM and auto-tuning/auto-sizing are as
   applicable in end hosts as in network equipment.

   ECN may even be appropriate (e.g.  on a subsystem such as a NIC), but
   within a host it should be possible to use back-pressure messages

   Buffer sizing recommendations specific to end-systems.

3.3.  Buffering recommendations for edge routers and switches

   XX Work in Progress, to be included in next revision XX

   Large is not best.

   AQM and ECN are strongly recommended.

Fairhurst & Briscoe    Expires September 12, 2013               [Page 5]

Internet-Draft        Advice on network buffering             March 2013

   Buffer sizing recommendations specific to edge routers, switches &

3.4.  Buffering recommendations for core routers and switches

   XX Work in Progress, to be included in next revision XX

   Large is not best.

   Buffer sizing recommendations specific to core routers & switches.

3.5.  Recommendations on Flow Isolation

   XX Work in Progress, to be included in next revision XX

   Still a subject of debate and research.  May be able to recommend
   something here, but more likely will commentate on the debate.

4.  Buffer Management Methods

   This section provides informative documentation of current practice.

4.1.  Examples of subnetwork buffering

   This section provides informative examples of buffer configuration
   and their impact on network traffic {TBA: to consider whether to
   bless, deprecate or merely state each of these practices}.

   o  An Ethernet subnetwork may operate over a range of speeds from a
      shared 10 Mbps of capacity to over 40 Gbps.  The buffering
      required depends on the link speed and many Many device drivers
      and operating systems do not adjust their buffering to the
      available capacity.  The first hop link from a host often has a
      higher speed than the subsequent links along a network path.

   o  Subnetwork flow-control can be triggered when a subnetwork link
      suffers congestion.  An example is the use of Ethernet Pause
      frames (e.g.  by consumer Ethernet switches) to slow a sender
      emitting traffic towards a congestion switch port.  These methods
      can increase the buffering experienced by the end-to-end flow.

   o  Docsis 3.1 supports transmission up to 300Mbps.  A current modem
      can be plugged into a current network.  Then suppose a customers
      service only supports 10 Mbps, the network equipment may be 30
      times over-buffered (assuming buffers are dimensioned based on the
      maximum bit rate).  The buffer control amendment may be
      implemented in the modem, and in its provisioning system to
      address this type of issue.  Similar issues apply for other link

Fairhurst & Briscoe    Expires September 12, 2013               [Page 6]

Internet-Draft        Advice on network buffering             March 2013

      technologies, were the offered service is often less than the
      maximum supported rate.

   o  On wireless, bandwidth (and hence network capacity) is often
      highly variable, unless you have a fixed point to point link.
      Even fixed links may use adaptive methods and propagation
      conditions can cause the capacity to var

4.2.  Examples of methods for active buffer management

   This section provides informative examples of active buffer

   While large buffers can lead to an increase in experienced network
   delay, they do not necessarily impact the flow delay.  The issue is
   not how how much buffering is provided, but how the provided buffers
   are used to manage the flow of traffic.

   Several active buffer/queue management methods have been proposed
   that can significantly improve performance of flows using a
   (potentially) congested bottleneck.

   o  RED

   o  CoDel

   o  Pi

   o  etc

5.  Security Considerations

   Decisions on queue management and buffer sizing are neutral to
   security considerations if they act indiscriminately over all
   packets.  Recommendations on treatment or lack of treatment at the
   flow or user-level can have security considerations, which are TBA.

   The question of whether end-systems respond to congestion signals is
   a valid security concern, but outside the scope of this document.

6.  IANA Considerations

   This document does not require any IANA considerations.

   [RFC-ED]: Please remove this section prior to publication.

7.  Acknowledgments

Fairhurst & Briscoe    Expires September 12, 2013               [Page 7]

Internet-Draft        Advice on network buffering             March 2013

   This work was part-funded by the European Community under its Seventh
   Framework Programme through the Reducing Internet Transport Latency
   (RITE) project (ICT-317700).  The views expressed are solely those of
   the author.

   The authors acknowledge contributions from: Jim Gettys.

8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP", RFC
              3168, September 2001.

   [RFC3819]  Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
              RFC 3819, July 2004.

8.2.  Informative References

              Appenzeller, G., Keslassy, I., and N. McKeown, "Sizing
              router buffers; ACM SIGCOMM '04, pages 281-292, New York,
              NY, USA.", 2004.

   [Ganjali]  Ganjali, Y. and N. McKeown, "Update on Buffer Sizing in
              Internet Routers; ACM SIGCOMM Computer Communication
              Review 36 ACM", October 2006.

   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7, RFC
              793, September 1981.

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, July 2003.

              Villamizar, C. and C. Song, "High Performance TCP in
              ANSNET; ACM Computer Communications Review, 24(5):45-60",

   [Wischik]  , "TCP Buffer Sizing Advice", .

Fairhurst & Briscoe    Expires September 12, 2013               [Page 8]

Internet-Draft        Advice on network buffering             March 2013

Appendix A.  vious IETF guidance for configuring network buffers

   This section reviews previous guidance for configuring network
   buffers and motivates the need to update these recommendations.

   Guidance for the use of buffers was provided in section 13 of RFC

   "each node should have enough buffering to hold one
   link_bandwidth*link_delay product's worth of data for each TCP
   connection sharing the link."

   However, in today's Internet, a deployment following this
   recommendation would overly allocate buffering for a network link
   that supports multiple flows.  This is discussed in the observations

   o  This buffering recommendation is appropriate for a device that
      supports a single or small number of bulk TCP flows [Villamizar].

   o  The buffering is unduly large when there are more than a small
      number of flows (e.g.  >10).  The goal of sharing between TCP
      flows requires only that the buffering is sufficient to hold one
      link_bandwidth*path_delay product's worth of data for the longest
      path flow.  The more flows share a link, the less buffering is
      needed [Appenzeller], unless the egress link becomes congested
      with so many flows that there are only a few packets per flow

   o  Many egress links have a higher level of multiplexing (e.g.  >100
      of uncorrelated flows).  This is often found beyond the edge of a
      network.  In this case, the buffer size may be inversely
      proportional to the square root of the number of flows (for medium
      numbers . For still higher levels of multiplexing, this may be of
      the order of the logarithm of the number of flows

   o  Note that while optimal buffering may be a function of the number
      of concurrent flows, it is not recommended to tune buffering by
      dynamically estimating the number of flows sharing a network
      device or path, or by attempting to classify flows as "long",
      "short", etc.  Such estimates are difficult, due to the wide
      variety of flow behaviours and the use of aggregation methods
      (such as tunnels) that hide the traffic of individual flows.

   o  In deployed scenarios (apart from restricted deployments in
      operator-controlled subnetworks), it is usually impossible for a
      router or other network middlebox to know the experienced by a

Fairhurst & Briscoe    Expires September 12, 2013               [Page 9]

Internet-Draft        Advice on network buffering             March 2013

      flow.  In the Internet service model this information is only
      available to end points (e.g.  using feedback provided by TCP
      [RFC0793] or RTCP [RFC3550].  It is therefore not usually possibly
      for operators to use the end-to-end path delay calculation to
      determine the size of buffering when configuring network

   The discussion in section 13 of RFC 3819 summarises:

   "In general, it is wise to err in favor of too much buffering rather
   than too little."

   While this advice may have been appropriate when routers and
   subnetworks with small numbers of flows and low buffer memory
   [Villamizar], this advice is now not appropriate for many modern

   Section 13 of RFC 3819 also motivates using methods such as Active
   Queue Management, AQM and [RFC3168].  However, at the time of writing
   there was little deployment experience, and little understanding of
   how to configure these methods.  We now argue that these methods
   should be considered for deployment in operational networks.

Appendix B.  Revision notes

   RFC-Editor: Please remove this section prior to publication

   Draft 00

   o  This contains the first draft for comment.

Authors' Addresses

   Godred Fairhurst
   University of Aberdeen
   School of Engineering
   Fraser Noble Building
   Aberdeen, Scotland  AB24 3UE

   Email: gorry@erg.abdn.ac.uk
   URI:   http://www.erg.abdn.ac.uk/~gorry

Fairhurst & Briscoe    Expires September 12, 2013              [Page 10]

Internet-Draft        Advice on network buffering             March 2013

   Bob Briscoe
   B54/77, Adastral Park
   Martlesham Heath, Ipswich  IP5 3RE

   Phone: +44 1473 645196
   Email: bob.briscoe@bt.com
   URI:   http://bobbriscoe.net/

Fairhurst & Briscoe    Expires September 12, 2013              [Page 11]