Network Working Group                                     L. Andrew, Ed.
Internet-Draft                             CAIA, Swinburne University of
Intended status: BCP                                          Technology
Expires: July 10, 2009                                     S. Floyd, Ed.
                                       ICSI Center for Internet Research
                                                         G. Wang, editor
                                                              NEC, China
                                                         January 6, 2009


                      Common TCP Evaluation Suite
                      draft-irtf-tmrg-tests-01.txt

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on July 10, 2009.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.

Abstract

   This document presents an evaluation test suite for the initial
   evaluation of proposed TCP modifications.  The goal of the test suite
   is to allow researchers to quickly and easily evaluate their proposed
   TCP extensions in simulators and testbeds using a common set of well-
   defined, standard test cases, in order to compare and contrast
   proposals against standard TCP as well as other proposed
   modifications.  This test suite is not intended to result in an
   exhaustive evaluation of a proposed TCP modification or new



Andrew, Ed., et al.       Expires July 10, 2009                 [Page 1]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   congestion control mechanism.  Instead, the focus is on quickly and
   easily generating an initial evaluation report that allows the
   networking community to understand and discuss the behavioral aspects
   of a new proposal, in order to guide further experimentation that
   will be needed to fully investigate the specific aspects of a new
   proposal.













































Andrew, Ed., et al.       Expires July 10, 2009                 [Page 2]


Internet-Draft         Common TCP Evaluation Suite          January 2009


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Traffic generation . . . . . . . . . . . . . . . . . . . . . .  4
     2.1.  Loads  . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.2.  Equilibrium  . . . . . . . . . . . . . . . . . . . . . . .  6
     2.3.  Packet size distribution . . . . . . . . . . . . . . . . .  7
     2.4.  Round Trip Times . . . . . . . . . . . . . . . . . . . . .  7
   3.  Scenarios  . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     3.1.  Basic scenarios  . . . . . . . . . . . . . . . . . . . . .  9
       3.1.1.  Topology and background traffic  . . . . . . . . . . .  9
       3.1.2.  Flows under test . . . . . . . . . . . . . . . . . . . 11
       3.1.3.  Outputs  . . . . . . . . . . . . . . . . . . . . . . . 12
     3.2.  Delay/throughput tradeoff as function of queue size  . . . 12
       3.2.1.  Topology and background traffic  . . . . . . . . . . . 12
       3.2.2.  Flows under test . . . . . . . . . . . . . . . . . . . 13
       3.2.3.  Outputs  . . . . . . . . . . . . . . . . . . . . . . . 13
     3.3.  Ramp up time: completion time of one flow  . . . . . . . . 13
       3.3.1.  Topology and background traffic  . . . . . . . . . . . 13
       3.3.2.  Flows under test . . . . . . . . . . . . . . . . . . . 14
       3.3.3.  Outputs  . . . . . . . . . . . . . . . . . . . . . . . 14
     3.4.  Transients: release of bandwidth, arrival of many flows  . 14
       3.4.1.  Topology and background traffic  . . . . . . . . . . . 14
       3.4.2.  Flows under test . . . . . . . . . . . . . . . . . . . 15
       3.4.3.  Outputs  . . . . . . . . . . . . . . . . . . . . . . . 15
     3.5.  Impact on standard TCP traffic . . . . . . . . . . . . . . 15
       3.5.1.  Topology and background traffic  . . . . . . . . . . . 15
       3.5.2.  Flows under test . . . . . . . . . . . . . . . . . . . 16
       3.5.3.  Outputs  . . . . . . . . . . . . . . . . . . . . . . . 16
       3.5.4.  Suggestions  . . . . . . . . . . . . . . . . . . . . . 16
     3.6.  Intra-protocol and inter-RTT fairness  . . . . . . . . . . 17
       3.6.1.  Topology and background traffic  . . . . . . . . . . . 17
       3.6.2.  Flows under test . . . . . . . . . . . . . . . . . . . 17
       3.6.3.  Outputs  . . . . . . . . . . . . . . . . . . . . . . . 17
     3.7.  Multiple bottlenecks . . . . . . . . . . . . . . . . . . . 18
       3.7.1.  Topology and background traffic  . . . . . . . . . . . 18
       3.7.2.  Flows under test . . . . . . . . . . . . . . . . . . . 19
       3.7.3.  Outputs  . . . . . . . . . . . . . . . . . . . . . . . 20
     3.8.  Implementations  . . . . . . . . . . . . . . . . . . . . . 20
     3.9.  Conclusions  . . . . . . . . . . . . . . . . . . . . . . . 20
     3.10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 20
   4.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 21
   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 21
   6.  Informative References . . . . . . . . . . . . . . . . . . . . 21
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23






Andrew, Ed., et al.       Expires July 10, 2009                 [Page 3]


Internet-Draft         Common TCP Evaluation Suite          January 2009


1.  Introduction

   This document describes a common test suite for the initial
   evaluation of new TCP extensions.  It defines a small number of
   evaluation scenarios, including traffic and delay distributions,
   network topologies, and evaluation parameters and metrics.  The
   motivation for such an evaluation suite is to help researchers in
   evaluating their proposed modifications to TCP.  The evaluation suite
   will also enable independent duplication and verification of reported
   results by others, which is an important aspect of the scientific
   method that is not often put to use by the networking community.  A
   specific target is that the evaluations should be able to be
   completed in three days of simulations, or in a reasonable amount of
   effort in a testbed.

   This document is an outcome of a ``round-table'' meeting on TCP
   evaluation, held at Caltech on November 8-9, 2007.  This document is
   the first step in constructing the evaluation suite; the goal is for
   the evaluation suite to be adapted in response from feedback from the
   networking community.


2.  Traffic generation

   Congestion control concerns the response of flows to bandwidth
   limitations or to the presence of other flows.  For a realistic
   testing of a congestion control protocol, we design scenarios to use
   reasonably-typical traffic; most scenarios use traffic generated from
   a traffic generator, with a range of start times for user sessions,
   connection sizes, and the like, mimicking the traffic patterns
   commonly observed in the Internet.  Cross-traffic and reverse-path
   traffic have the desirable effect of reducing the occurrence of
   pathological conditions such as global synchronization among
   competing flows that might otherwise be mis-interpreted as normal
   average behaviours of those protocols [FK03], [MV06].  This traffic
   must be reasonably realistic for the tests to predict the behaviour
   of congestion control protocols in real networks, and also well-
   defined so that statistical noise does not mask important effects.

   It is important that the same ``amount'' of congestion or cross-
   traffic be used for the testing scenarios of different congestion
   control algorithms.  This is complicated by the fact that packet
   arrivals and even flow arrivals are influenced by the behavior of the
   algorithms.  For this reason, a pure packet-level generation of
   traffic where generated traffic does not respond to the behaviour of
   other present flows is not suitable.  Instead, emulating application
   or user behaviours at the end points using reactive protocols such as
   TCP in a closed-loop fashion results in a closer approximation of



Andrew, Ed., et al.       Expires July 10, 2009                 [Page 4]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   cross-traffic, where user behaviours are modeled by well-defined
   parameters for source inputs (e.g., request sizes for HTTP),
   destination inputs (e.g., response size), and think times between
   pairs of source and destination inputs.  By setting appropriate
   parameters for the traffic generator, we can emulate non-greedy user-
   interactive traffic (e.g., HTTP 1.1, SMTP and Telnet) as well as
   greedy traffic (e.g., P2P and long file downloads).  This approach
   models protocol reactions to the congestion caused by other flows in
   the common paths, although it fails to model the reactions of users
   themselves to the presence of the congestion.

   While the protocols being tested may differ, it is important that we
   maintain the same ``load'' or level of congestion for the
   experimental scenarios.  To enable this, we use a hybrid of open-loop
   and close-loop approaches.  For this test suite, network traffic
   consists of sessions corresponding to individual users.  Because
   users are independent, these session arrivals are well modeled by an
   open-loop Poisson process.  A session may consist of a single greedy
   TCP flow, multiple greedy flows separated by user ``think'' times, or
   a single non-greedy flow with embedded think times.  The session
   arrival process forms a Poisson process [HVA03].  Both the think
   times and burst sizes have heavy-tailed distributions, with the exact
   distribution based on empirical studies.  The think times and burst
   sizes will be chosen independently.  This is unlikely to be the case
   in practice, but we have not been able to find any measurements of
   the joint distribution.  We invite researchers to study this joint
   distribution, and future revisions of this test suite will use such
   statistics when they are available.

   There are several traffic generators available that implement a
   similar approach to that discussed above.  For now, we are planning
   to use the Tmix [Tmix] traffic generator.  Tmix represents each TCP
   connection by a connection vector consisting of a sequence of
   (request-size, response-size, think-time) triples, thus representing
   bi-directional traffic.  Connection vectors used for traffic
   generation can be obtained from Internet traffic traces.  By taking
   measurement from various points of the Internet such as campus
   networks, DSL access links, and IPS core backbones, we can obtain
   sets of connection vectors for different levels of congested links.
   We plan to publish these connection vectors as part of this test
   suite.  A draft set of connection vectors is avilable at
   <http://wil.cs.caltech.edu/suite/TrafficTraces.php>.

2.1.  Loads

   For most current traffic generators, the traffic is specified by an
   arrival rate for independent user sessions, along with specifications
   of connection sizes, number of connections per sessions, user wait



Andrew, Ed., et al.       Expires July 10, 2009                 [Page 5]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   times within sessions, and the like.  For many of the scenarios, such
   as the basic scenarios in Section 3.1, each scenario is run for a
   range of loads, where the load is varied by varying the rate of
   session arrivals.  For a given congestion control mechanism,
   experiments run with different loads are likely to have different
   packet drop rates, and different levels of statistical multiplexing.

   Because the session arrival times are specified independently of the
   transfer times, one way to specify the load would be as A =
   E[f]/E[t], where  E[f]  is the mean session size (in bits
   transferred),  E[t]  is the mean session inter-arrival time in
   seconds, and  A  is the load in bps.

   It is important to test congestion control in ``overloaded''
   conditions.  However, if  A>c , where  c  is the capacity of the
   bottleneck link, then the system has no equilibrium.  Such cases are
   studied in Section 3.4.  In long-running experiments with  A>c , the
   expected number of flows would increase without bound.  This means
   that the measured results would be very sensitive to the duration of
   the simulation.

   Instead, for equilibrium experiments, we measure the load as the
   ``mean number of jobs in an M/G/1 queue using processor sharing,''
   where a job is a user session.  This reflects the fact that TCP aims
   at processor sharing of variable sized files.  Because processor
   sharing is a symmetric discipline [Kelly79], the mean number of flows
   is equal to that of an M/M/1 queue, namely  rho/(1-rho) , where
    rho=lambda S/C , and  lambda  [flows per second] is the arrival rate
   of jobs/flows,  S  [bits] is the mean job size and  C  [bits per
   second] is the bottleneck capacity.  For small loads, say 10%, this
   is essentially equal to the fraction of the capacity.  However, for
   overloaded systems, the fraction of the bandwidth used will be much
   less than this measure of load.

   In order to improve the traffic generators used in these scenarios,
   we invite researchers to explore how the user behavior, as reflected
   in the connection sizes, user wait times, and number of connections
   per session, might be affected by the level of congestion experienced
   within a session [RMC03].

2.2.  Equilibrium

   In order to minimize the dependence of the results on the experiment
   durations, scenarios should be as stationary as possible.  To this
   end, experiments will start with  rho/(1-rho)  active cross-traffic
   flows, with traffic of the specified load.

   * This is insufficient if the traces have very long pauses between



Andrew, Ed., et al.       Expires July 10, 2009                 [Page 6]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   bursts, because this initial loading has all finished by the time the
   number of actual tmix sessions builds up.  It may be better to start
   with several (many?) pre-existing connection vectors instead of
   greedy sources. *

   *It is still an open issue whether to use tests with  rho>1 .  If
   such tests are used, the initial number of flows will need to be
   defined.*

   Note that the distribution of the durations of the active flows at a
   given time is (often significantly) different from the distribution
   of flow durations, skewed toward long flows.  For simplicity, this
   will be ignored and the initial flow sizes will be drawn from the
   general flow size distribution.

2.3.  Packet size distribution

   For flows generated by the traffic generator, 10% use 536-byte
   packets, and 90% 1500-byte packets.  The packet size of each flow
   will be specified along with the start time and duration, to maximize
   the repeatability.

2.4.  Round Trip Times

   Most tests use a simple dumbbell topology with a central link that
   connects two routers, as illustrated in Figure 1.  Each router is
   connected to three nodes by edge links.  In order to generate a
   typical range of round trip times, edge links have different delays.
   On one side, the one-way propagation delays are: 0 ms, 12 ms and
   25 ms; on the other: 2 ms, 37 ms, and 75 ms.  Traffic is uniformly
   shared among the nine source/destination pairs, giving a distribution
   of per-flow RTTs in the absence of queueing delay shown in Figure 2.
   These RTTs are computed for a dumbbell topology with a delay of 0 ms
   for the central link.  The delay for the central link is given in the
   specific scenarios in the next section.
















Andrew, Ed., et al.       Expires July 10, 2009                 [Page 7]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   Node 1                                                      Node 4
          \_                                                _/
            \_                                            _/
              \_ __________   Intermediate   __________ _/
                |          |     link       |          |
   Node 2 ------| Router 1 |----------------| Router 2 |------ Node 5
               _|__________|                |__________|_
             _/                                          \_
           _/                                              \_
   Node 3 /                                                  \ Node 6


   A dumbbell topology

                                 Figure 1

   For dummynet experiments, delays can be obtained by specifying the
   delay of each flow.


            ------------------------------------------
            | Path | RTT || Path | RTT || Path | RTT |
            |------+-----++------+-----++------+-----|
            | 1-4  | 4   || 1-5  | 74  || 1-6  | 150 |
            | 2-4  | 28  || 2-5  | 98  || 2-6  | 174 |
            | 3-4  | 54  || 3-5  | 124 || 3-6  | 200 |
            ------------------------------------------


   RTTs of the paths between two nodes, in milliseconds. * These RTTs
   are subject to change, based on comparison between the resulting
   packet-weighted RTT distribution and measurements* * I'd like to
   change the RTT 1-4 to 3ms or 5ms instead of 4ms... -- LA*

                                 Figure 2


3.  Scenarios

   It is not possible to provide TCP researchers with a complete set of
   scenarios for an exhaustive evaluation of a new TCP extension;
   especially because the characteristics of a new extension will often
   require experiments with specific scenarios that highlight its
   behavior.  On the other hand, an exhaustive evaluation of a TCP
   extension will need to include several standard scenarios, and it is
   the focus of the test suite described in this section to define this
   initial set of test cases.




Andrew, Ed., et al.       Expires July 10, 2009                 [Page 8]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.1.  Basic scenarios

   The purpose of the basic scenarios is to explore the behavior of a
   TCP extension over different link types.  The scenarios use the
   dumbbell topology of Section 2.4, with the link delays modified as
   specified below.

   This basic topology is used to instantiate several basic scenarios,
   by appropriately choosing capacity and delay parameters for the
   individual links.  Depending on the configuration, the bottleneck
   link may be in one of the edge links or the central link.

3.1.1.  Topology and background traffic

   The basic scenarios are for a single topology, with a range of
   capacities and RTTs.  For each scenario, traffic levels of
   uncongested, mild congestion, and moderate congestion are specified;
   these are explained below.

   *Data Center:* The data center scenario models a case where bandwidth
   is plentiful and link delays are generally low.  It uses the same
   configuration for the central link and all of the edge links.  All
   links have a capacity of either 1 Gbps, 2.5 Gbps or 10 Gbps; links
   from nodes 1, 2 and 4 have a one-way propagation delay of 1 ms, while
   those from nodes 3, 5 and 6 have 10 ms [WCL05], and the common link
   has 0 ms delay.

   Uncongested: TBD     Mild congestion: TBD     Moderate congestion:
   TBD

   *Access Link:* The access link scenario models an access link
   connecting an institution (e.g., a university or corporation) to an
   ISP.  The central and edge links are all 100 Mbps.  The one-way
   propagation delay of the central link is 2 ms, while the edge links
   have the delays given in Section 2.4.  Our goal in assigning delays
   to edge links is only to give a realistic distribution of round-trip
   times for traffic on the central link.

   Uncongested: TBD     Mild congestion: TBD     Moderate congestion:
   TBD

   *Trans-Oceanic Link:* The trans-oceanic scenario models a test case
   where mostly lower-delay edge links feed into a high-delay central
   link.  The central link is 1 Gbps, with a one-way propagation delay
   of 65 ms.  The edge links have the same bandwidth as the central
   link, with the one-way delays given in Section 2.4.  An alternative
   would be to use smaller delays for the edge links, with one-way
   delays for each set of three edge links of 5, 10, and 25 ms.



Andrew, Ed., et al.       Expires July 10, 2009                 [Page 9]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   *Implementations may use a smaller bandwidth for the trans-oceanic
   link, for example to run a simulation in a feasible amount of time.
   In testbeds, one of the metrics should be the number of timeouts in
   servers, due to implementation issues when running at high speed.*

   Uncongested: TBD     Mild congestion: TBD     Moderate congestion:
   TBD

   *Geostationary Satellite:* The geostationary satellite scenario
   models an asymmetric test case with a high-bandwidth downlink and a
   low-bandwidth uplink [HK99], [GF04].  The capacity of the central
   link is 40 Mbps with a one-way propagation delay of 300 ms.  The
   downlink capacity of the edge links is also 40 Mbps, but their uplink
   capacity is only 4 Mbps.  Edge one-way delays are as given in
   Section 2.4.  Note that ``downlink'' is towards the router for edge
   links attached to the first router, and away from the router for edge
   links on the other router.

   Uncongested: TBD     Mild congestion: TBD     Moderate congestion:
   TBD

   *Wireless Access:* The wireless access scenario models wireless
   access to the wired backbone.  The capacity of the central link is
   100 Mbps with 2 ms of one-way delay.  All links to Router 1 are
   wired.  Router 2 has a shared wireless link of nominal bit rate
   11 Mbps (to model IEEE 802.11b links) or 54 Mbps (IEEE 802.11a/g)
   with a one-way delay of 1us connected to dummy nodes 4', 5' and 6',
   which are then connected to nodes 4, 5 and 6 by wired links of delays
   2, 37 and 75 ms.  This is to achieve the same RTT distribution as the
   other scenarios, while allowing a CSMA model to have realistic delay
   for a WLAN.

   Note that wireless links have many other unique properties not
   captured by delay and bitrate.  In particular, the physical layer
   might suffer from propagation effects that result in packet losses,
   and the MAC layer might add high jitter under contention or large
   steps in bandwidth due to adaptive modulation and coding.  Specifying
   these properties is beyond the scope of the current first version of
   this test suite.

   Uncongested: TBD     Mild congestion: TBD     Moderate congestion:
   TBD

   *Dial-up Link:* The dial-up link scenario models a network with a
   dial-up link of 64 kbps and a one-way delay of 5 ms for the central
   link. *modems are asymmetric, 56k downlink and 33.6k or 48k uplink.
   Should we change this?* This could be thought of as modeling a
   scenario reported as typical in Africa, with many users sharing a



Andrew, Ed., et al.       Expires July 10, 2009                [Page 10]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   single low-bandwidth dial-up link.

   Uncongested: TBD     Mild congestion: TBD     Moderate congestion:
   TBD

   *Traffic:* For each of the basic scenarios, three cases are tested:
   uncongested; mild congestion, and moderate congestion.  All cases
   will use scaled versions of the traces available at
   <http://wil.cs.caltech.edu/suite>. *The exact traffic loads and run
   times for each scenario still need to be agreed upon.  There is
   ongoing debate about whether  rho>1  is needed to get moderate to
   high congestion.  If  rho>1  is used, note that the results will
   depend heavily on the run time, because congestion will progressively
   build up.  In those cases, metrics which consider this non-
   stationarity may be more useful than average quantities.* In the
   default case, the reverse path has a low level of traffic (10% load).
   The buffer size at the two routers is set to the maximum bandwidth-
   delay-product for a 100 ms flow (i.e., a maximum queueing delay of
   100 ms), with drop-tail queues in units of packets.  Each run will be
   for at least a hundred seconds, and the metrics will not cover the
   initial warm-up times of each run.  (Testbeds might use longer run
   times, as should simulations with smaller bandwidth-delay products.)

   As with all of the scenarios in this document, the basic scenarios
   could benefit from more measurement studies about characteristics of
   congested links in the current Internet, and about trends that could
   help predict the characteristics of congested links in the future.
   This would include more measurements on typical packet drop rates,
   and on the range of round-trip times for traffic on congested links.

   For the access link scenario, more extensive simulations or
   experiments will be run, with both drop-tail and RED queue
   management, with drop-tail queues in units of both bytes and packets,
   and with RED queue management both in byte mode and in packet mode.
   Specific TCP extensions may require the evaluation of associated AQM
   mechanisms.  For the access link scenario, simulations or experiments
   will also include runs with a reverse-path load equal to the forward-
   path load.  For the access link scenario, additional experiments will
   use a range of buffer sizes, including 20% and 200% of the bandwidth-
   delay product for a 100 ms flow.

3.1.2.  Flows under test

   For this basic scenario, there is no differentiation between ``cross-
   traffic'' and the ``flows under test''.  The aggregate traffic is
   under test, with the metrics exploring both aggregate traffic and
   distributions of flow-specific metrics.




Andrew, Ed., et al.       Expires July 10, 2009                [Page 11]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.1.3.  Outputs

   For each run, the following metrics will be collected, for the
   central link in each direction: the aggregate link utilization, the
   average packet drop rate, and the average queueing delay, all over
   the second half of the run. *This metric could be difficult to gather
   in emulated testbeds since routers statistics of queue utilization
   are not always reliable and depend on time-scale.* Separate
   statistics should be reported for each direction in the satellite and
   wireless access scenarios, since those networks are asymmetric.
   *Should "over the second half of the run" be "starting after 50s"?
   Sally used the second half of the run, for 100s simulations, but we
   to get non-random results, we should run for longer.  The warm-up
   time doesn't need to scale up with the run length.*

   Other metrics of interest for general scenarios can be grouped in two
   sets: flow-centric and stability.  The flow-centric metrics include
   the sending rate, good-put, cumulative loss and queueing delay
   trajectory for each flow, over time, and the transfer time per flow
   versus file size. *Testbeds could use monitors in the TCP layer
   (e.g., Web100) to estimate the queueing delay and loss.* *NS2 flowmon
   has problems, because it seems not to release memory associated with
   terminated flows. * Stability properties of interest include the
   standard deviation of the throughput and the queueing delay for the
   bottleneck link and for flows [WCL05].  The worst case stability is
   also considered.

3.2.  Delay/throughput tradeoff as function of queue size

   Different queue management mechanisms have different delay-throughput
   tradeoffs.  E.g., Adaptive Virtual Queue [KS01] gives low delay, at
   the expense of lower throughput.  Different congestion control
   mechanisms may have different tradeoffs, which these tests aim to
   illustrate.

3.2.1.  Topology and background traffic

   These tests use the topology of Section 2.4.  This test is run for
   the access link scenario in Section 3.1.

   For each Drop-Tail scenario set, five tests are run, with buffer
   sizes of 10%, 20%, 50%, 100%, and 200% of the Bandwidth Delay Product
   (BDP) for a 100 ms flow.  For each AQM scenario (if used), five tests
   are run, with a target average queue size of 2.5%, 5%, 10%, 20%, and
   50% of the BDP, with a buffer equal to the BDP.






Andrew, Ed., et al.       Expires July 10, 2009                [Page 12]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.2.2.  Flows under test

   The level of traffic from the traffic generator will be specified so
   that when a buffer size of 100% of the BDP is used with Drop Tail
   queue management, there is a moderate level of congestion (e.g., 1-2%
   packet drop rates when Standard TCP is used).  Alternately, a range
   of traffic levels could be chosen, with a scenario set run for each
   traffic level (as in the examples cited below).

3.2.3.  Outputs

   For each test, three figures are kept, the average throughput, the
   average packet drop rate, and the average queueing delay over the
   second half of the test.

   For each set of scenarios, the output is two graphs.  For the delay/
   bandwidth graph, the x-axis shows the average queueing delay, and the
   y-axis shows the average throughput.  For the drop-rate graph, the
   x-axis shows the average queueing delay, and the y-axis shows the
   average packet drop rate.  Each pair of graphs illustrates the delay/
   throughput/drop-rate tradeoffs for this congestion control mechanism.
   For an AQM mechanism, each pair of graphs also illustrates how the
   throughput and average queue size vary (or don't vary) as a function
   of the traffic load.  Examples of delay/throughput tradeoffs appear
   in Figures 1-3 of [FS01] and Figures 4-5 of [AHM08].

3.3.  Ramp up time: completion time of one flow

   These tests aim to determine how quickly existing flows make room for
   new flows.

3.3.1.  Topology and background traffic

   Dumbbell.  At least three capacities should be used, as close as
   possible to: 56 kbps, 10 Mbps and 1 Gbps.  The 56 kbps case is
   included to investigate the performance using mobile handsets.

   For each capacity, three RTT scenarios should be tested, in which the
   existing and newly arriving flow have RTTs of (74 ms, 74 ms),
   (124 ms, 28 ms) and (28 ms, 124 ms). *Was (80,80), (120,30),
   (30,120), but the above are taken from Table 1 to simplify
   implementation.  OK? *

   Throughout the experiment, there is also 10% bidirectional cross
   traffic, as described in Section 2, using the mix of RTTs described
   in Section 2.4.  All traffic is from the new TCP extension.





Andrew, Ed., et al.       Expires July 10, 2009                [Page 13]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.3.2.  Flows under test

   Traffic is dominated by two long lived flows, because we believe that
   to be the worst case, in which convergence is slowest.

   One flow starts in ``equilibrium'' (at least having finished normal
   slow-start).  A new flow then starts; slow-start is disabled by
   setting the initial slow-start threshold to the initial CWND.  Slow
   start is disabled because this is the worst case, and could happen if
   a loss occurred in the first RTT. * Roman Chertov has suggested doing
   some tests with slow start enabled too.  Will there be time?  Wait
   until initial NS2 implementation is available to test *

   The experiment ends once the new flow has run for five minutes.  Both
   of the flows use 1500-byte packets.

3.3.3.  Outputs

   The output of these experiments are the time until the  1500(10^n) th
   byte of the new flow is received, for  n = 1,2,... .  This measures
   how quickly the existing flow releases capacity to the new flow,
   without requiring a definition of when ``fairness'' has been
   achieved.  By leaving the upper limit on  n  unspecified, the test
   remains applicable to very high-speed networks.

   A single run of this test cannot achieve statistical reliability by
   running for a long time.  Instead, an average over at least three
   runs should be taken.  Each run must use different cross traffic, as
   specified in Section 2.

3.4.  Transients: release of bandwidth, arrival of many flows

   These tests investigate the impact of a sudden change of congestion
   level.  They differ from the "Ramp up time" test in that the
   congestion here is caused by unresponsive traffic.

3.4.1.  Topology and background traffic

   The network is a single bottleneck link, with bit rate 100 Mbps, with
   a buffer of 1024 packets (120% BDP at 100 ms).

   The transient traffic is generated using UDP, to avoid overlap with
   the scenario of Section 3.3 and isolate the behavior of the flows
   under study.  Three transients are tested:
   1.  step decrease from 75 Mbps to 0 Mbps,
   2.  step increase from 0 Mbps to 75 Mbps,





Andrew, Ed., et al.       Expires July 10, 2009                [Page 14]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   3.  30 step increases of 2.5 Mbps at 1 s intervals, simulating a
       ``flash crowd'' effect.
   These transients occur after the flow under test has exited slow-
   start, and remain until the end of the experiment.

   There is no TCP cross traffic as described in Section 2 in this
   experiment. because flow arrivals/departures occur on timescales long
   compared with these effects.

3.4.2.  Flows under test

   There is one flow under test: a long-lived flow in the same direction
   as the transient traffic, with a 100 ms RTT.

3.4.3.  Outputs

   For the decrease in cross traffic, the metrics are (i) the time taken
   for the flow under test to increase its window to 60%, 80% and 90% of
   its BDP, and (ii) the maximum change of the window in a single RTT
   while the window is increasing to that value.

   For cases with an increase in cross traffic, the metric is the number
   of packets dropped by the cross traffic from the start of the
   transient until 100 s after the transient.  This measures the harm
   caused by algorithms which reduce their rates too slowly on
   congestion.

3.5.  Impact on standard TCP traffic

   Many new TCP proposals achieve a gain,  G , in their own throughput
   at the expense of a loss,  L , in the throughput of standard TCP
   flows sharing a bottleneck, as well as by increasing the link
   utilization.  In this context a "standard TCP flow" is defined as a
   flow using SACK TCP [RFC2883], but without ECN [RFC3168]. *What
   about: * Window scaling [RFC1323] (yes), FRTO [RFC4138] (yes), ABC
   [RFC3465] (no)?  The intention is for a "standard TCP flow" to
   correspond to TCP as commonly deployed in the Internet today (with
   the notable exception of CUBIC, which runs by default on the majority
   of web servers).  This scenario quantifies this tradeoff.

3.5.1.  Topology and background traffic

   The dumbbell of Section 2.4 is used with the same capacities as for
   the convergence tests (Section 3.3).  All traffic in this scenario
   comes from the flows under test.






Andrew, Ed., et al.       Expires July 10, 2009                [Page 15]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.5.2.  Flows under test

   The scenario is performed by conducting pairs of experiments, with
   identical flow arrival times and flow sizes.  Within each experiment,
   flows are divided into two camps.  For every flow in camp A, there is
   a flow with the same size, source and destination in camp B, and vice
   versa.  The start time of the two flows are within 2 s.

   The file sizes and start times are as specified in Section 2, with
   start times scaled to achieve loads of 50% and 100%.  In addition,
   both camps have a long-lived flow.  The experiments last for 1200
   seconds.

   In the first experiment, called BASELINE, both camp A and camp B use
   standard TCP.  In the second, called MIX, camp A uses standard TCP
   and camp B uses the new TCP extension.

   The rationale for having paired camps is to remove the statistical
   uncertainty which would come from randomly choosing half of the flows
   to run each algorithm.  This way, camp A and camp B have the same
   loads.

3.5.3.  Outputs

   The gain achieved by the new algorithm and loss incurred by standard
   TCP are given by  G=T(B)_Mix/T(B)_Baseline  and  L=T(A)_Mix/
   T(A)_Baseline  where  T(x)  is the throughput obtained by camp  x ,
   measured as the amount of data acknowledged by the receivers (that
   is, ``goodput''), and taken over the last 8000 seconds of the
   experiment.

   The loss,  L , is analogous to the ``bandwidth stolen from TCP''
   in [SA03] and ``throughput degradation'' in [SSM07].

   A plot of  G  vs  L  represents the tradeoff between efficiency and
   loss.

3.5.4.  Suggestions

   Other statistics of interest are the values of  G  and  L  for each
   quartile of file sizes.  This will reveal whether the new proposal is
   more aggressive in starting up or more reluctant to release its share
   of capacity.

   As always, testing at other loads and averaging over multiple runs
   are encouraged.





Andrew, Ed., et al.       Expires July 10, 2009                [Page 16]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.6.  Intra-protocol and inter-RTT fairness

   These tests aim to measure bottleneck bandwidth sharing among flows
   of the same protocol with the same RTT, which represents the flows
   going through the same routing path.  The tests also measure inter-
   RTT fairness, the bandwidth sharing among flows of the same protocol
   where routing paths have a common bottleneck segment but might have
   different overall paths with different RTTs.

3.6.1.  Topology and background traffic

   The topology, the capacity and cross traffic conditions of these
   tests are the same as in Section 3.3.  The bottleneck buffer is
   varied from 25% to 200% BDP for a 100 ms flow, increasing by factors
   of 2.

3.6.2.  Flows under test

   We use two flows of the same protocol for this experiment.  The RTTs
   of the flows range from 10 ms to 160 ms (10 ms, 20 ms, 40 ms, 80 ms,
   and 160 ms) such that the ratio of the minimum RTT over the maximum
   RTT is at most 1/16. *In case a testbed doesn't support up to 160 ms
   RTT, the RTTs may be scaled down in proportion to the maximum RTT
   supported in that environment.*

   Intra-protocol fairness: For each run, two flows with the same RTT,
   taken from the range of RTTs above start randomly within the first
   10% of the experiment.  The order in which these flows start doesn't
   matter.  An additional test of interest, but not part of this suite,
   would involve two extreme cases - two flows with very short or long
   RTTs (e.g., the delay less then 1-2 ms represents communication
   happen in the data-center and the delay larger than 600 ms considers
   communication over satellite).

   Inter-RTT fairness: For each run, one flow with a fixed RTT of 160 ms
   starts first, and another flow with a different RTT taken from the
   range of RTTs above, joins afterward.  The starting times of both two
   flows are randomly chosen within the first 10% of the experiment as
   before.

3.6.3.  Outputs

   The output of this experiment is the ratio of the average throughput
   values of the two flows.  The output also includes the packet drop
   rate for the congested link.






Andrew, Ed., et al.       Expires July 10, 2009                [Page 17]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.7.  Multiple bottlenecks

   These experiments explore the relative bandwidth for a flow that
   traverses multiple bottlenecks, and flows with the same round-trip
   time that each traverse only one of the bottleneck links.

3.7.1.  Topology and background traffic

   The topology is a ``parking-lot'' topology with three (horizontal)
   bottleneck links and four (vertical) access links.  The bottleneck
   links have a rate of 100 Mbps, and the access links have a rate of
   1 Gbps.

   All flows have a round-trip time of 60 ms, to enable the effect of
   traversing multiple bottlenecks to be distinguished from that of
   different round trip times.  This can be achieved as in Figure 3 by
   (a) the second access link having a one-way delay of 30 ms (b) the
   bottleneck link to which it does not connect having a one-way delay
   of 30 ms and (c) all other links having negligible delay.  This can
   be extended to more than three bottlenecks as shown in Figure 4, by
   assigning a delay of 30 ms to every alternate access link, and to
   zero or one of the bottleneck links.)  For the special case of three
   hops, an alternative is for all links to have a one-way delay of
   10 ms, as shown in Figure 5.  It is not clear whether there are
   interesting performance differences between these two topologies, and
   if so, which is more typical of the actual internet.


    > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >
     _____ 100M 0ms ____________ 100M 0ms _____________ 100M 30ms ____
    |   ................  |   ................  |   ................  |
   1G   :              : 1G   :              : 1G   :              : 1G
    |   :              :  |   :              :  |   :              :  |
   0ms  :              : 30ms :              : 0ms  :              : 0ms
    |   ^              V  |   ^              V  |   ^              V  |


   Basic multi-hop topology.

                                 Figure 3











Andrew, Ed., et al.       Expires July 10, 2009                [Page 18]


Internet-Draft         Common TCP Evaluation Suite          January 2009


       -------+------+------+------+------+-------######
       |......#......|......#......|......#......|......|
       |:    :#:    :|:    :#:    :|:    :#:    :|:    :|
       |:    :#:    :|:    :#:    :|:    :#:    :|:    :|
       |:    :#:    :|:    :#:    :|:    :#:    :|:    :|
       |^    V#^    V|^    V#^    V|^    V#^    V|^    V|

          ### 30ms one-way       ---  0ms one-way


   Extension to 7-hop parking lot.  (Not part of the basic test suite.)

                                 Figure 4



    > - - - - - - - - - - - - - - - - - - - - - - - - - - -  - - - >
     _____ 100M 10ms __________ 100M 10ms ___________ 100M 10ms ___
    |   ...............  |   ...............  |   ...............  |
   1G   :             : 1G   :             : 1G   :             : 1G
    |   :             :  |   :             :  |   :             :  |
   10ms :             : 10ms :             : 10ms :             : 10ms
    |   ^             V  |   ^             V  |   ^             V  |


   Alternative highly symmetric multi-hop topology.

                                 Figure 5

   Throughout the experiment, there is 10% bidirectional cross traffic
   on each of the three bottleneck links, as described in Section 2.
   The cross-traffic flows all traverse two access links and a single
   bottleneck link.

   All traffic uses the new TCP extension.

3.7.2.  Flows under test

   In addition to the cross-traffic, there are four flows under test,
   all with traffic in the same direction on the bottleneck links.  The
   multiple-bottleneck flow traverses no access links and all three
   bottleneck links.  The three single-bottleneck flows each traverse
   two access links and a single bottleneck link, with one flow for each
   bottleneck link.  The flows start in quick succession, separated by
   approximately 1 second.  These flows last at least 5 minutes.

   An additional test of interest would be to have a longer, multiple-
   bottleneck flow competing against shorter single-bottleneck flows.



Andrew, Ed., et al.       Expires July 10, 2009                [Page 19]


Internet-Draft         Common TCP Evaluation Suite          January 2009


3.7.3.  Outputs

   The output for this experiment is the ratio between the average
   throughput of the single-bottleneck flows and the throughput of the
   multiple-bottleneck flow, measured over the second half of the
   experiment.  Output also includes the packet drop rate for the
   congested link.

3.8.  Implementations

   There are two on-going implementation efforts.

   A testbed implementation is jointly being developed by the Centre for
   Advanced Internet Architectures (CAIA) at Swinburne University of
   technology and by Netlab at Caltech.  It will eventually be available
   for public use through the web interface
   <http://wil-ns.cs.caltech.edu/testing/benchmark/tmrg.php>.

   A simulation implementation in ns is being developed by NEC Labs,
   China.  Contributions can be made via its source forge page,
   <http://sourceforge.net/projects/tcpeval/>.

3.9.  Conclusions

   An initial specification of an evaluation suite for TCP extensions
   has been described.  Future versions will include: detailed
   specifications, with modifications for simulations and testbeds; more
   measurement results about congested links in the current Internet;
   alternate specifications; and specific sets of scenarios that can be
   run in a plausible period of time in simulators and testbeds,
   respectively.

   Several software and hardware implementations of these tests are
   being developed for use by the community.  An implementation is being
   developed on WAN-in-Lab [LATL07], which will allow users to upload
   Linux kernels via the web and will run tests similar to those
   described here.  Some tests will be modified to suit the hardware
   available in WAN-in-Lab.  An NS-2 implementation is also being
   developed at NEC.  We invite others to contribute implementations on
   other simulator platforms, such as OMNeT++ and OpNet.

3.10.  Acknowledgements

   This work is based on a paper by Lachlan Andrew, Cesar Marcondes,
   Sally Floyd, Lawrence Dunn, Romaric Guillier, Wang Gang, Lars Eggert,
   Sangtae Ha and Injong Rhee.

   The authors would also like to thank Roman Chertov, Doug Leith,



Andrew, Ed., et al.       Expires July 10, 2009                [Page 20]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   Saverio Mascolo, Ihsan Qazi, Bob Shorten, David Wei and Michele
   Weigle for valuable feedback.


4.  IANA Considerations

   None.


5.  Security Considerations

   None.


6.  Informative References

   [AHM08]    Andrew, L., Hanly, S., and R. Mukhtar, "Active Queue
              Management for Fair Resource Allocation in Wireless
              Networks", IEEE Transactions on Mobile Computing vol. 7,
              2008.

   [FK03]     Floyd, S. and E. Kohler, "Internet Research Needs Better
              Models", SIGCOMM Computer Communication Review (CCR) vol.
              33, no. 1, pp. 29-34, 2003.

   [FS01]     Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An
              Algorithm for Increasing the Robustness of RED",  ICIR,
              Tech. Rep., 2001. [Online]. Available:
              http://www.icir.org/floyd/papers/adaptiveRed.pdf .

   [GF04]     Gurtov, A. and S. Floyd, "Modeling Wireless Links for
              Transport Protocols", SIGCOMM Computer Communication
              Review (CCR) vol. 34, no. 2, pp.  85-96, 2004.

   [HK99]     Henderson, T. and R. Katz, "Transport Protocols for
              Internet-Compatible Satellite Networks", IEEE Journal on
              Selected Areas in Communications (JSAC) vol. 17, no. 2,
              pp. 326-344, 1999.

   [HVA03]    Hohn, N., Veitch, D., and P. Abry, "The Impact of the Flow
              Arrival Process in Internet Traffic", Proc. IEEE
              International Conference on Acoustics, Speech, and Signal
              Processing (ICASSP'03) vol. 6, pp. 37-40, 2003.

   [KS01]     Kunniyur, S. and R. Srikant, "Analysis and Design of an
              Adaptive Virtual Queue (AVQ) Algorithm for Active Queue
              Management", Proc. SIGCOMM'01 pp.  123-134, 2001.




Andrew, Ed., et al.       Expires July 10, 2009                [Page 21]


Internet-Draft         Common TCP Evaluation Suite          January 2009


   [Kelly79]  Kelly, F., "Reversibility and stochastic networks",
              Wiley 1979.

   [LATL07]   Lee, G., Andrew, L., Tang, A., and S. Low, "A WAN-in-Lab
              for protocol development", PFLDnet 2007.

   [MV06]     Mascolo, S. and F. Vacirca, "The Effect of Reverse Traffic
              on the Performance of New TCP Congestion Control
              Algorithms for Gigabit Networks", PFLDnet 2006.

   [RFC1323]  Jacobson, V., Braden, B., and D. Borman, "TCP Extensions
              for High Performance", RFC 1323, May 1992.

   [RFC2883]  Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An
              Extension to the Selective Acknowledgement (SACK) Option
              for TCP", RFC 2883, July 2000.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, September 2001.

   [RFC3465]  Allman, M., "TCP Congestion Control with Appropriate Byte
              Counting (ABC)", RFC 3465, February 2003.

   [RFC4138]  Sarolahti, P. and M. Kojo, "Forward RTO-Recovery (F-RTO):
              An Algorithm for Detecting Spurious Retransmission
              Timeouts with TCP and the Stream Control Transmission
              Protocol (SCTP)", RFC 4138, August 2005.

   [RMC03]    Rossi, D., Mellia, M., and C. Casetti, "User Patience and
              the Web: a Hands-on Investigation", IEEE Globecom 2003.

   [SA03]     Souza, E. and D. Agarwal, "A HighSpeed TCP Study:
              Characteristics and Deployment Issues", LBNL, Technical
              Report LBNL-53215, 2003.

   [SSM07]    Shimonishi, H., Sanadidi, M., and T. Murase, "Assessing
              Interactions among Legacy and High-Speed TCP Protocols",
              Proc. Workshop on Protocols for Fast Long-Delay Networks
              (PFLDNet) 2007.

   [Tmix]     Weigle, M., Adurthi, P., Hernandez-Campos, F., Jeffay, K.,
              and F. Smith, "Tmix: a tool for generating realistic TCP
              application workloads in ns-2", SIGCOMM Computer
              Communication Review (CCR) vol. 36, no. 3,pp. 65-76, 2006.

   [WCL05]    Wei, D., Cao, P., and S. Low, "Time for a TCP Benchmark
              Suite?", [Online]. Available:



Andrew, Ed., et al.       Expires July 10, 2009                [Page 22]


Internet-Draft         Common TCP Evaluation Suite          January 2009


              http://wil.cs.caltech.edu/pubs/
              DWei-TCPBenchmark06.ps 2006.


Authors' Addresses

   Lachlan Andrew
   CAIA, Swinburne University of Technology
   PO Box 218
   Hawthorn, Vic  3122
   Australia

   Email: landrew@swin.edu.au


   Sally Floyd
   ICSI Center for Internet Research
   1947 Center Street, Suite 600
   Berkeley, CA  94704
   USA

   Email: floyd@icir.org


   Gang Wang
   NEC, China
   Innovation Plaza, Tsinghua Science Park, 1 Zhongguancun East Road
   Beijing  100084
   China

   Email: wanggang@research.nec.com.cn




















Andrew, Ed., et al.       Expires July 10, 2009                [Page 23]