Network Working Group                                       H.Berkowitz
Internet Draft                                                 A.Retana
Expires Febuaary 2002                                           S.Hares
draft-ietf-bmwg-bgpbas-00.txt                            P.Krishnaswamy
                                                                 M. Lepp

                                                               June 2001



        Benchmarking Methodology for Basic BGP Convergence


    Status of this Memo

    This document is an Internet-Draft and is in full conformance with
    all provisions of Section 10 of RFC 2026[1].

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups. Note that
    other groups may also distribute working documents as Internet-
    Drafts.

    Internet-Drafts are draft documents valid for a maximum of six
    months and may be updated, replaced, or made obsolete by other
    documents at any time. It is inappropriate to use Internet- Drafts
    as reference material or to cite them other than as "work in
    progress."

    The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.

    The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.


Abstract
    This draft establishes standards for measuring BGP convergence
performance. Its initial emphasis is on the control plane of single
BGP routers.  We do not address forwarding plane performance.


Conventions used in this document

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
    this document are to be interpreted as described in [RFC-2119]. [2].

Table of Contents
1 1.  Introduction 2
1.1 Overview and Roadmap 2
1.2  Scope 3
1.3.  Types of Single-Router Convergence 3
2.  Reference Configurations 4
3.  Basic eBGP tests 4
3.1  Connection Conditions 5
3.2  Test Streams 5
3.3 Order of Received Updates 5
3.4 Initial Convergence 6
3.4.1 Single Peer Initial Convergence Time 6
3.4.2 Multiple Peers 7
3.5  Incremental Re-convergence with a Single Peer 7
3.5.1 Explicit add of single new route 7
3.5.2 Sequential withdraw and reannounce 7
3.5.3  Time to Change to Alternate Path after Explicit Withdrawal 7
3.6  Incremental Re-convergence with Multiple Peers 8
4.  Flaps 8
4.1  Flap Isolation Test 8
4.2 Authentication 8
5.  Acknowledgements 8
6.  References 8
Appendix A.  Representative Scenarios 10
A.1  Default-free interprovider peering 10
A.2  Interprovider peering with transit 10
A.3  Provider edge  router 10
A.4 Multihomed subscriber edge  router 10

1.  Introduction
    This document describes a specific set of tests aimed at
    characterizing the convergence performance of BGP-4 processes in
    routers or other boxes that incorporate BGP functionality. A key
    objective is to propose methodology that will standardize the
    conducting and reporting of convergence-related measurements.

    Although both convergence and forwarding are
    essential to basic router operation, this document does not consider
    the forwarding performance in the Device Under Test (DUT),for two
    reasons. Forwarding performance is the primary focus in [RFC 2544] and
    it is expected to be dealt with in work that ensues from [Trotter].

    Further, as convergence characterization is a complex process, we
    deliberately restrict this document basic measurements towards
    characterizing BGP convergence.

    Subsequent documents will explore the more intricate aspects of
    convergence measurement, such as the presence of policy processing,
    simultaneous traffic on the control and data paths within the DUT,
    and other realistic performance modifiers. Convergence of
    Interior Gateway Protocols will  be considered in separate
    drafts.

1.1 Overview and Roadmap

Measurements of protocols can be classified either as internal or
external.  Internal measurements are time-stamped within the Device
Under Test (DUT).  External measurements infer the timing of a process
in the DUT to have converged after a downstream measurement device
indicates the corresponding advertisement has been received.   An
alternative type of external measurement is to test for data forwarded
to the downstream device that relies upon the new route just computed by
the Device Under Test.

    Internal measurements are plagued with time synchronization issues,
since the Network Time Protocol (NTP) hooks may be missing from products
or improperly implemented.  Of course in a self-contained lab setting or
the self-contained measurement of internal processes themselves,
synchronized timing is not an issue.

For the purposes of this paper, external technique are more readily
applicable.  However, external measurements have their own problems
because they include the time to advertise the new route downstream and
transmission times for the advertisement within the device under test.
If data forwarding were to feature in the measurement methodology it too
would include some extraneous latency- that of the forwarding lookup
process in the DUT at the minimum.  This document deals only with
external measurements limited to route propagation.

    A characterization of the BGP convergence performance of a device
    must take into account all distinct stages and aspects of BGP
functionality. This requires that the relevant terms and metrics be as
specific as possible. A terminology that meets this objective  was
presented  in draft-ietf-bmwg-conterm-00.txt

1.2  Scope

This document deals with eBGP convergence of a single router Device
Under Test (DUT). It restricts the measurement of convergence to events
in the control plane, and does not consider the interactions of
convergence and forwarding.

Convergence measurements among multiple iBGP-connected routers in an AS,
and Internet-wide convergence measurements, are outside the scope of
this document as well.

These additional topics are unquestionably of interest, and it is the
intention of this document to form a stepping stone toward them

1.3.  Types of Single-Router Convergence

Two significantly different types of convergence time tend to be lumped
together in product specifications.  The first is the time needed for a
BGP speaker to build a full table after initialization, or for a
particular peering session to rebuild its table after a hard reset. The
second is the time needed for a router to respond to a new announcement
or withdrawal.

As stated in the Roadmap, measurements can be defined either as internal
or external. Internal measurements examine the RIB/FIB of the DUT
directly. While they are more accurate in principle, they require
measurement hooks in the implementation, as described in [Ahuja et al].

External measurements start with a stimulus from one or more "upstream"
routers and end with a specific event causing an advertisement to be
sent to a "downstream" peer.  In the reference configuration above,
external measurements are defined with respect to TR3 as the downstream
router.

2.  Reference Configurations

For tests when the number of peers is not a performance parameter of
interest, use the configuration in Figure 1:

TR1==========+---------+==========TR3
|           |         |
D1          |         |
|           |   DUT   |
TR2==========|         |
              +---------+
Figure 1.  Basic Test Configuration.


D1 is a prefix reachable by both TR1 and TR2. Neither TR1 or TR2 is the
originating AS for the announcement of D1.

More complex peering arrangements will involve up to n Test Routers, as
shown in Figure 2.  It is recommended that the Figure 1 configuration
always be tested as a baseline, and then additional reports made that
show the effect on performance of increasing the number of peers.

TR1==========+---------+==========TR3
|           |         |
D1          |         |
|           |   DUT   |
TR2==========|         |
              |         |
                  ...
TRn==========+---------+
Figure 2. Test Configuration with n Peers.

Interface speeds must  be specified as part of the test report.  At
least 100 Mbps is recommended, so media delays are not a significant
component of  convergence times.

In the absence of other route selection criteria, TR1 shall have an IP
address that makes it most preferred.


3.  Basic eBGP tests

All routers in this configuration shall have a policy of ADVERTISE
ALL/ACCEPT ALL [RPSL].  Tests with prefix filtering, community-based
preferences, authentication, etc., as well as performance under flap are
TBD.

Not all eBGP applications are alike.  While the tests in this section
are applicable to a wide range of configurations, testers may select
configurations that are most relevant to the intended product use.  Such
configurations include:

    1. Interprovider peering, characterized by an exchange of customer
routes,which, in the case of major providers, may be in the tens of
thousands of routes but smaller than the full default-free table.

    2. Provider/Subscriber edge peering, where transit service implies
the subscriber advertises relatively few routes to the provider but may
take, variously, full default-free routes, a limited subset therein, or
default only from the provider.

3.1  Connection Conditions

The DUT should be physically connected to the test routers over a medium
sufficiently fast that propagation time is not a significant factor. A
medium of at least 100 Mbps is recommended.

Multiple peers may be connected to a single physical interface using
802.1q VLANs or another appropriate multiplexing scheme.

TCP connections shall use slow start.  Any nonstandard initial or
maximum window sizes shall be indicated in the test report.

3.2  Test Streams

Packet trains presented to the DUT shall be random with respect to
prefix length or order of specificity.

The degree of update packing shall be specified. When long packet trains
are being sent, the usual case will be that maximum packing up to the
MTU size will be used.


3.3 Order of Received Updates

Within a set of updates, there is a potential for ordering among the
prefixes.  For the fairest testing of  update trains randomize the order
of prefixes, so no particular RIB data structure benefits by the
ordering.


Assume we have a Adj-RIB-out that consists of

          1.0.0.0/8
          2.0.0.0/8
          3.0.0.0/8
          1.1.0.0/16
          2.1.0.0/16
          3.1.0.0/16
          3.2.0.0/16
          1.1.1.0/24
          1.1.2.0/24
          2.1.2.0/24

    If it were sent in this order, top to bottom, it would be sorted
    by prefix size and prefix value within size.  A radix tree
    implementation might like to receive this very much.

    But if it were sent out in the following order

          1.0.0.0/8
          1.1.0.0/16
          1.1.1.0/24
          1.1.2.0/24
          2.0.0.0/8
          2.1.0.0/16
          2.1.2.0/24
          3.0.0.0/8
          3.1.0.0/16
          3.2.0.0/16

    It would make the day for an implementation that orders its routing
    table as a strict tree, implemented as a linked list.

    The optimal test train would be

          1.0.0.0/8
          2.1.0.0/16
          1.1.0.0/16
          3.0.0.0/8
          1.1.1.0/24
          2.0.0.0/8
          1.1.2.0/24
          3.1.0.0/16
          2.1.2.0/24
          3.2.0.0/16
     which is random, and does not favor any particular implementation.

Measurement units:  A metric of randomness,TBD


3.4 Initial Convergence

While this is relatively simple to measure, and often is the basis of
product specifications, it is operationally far less significant than
reconvergence after changes.  A "carrier-grade" router should not
initialize often, and the soft reset option reduces the need to rebuild
views. The initialization time, therefore, can be amortized over a long
period of time and may disappear into the noise when compared to
reconvergence.

3.4.1 Single Peer Initial Convergence Time

This basic reference test uses a representatively sized and populated
target RIB and no other variable influences (eg authentication off,
filters off, no policy).

The test begins with OPEN requests sent from TR1 and TR2 to the DUT.
Each Test Router sends a standard routing table of TBD routes.

The test ends when the DUT begins to advertise the last route in the
routing table to TR3.

3.4.2 Multiple Peers

TBD

3.5  Incremental Re-convergence with a Single Peer

For all of these measurements, report any route filters, authentication,
and reverse path verification used.  It is recommended that these not be
used for initial testing.

3.5.1 Explicit add of single new route

This test measures the time required to add a route newly advertised by
a peer.  Such a route does not exist in the DUT's RIB, and will not
displace a route in the RIB.

The DUT has been initialized, with no path to D1. Measurement time
begins when TR1 announces D1 to the DUT.

Measurement time stops when the DUT advertises D1 to TR3.

3.5.2 Sequential withdraw and reannounce


The DUT has been initialized and has a path to D1 via TR1, not TR2.
Simultaneously, TR1 sends TDown(TR1) and TR2 announces the new route
with Tbest(TR2).

Measurement begins when Tbest is received at the DUT. Measurement time
stops when the DUT advertises D1 to TR3.

3.5.3  Time to Change to Alternate Path after Explicit Withdrawal

The DUT has been initialized and has paths to D1 via both TR1 and TR2.
TR1's path is preferred, but TR1 withdraws it with TDown(TR1). Re-
convergence occurs when the TR2 advertised path(s) becomes active.

Measurement time stops when the DUT advertises D1 to TR3.

3.6  Incremental Re-convergence with Multiple Peers

    The number of routes per BGP peer is an obvious stressor to the
    convergence process. The number, and relative proportion, of
    multiple route instances and distinct routes being added or
    withdrawn by each peer will affect the convergence process, as will
    the mix of overlapping route instances, and IGP routes.

4.  Flaps
The following tests evaluate convergence when route flap exists.



Let TRF be a router that will generate only flapping routes.


TR1==========+---------+==========TR3
|           |         |
D1          |         |
|           |   DUT   |
TR2==========|         |
              |         |
                  ...
TRF==========+---------+
Figure 3. Test Diagram with a Router, TRF, flapping.

4.1  Flap Isolation Test

TRF will advertise a continuously flapping route. Repeat the eBGP
convergence tests.
The objective is to determine whether one route flapping affects the
operation of the router.

4.2 Authentication
Repeat all tests above with MD5 authentication.


5.  Acknowledgements

Thanks to Francis Ovenden for review and Abha Ahuja for encouragement. Much
appreciation to Jeff Haas, Matt Richardson, and Shane Wright at Nexthop for
comments and input.

6.  References

    [Ahuja 2000a] "An Experimental Study of Delayed Internet Routing
Convergence." Abha Ahuja, Farnam Jahanian, Abhijit Bose, Craig Labovits,
RIPE 37 - Routing WG.
    [RFC 2119] "Key words for use in RFCs to Indicate Requirement
Levels." S Bradner, March 1997.
    [RFC 2539] "BGP Route Flap Damping" C. Villamizar, R. Chandra, R.
Govindan. November 1998.
    [RFC 2544] "Benchmarking Methodology for Network Interconnect
Devices." S.  Bradner, J. McQuaid. March 1999.
    [RFC 2622] Routing Policy Specification Language (RPSL)." C.
Alaettinoglu, C. Villamizar, E. Gerich, D. Kessens, D. Meyer, T. Bates,
D. Karrenberg, M. Terpstra. June 1999.
    [RFC 2827] Network Ingress Filtering: Defeating Denial of Service
Attacks    which employ IP Source Address Spoofing. P. Ferguson, D.
Senie. May 2000.
    [RFC 2928] "Route Refresh Capability for BGP-4". E. Chen.
    [Trotter] "Terminology for Forwarding Information Based (FIB) based
Router Performance Benchmarking", Work in Progress, IETF draft-ietf-
bmwg-fib-term-00.txt

12. Authors' Addresses

    Howard Berkowitz
    Nortel Networks
    5012 S. 25th St
    Arlington VA 22206

    Phone: +1 703 998-5819 (ESN 451-5819)
    Fax:   +1 703 998-5058
    EMail: hberkowi@nortelnetworks.com
           hcb@clark.net


    Alvaro Retana
    Cisco Systems, Inc.
    7025 Kit Creek Rd.
    Research Triangle Park, NC 27709
    Email: aretana@cisco.com

    Susan Hares
    Nexthop Technologies
    517 W. William
    Ann Arbor, Mi 48103
    Phone:
    Email: skh@nexthop.com

    Padma Krishnaswamy
    Nexthop  Technologies
    517 W William
    Ann Arbor, Mi 48103
    Phone: 734 936 2656
    Email: kri@nexthop.com

    Marianne Lepp
    Juniper Networks
    51 Sawyer Road
    Waltham, MA 02453
    Phone: 617 645 9019
    Email: mlepp@juniper.net

Appendix A.  Representative Scenarios

The following describes sample BGP applications positioned at various
points in the network.

A.1  Default-free interprovider peering

The DUT exchanges 0.3 to 0.5 D with a small number of peers.  Typically,
routers in this application are limited by bandwidth rather than route
processing

A.2  Interprovider peering with transit

The DUT exchanges 1.3 D routes with a small number of peers.

A.3  Provider edge  router

The DUT has a large number (>10) of eBGP peers.

To 10% of the peers, the DUT advertises 1.3 D.
To 20% of the peers, the DUT advertises 0.3 D.
To 70% of the peers, the DUT advertises default.

50% of the peers advertise an aggregate and a more-specific route to the
DUT.
20% of the peers advertise 10 or more routes to the DUT.

30% of the peers advertise a single route to the DUT.
A.4 Multihomed subscriber edge  router

The DUT connects to 2 peers.  It advertises an aggregate and a more-
specific to each.


Full Copyright Statement

    Copyright (C) The Internet Society (2001).  All Rights Reserved.

    This document and translations of it may be copied and furnished to
    others, and derivative works that comment on or otherwise explain it
    or assist in its implementation may be prepared, copied, published
    and distributed, in whole or in part, without restriction of any
    kind, provided that the above copyright notice and this paragraph are
    included on all such copies and derivative works.  However, this
    document itself may not be modified in any way, such as by removing
    the copyright notice or references to the Internet Society or other
    Internet organizations, except as needed for the purpose of
    developing Internet standards in which case the procedures for
    copyrights defined in the Internet Standards process must be
    followed, or as required to translate it into languages other than
    English.

    The limited permissions granted above are perpetual and will not be
    revoked by the Internet Society or its successors or assigns.

    This document and the information contained herein is provided on an
    "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
    TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
    BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
    HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.