Benchmarking Methodology WG                                 Rajiv Asati
Internet Draft                                                    Cisco
Updates: 2544 (if approved)                            Carlos Pignataro
Intended status: Informational                                    Cisco
Expires: August 2010                                  Fernando Calabria
                                                                  Cisco
                                                           Cesar Olvera
                                                            Consulintel

                                                      February 19, 2010

                      Device Reset Characterization
                        draft-asati-bmwg-reset-03




Abstract

   An operational forwarding device may need to be re-started
   (automatically or manually) for a variety of reasons, an event that
   we call a "reset" in this document. Since there may be an
   interruption in the forwarding operation during a reset, it is
   useful to know how long a device takes to begin forwarding packets
   again.

   This document specifies a methodology for characterizing reset
   during benchmarking of forwarding devices, and provides clarity and
   consistency in reset test procedures beyond what's specified in
   RFC2544. It therefore updates RFC2544.



Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."




Asati, et al.          Expires August 19, 2010                 [Page 1]


Internet-Draft          Reset Characterization            February 2010

   The list of current Internet-Drafts can be accessed at
        http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
        http://www.ietf.org/shadow.html



   This Internet-Draft will expire on August 19, 2010.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.  Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Simplified BSD License.


























Asati, et al.          Expires August 19, 2010                 [Page 2]


Internet-Draft          Reset Characterization            February 2010



Table of Contents


   1. Introduction...................................................4
      1.1. Scope.....................................................4
   2. Key Words to Reflect Requirements..............................5
   3. Reset Test.....................................................5
      3.1. Hardware Reset............................................5
         3.1.1. Routing Processor (RP) / Routing Engine reset........6
            3.1.1.1. RP Failure for a single-RP device (mandatory)...6
            3.1.1.2. RP Failure for a multiple-RP device (optional)..7
         3.1.2. Line Card (LC) Removal and Insertion (mandatory).....9
      3.2. Software Reset...........................................11
         3.2.1. Operating System (OS) reset (mandatory).............11
         3.2.2. Process reset (optional)............................13
      3.3. Power interruption.......................................15
         3.3.1. Power Interruption (mandatory)......................15
   4. Security Considerations.......................................16
   5. IANA Considerations...........................................17
   6. Acknowledgments...............................................17
   7. References....................................................18
      7.1. Normative References.....................................18
      7.2. Informative References...................................18
   Authors' Addresses...............................................19
























Asati, et al.          Expires August 19, 2010                 [Page 3]


Internet-Draft          Reset Characterization            February 2010

1. Introduction

   An operational forwarding device (or one of its components) may need
   to be re-started for a variety of reasons, an event that we call a
   "reset" in this draft. Since there may be an interruption in the
   forwarding operation during a reset, it is useful to know how long a
   device takes to begin forwarding packets again.

   However, the answer to this question is no longer simple and
   straight-forward as the modern forwarding devices employ many
   hardware advancements (distributed forwarding, etc.) and software
   advancements (graceful restart, etc.) that influence the recovery
   time after the reset.

   Additionally, there are other factors that influence the recovery
   time after the reset:

     1. Type of reset - Hardware (line-card crash, etc.) vs. Software
        (protocol reset, process crash, etc.) or even complete power
        failures

     2. Manual vs. Automatic reset

     3. Local vs. Remote reset

     4. Scale - Number of line cards present vs. in-use

     5. Scale - Number of physical and logical interfaces

     6. Scale - Number of routing protocol instances

   This document specifies a methodology for characterizing reset
   during benchmarking of forwarding devices, and provides clarity and
   consistency in reset procedures beyond what's specified in
   [RFC2544]. These procedures may be used by other benchmarking
   documents such as [RFC2544], [RFC5180], [RFC5695], etc.

   This document updates Section 26.6 of [RFC2544].

1.1. Scope

   This document focuses on only the reset criterion of benchmarking,
   and presumes that it would be beneficial to [RFC2544], [RFC5180],
   [RFC5695], and other BMWG benchmarking efforts.






Asati, et al.          Expires August 19, 2010                 [Page 4]


Internet-Draft          Reset Characterization            February 2010

2. Key Words to Reflect Requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in BCP 14, RFC 2119
   [RFC2119].  RFC 2119 defines the use of these key words to help make
   the intent of standards track documents as clear as possible.  While
   this document uses these keywords, this document is not a standards
   track document.



3. Reset Test

   This section contains the description of the tests that are related
   to the characterization of DUT's (Device Under Test) / SUT's (System
   Under Test) speed to recover from a reset. There are three types of
   reset considered in this document:

     1. Hardware resets

     2. Software resets

     3. Power interruption

   Section 3.1 describes various hardware resets, whereas Section 3.2
   describes various software resets. Additionally, Section 3.3
   describes power interruption tests. These sections define and
   characterize these resets.

   Additionally, since device specific implementations may vary for
   hardware and software type resets, it is desirable to classify each
   test case as "mandatory" or "optional".



3.1. Hardware Reset

   A test designed to characterize the time it takes a DUT to recover
   from the hardware reset.

   A "hardware reset" generally involves the re-initialization of one
   or more physical components in the DUT, but not the entire DUT.

   A hardware reset is executed by the operator for example by physical
   removal of a physical component, by pressing on a "reset" button for




Asati, et al.          Expires August 19, 2010                 [Page 5]


Internet-Draft          Reset Characterization            February 2010

   the component, or could even be triggered from the command line
   interface.

   For routers that do not contain separate Routing Processor and Line
   Card modules, the hardware reset tests are not performed since they
   are not relevant; instead, the power interruption tests are
   mandatory to be performed (see Section 3.3) in these cases.

3.1.1. Routing Processor (RP) / Routing Engine reset

   The Routing Processor (RP) is the DUT module that is primarily
   concerned with Control Plane functions.

3.1.1.1. RP Failure for a single-RP device (mandatory)

   Objective

     To characterize the speed at which a DUT recovers from a Route
     processor hardware reset in a single RP environment.

   Procedure

     First, ensure that the RP is in a permanent state to which it will
     return to after the reset, by performing some or all of the
     following operational tasks: save the current DUT configuration,
     specify boot parameters, ensure the appropriate software files are
     available, or perform additional Operating System or hardware
     related task.

     Second, ensure that the DUT is able to forward the traffic for at
     least 15 seconds before any test activities are performed. The
     traffic should use the minimum frame size possible on the media
     used in the testing and rate should be sufficient for the DUT to
     attain the maximum forwarding throughput. This enables a finer
     granularity in the recovery time measurement.

     Third, perform the Route Processor (RP) hardware reset at this
     point. This entails for example physically removing the RP to
     later re-insert it, or triggering a hardware reset by other means
     (e.g., command line interface, physical switch, etc.)

     Finally, the characterization is completed by measuring the frame
     loss and recovery time from the moment the RP is re-initialized or
     reinserted.

   Reporting format




Asati, et al.          Expires August 19, 2010                 [Page 6]


Internet-Draft          Reset Characterization            February 2010

     The reset results are reported in a simple statement including the
     frame loss and recovery times.

     For each test case, it is RECOMMENDED that the following
     parameters be reported in these units:

           Parameter                Units or Examples

           Throughput               Frames per second and bits per

                                    second

           Loss                     Frames

           Time                     Seconds, with sufficient resolution

                                    to convey meaningful info

           Protocol                 IPv4, IPv6, MPLS, etc.

           Frame Size               Octets

           Port Media               Ethernet, GigE (Gigabit Ethernet),

                                    POS (Packet over SONET), etc.

           Port Speed               10 Gbps, 1 Gbps, 100 Mbps, etc.

           Interface Encap.         Ethernet, Ethernet VLAN,

                                    PPP, HDLC, etc.

     The reporting of results MUST regard repeatability considerations
     from Section 4 of [RFC2544]. It is RECOMMENDED to perform multiple
     trials and report average results.



3.1.1.2. RP Failure for a multiple-RP device (optional)

   Objective

     To characterize the speed at which a "secondary" Route Processor
     (sometimes referred to as "backup" RP) of a DUT becomes active
     after a "primary" (or "active") Route Processor hardware reset.
     This process is often referred to as "RP Switchover". The
     characterization in this test should be done for the default DUT



Asati, et al.          Expires August 19, 2010                 [Page 7]


Internet-Draft          Reset Characterization            February 2010

     behavior as well as a DUT's non-default configuration that
     minimizes frame loss.

   Procedure

     This test characterizes "RP Switchover". Many implementations
     allow for optimized switchover capabilities that minimize the
     downtime during the actual switchover. This test consists of two
     sub-cases from a switchover characteristics standpoint: First, a
     default behavior (with no switchover-specific configurations); and
     second, a non-default behavior with switchover configuration to
     minimize frame loss. Therefore, the procedures hereby described
     are executed twice, and reported separately.

     First, ensure that the RPs are in a permanent state such that the
     secondary will be activated to the same state as the active is, by
     performing some or all of the following operational tasks: save
     the current DUT configuration, specify boot parameters, ensure the
     appropriate software files are available, or perform additional
     Operating System or hardware related task.

     Second, ensure that the DUT is able to forward the traffic for at
     least 15 seconds before any test activities are performed. The
     traffic should use the minimum frame size possible on the media
     used in the testing and rate should be sufficient for the DUT to
     attain the maximum forwarding throughput. This enables a finer
     granularity in the recovery time measurement.

     Third, perform the primary Route Processor (RP) hardware reset at
     this point. This entails for example physically removing the RP,
     or triggering a hardware reset by other means (e.g., command line
     interface, physical switch, etc.) Is up to the Operator to decide
     if the active RP needs to be re-inserted after a grace period or
     not.

     Finally, the characterization is completed by measuring the
     complete frame loss and recovery time from the moment the active
     RP is hardware-reset.

   Reporting format

     The reset results are reported twice, one for the default
     switchover behavior and the other for the non-default one. For
     each, the report consists of a simple statement including the
     frame loss and recovery times, as well as any specific redundancy
     scheme in place.




Asati, et al.          Expires August 19, 2010                 [Page 8]


Internet-Draft          Reset Characterization            February 2010

     For each test case, it is RECOMMENDED that the following
     parameters be reported in these units:

           Parameter                Units or Examples

           Throughput               Frames per second and bits per

                                    second

           Loss                     Frames

           Time                     Seconds, with sufficient resolution

                                    to convey meaningful info

           Protocol                 IPv4, IPv6, MPLS, etc.

           Frame Size               Octets

           Port Media               Ethernet, GigE (Gigabit Ethernet),

                                    POS (Packet over SONET), etc.

           Port Speed               10 Gbps, 1 Gbps, 100 Mbps, etc.

           Interface Encap.         Ethernet, Ethernet VLAN,

                                    PPP, HDLC, etc.

     The reporting of results MUST regard repeatability considerations
     from Section 4 of [RFC2544]. It is RECOMMENDED to perform multiple
     trials and report average results.





3.1.2. Line Card (LC) Removal and Insertion (mandatory)

   The Line Card (LC) is the DUT component that is responsible with
   packet forwarding.

   Objective

     To characterize the speed at which a DUT recovers from a Line Card
     removal and insertion event.




Asati, et al.          Expires August 19, 2010                 [Page 9]


Internet-Draft          Reset Characterization            February 2010

   Procedure

     For this test, the Line Card that is being hardware-reset MUST be
     on the forwarding path and all destinations MUST be directly
     connected.

     First, complete some or all of the following operational tasks:
     save the current DUT configuration, specify boot parameters,
     ensure the appropriate software files are available, or perform
     additional Operating System or hardware related task.

     Second, ensure that the DUT is able to forward the traffic for at
     least 15 seconds before any test activities are performed. The
     traffic should use the minimum frame size possible on the media
     used in the testing and rate should be sufficient for the DUT to
     attain the maximum forwarding throughput. This enables a finer
     granularity in the recovery time measurement.

     Third, perform the Line Card (LC) hardware reset at this point.
     This entails for example physically removing the LC to later re-
     insert it, or triggering a hardware reset by other means (e.g.,
     command line interface, physical switch, etc.)

     Finally, the characterization is completed by measuring the frame
     loss and recovery time from the moment the LC is reinitialized or
     reinserted.

   Reporting Format

     The reset results are reported in a simple statement including the
     frame loss and recovery times.

     For each test case, it is RECOMMENDED that the following
     parameters be reported in these units:

           Parameter                Units or Examples

           Throughput               Frames per second and bits per

                                    second

           Loss                     Frames

           Time                     Seconds, with sufficient resolution

                                    to convey meaningful info




Asati, et al.          Expires August 19, 2010                [Page 10]


Internet-Draft          Reset Characterization            February 2010

           Protocol                 IPv4, IPv6, MPLS, etc.

           Frame Size               Octets

           Port Media               Ethernet, GigE (Gigabit Ethernet),

                                    POS (Packet over SONET), etc.

           Port Speed               10 Gbps, 1 Gbps, 100 Mbps, etc.

           Interface Encap.         Ethernet, Ethernet VLAN,

                                    PPP, HDLC, etc.

     The reporting of results MUST regard repeatability considerations
     from Section 4 of [RFC2544]. It is RECOMMENDED to perform multiple
     trials and report average results.



3.2. Software Reset

   To characterize the speed at which a DUT recovers from the software
   reset.

   In contrast to a "hardware reset", a "software reset" involves only
   the re-initialization of the execution, data structures, and partial
   state within the software running on the DUT module(s).

   A software reset is initiated for example from the DUT's Command
   Line Interface (CLI).

3.2.1. Operating System (OS) reset (mandatory)

   Objective

     To characterize the speed at which a DUT recovers from an
     Operating System (OS) software reset.

   Procedure

     First, complete some or all of the following operational tasks:
     save the current DUT configuration, specify software boot
     parameters, ensure the appropriate software files are available,
     or perform additional Operating System task.





Asati, et al.          Expires August 19, 2010                [Page 11]


Internet-Draft          Reset Characterization            February 2010

     Second, ensure that the DUT is able to forward the traffic for at
     least 15 seconds before any test activities are performed. The
     traffic should use the minimum frame size possible on the media
     used in the testing and rate should be sufficient for the DUT to
     attain the maximum forwarding throughput. This enables a finer
     granularity in the recovery time measurement.

     Third, trigger an Operating System re-initialization in the DUT,
     by operational means such as use of the DUT's Command Line
     Interface (CLI) or other management interface.

     Finally, the characterization is completed by measuring the
     complete frame loss and recovery time from the moment the reset
     instruction was given until the Operating System finished the
     reload and re-initialization (inferred by the re-establishing of
     traffic).

   Reporting format

     The reset results are reported in a simple statement including the
     frame loss and recovery times.

     For each test case, it is RECOMMENDED that the following
     parameters be reported in these units:

           Parameter                Units or Examples

           Throughput               Frames per second and bits per

                                    second

           Loss                     Frames

           Time                     Seconds, with sufficient resolution

                                    to convey meaningful info

           Protocol                 IPv4, IPv6, MPLS, etc.

           Frame Size               Octets

           Port Media               Ethernet, GigE (Gigabit Ethernet),

                                    POS (Packet over SONET), etc.

           Port Speed               10 Gbps, 1 Gbps, 100 Mbps, etc.




Asati, et al.          Expires August 19, 2010                [Page 12]


Internet-Draft          Reset Characterization            February 2010

           Interface Encap.         Ethernet, Ethernet VLAN,

                                    PPP, HDLC, etc.

     The reporting of results MUST regard repeatability considerations
     from Section 4 of [RFC2544]. It is RECOMMENDED to perform multiple
     trials and report average results.



3.2.2. Process reset (optional)

   Objective

     To characterize the speed at which a DUT recovers from a software
     process reset.

     Such speed may depend upon the number and types of process running
     in the DUT and which ones are tested. Different implementations of
     forwarding devices include various common processes. A process
     reset should be performed only in the processes most relevant to
     the tester.

   Procedure

     First, complete some or all of the following operational tasks:
     save the current DUT configuration, specify software parameters or
     environmental variables, or perform additional Operating System
     task.

     Second, ensure that the DUT is able to forward the traffic for at
     least 15 seconds before any test activities are performed. The
     traffic should use the minimum frame size possible on the media
     used in the testing and rate should be sufficient for the DUT to
     attain the maximum forwarding throughput. This enables a finer
     granularity in the recovery time measurement.

     Third, trigger a process reset for each process running in the DUT
     and considered for testing from a management interface (e.g., by
     means of the Command Line Interface (CLI), etc.)

     Finally, the characterization for each individual process is
     completed by measuring the complete frame loss and recovery time
     from the moment the reset instruction was given until the
     Operating System finished the reload and re-initialization
     (inferred by the re-establishing of traffic).




Asati, et al.          Expires August 19, 2010                [Page 13]


Internet-Draft          Reset Characterization            February 2010

   Reporting format

     The reset results are reported in a simple statement including the
     frame loss and recovery times for each process running in the DUT
     and tested. Given the implementation nature of this test, details
     of the actual process tested should be included along with the
     statement.

     For each test case, it is RECOMMENDED that the following
     parameters be reported in these units:

           Parameter                Units or Examples

           Throughput               Frames per second and bits per

                                    second

           Loss                     Frames

           Time                     Seconds, with sufficient resolution

                                    to convey meaningful info

           Protocol                 IPv4, IPv6, MPLS, etc.

           Frame Size               Octets

           Port Media               Ethernet, GigE (Gigabit Ethernet),

                                    POS (Packet over SONET), etc.

           Port Speed               10 Gbps, 1 Gbps, 100 Mbps, etc.

           Interface Encap.         Ethernet, Ethernet VLAN,

                                    PPP, HDLC, etc.

     The reporting of results MUST regard repeatability considerations
     from Section 4 of [RFC2544]. It is RECOMMENDED to perform multiple
     trials and report average results.










Asati, et al.          Expires August 19, 2010                [Page 14]


Internet-Draft          Reset Characterization            February 2010

3.3. Power interruption

   "Power interruption" refers to the complete loss of power on the
   DUT. It can be viewed as a special case of a hardware reset,
   triggered by the loss of the power supply to the DUT or its
   components, and is characterized by the re-initialization of all
   hardware and software in the DUT.



3.3.1. Power Interruption (mandatory)



   Objective

     To characterize the speed at which a DUT recovers from a complete
     loss of electric power or complete power interruption. This test
     simulates a complete power failure or outage, and should be
     indicative of the DUT/SUT's behavior during such event.

   Procedure

     First, ensure that the entire DUT is at a permanent state to which
     it will return to after the power interruption, by performing some
     or all of the following operational tasks: save the current DUT
     configuration, specify boot parameters, ensure the appropriate
     software files are available, or perform additional Operating
     System or hardware related task.

     Second, ensure that the DUT is able to forward the traffic for at
     least 15 seconds before any test activities are performed. The
     traffic should use the minimum frame size possible on the media
     used in the testing and rate should be sufficient for the DUT to
     attain the maximum forwarding throughput. This enables a finer
     granularity in the recovery time measurement.

     Third, interrupt the power (AC or DC) that feeds the corresponding
     DUT's power supplies at this point. This entails for example
     physically removing the power supplies in the DUT to later re-
     insert them, or simply disconnecting or switching off their power
     feeds (AC or DC as applicable). The actual power interruption
     should last at least 15 seconds.

     Finally, the characterization is completed by measuring the frame
     loss and recovery time from the moment the power is restored or
     the power supplies reinserted in the DUT.



Asati, et al.          Expires August 19, 2010                [Page 15]


Internet-Draft          Reset Characterization            February 2010

   Reporting format

     The reset results are reported in a simple statement including the
     frame loss and recovery times.

     For each test case, it is RECOMMENDED that the following
     parameters be reported in these units:

           Parameter                Units or Examples

           Throughput               Frames per second and bits per

                                    second

           Loss                     Frames

           Time                     Seconds, with sufficient resolution

                                    to convey meaningful info

           Protocol                 IPv4, IPv6, MPLS, etc.

           Frame Size               Octets

           Port Media               Ethernet, GigE (Gigabit Ethernet),

                                    POS (Packet over SONET), etc.

           Port Speed               10 Gbps, 1 Gbps, 100 Mbps, etc.

           Interface Encap.         Ethernet, Ethernet VLAN,

                                    PPP, HDLC, etc.

     The reporting of results MUST regard repeatability considerations
     from Section 4 of [RFC2544]. It is RECOMMENDED to perform multiple
     trials and report average results.





4. Security Considerations

   Benchmarking activities, as described in this memo, are limited to
   technology characterization using controlled stimuli in a laboratory




Asati, et al.          Expires August 19, 2010                [Page 16]


Internet-Draft          Reset Characterization            February 2010

   environment, with dedicated address space and the constraints
   specified in the sections above.

   The benchmarking network topology will be an independent test setup
   and MUST NOT be connected to devices that may forward the test
   traffic into a production network or misroute traffic to the test
   management network.

   Furthermore, benchmarking is performed on a "black-box" basis,
   relying solely on measurements observable external to the DUT/SUT.

   Special capabilities SHOULD NOT exist in the DUT/SUT specifically
   for benchmarking purposes.  Any implications for network security
   arising from the DUT/SUT SHOULD be identical in the lab and in
   production networks.

   There are no specific security considerations within the scope of
   this document.



5. IANA Considerations

   There is no IANA consideration for this document.

6. Acknowledgments

   The authors would like to thank Ron Bonica, who motivated us to
   write this document. The authors would also like to thank Al Morton
   and Andrew Yourtchenko for providing review, suggestions, and
   valuable input.

   This document was prepared using 2-Word-v2.0.template.dot.

















Asati, et al.          Expires August 19, 2010                [Page 17]


Internet-Draft          Reset Characterization            February 2010



7. References

    7.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2544] Bradner, S. and McQuaid, J., "Benchmarking Methodology for
             Network Interconnect Devices", RFC 2544, March 1999.



    7.2. Informative References

   [RFC5180] Popoviciu, C., et al, "IPv6 Benchmarking Methodology for
             Network Interconnect Devices", RFC 5180, May 2008.

   [RFC5695] Akhter, A., Asati, R., and C. Pignataro, "MPLS Forwarding
             Benchmarking Methodology for IP Flows", RFC 5695, November
             2009.




























Asati, et al.          Expires August 19, 2010                [Page 18]


Internet-Draft          Reset Characterization            February 2010

Authors' Addresses

   Rajiv Asati
   Cisco Systems
   7025-6 Kit Creek Road
   RTP, NC 27709
   USA

   Email: rajiva@cisco.com


   Carlos Pignataro
   Cisco Systems
   7200-12 Kit Creek Road
   RTP, NC 27709
   USA

   Email: cpignata@cisco.com


   Fernando Calabria
   Cisco Systems
   7200-12 Kit Creek Road
   RTP, NC 27709
   USA

   Email: fcalabri@cisco.com


   Cesar Olvera
   Consulintel
   Joaquin Turina, 2
   Pozuelo de Alarcon, Madrid, E-28224
   Spain

   Phone: +34 91 151 81 99
   Email: cesar.olvera@consulintel.es













Asati, et al.          Expires August 19, 2010                [Page 19]