Internet Engineering Task Force                              M. Hamilton
Internet-Draft                                     BreakingPoint Systems
Intended status: Informational                             March 5, 2009
Expires: September 6, 2009


       Benchmarking Methodology for Content-Aware Network Devices
                  draft-hamilton-bmwg-ca-bench-meth-00

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 6, 2009.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Abstract

   The purpose of this document is to define a series of test scenarios
   which may be used to generate statistics that should help to better
   understand the performance of network devices under realistic loading



Hamilton                Expires September 6, 2009               [Page 1]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


   conditions.  Additionally, this document provides suggestions on
   which statistics may be the most useful for determining network
   device performance under realistic deployment scenarios.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Requirements Language  . . . . . . . . . . . . . . . . . .  4
   2.  Scope  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     3.1.  Test Considerations  . . . . . . . . . . . . . . . . . . .  5
     3.2.  Clients and Servers  . . . . . . . . . . . . . . . . . . .  5
     3.3.  Traffic Generation Requirements  . . . . . . . . . . . . .  6
     3.4.  Multiple Client/Server Testing . . . . . . . . . . . . . .  6
     3.5.  Network Address Translation  . . . . . . . . . . . . . . .  6
     3.6.  TCP Stack Considerations . . . . . . . . . . . . . . . . .  7
     3.7.  Other Considerations . . . . . . . . . . . . . . . . . . .  7
   4.  Benchmarking Tests . . . . . . . . . . . . . . . . . . . . . .  7
     4.1.  Maximum Application Connection Establishment Rate  . . . .  7
       4.1.1.  Objective  . . . . . . . . . . . . . . . . . . . . . .  7
       4.1.2.  Setup Parameters . . . . . . . . . . . . . . . . . . .  7
         4.1.2.1.  Transport-Layer Parameters . . . . . . . . . . . .  7
         4.1.2.2.  Application-Layer Parameters . . . . . . . . . . .  7
       4.1.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . .  8
       4.1.4.  Measurement  . . . . . . . . . . . . . . . . . . . . .  8
         4.1.4.1.  Maximum Application Session Establishment Rate . .  8
         4.1.4.2.  Application Session Setup Time . . . . . . . . . .  8
         4.1.4.3.  Application Session Response Time  . . . . . . . .  8
         4.1.4.4.  Application Session Time To Close  . . . . . . . .  8
         4.1.4.5.  Application Latency  . . . . . . . . . . . . . . .  8
     4.2.  Application Throughput . . . . . . . . . . . . . . . . . .  9
       4.2.1.  Objective  . . . . . . . . . . . . . . . . . . . . . .  9
       4.2.2.  Setup Parameters . . . . . . . . . . . . . . . . . . .  9
         4.2.2.1.  Parameters . . . . . . . . . . . . . . . . . . . .  9
       4.2.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . .  9
       4.2.4.  Measurement  . . . . . . . . . . . . . . . . . . . . .  9
         4.2.4.1.  Maximum Throughput . . . . . . . . . . . . . . . .  9
         4.2.4.2.  Packet Loss  . . . . . . . . . . . . . . . . . . .  9
         4.2.4.3.  Application Setup Time . . . . . . . . . . . . . .  9
         4.2.4.4.  Application Response Time  . . . . . . . . . . . .  9
         4.2.4.5.  Application Session Time To Close  . . . . . . . . 10
         4.2.4.6.  Application Latency  . . . . . . . . . . . . . . . 10
     4.3.  Denial of Service Attack Mitigation  . . . . . . . . . . . 10
       4.3.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 10
       4.3.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 10
       4.3.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 10
       4.3.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 11



Hamilton                Expires September 6, 2009               [Page 2]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


         4.3.4.1.  False Positives  . . . . . . . . . . . . . . . . . 11
         4.3.4.2.  False Negatives  . . . . . . . . . . . . . . . . . 11
     4.4.  Malicious Traffic Mitigation . . . . . . . . . . . . . . . 11
       4.4.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 11
       4.4.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 11
       4.4.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 11
       4.4.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 12
         4.4.4.1.  False Positives  . . . . . . . . . . . . . . . . . 12
         4.4.4.2.  False Negatives  . . . . . . . . . . . . . . . . . 12
     4.5.  Malformed Traffic Mitigation . . . . . . . . . . . . . . . 12
       4.5.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 12
       4.5.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 12
       4.5.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 12
       4.5.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 13
   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 13
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 13
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 13
     7.2.  Informative References . . . . . . . . . . . . . . . . . . 14
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14































Hamilton                Expires September 6, 2009               [Page 3]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


1.  Introduction

   The purpose of this Internet Draft is to define and provide a set of
   benchmarks useful for evaluating content-aware network devices.  As
   processing resources have become faster and cheaper, network devices
   now utilize information far deeper inside the network packet than
   ever before.  No longer are devices looking simply at TCP/IP header
   information and bits of application headers; devices are now decoding
   application layer protocols and inspecting them for conformance to a
   given rule set, anomalies and even security signatures.  These
   devices have commonly become known as content-aware.

   Many of the terms used throughout this draft have previously been
   defined in "Benchmarking Terminology for Firewall Performance" RFC
   2647 [1].  This document SHOULD be consulted prior to using this
   document.  The Benchmarking Methodology Working Group (BMWG) has
   previously defined methodologies for network interconnect devices
   with RFC 2544 [2] and firewall performance with RFC 3511 [3].  This
   draft seeks to enhance these methodologies to provide even more
   realistic results.

1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [4].


2.  Scope

   Content-aware devices take many forms, shapes and architectures.
   These devices are advanced network interconnect devices that inspect
   deep into the application payload of network data packets to do
   classification.  They may be as simple as a firewall that uses
   application data inspection for rule set enforcement, or they may
   have advanced functionality such as doing protocol decoding and
   validation, anti-virus, anti-spam and even application exploit
   filtering.

   This document is strictly focused on examining performance and
   robustness across a wide range of metrics that will help to predict
   device performance when deployed in live networks.  These metrics
   will be, wherever possible, implementation independent.

   It should also be noted that the purpose of this document is not to
   define functional testing of the potential features in the Device/
   System Under Test (DUT/SUT)[1] nor specify the configurations that
   should be tested.  Various definitions of proper operation and



Hamilton                Expires September 6, 2009               [Page 4]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


   configuration may be appropriate for different deployments, thus
   those parameters are outside the scope of this document.


3.  Test Setup

   This document will be applicable to most test configurations and will
   not be confined to a discussion on specific test configurations.
   Since each DUT/SUT will have their own unique configuration, users
   MUST configure their device with the same parameters that would be
   used in a live deployment of the device.

   The lines between network boundaries are rapidly blurring.  Every
   port on a device could be content-aware when using a fully meshed
   network topology.  Organizations deploying content-aware devices are
   doing so throughout their network infrastructure.  These devices
   inspect deep into the application flow to perform quality of service
   monitoring, filtering, metering, threat mitigation and more.

   Figure 1 illustrates a network topology that is fully meshed.

                +----------+    +----------+    +----------+
                |          |    |          |    |          |
                | Servers/ |____|   DUT    |____| Servers/ |
                | Clients  |    |          |    | Clients  |
                |          |    |          |    |          |
                +----------+    +----------+    +----------+

                            Fully Meshed Device

                       Figure 1: Fully Meshed Device

   This document will also apply to the network configurations specified
   by Figures 1 and 2 in RFC 3511.

3.1.  Test Considerations

3.2.  Clients and Servers

   Content-aware device testing SHOULD involve multiple clients and
   multiple servers.  As with RFC 3511 [3], this methodology will use
   the terms virtual clients/servers throughout.  Similarly defined in
   RFC 3511 [3], a data source may emulate multiple clients and/or
   servers within the context of the same test scenario.  The test
   report MUST indicate the number of virtual clients/servers used
   during the test.  In Appendix C of RFC 2544 [2], the range of IP
   addresses assigned to the BMWG by the IANA are listed.  This address
   range SHOULD be adhered to in accordance with RFC 2544 [2].



Hamilton                Expires September 6, 2009               [Page 5]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


3.3.  Traffic Generation Requirements

   The explicit purposes of content-aware devices vary widely, but most
   of these devices use information deeper inside the application flow
   to make decisions and classify traffic.  Because of this, users MUST
   utilize real application traffic for determining benchmarking
   performance.

   Due to the dynamic nature of the environment in which these devices
   are being deployed, this document will not explicitly state the
   application protocols or versions to be used for this methodology.
   While this is left to the discretion of the end user, there are
   several guidelines that SHOULD be used when determining the breadth
   and depth of application protocols to be used:

   o  The traffic generation pattern SHOULD contain all protocols that
      may be present in the final production deployment.

   o  The percentage of each protocol SHOULD approximate the percentage
      seen in the final production deployment.

   o  The application traffic SHOULD be unique traffic flows and not
      simply ones and zeros.

   There are numerous tools available providing detailed information on
   the traffic flows in networks.  A description or definition of these
   tools is outside the scope of this document.

3.4.  Multiple Client/Server Testing

   In actual network deployments, connections are being established
   between multiple clients and multiple servers simultaneously.  RFC
   3511 [3] specifies that connections must be initiated in a round-
   robin fashion, but in order to replicate performance in live
   networks, this method SHOULD NOT be used.  The connection sequence
   ordering scenarios a device will see on a live network will likely be
   much less deterministic.  Thus, users SHOULD setup the test equipment
   to issue requests at random to the virtual servers rather than in a
   predictable round-robin fashion.  This method will help to
   appropriately reflect live network deployment behavior in the test
   setup.

3.5.  Network Address Translation

   Many content-aware devices are capable of performing Network Address
   Translation (NAT)[1].  If the final deployment of the DUT will have
   this functionality enabled, then the DUT MUST also have it enabled
   during the execution of this methodology.  It MAY be beneficial to



Hamilton                Expires September 6, 2009               [Page 6]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


   perform the test series in both modes in order to determine the
   performance differential when using NAT.  The test report MUST
   indicate whether NAT was enabled during the testing process.

3.6.  TCP Stack Considerations

   As with RFC 3511 [3], TCP options SHOULD remain constant across all
   devices under test in order to ensure truly comparable results.  This
   document does not attempt to specify which TCP options should be
   used, but all devices tested SHOULD be subject to the same
   configuration options.

3.7.  Other Considerations

   Various content-aware devices will have widely varying feature sets.
   In the interest of realistic test results, the DUT features that will
   likely be enabled in the final deployment SHOULD be used.  This
   methodology is not intended to advise on which features should be
   enabled, but to suggest using actual deployment configurations.


4.  Benchmarking Tests

4.1.  Maximum Application Connection Establishment Rate

4.1.1.  Objective

   To determine the maximum rate through which a device is able to
   establish application-specific sessions as defined by RFC 2647 [1].

4.1.2.  Setup Parameters

   The following parameters MUST be defined for all tests:

4.1.2.1.  Transport-Layer Parameters

   o  Aging Time: The time, expressed in seconds that the DUT will keep
      a connection in its state table after receiving a TCP FIN or RST
      packet.

   o  Maximum Segment Size: The size in bytes of the largest segment
      which may be sent over a TCP connection.

4.1.2.2.  Application-Layer Parameters

   o  Protocol List: A listing of the layer 4 through 7 protocols
      present in a given test run.




Hamilton                Expires September 6, 2009               [Page 7]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


   o  Protocol Mix: A listing of the percentage of total throughput
      absorbed by a given protocol.

4.1.3.  Procedure

   The test SHOULD generate application network traffic that meets the
   conditions of Section 3.3.  The traffic pattern SHOULD begin with an
   application session establishment rate of 5% of expected maximum.
   The test SHOULD be configured to increase the attempt rate in units
   of 5% up through 110% of expected maximum.  The duration of each
   loading phase SHOULD be at least 30 seconds.  This test MAY be
   repeated, each subsequent iteration beginning at 5% of expected
   maximum and increasing session establishment rate to 10% more than
   the maximum observed from the previous test run.

   This procedure MAY be repeated any number of times with the results
   being averaged together.

4.1.4.  Measurement

   The following metrics MAY be determined from this test, and SHOULD be
   observed for each application protocol within the traffic mix:

4.1.4.1.  Maximum Application Session Establishment Rate

   The test tool SHOULD report the maximum rate at which application
   sessions were established.

4.1.4.2.  Application Session Setup Time

   The test tool SHOULD report the minimum, maximum and average
   application setup time.

4.1.4.3.  Application Session Response Time

   The test tool SHOULD report the minimum, maximum, and average
   application session response times.

4.1.4.4.  Application Session Time To Close

   The test tool SHOULD report the minimum, maximum, and average
   application session time to close.

4.1.4.5.  Application Latency

   The test tool SHOULD report the minimum, maximum and average amount
   of time an application packet takes to traverse the DUT.




Hamilton                Expires September 6, 2009               [Page 8]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


4.2.  Application Throughput

4.2.1.  Objective

   To determine the maximum rate through which a device is able to
   forward packets when using realistic and stateful applications.

4.2.2.  Setup Parameters

   The following parameters MUST be defined and reported for all tests:

4.2.2.1.  Parameters

   The same transport and application parameters as described in
   Section 4.1.2 MUST be used.

4.2.3.  Procedure

   This test will attempt to send application data through the device at
   a session rate of 30% of the maximum established as observed in
   Section 4.1.  This procedure MAY be repeated with the results from
   each iteration averaged together.

4.2.4.  Measurement

   The following metrics MAY be determined from this test, and SHOULD be
   observed for each application protocol within the traffic mix:

4.2.4.1.  Maximum Throughput

   The test tool SHOULD report the minimum, maximum and average
   application throughput.

4.2.4.2.  Packet Loss

   The test tool SHOULD report the number of network packets lost or
   dropped from source to destination.

4.2.4.3.  Application Setup Time

   The test tool SHOULD report the minimum, maximum, and average amount
   of time necessary before an application may begin transmitting data.

4.2.4.4.  Application Response Time

   The test tool SHOULD report the minimum, maximum, and average
   response time of the application session.




Hamilton                Expires September 6, 2009               [Page 9]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


4.2.4.5.  Application Session Time To Close

   The test tool SHOULD report the minimum, maximum and average amount
   of time for an application session to fully close.

4.2.4.6.  Application Latency

   The test tool SHOULD report the minimum, maximum and average amount
   of time an application packet takes to traverse the DUT.

4.3.  Denial of Service Attack Mitigation

4.3.1.  Objective

   To determine the effects of a TCP SYN Flood Denial-of-Service (DoS)
   attack on application session performance.

4.3.2.  Setup Parameters

   The same parameters must be used for Transport-Layer and Application
   Layer Parameters previously specified in Section 4.1.2 and
   Section 4.2.2, respectively.  Additionally, the following parameters
   MUST be defined and reported for all tests:

   o  SYN attack rate: Rate, expressed in packets per second at which
      the DUT will receive TCP SYN packets.[3]

4.3.3.  Procedure

   This test will utilize the procedures specified previously in
   Section 4.1.3 and Section 4.2.3.  When performing the procedures
   listed previously, during the steady-state time, the test should
   generate TCP SYN packets at the rate defined by the SYN attack rate
   parameter described above.  The test tool MUST NOT respond to the TCP
   SYN packets with TCP SYN/ACK packets.  This procedure SHOULD be
   performed with the TCP SYN packets originating from a single host, as
   well as from multiple hosts.

   Both procedures SHOULD be run with and without the feature enabled on
   the DUT to determine the affects of the DoS attack on the baseline
   metrics previously derived.  Additionally, the test MAY be configured
   to generate other denial of service attacks, including distributed.
   This document does not attempt to specify which additional scenarios
   should be tested.







Hamilton                Expires September 6, 2009              [Page 10]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


4.3.4.  Measurement

   For each protocol present in the traffic mix, in addition to the
   metrics specified by Section 4.1.4 and Section 4.2.4, the following
   metrics MAY be determined from this test:

4.3.4.1.  False Positives

   Record this measurement of the number of application sessions that
   were failed due because of false detection as denial of service
   attack.

4.3.4.2.  False Negatives

   Record the number of TCP SYN packets as part of the DoS stream that
   were allowed to pass through the DUT.

4.4.  Malicious Traffic Mitigation

4.4.1.  Objective

   To determine the effects on performance that malicious traffic may
   have on the DUT.

4.4.2.  Setup Parameters

   The same parameters must be used for Transport-Layer and Application
   Layer Parameters previously specified in Section 4.1.2 and
   Section 4.2.2, respectively.  Additionally, the following parameters
   MUST be defined and reported for all tests:

   o  Attack List: A listing of the malicious traffic that was generated
      by the test.

4.4.3.  Procedure

   This test will utilize the procedures specified previously in
   Section 4.1.3 and Section 4.2.3.  When performing the procedures
   listed previously, during the steady-state time, the tester should
   generate malicious traffic representative of the final network
   deployment.  The mix of attacks MAY include software vulnerability
   exploits, network worms, back-door access attempts, network probes
   and other malicious traffic.

   If a DUT may be run with and without the attack mitigation, both
   procedures SHOULD be run with and without the feature enabled on the
   DUT to determine the affects of the malicious traffic on the baseline
   metrics previously derived.  If a DUT does not have active attack



Hamilton                Expires September 6, 2009              [Page 11]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


   mitigation capabilities, this procedure SHOULD be run regardless.
   Certain malicious traffic could affect device performance even if the
   DUT does not actively inspect packet data for malicious traffic.

4.4.4.  Measurement

   For each protocol present in the traffic mix, in addition to the
   metrics specified by Section 4.1.4 and Section 4.2.4, the following
   metrics MAY be determined from this test:

4.4.4.1.  False Positives

   Record this measurement of the number of application transactions
   that were failed due to false detection as malicious traffic.  This
   measurement has little meaning for DUTs that do not actively block
   malicious traffic.

4.4.4.2.  False Negatives

   This measurement is the number of malicious attacks that were passed
   through the DUT.  This measurement has little meaning for DUTs that
   do not actively block malicious traffic.

4.5.  Malformed Traffic Mitigation

4.5.1.  Objective

   To determine the effects on performance and stability that malformed
   traffic may have on the DUT.

4.5.2.  Setup Parameters

   The same parameters must be used for Transport-Layer and Application
   Layer Parameters previously specified in Section 4.1.2 and
   Section 4.2.2.

4.5.3.  Procedure

   This test will utilize the procedures specified previously in
   Section 4.1.3 and Section 4.2.3.  When performing the procedures
   listed previously, during the steady-state time, the tester should
   generate malformed traffic at all protocol layers.  This is commonly
   known as fuzzed traffic.  This test SHOULD be run on a DUT regardless
   of whether it has built-in mitigation capabilities.







Hamilton                Expires September 6, 2009              [Page 12]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


4.5.4.  Measurement

   For each protocol present in the traffic mix, the metrics specified
   by Section 4.1.4 and Section 4.2.4 MAY be determined.  This data may
   be used to ascertain the effects of fuzzed traffic on the DUT.


5.  IANA Considerations

   This memo includes no request to IANA.

   All drafts are required to have an IANA considerations section (see
   the update of RFC 2434 [6] for a guide).  If the draft does not
   require IANA to do anything, the section contains an explicit
   statement that this is the case (as above).  If there are no
   requirements for IANA, the section will be removed during conversion
   into an RFC by the RFC Editor.


6.  Security Considerations

   The purpose of this document is to provide a methodology for
   benchmarking content-aware network interconnect devices.  While this
   document does suggest running some tests utilizing software
   vulnerability exploits and network attacks, the primary purpose is to
   determine the effects on performance rather than assess the security
   of the DUTs themselves.  Thus, security is outside the scope of this
   document.


7.  References

7.1.  Normative References

   [1]  Newman, D., "Benchmarking Terminology for Firewall Performance",
        RFC 2647, August 1999.

   [2]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
        Network Interconnect Devices", RFC 2544, March 1999.

   [3]  Hickman, B., Newman, D., Tadjudin, S., and T. Martin,
        "Benchmarking Methodology for Firewall Performance", RFC 3511,
        April 2003.

   [4]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [5]  Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on



Hamilton                Expires September 6, 2009              [Page 13]


Internet-Draft    Methodology for Content-Aware Devices       March 2009


        Security Considerations", BCP 72, RFC 3552, July 2003.

7.2.  Informative References

   [6]  Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
        Considerations Section in RFCs",
        draft-narten-iana-considerations-rfc2434bis-09 (work in
        progress), March 2008.


Author's Address

   Mike Hamilton
   BreakingPoint Systems
   Austin, TX  78717
   US

   Phone: +1 512 636 2303
   Email: mhamilton@breakingpoint.com
































Hamilton                Expires September 6, 2009              [Page 14]