Network Working Group                                           A. Clark
Internet-Draft                                     Telchemy Incorporated
Intended status: BCP                                        July 4, 2008
Expires: January 5, 2009

              Framework for Performance Metric Development

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

   This Internet-Draft will expire on January 5, 2009.


   This memo describes a framework and guidelines for the development of
   performance metrics that are beyond the scope of existing working
   group charters in the IETF.  In this version, the memo refers to a
   Performance Metrics Entity, or PM Entity, which may in future be a
   working group or directorate or a combination of these two.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in RFC 2119 [RFC2119].

Clark                    Expires January 5, 2009                [Page 1]

Internet-Draft           Perf. Metric Framework                July 2008

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Background and Motivation  . . . . . . . . . . . . . . . .  3
     1.2.  Organization of this memo  . . . . . . . . . . . . . . . .  4
   2.  Purpose and Scope  . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Metrics Development  . . . . . . . . . . . . . . . . . . . . .  4
     3.1.  Audience for Metrics . . . . . . . . . . . . . . . . . . .  5
     3.2.  Definitions of a Metric  . . . . . . . . . . . . . . . . .  5
     3.3.  Composed Metrics . . . . . . . . . . . . . . . . . . . . .  6
     3.4.  Metric Specification . . . . . . . . . . . . . . . . . . .  6
       3.4.1.  Outline  . . . . . . . . . . . . . . . . . . . . . . .  6
       3.4.2.  Normative parts of metric definition . . . . . . . . .  6
       3.4.3.  Informative parts of metric definition . . . . . . . .  7
       3.4.4.  Metric Definition Template . . . . . . . . . . . . . .  8
       3.4.5.  Examples . . . . . . . . . . . . . . . . . . . . . . .  9
     3.5.  Qualifying Metrics . . . . . . . . . . . . . . . . . . . . 10
     3.6.  Reporting Models . . . . . . . . . . . . . . . . . . . . . 10
     3.7.  Dependencies . . . . . . . . . . . . . . . . . . . . . . . 10
       3.7.1.  Timing accuracy  . . . . . . . . . . . . . . . . . . . 10
       3.7.2.  Dependencies of metric definitions on related
               events or metrics  . . . . . . . . . . . . . . . . . . 11
       3.7.3.  Relationship between application performance and
               lower layer metrics  . . . . . . . . . . . . . . . . . 11
   4.  Performance Metric Development Process . . . . . . . . . . . . 11
     4.1.  New Proposals for Metrics  . . . . . . . . . . . . . . . . 11
     4.2.  Proposal Approval  . . . . . . . . . . . . . . . . . . . . 12
     4.3.  PM Entity Interaction with other WGs . . . . . . . . . . . 12
     4.4.  Standards Track Performance Metrics  . . . . . . . . . . . 12
   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 13
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 13
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 13
     8.2.  Informative References . . . . . . . . . . . . . . . . . . 13
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14
   Intellectual Property and Copyright Statements . . . . . . . . . . 15

Clark                    Expires January 5, 2009                [Page 2]

Internet-Draft           Perf. Metric Framework                July 2008

1.  Introduction

   Many applications are distributed in nature, and their performance
   may be impacted by a IP impairments, server capacity, congestion and
   other factors.  It is important to measure the performance of
   applications and services to ensure that quality objectives are being
   met and to support problem diagnosis.  Standardized metrics help to
   ensure that performance measurement is implemented consistently and
   to facilitate interpretation and comparison.

   There are at least three phases in the development of performance
   standards.  They are:

   1.  Definition of a Performance Metric and its units of measure

   2.  Specification of a Method of Measurement

   3.  Specification of the Reporting Format

   During the development of metrics it is often useful to define
   performance objectives and expected value ranges however this is not
   defined as part of the metric specification.

   This memo refers to a Performance Metrics Entity, or PM Entity, which
   may in future be a working group or directorate or a combination of
   these two.

1.1.  Background and Motivation

   Although the IETF has two Working Groups dedicated to the development
   of performance metrics, they each have strict limitations in their

   - The Benchmarking Methodology WG has addressed a range of networking
   technologies and protocols in their long history (such as IEEE 802.3,
   ATM, Frame Relay, and Routing Protocols), but the charter strictly
   limits their performance characterizations to the laboratory

   - The IP Performance Metrics WG has the mandate to develop metrics
   applicable to live IP networks, but it is specifically prohibited
   from developing metrics that characterize traffic (such as a VoIP

   A BOF held at IETF-69 introduced the IETF community to the
   possibility of a generalized activity to define standardized
   performance metrics.  The existence of a growing list of Internet-
   Drafts on performance metrics (with community interest in

Clark                    Expires January 5, 2009                [Page 3]

Internet-Draft           Perf. Metric Framework                July 2008

   development, but in un-chartered areas) illustrates the need for
   additional performance work.  The majority of people present at the
   BOF supported the proposition that IETF should be working in these
   areas, and no one objected to any of the proposals.

   The IETF does have current and completed activities related to the
   reporting of application performance metrics (e.g.  RAQMON) and is
   also actively involved in the development of reliable transport
   protocols which would affect the relationship between IP performance
   and application performance.

   Thus there is a gap in the currently chartered coverage of IETF WGs:
   development of performance metrics for non-IP-layer protocols that
   can be used to characterize performance on live networks.

1.2.  Organization of this memo

   This memo is divided in two major sections beyond the Purpose and
   Scope section.  The first is a definition and description of a
   performance metric and its key aspects.  The second defines a process
   to develop these metrics that is applicable to the IETF environment.

2.  Purpose and Scope

   The purpose of this memo is to define a framework and a process for
   developing performance metrics for IP-based applications that operate
   over reliable or datagram transport protocols, and that can be used
   to characterize traffic on live networks and services.

   The scope of this memo includes the support of metric definition for
   any protocol developed by the IETF, however this memo is not intended
   to supercede existing working methods within WGs that have existing
   chartered work in this area.

   This process is not intended to govern performance metric development
   in existing IETF WG that are focused on metrics development, such as
   IPPM and BMWG.  However, the framework and guidelines may be useful
   in these activities, and MAY be applied where appropriate.

3.  Metrics Development

   This section provides key definitions and qualifications of
   performance metrics.

Clark                    Expires January 5, 2009                [Page 4]

Internet-Draft           Perf. Metric Framework                July 2008

3.1.  Audience for Metrics

   Metrics are intended for use in measuring the performance of an
   application, network or service.  A key first step in metric
   definition is to identify what metrics are needed by the "user" in
   order to properly maintain service quality and to identify and
   quantify problems, i.e. to consider the audience for the metrics.

3.2.  Definitions of a Metric

   A metric is a measure of an observable behavior of an application,
   protocol or other system.  The definition of a metric often assumes
   some implicit or explicit underlying statistical process, and a
   metric is an estimate of a parameter of this process.  If the assumed
   statistical process closely models the behavior of the system then
   the metric is "better" in the sense that it more accurately
   characterizes the state or behavior of the system.

   A metric should serve some defined purpose.  This may include the
   measurement of capacity, quantifying how bad some problem is,
   measurement of service level, problem diagnosis or location and other
   such uses.  A metric may also be an input to some other process, for
   example the computation of a composite metric or a model or
   simulation of a system.  Tests of the "usefulness" of a metric

      (i) the degree to which its absence would cause significant loss
      of information on the behavior or state of the application or
      system being measured

      (ii) the correlation between the metric and the quality of service
      / experience delivered to the user (person or other application)

      (iii) the degree to which the metric is able to support the
      identification and location of problems affecting service quality.

   For example, consider a distributed application operating over a
   network connection that is subject to packet loss.  A Packet Loss
   Rate (PLR) metric is defined as the mean packet loss rate over some
   time period.  If the application performs poorly over network
   connections with high packet loss rate and always performs well when
   the packet loss rate is zero then the PLR metric is useful to some
   degree.  Some applications are sensitive to short periods of high
   loss (bursty loss) and are relatively insensitive to isolated packet
   loss events; for this type of application there would be very weak
   correlation between PLR and application performance.  A "better"
   metric would consider both the packet loss rate and the distribution
   of loss events.  If application performance is degraded when the PLR

Clark                    Expires January 5, 2009                [Page 5]

Internet-Draft           Perf. Metric Framework                July 2008

   exceeds some rate then a useful metric may be a measure of the
   duration and frequency of periods during which the PLR exceeds that

3.3.  Composed Metrics

   Some metrics may not be measured directly, but may be composed from
   metrics that have been measured.  Usually the contribution metrics
   have a limited scope in time or space, and they can be combined to
   estimate the performance of some larger entity.  Some examples of
   composed metrics and composed metric definitions are:

   Spatial Composition is defined as the composition of metrics of the
   same type with differing spatial domains [Ref ?].  For spatially
   composed metrics to be meaningful, the spatial domains should be non-
   overlapping and contiguous, and the composition operation should be
   mathematically appropriate for the type of metric.

   Temporal Composition is defined as the composition of metrics of the
   same type with differing time spans [Ref ?].  For temporally composed
   metrics to be meaningful, the time spans should be non-overlapping
   and contiguous, and the composition operation should be
   mathematically appropriate for the type of metric.

   Temporal Aggregation is a summarization of metrics into a smaller
   number of metrics, each of which has a greater time span than the
   original metrics.  An example would be to compute the minimum,
   maximum and average values of a series of time sampled values of a

3.4.  Metric Specification

3.4.1.  Outline

   A metric definition MUST have a normative part that defines what the
   metric is and how it is measured or computed and SHOULD have an
   informative part that describes the metric and its application.

3.4.2.  Normative parts of metric definition

   The normative part of a metric definition MUST define at least the

   (i) Metric Name

   Metric names MUST be unique within the set of metrics being defined
   and MAY be descriptive.

Clark                    Expires January 5, 2009                [Page 6]

Internet-Draft           Perf. Metric Framework                July 2008

   (ii) Metric Description

   The description MUST explain what the metric is, what is being
   measured and how this relates to the performance of the system being

   (iii) Measurement Method

   This MUST define what is being measured, estimated or computed and
   the specific algorithm to be used.  Terms such as "average" should be
   qualified (e.g. running average or average over some interval).  It
   is important to also define exception cases and how these are
   handled.  For example, there are a number of commonly used metrics
   related to packet loss; these often don't define the criteria by
   which a packet is determined to be lost (vs very delayed) or how
   duplicate packets are handled.  For example, if the average packet
   loss rate during a time interval is reported, and a packet's arrival
   is delayed from one interval to the next then was it "lost" during
   the interval during which it should have arrived or should it be
   counted as received?

   (iv) Units of measurement

   The units of measurement must be clearly stated.

   (v) Measurement timing

   The acceptable range of timing intervals or sampling intervals for a
   measurement and the timing accuracy required for such intervals must
   be specified.  Short intervals or frequent sampling provides a richer
   source of information that can be helpful in assessing application
   performance however can lead to excessive measurement data.  Long
   measurement or sampling intervals reduce that amount of reported and
   collected data however may be insufficient to truly understand
   application performance or service quality if this varies with time.

3.4.3.  Informative parts of metric definition

   The informative part of a metric specification is intended to support
   the implementation and use of the metric.  This part SHOULD provide
   the following data: (i) Implementation The implementation description
   may be in the form of text, algorithm or example software.  The
   objective of this part of the metric definition is to assist
   implementers to achieve a consistent result. (ii) Conformance Testing
   The metric definition SHOULD provide guidance on conformance testing.
   This may be in the form of test vectors, a formal conformance test
   plan or informal advice.

Clark                    Expires January 5, 2009                [Page 7]

Internet-Draft           Perf. Metric Framework                July 2008

   (iii) Use and Applications The Use and Applications description is
   intended to assist the "user" to understand how, when and where the
   metric can be applied, and what significance the value range for the
   metric may have.  This would typically involve a definition of the
   "typical" and "abnormal" range of the metric, if this was not
   apparent from the nature of the metric.  For example, it is fairly
   intuitive that a lower packet loss rate would equate to better
   performance however the user may not know the significance of some
   given packet loss rate.  For example, the speech level of a telephone
   signal is commonly expressed in dBm0.  If the user is presented with:
   Speech level = -7 dBm0 this is not intuitively understandable, unless
   the user is a telephony expert.  If the metric definition explains
   that the typical range is -18 to -28 dBm0, a value higher than -18
   means the signal may be too high (loud) and less than -28 means that
   the signal may be too low (quiet), it is much easier to interpret the
   metric. (iv) Reporting Model

   There are often implied relationships between the method of reporting
   metrics and the metric itself.  For example, if the metric is a short
   term running average packet delay variation (e.g.  PPDV as defined in
   RFC3550) however this value is reported at intervals of 6-10 seconds
   this results in a sampling model which may have limited accuracy if
   packet delay variation is non-stationary.

3.4.4.  Metric Definition Template


   Metric Name

   Metric Description

   Measurement Method

   Units of measurement

   Measurement Timing


   Implementation Guidelines

   Use and Applications

   Reporting Model

Clark                    Expires January 5, 2009                [Page 8]

Internet-Draft           Perf. Metric Framework                July 2008

3.4.5.  Examples

   Example definition

   Metric Name: BurstPacketLossFrequency

   Metric Description: A burst of packet loss is defined as a longest
   period starting and ending with lost packets during which no more
   than Gmin consecutive packets are received.  The
   BurstPacketLossFrequency is defined as the number of bursts of packet
   loss occurring during a specified time interval (e.g. per minute, per
   hour, per day).  If Gmin is set to 0 then a burst of packet loss
   would comprise only consecutive lost packets, whereas a Gmin of 16
   would define bursts as periods of both lost and received packets
   (sparse bursts) having a loss rate of greater than 5.9%.

   Measurement Method: Bursts may be detected using the Markov Model
   algorithm defined in RFC3611.  The BurstPacketLossFrequency is
   calculated by counting the number of burst events within the defined
   measurement interval.  A burst that spans the boundary between two
   time intervals shall be counted within the later of the two

   Units of Measurement: Bursts per time interval (e.g. per second, per
   hour, per day)

   Measurement Timing: This metric can be used over a wide range of time
   intervals.  Using time intervals of longer than one hour may prevent
   the detection of variations in the value of this metric due to time-
   of-day load network load changes.  Timing intervals should not vary
   in duration by more than +/- 2%.

   Implementation Guidelines: See RFC3611.

   Conformance Testing: See Appendix for C code to generate test

   Use and Applications: This metric is useful to detect IP network
   transient problems that affect the quality of applications such as
   Voice over IP or IP Video.  The value of Gmin may be selected to
   ensure that bursts correspond to a packet loss rate that would
   degrade the performance of the application of interest (e.g. 16 for

   Reporting Model: This metric needs to be associated with a defined
   time interval, which could be defined by fixed intervals or by a
   sliding window.

Clark                    Expires January 5, 2009                [Page 9]

Internet-Draft           Perf. Metric Framework                July 2008

3.5.  Qualifying Metrics

   Each metric SHOULD be assessed according to the following list of

   o  Unambiguously defined?

   o  Units of Measure Specified?

   o  Measurement Interval Specified?

   o  Measurement Errors Identified?

   o  Repeatable?

   o  Implementable?

   o  Assumptions concerning underlying process?

   o  Use cases?

   o  Correlation with application performance/ user experience?

3.6.  Reporting Models

   A metric, or some set of metrics, may be measured over some time
   period, measured continuously, sampled, aggregated and/or combined
   into composite metrics and then reported using a "push" or "pull"
   model.  Reporting protocols typically introduce some limitations and
   assumptions with regard to the definition of a metric.

3.7.  Dependencies

3.7.1.  Timing accuracy

   The accuracy of the timing of a measurement may affect the accuracy
   of the metric.  This may not materially affect a sampled value metric
   however would affect an interval based metric.  Some metrics, for
   example the number of events per time interval, would be directly
   affected; for example a 10 percent variation in time interval would
   lead directly to a 10 percent variation in the measured value.  Other
   metrics, such as the average packet loss rate during some time
   interval, would be affected to a lesser extent.

   If it is necessary to correlate sampled values or intervals then it
   is essential that the accuracy of sampling time and interval start/
   stop times is sufficient for the application (for example +/- 2%).

Clark                    Expires January 5, 2009               [Page 10]

Internet-Draft           Perf. Metric Framework                July 2008

3.7.2.  Dependencies of metric definitions on related events or metrics

   Metric definitions may explicitly or implicitly rely on factors that
   may not be obvious.  For example, the recognition of a packet as
   being "lost" relies on having some method to know the packet was
   actually lost (e.g.  RTP sequence number), and some time threshold
   after which a non-received packet is declared as lost.  It is
   important that any such dependencies are recognized and incorporated
   into the metric definition.

3.7.3.  Relationship between application performance and lower layer

   Lower layer metrics may be used to compute or infer the performance
   of higher layer applications, potentially using an application
   performance model.  The accuracy of this will depend on many factors

   (i) The completeness of the set of metrics - i.e. are there metrics
   for all the input values to the application performance model?

   (ii) Correlation between input variables (being measured) and
   application performance

   (iii) Variability in the measured metrics and how this variability
   affects application performance

4.  Performance Metric Development Process

4.1.  New Proposals for Metrics

   The following entry criteria will be considered for each proposal.

   Proposals SHOULD be prepared as Internet Drafts, describing the
   metrics and conforming to the qualifications above as much as

   Proposals SHOULD be vetted by the corresponding protocol development
   Working Group prior to discussion by the PM Entity.  This aspect of
   the process includes an assessment of the need for the metrics
   proposed and assessment of the support for their development in IETF.

   Proposals SHOULD include an assessment of interaction and/or overlap
   with work in other Standards Development Organizations.

   Proposals SHOULD specify the intended audience and users of the
   metrics.  The development process encourages participation by members

Clark                    Expires January 5, 2009               [Page 11]

Internet-Draft           Perf. Metric Framework                July 2008

   of the intended audience.

   Proposals SHOULD survey the existing standards work in the area and
   identify additional expertise that might be consulted, or possible
   overlap with other standards development orgs.

   Proposals SHOULD identify any security and IANA requirements.
   Security issues could potentially involve revealing of user
   identifying data or the potential misuse of active test tools.  IANA
   considerations may involve the need for a metrics registry.

4.2.  Proposal Approval

   Who does this???

   The IETF/IESG/Relevant ADs/Relevant WG/PM Entity ???

   This section depends on the direction of the solution, or form that
   the PM Entity takes.

4.3.  PM Entity Interaction with other WGs

   The PM Entity SHALL work in partnership with the related protocol
   development WG when considering an Internet Draft that specifies
   performance metrics for a protocol.  A sufficient number of
   individuals with expertise must be willing to consult on the draft.
   If the related WG has concluded, comments on the proposal should
   still be sought from key RFC authors and former chairs, or from the
   WG mailing list if it was not closed.

   A dedicated mailing list MAY be initiated for each work area, so that
   protocol experts can subscribe to and receive the message traffic
   that is relevant to their work.

   In some cases, it will be appropriate to have the IETF session
   discussion during the related protocol WG session, to maximize
   visibility of the effort to that WG and expand the review.

4.4.  Standards Track Performance Metrics

   The PM Entity will manage the progression of PM RFCs along the
   Standards Track.  See [I-D.bradner-metricstest].  This may include
   the preparation of test plans to examine different implementations of
   the metrics to ensure that the metric definitions are clear and
   unambiguous (depending on the final form of the draft above).

Clark                    Expires January 5, 2009               [Page 12]

Internet-Draft           Perf. Metric Framework                July 2008

5.  IANA Considerations

   This document makes no request of IANA.

   Note to RFC Editor: this section may be removed on publication as an

6.  Security Considerations

   In general, the existence of framework for performance metric
   development does not constitute a security issue for the Internet.
   Metric definitions may introduce security issues and this framework
   recommends that those defining metrics should identify any such risk

   The security considerations that apply to any active measurement of
   live networks are relevant here as well.  See [RFC4656].

7.  Acknowledgements

   The authors would like to thank Al Morton and Benoit Claise for their
   comments and contributions.

8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4656]  Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M.
              Zekauskas, "A One-way Active Measurement Protocol
              (OWAMP)", RFC 4656, September 2006.

8.2.  Informative References

   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
              "Framework for IP Performance Metrics", RFC 2330,
              May 1998.

   [Casner]   "A Fine-Grained View of High Performance Networking, NANOG
              22 Conf.;", May
              20-22 2001.


Clark                    Expires January 5, 2009               [Page 13]

Internet-Draft           Perf. Metric Framework                July 2008

              Bradner, S. and V. Paxson, "Advancement of metrics
              specifications on the IETF Standards Track",
              draft-bradner-metricstest-03 (work in progress),
              August 2007.

              Morton, A., "Framework for Metric Composition",
              draft-ietf-ippm-framework-compagg-06 (work in progress),
              February 2008.

              Morton, A. and E. Stephan, "Spatial Composition of
              Metrics", draft-ietf-ippm-spatial-composition-06 (work in
              progress), February 2008.

Author's Address

   Alan Clark
   Telchemy Incorporated
   2905 Premiere Parkway, Suite 280
   Duluth, Georgia  30097


Clark                    Expires January 5, 2009               [Page 14]

Internet-Draft           Perf. Metric Framework                July 2008

Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at

Clark                    Expires January 5, 2009               [Page 15]