INTERNET-DRAFT                                               David Black
Diffserv Working Group                                    The Open Group
Expires: November 1998                                      Steven Blake
                                                         IBM Corporation
                                                            Mark Carlson
                                                        Redcape Software
                                                            Elwyn Davies
                                                               Nortel UK
                                                              Zheng Wang
                                           Bell Labs Lucent Technologies
                                                            Walter Weiss
                                                     Lucent Technologies

                                                                May 1998


               An Architecture for Differentiated Services

                    <draft-ietf-diffserv-arch-00.txt>


Status of This Memo

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   To view the entire list of current Internet-Drafts, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern
   Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific
   Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).


Abstract

   This document defines an architecture for implementing scalable
   service differentiation in the Internet.  This architecture achieves
   scalability by aggregating traffic classification state which is
   conveyed by means of IP-layer packet marking using the DS field
   [DSFIELD].  Packets are classified and marked to receive a particular
   per-hop forwarding behavior on routers along their path.
   Sophisticated classification, policing, and shaping operations need
   only be implemented at network boundaries or hosts.  Network
   resources are allocated to traffic streams by service provisioning
   policies which govern how traffic is conditioned upon entry to a
   differentiated services-capable network, and how that traffic is


Black, et. al.            Expires: November 1998               [Page  1]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   forwarded within that network.  A wide variety of services can
   implemented on top of these building blocks.

   This document should be read along with its companion documents, the
   the differentiated services framework [DSFWK], the definition of the
   DS field [DSFIELD], and other documents which specify per-hop
   behaviors, such as [Baker].


1.  Introduction

1.1  Overview

   This document defines an architecture for implementing scalable
   service differentiation in the Internet.  "Service" is taken to
   signify some significant characteristics of packet transmission
   across a set of one or more paths within a network.  These
   characteristics may be specified in quantitative or statistical terms
   of throughput, delay, jitter, and/or loss, or may otherwise be
   specified in terms of some relative priority of access to network
   resources.  Service differentiation is desired to accommodate
   heterogeneous application requirements and user expectations, and to
   permit differentiated pricing of Internet service.

   This architecture is composed of a number of functional elements
   implemented in network nodes, including a small set of well-defined
   per-hop forwarding behaviors, and traffic conditioning functions
   including classification, metering, marking, shaping, and policing.
   This architecture achieves scalability by implementing complex
   conditioning functions only at network edge nodes, and by applying
   per-hop behaviors to aggregates of traffic which have been
   appropriately marked using the DS field in the IPv4 or IPv6 headers
   [DSFIELD].  Per-hop behaviors are defined to permit a reasonably
   granular means of allocating buffer and bandwidth resources among
   competing traffic streams.  Per-application flow or per-customer
   forwarding state need not be maintained within the core of the
   network.  Service provisioning and traffic conditioning policies are
   sufficiently decoupled from the forwarding behaviors within the
   network interior to permit a wide variety of service behaviors to be
   implemented, with room for future expansion.

   Section 1.2 is a glossary of terms used within this document.
   Section 1.3 lists requirements for this architecture, and Section 1.4
   provides a brief comparison to other approaches for service
   differentiation.  Section 2 discusses the components of the
   architecture in detail.  Section 3 proposes requirements for per-hop
   behavior specifications.  Section 4 discusses interoperability issues
   with networks which do not implement differentiated services as
   defined in this document and [DSFIELD].  Section 5 discusses issues
   with multicast traffic (this section is currently left for future
   study).  Section 6 addresses security and tunnel considerations.



Black, et. al.            Expires: November 1998               [Page  2]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   This document should be read along with its companion documents, the
   differentiated services framework [DSFWK], the definition of the DS
   field [DSFIELD], and other documents which specify per-hop behaviors,
   such as [Baker].  It has been heavily influenced by the thoughtful
   proposals of previous authors [Clark97, Ellesson, Ferguson, Heinanen,
   SIMA, 2BIT, Weiss].


1.2  Terminology

   This section gives a general conceptual overview of the terms used
   in this document.  Some of these terms are more precisely defined in
   later sections of this document.  The choice of terms and definitions
   were influenced by [MPLSFWK].

   Behavior Aggregate (BA)   a DS behavior aggregate.

   BA classifier             a classifier that selects packets based
                             only on the contents of the DS-field.  Such
                             classifiers are used in DS interior nodes,
                             and are typically used for policing at a DS
                             ingress node.

   Boundary                  a link connecting the edge nodes of two
                             domains.

   Classifier                a logical element of traffic conditioning
                             that selects packets based on the content
                             of packet headers according to defined
                             rules.

   Customer DS domain        a DS domain that has an SLA in place with
                             another directly attached DS domain (the
                             provider DS domain) governing the rules by
                             which traffic from the customer DS domain
                             will be serviced within the provider DS
                             domain.  A single DS domain may be both a
                             customer DS domain and a provider DS domain
                             for different directions of traffic at the
                             same time.

   Differentiated Services   a paradigm for providing quality-of-service
   (DS)                      (QoS) in the Internet by employing a small,
                             well-defined set of building blocks from
                             which a variety of services may be built.

   DS behavior aggregate     a stream of packets that have the same DS
                             codepoint.

   DS field                  the IPv4 TOS octet or IPv6 Traffic Class
                             octet when interpreted according to
                             [DSFIELD].


Black, et. al.            Expires: November 1998               [Page  3]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   DS capable                able to support differentiated services
                             functions and behaviors as defined in
                             [DSFIELD], this document, and other
                             documents.

   DS codepoint              a specific bit-pattern of the DS field.

   DS edge node              a DS node that connects one DS domain to a
                             node either in another DS domain or in a
                             domain that is not DS capable.

   DS egress node            a DS edge node in its role in handling
                             traffic as it leaves a DS domain.

   DS destination host       a DS host that acts as a DS egress node.

   DS domain                 a contiguous set of nodes which operate
                             with a common set of service provisioning
                             policies and PHB definitions.

   DS host                   a host computer that can perform certain
                             traffic conditioning functions and
                             therefore acts as a special DS edge node.

   DS ingress node           a DS edge node in its role in handling
                             traffic as it enters a DS domain.

   DS interior node          a DS node that is not a DS edge node.

   DS node                   a DS capable node.

   DS region                 a set of contiguous DS domains which can
                             offer differentiated services over paths
                             across those DS domains.

   DS source host            a DS host that acts as a DS ingress node.

   Legacy node               a node which implements IPv4 Precedence as
                             defined in [RFC791] but which is otherwise
                             not DS capable.

   Marker                    a logical element of traffic conditioning
                             that sets the DS codepoint in the DS field
                             based on defined rules.

   MF Classifier             a classifier which selects packets based on
                             the content of some arbitrary number of
                             header fields; typically some combination
                             of source address, destination address,
                             protocol ID, source port and destination
                             port.



Black, et. al.            Expires: November 1998               [Page  4]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   Mechanism                 a specific algorithm or operation (e.g.,
                             queueing discipline) that is implemented in
                             a node to realize a set of one or more per-
                             hop behaviors.

   Meter                     a logical element of traffic conditioning
                             that measures the properties (e.g., rate)
                             of a packet stream selected by a
                             classifier.

   Microflow                 a single instance of an application-to-
                             application flow of packets which is
                             identified by source address, source port,
                             destination address, destination port and
                             protocol id.

   Per-Hop-Behavior (PHB)    the externally observable forwarding
                             behavior applied at a DS capable node to a
                             DS behavior aggregate.

   PHB group                 a set of one or more PHBs that can only be
                             meaningfully specified and implemented
                             simultaneously, due to a common constraint
                             applying to all PHBs in the set such as a
                             packet scheduling or discard policy.

   Policing                  the process of applying traffic
                             conditioning functions such as marking or
                             discarding to a traffic stream in
                             accordance with the state of a
                             corresponding meter.

   Provider DS domain        a DS domain that has an SLA in place with
                             another directly attached DS domain (the
                             customer DS domain) governing the rules by
                             which traffic from the customer DS domain
                             will be serviced within the provider DS
                             domain.  A single DS domain may be both a
                             customer DS domain and a provider DS domain
                             for different directions of traffic at the
                             same time.

   Service                   the overall treatment of a defined subset
                             of a customer's traffic within a DS domain
                             or end-to-end.

   Service Level Agreement   a service contract between a customer and a
   (SLA)                     service provider that specifies the details
                             of a TCA and the corresponding service
                             behavior a customer should receive.  A
                             customer may be a user organization or
                             another DS domain.


Black, et. al.            Expires: November 1998               [Page  5]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   Service Provisioning      a policy which defines how traffic
   Policy                    conditioners are configured on DS edge
                             nodes and how traffic streams are mapped to
                             DS behavior aggregates to achieve a range
                             of service behaviors.

   Shaper                    a logical element of traffic conditioning
                             that delays packets within a traffic stream
                             to cause it to conform to some defined
                             traffic properties.

   Traffic conditioner       an entity that performs traffic
                             conditioning and which may contain
                             classifiers, markers, meters, and shapers.

   Traffic conditioning      control functions performed to enforce
                             rules specified in a TCA and to prepare
                             traffic for differentiated services,
                             including classifying, metering, marking,
                             policing, and shaping.

   Traffic Conditioning      an agreement specifying classifier rules
   Agreement (TCA)           and the corresponding traffic profiles and
                             metering, marking, policing and/or shaping
                             rules which are to apply to the traffic
                             streams selected by the classifier.

   Traffic profile           a description of the expected properties
                             of a traffic stream such as rate and burst
                             size.

   Traffic stream            an administratively significant set of one
                             or more microflows which traverse a path
                             segment.  A traffic stream may consist of
                             the set of active microflows which are
                             selected by a particular classifier.


1.3  Requirements

   The history of the Internet has been continuous growth in the number
   of hosts, the number and variety of applications, and the capacity of
   the network infrastructure, and this growth is expected to continue
   for the foreseeable future.  A scalable architecture for service
   differentiation must be able to accommodate this continued growth.

   The following requirements were identified and are addressed in this
   architecture:

   o  must accommodate a wide variety of service behaviors and
      provisioning policies, extending end-to-end or within a particular
      (set of) network(s),


Black, et. al.            Expires: November 1998               [Page  6]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   o  must allow decoupling of the service behavior from the particular
      application in use,

   o  must work with existing applications (assuming suitable deployment
      of traffic conditioners),

   o  must decouple traffic conditioning and service provisioning
      functions from forwarding behaviors implemented within the core
      network routers,

   o  must not depend on hop-by-hop application signaling,

   o  must require only a small set of forwarding behaviors whose
      implementation complexity does not dominate the cost of a network
      device, and which will not introduce bottlenecks for future high-
      speed system implementations,

   o  must avoid per-microflow or per-customer state within core network
      routers,

   o  must utilize only aggregated classification state within the
      network core,

   o  must permit simple packet classification implementations in core
      network routers (BA classifier),

   o  must permit reasonable interoperability with non-compliant network
      nodes,

   o  must accommodate incremental deployment.


1.4  Comparisons with Other Approaches

   The differentiated services architecture specified in this document
   can be contrasted with other existing models of traffic management
   and service differentiation.  We classify these alternative models
   into the following categories: relative priority, virtual circuit,
   Integrated Services/RSVP, and service marking.

   Implementations of the relative priority model include IPv4
   Precedence marking as defined in [RFC791], 802.5 Token Ring priority
   [TR], and 802.1p priority [802.1p].  In this model the application,
   host, or proxy node selects a relative priority or "precedence" for a
   packet (e.g., delay or discard priority), and the network nodes along
   the transit path apply the appropriate priority forwarding behavior
   corresponding to the priority value within the packet's header.  Our
   architecture can be considered as a refinement to this model, since
   we more clearly specify the role and importance of edge nodes and
   traffic conditioners, and since our per-hop behavior model permits
   more general forwarding behaviors than relative delay or discard
   priority.


Black, et. al.            Expires: November 1998               [Page  7]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   Implementations of the virtual circuit model include Frame Relay,
   ATM, and MPLS [FRELAY, ATM, PASTE].  In this model path forwarding
   state and traffic management or QoS state is established for traffic
   streams on each hop along a path.  Traffic aggregates of varying
   granularity are associated with a virtual circuit, and packets/cells
   within each virtual circuit are marked with a forwarding label that
   is used to lookup the next hop, the per-hop forwarding behavior, and
   the replacement label at each hop.  This model permits finer
   granularity resource allocation to traffic streams, but the amount
   of forwarding state scales linearly with the number of edges of the
   network in the best case (assuming multipoint-to-point virtual
   circuits), and it scales with the square of the number of edges in
   the worst case, when edge-edge traffic streams with provisioned
   resources are employed.

   The Integrated Services/RSVP model relies upon traditional datagram
   forwarding in the default case, but allows sources and receivers to
   exchange signaling messages which establish classification and
   forwarding state on each node along the path between them [IntServ,
   RSVP].  In the absence of state aggregation, the amount of state on
   each node scales in proportion to the ratio of the link rate to the
   average reservation size (in bps), multiplied by some fraction of the
   link rate which is "reservable".  This model also requires
   application support for the RSVP signaling protocol.

   An example of a service marking model is IPv4 TOS as defined in
   [RFC1349].  In this example each packet is marked with a request for
   a "type of service", which may include "minimize delay", "maximize
   throughput", "maximize reliability", or "minimize cost".  Network
   nodes may select routing paths or forwarding behaviors which are
   suitably provisioned to satisfy the service request.  This model is
   subtly different from our architecture.  The defined TOS markings are
   very generic and do not span the range of possible service semantics.
   Furthermore, the service request is associated with each individual
   packet, whereas some service semantics may depend on the aggregate
   forwarding behavior of a sequence of packets.  The service marking
   model does not easily accommodate growth in the number and range of
   future services, and involves configuration of the "TOS->forwarding
   behavior" association in each core network router.


2.  Differentiated Services Architectural Model

   The differentiated services architecture is based on a simple model
   where traffic entering a network is conditioned at the edges of the
   network, and assigned to different behavior aggregates.  Each
   behavior aggregate is identified with a single DS codepoint.  Within
   the core of the network, packets are forwarded according to the per-
   hop behavior associated with the DS codepoint.  In this section, we
   discuss the key components in a differentiated services region,
   traffic conditioning functions, and how differentiated services are
   achieved through the combination of traffic conditioning and PHB-


Black, et. al.            Expires: November 1998               [Page  8]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   based forwarding.


2.1  Differentiated Services Regions

   A differentiated services region (DS Region) is a set of contiguous
   DS domains, where each DS domain consists of a set of edge nodes and
   interior nodes.

2.1.1  DS Domain

   A DS domain is a contiguous set of DS nodes which operate with a
   common service provisioning policy and set of PHB group definitions.
   A DS domain has a well-defined boundary consisting of DS edge nodes
   which condition ingress traffic and ensure that packets which transit
   the domain are only marked using one of the PHB groups supported in
   the domain.  All nodes inside the DS domain select the forwarding
   behavior for packets based solely on the DS codepoint as defined for
   the PHB groups supported in the domain.  Inclusion of non-DS capable
   nodes within a DS domain may result in unpredictable performance and
   may impede the ability to satisfy SLAs.

   A DS domain normally consists of one or more networks under the same
   administration, for example, an organization's intranet or an ISP.
   Multiple DS domains may be inter-connected through mutual agreements
   to form a DS region.  DS domains in a DS region may implement
   different PHB groups.  However, to permit services which span across
   the domains, the peering DS domains must each establish a peering SLA
   which includes a Traffic Conditioning Agreement (TCA) which specifies
   how transit traffic from one DS domain to another DS domain is
   conditioned at the boundary of the two DS domains.

   It is possible that several DS domains within a DS region may adopt a
   common service provisioning policy and PHB group definitions, thus
   eliminating the need for traffic conditioning between those DS
   domains.  In such cases, those DS domains are effectively under a
   single administration and may be considered as a single DS domain.

   The administration of the domain is responsible for ensuring that
   adequate resources are provisioned and/or reserved to support the
   SLAs offered by the domain.

2.1.2  DS Edge Nodes and Interior Nodes

   A DS domain consists of DS edge nodes and DS interior nodes. While
   DS edge nodes connect the DS domain to other DS or non-DS domains, DS
   interior nodes only connect to other DS interior or edge nodes within
   the DS domain.

   Both DS edge nodes and interior nodes must be able to forward packets
   based on the DS codepoint as defined by the PHB groups supported in
   the domain; otherwise unpredictable behavior may result. In addition,


Black, et. al.            Expires: November 1998               [Page  9]

INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   DS edge nodes must be able to perform traffic conditioning functions
   as described by the TCA between their DS domain and the peering
   domain which they connect to.

   Interior nodes may be able to perform limited traffic conditioning
   functions such as DS codepoint mutation.

   A host within a DS domain may act as a DS edge node for traffic to
   and from applications running on that host.  If a host is embedded in
   a DS domain and does not act as an edge node, then the host's first-
   hop router acts as the DS edge node for the host's traffic.

2.1.3  DS Ingress Node and Egress Node

   DS edge nodes may act both as a DS ingress node or as a DS egress
   node.  Traffic enters a DS domain at a DS ingress node and leaves a
   DS domain at a DS egress node. A DS ingress node is responsible for
   ensuring that the traffic entering the DS domain conforms to the TCA
   between it and the other domain which the ingress node is connected
   to.  A DS egress node may perform traffic conditioning functions on
   traffic forwarded to the peering domain, depending on the details of
   the TCA between two domains.


2.2  Traffic Conditioning

   Traffic conditioning functions are performed by DS edge nodes in a DS
   domain to ensure that the traffic entering a DS domain conforms to
   the rules specified in the TCA, in accordance with the domain's
   service provisioning policy, and to prepare the traffic for the PHB-
   based forwarding treatment in the interior routers.

2.2.1  General Architecture of Traffic Conditioners

   A traffic conditioner may contain the following elements: classifier,
   meter, marker, and shaper.  The classifier and the meter select the
   packets within a traffic stream and measure the stream against a
   traffic profile.  The marker and shaper perform control actions on
   the packets depending on whether the traffic stream is within its
   associated profile.

   A packet stream normally passes to a classifier first, and the
   matched packets are measured by a meter against the profile as
   defined in the TCA.  The packets within the profile may leave the
   traffic conditioner or may be marked by the marker. The packets that
   are out-of-profile may be either marked or shaped according to the
   rules specified in the TCA.  Note that discard policing can be
   performed by a specially configured shaper (see Sec. 2.2.3.4).  When
   packets leave the traffic conditioner of a DS ingress node, the DS
   field of each packet must be set to one of DS codepoints defined by
   the PHB groups supported in the DS domain.



Black, et. al.            Expires: November 1998               [Page 10]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   Fig. 1 shows the block diagram of a traffic conditioner.  Note that a
   traffic conditioner may not necessarily contain all four elements.
   For example, packets may pass from the classifier directly to the
   marker or shaper (null meter).


                                                    +-------+
                                                 -->|       |---->
                    +-------+       +-------+  /    +-------+
                    |       |       |       |/        marker
     packets -----> |       |------>|       |-------------------->
                    |       |       |       |\
                    +-------+       +-------+  \    +-------+
                    classifier        meter      -->|       |---->
                                                    +-------+
                                                      shaper


                Fig. 1: Logical View of a Traffic Conditioner


2.2.2  Traffic Conditioning Agreement (TCA)

   Differentiated services are extended across a DS domain boundary by
   establishing a SLA between the customer and provider DS domains.  The
   SLA includes a traffic conditioning agreement which usually specifies
   traffic profiles and actions to in-profile and out-of-profile
   packets.

   2.2.2.1  Traffic Profiles

   A traffic profile specifies rules for classifying and measuring a
   traffic stream.  It identifies what packets are eligible and rules
   for determining whether a particular packet is in-profile or out-of-
   profile.  For example, a profile based on token bucket may look like:

     codepoint=X, use token-bucket r, b

   The above profile indicates that all packets in the behavior
   aggregate with DS codepoint X should be measured against a token
   bucket meter with rate r and burst size b.  In this example out-of-
   profile packets are those packets in the behavior aggregate which
   arrive when insufficient tokens are available in the bucket.
   Different conditioning actions may be applied to the in-profile
   packets and out-of-profile packets, or different accounting actions
   may be triggered.

   2.2.2.2  Actions to In-Profile and Out-of-Profile Packets

   In-profile packets may be allowed to enter the DS domain without
   further conditioning as they conform to the TCA; or, alternatively,
   their DS field may be marked with a new DS codepoint.  The latter


Black, et. al.            Expires: November 1998               [Page 11]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   happens when the DS field is set to a non-Default value for the first
   time [DSFIELD], or when the packets enter a DS domain that uses a
   different PHB group for this traffic stream, so the DS codepoint has
   to be mapped to the new PHB group.

   The actions to out-of-profile packets may include delaying the
   packets until they are in-profile (shaping), discarding the packets,
   marking the DS field to a particular codepoint, or triggering some
   accounting action.

2.2.3  Components of a Traffic Conditioner

   2.2.3.1  Classifiers

   Packet classifiers select packets in a traffic stream based on the
   content of some portion of the packet header.  The classification may
   be based on the DS field only (Behavior Aggregate Classification), or
   on any combination of one or several fields in the packet header such
   as source address, destination address, DS field, protocol ID, and,
   transport-layer header fields such as source port and destination
   port numbers (Multi-Field Classification).  Classifiers are used to
   steer packets matching some specified rule to another element of the
   traffic conditioner for further processing.  Classifiers must be
   configured by some management procedure in accordance with the
   appropriate TCA.

   The classifier should authenticate the information which it uses to
   classify the packet (see Sec. 6).

   Note that in the event of upstream packet fragmentation, multi-field
   classifiers which examine the contents of transport-layer header
   fields may incorrectly classify packet fragments subsequent to the
   first.  A possible solution to this problem is to maintain
   fragmentation state; however, this is not a general solution due to
   the possibility of upstream fragment re-ordering or divergent routing
   paths.

   2.2.3.2  Meters

   Traffic meters measure the traffic properties of the set of packets
   selected by a classifier against a traffic profile specified in the
   TCA.  A meter indicates to other conditioning functions whether each
   individual packet is in- or out-of-profile.

   A null meter will identify all packets as in-profile.  Such a meter
   may be used when the traffic profile does not specify conforming rate
   or burst parameters.

   2.2.3.3  Markers

   Packet markers set the DS field of a packet to a particular
   codepoint, adding the marked packet to a particular DS behavior


Black, et. al.            Expires: November 1998               [Page 12]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   aggregate.  The marker may be configured to mark all packets which
   are steered to it to a single codepoint, or may be configured to mark
   a packet to one of a set of codepoints within a PHB group according
   to the state of a meter.

   2.2.3.4  Shapers

   Shapers delay some or all of the packets in a traffic stream in order
   to bring the stream into compliance with its associated traffic
   profile.  A shaper usually has a finite-size buffer, and packets may
   be discarded if there is not enough buffer space to hold the delayed
   packets.  Note that discard policers can be implemented as a special
   case of a shaper by setting the shaper buffer size to zero (or a few)
   packets.

2.2.4  Location of Traffic Conditioners

   Traffic conditioners may be located within a customer DS domain, and
   at the boundary of a DS domain.  Traffic conditioners may also be
   located in nodes in a non-DS domain.

   2.2.4.1  Traffic Conditioners within a Customer DS Domain

   Traffic sources and nodes within a customer DS domain may perform
   traffic conditioning functions.  The packets originating from the
   customer DS domain across a boundary may have their DS field marked
   by the traffic sources or by intermediate routers before leaving the
   customer DS domain.

   For example, suppose that a customer domain has a policy that the
   CEO's packets should have higher priority. The CEO's host may mark
   the DS field of all outgoing packets with a DS codepoint that
   indicates higher priority.  Alternatively, the first-hop router
   directly connected to the CEO's host may classify the traffic and
   mark the CEO's packets with the correct DS codepoint.

   There are some advantages to marking the DS field close to the
   traffic source.  First, a traffic source can more easily take an
   application's preferences into account when deciding which packets
   should receive better forwarding treatment.  Also, classification of
   packets is much simpler before the traffic has been aggregated with
   packets from other sources, since the number of classification rules
   which need to be applied within a single node is reduced.

   Since packet marking may be distributed across different nodes, the
   customer DS domain is responsible for ensuring that the aggregated
   traffic towards its provider DS domain conforms to the appropriate
   TCA.  Additional allocation mechanisms such as bandwidth brokers or
   RSVP may be used to dynamically allocate resources for a particular
   DS behavior aggregate within the customer's network. The edge node of
   the customer DS domain should also monitor conformance to the TCA,
   and triage packets as necessary.


Black, et. al.            Expires: November 1998               [Page 13]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   2.2.4.2  Traffic Conditioners at the Boundary of a DS Domain

   Traffic streams may be marked and otherwise conditioned on either end
   of a boundary link (the DS egress node of the customer DS domain or
   the DS ingress node of the provider DS domain).  The TCA between the
   domains should specify which domain has responsibility for mapping
   traffic streams to DS behavior aggregates and conditioning those
   aggregates in conformance with the TCA.  However, a DS ingress node
   must assume that the incoming traffic may not conform to the TCA and
   must be prepared to enforce the TCA in accordance with local policy.

   There is an advantage to performing complex conditioning operations
   in the customer DS domain since it is then no longer necessary to
   divulge the local classification and service provisioning rules to
   the provider DS domain.  In this circumstance the provider domain may
   only need to re-mark or police incoming behavior aggregates to
   enforce the TCA.  However, more sophisticated services which are
   path- or source-dependent may require multi-field classification in
   the provider's ingress nodes.

   Since packet marking may be distributed across different nodes, the
   If a DS ingress node is connected to a non-DS domain, the DS ingress
   node must be able to perform all traffic conditioning functions on
   the incoming traffic.

   2.2.4.3  Traffic Conditioners in non-DS Domains

   Traffic sources or intermediate nodes in a non-DS domain may employ
   traffic conditioners to pre-mark traffic before it reaches the
   ingress of a provider DS domain.


2.3  Per-Hop Behaviors

   A per-hop behavior (PHB) is a description of the externally
   observable forwarding behavior of a DS node applied to a particular
   DS behavior aggregate.  "Forwarding behavior" is a general concept in
   this context.  For example, in the event that only one behavior
   aggregate occupies a link, the observable forwarding behavior (i.e.,
   loss, delay, jitter) will usually depend only on the relative loading
   of the link (i.e., in the event that the behavior assumes a work-
   conserving scheduling discipline).  Useful behavioral distinctions
   are only observed when multiple behavior aggregates compete for
   buffer and bandwidth resources on a node.  The PHB is the means by
   which a node allocates resources to behavior aggregates, and it is on
   top of this basic hop-by-hop resource allocation mechanism that
   useful differentiated services may be constructed.

   The most simple example of a PHB is one which guarantees a minimal
   bandwidth allocation of X% of a link (over some reasonable time
   interval) to a behavior aggregate.  This PHB can be fairly easily
   measured under a variety of competing traffic conditions.  A slightly


Black, et. al.            Expires: November 1998               [Page 14]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   more complex PHB would guarantee a minimal bandwidth allocation of X%
   of a link, with proportional fair sharing of any excess link
   capacity.  Another simple example is taken from [DSFIELD]; the
   Expedited Forwarding PHB.  This PHB provides negligible loss, delay,
   and delay jitter (similar to that observed by a single packet
   traversing an otherwise idle router) for a behavior aggregate which
   is the multiplex of multiple peak-rate regulated traffic streams,
   under the constraint that the load of the behavior aggregate is a
   small fraction of the link capacity.  This last constraint is a
   consequence of queueing physics; a multiplex of peak-rate regulated
   traffic streams may still exhibit arrival burstiness, and the
   resulting delay and jitter will only be negligible under the
   circumstance where the relative load of the aggregated traffic is
   small, even when there is no competing traffic from other behavior
   aggregates.  In general, the observable behavior of a PHB may depend
   on certain constraints on the traffic characteristics of the
   associated behavior aggregate, or the characteristics of other
   behavior aggregates.

   PHBs may be specified in terms of their resource (e.g., buffer,
   bandwidth) priority relative to other PHBs, or in terms of their
   relative observable traffic characteristics (e.g., delay, loss)
   [Baker].  These PHBs should be specified as a group (PHB group) for
   consistency.  The priority relationship within a PHB group will tend
   to be hierarchical, and the associated DS codepoints should be
   assigned in increasing order of relative priority for clarity of
   interpretation.  The priority relationship between PHBs in the group
   may be absolute (e.g., absolute discard priority) or may be less
   rigid (e.g., higher probability of loss).  A single PHB defined in
   isolation is a degenerate form of a PHB group.

   PHBs are implemented in nodes by means of some buffer management and
   packet scheduling mechanisms.  PHBs should be defined in terms of
   behavior characteristics relevant to service provisioning policies,
   and not in terms of particular implementation mechanisms.  In
   general, a variety of implementation mechanisms may be suitable for
   implementing a particular PHB group.  Furthermore, it is likely that
   more than one PHB group may be implemented on a node and utilized
   within a domain.  PHB groups should be defined such that the proper
   resource allocation between groups can be inferred, and integrated
   mechanisms can be implemented which can simultaneously support two
   or more groups.


2.4  Network Resource Allocation

   The implementation, configuration, operation and administration of
   the supported PHB groups in the nodes of a DS Domain should
   effectively partition the resources of those nodes and the inter-node
   links between the traffic aggregates, in accordance with the domain's
   service provisioning policy.  Traffic conditioners control the usage
   of these resources through the administrative control of TCAs and


Black, et. al.            Expires: November 1998               [Page 15]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   possibly through operational feedback from the nodes and traffic
   conditioners in the domain.

   The configuration of and interaction between the traffic conditioners
   and the interior nodes should be managed by the administrative
   control of the domain and may require operational control through
   protocols and a control entity.  There is a wide range of possible
   control models [DSFWK].  The precise nature and implementation of the
   interaction between these components is outside the scope of this
   architecture.  However, scalability requires that the control of the
   domain does not require micro-management of the network resources.
   The most scalable control model would operate nodes in open-loop in
   the operational timeframe, and would only require administrative-
   timescale management as SLAs are varied.  This simple model may be
   unsuitable in some circumstances, and some automated but relatively
   long time-constant operational control (minutes rather than seconds)
   may be desirable to balance the utilization of the network against
   the recent load profile.


3.  Per-Hop Behavior Definition Requirements

   In order for a Per Hop Behavior (PHB) group to be considered for
   standardization, a detailed definition of the behavior should be
   provided as a basis for implementation consistency.  This section
   provides a template for defining a new PHB group.  Before a PHB group
   is considered for standardization it should satisfy the PHB
   definition requirements in this section, to preserve the integrity of
   this architecture.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   3.1.  A PHB definition MUST NOT require inspection or modification of
   any part of the packet other than the DS field.

   3.2.  The definition of each newly proposed PHB group MUST include an
   overview of the behavior and the purpose of the behavior being
   proposed.  The overview MUST include a problem or problems statement
   for which the PHB group is targeted.  The overview MUST include the
   basic concepts behind the PHB group.  These concepts SHOULD include,
   but are not restricted to, queueing behavior, discard behavior, and
   output link selection behavior.  Lastly, the overview MUST specify
   the method by which the PHB group solves the problem or problems
   specified in the problem statement.

   Any configuration or management issues which affect the basic PHB
   definition MUST be specified in the overview of the behavior.  The
   actual details of the management and configuration of PHB groups in
   routers or hosts MUST be addressed in a separate, parallel document.



Black, et. al.            Expires: November 1998               [Page 16]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   3.3.  A PHB group definition MUST indicate whether a PHB group
   consists of one or more codepoints.  In the event that multiple
   codepoints are specified, the interactions between the codepoints
   within the PHB group and constraints that must be respected globally
   across all the codepoints within the PHB group MUST be clearly
   explained in the description of the PHB group.  As an example, the
   definition MUST specify whether packet reordering within a microflow
   with packets marked by two or more codepoints within the group is
   likely.

   3.4.  A PHB group may be standardized for local use within a domain
   in order to provide some domain specific functionality or domain
   specific services.  In this event, the PHB definition is useful for
   providing vendors with a consistent definition of the PHB group.  The
   PHB definition can also provide semantics for PHB translation and
   service mappings with peer domains which do not support the PHB
   group.  However, any PHB group which is defined as local use MUST be
   considered as an informational standard.  In contrast, a PHB group
   which is proposed for general use will follow a stricter
   standardization process.  Therefore all proposed PHB definitions MUST
   specifically state whether they are to be considered for general use
   or local use.

   It is recognized that PHB groups can be designed with the intent of
   providing host-to-host, WAN edge-to-WAN edge, or domain edge-to-
   domain edge services.  Use of the term "end-to-end" in a PHB
   definition MUST be interpreted to mean "host-to-host".

   Other PHB groups may be defined and deployed locally within domains,
   for experimental or operational purposes.  There is no requirement
   that these PHB groups must be publically documented, but they SHOULD
   utilize DS codepoints from one of the EXP/LU pools as defined in
   [DSFIELD].

   3.5.  It may be possible or appropriate for a packet marked with a
   codepoint within a PHB group to be re-marked to another codepoint
   within that group either within a domain or across two cooperating
   domains.  Typically there are three reasons for PHB group mutability:

   1. The codepoints of the PHB group are collectively intended to carry
      state about the network.
   2. Changes in the network state which require promotion or demotion
      of traffic marked with a codepoint within the PHB group.
   3. A PHB group is not implemented one both sides of a domain
      boundary; All codepoints of a PHB group have to be mapped to some
      other PHB or PHB group at the boundary.

   In contrast, it may also be necessary for specific PHB groups to be
   preserved within a domain and/or across multiple domains.  Typically
   this is because the PHB groups carry some host-to-host, WAN edge-to-
   WAN edge, or domain edge-to-domain edge semantics which are difficult
   to duplicate when the PHB group is mapped to a different PHB group.


Black, et. al.            Expires: November 1998               [Page 17]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   Further, these semantics may also be difficult to duplicate if packet
   markings are promoted or demoted within the same PHB group.

   A PHB definition MUST clearly state whether packets marked by a
   codepoint within a PHB group MAY, or SHOULD be promoted, demoted (to
   another codepoint within the group), or preserved within a domain.  A
   PHB definition MUST clearly state whether packets marked by a
   codepoint within a PHB group MAY, or SHOULD be promoted, demoted, or
   preserved across multiple, cooperating domains.  A PHB definition
   MUST clearly state whether codepoints within a PHB group MAY, or
   SHOULD be mapped to a different PHB group.

   If it is desirable for a PHB group to be changed, the definition
   SHOULD clearly state the circumstances under which a change is
   desirable.  If it is undesirable for a PHB group to be changed, the
   definition MUST clearly state what the risks are when a PHB group is
   modified.  A PHB definition may include constraints on actions that
   change the PHB group.  These constraints may be specified as actions
   the router SHOULD, or MUST perform.

   3.6.  The PHB definition MUST also include a section defining the
   implications of tunneling on the PHB group.  This section should
   specify the implications on the PHB group of a newly created outer
   header when the original PHB group of the inner header is
   encapsulated in a tunnel.  This section should also discuss what
   possible changes should be applied to the inner header at the egress
   of the tunnel, when both the PHB groups from the inner header and the
   outer header are accessible.

   3.7.  The process of defining PHB groups is incremental in nature.
   When new PHB groups are defined, their known interactions with
   previously defined PHB groups MUST be documented.  When a new PHB
   group is created, it can be entirely new in scope or it can be an
   extension to an existing PHB group.  If the PHB group is entirely
   independent of some or all of the existing PHB definitions, a section
   MUST be included in the PHB definition which details how the new PHB
   group co-exists with those PHB groups already defined.  For example,
   this section might indicate the possibility of packet re-ordering
   within a microflow with packets marked by codepoints within two
   separate PHB groups.  If concurrent operation of two (or more)
   different PHB groups in the same router is impossible or detrimental
   this MUST be stated.  If the concurrent operation of two (or more)
   different PHB groups requires some specific behaviors by the router
   when traffic specifying these different PHB groups are in the router
   at the same time, these behaviors MUST be stated.

   If the proposed PHB group is an extension to an existing PHB group, a
   section MUST be included in the PHB group definition which details
   how this extension inter-operates with the behavior being extended.
   Further, if the extension alters or more narrowly defines the
   existing behavior in some way, this MUST also be clearly specified in
   the PHB definition.


Black, et. al.            Expires: November 1998               [Page 18]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   3.8.  Each PHB definition MUST include a section specifying minimal
   conformance to the PHB group.  This conformance section is intended
   to provide a means for specifying the details of a behavior while
   allowing for implementation variation to the extent permitted by the
   PHB definition.  This conformance section can take the form of rules,
   tables, pseudo-code or tests.

   3.9.  A PHB definition MUST include a section detailing the security
   implications of the behavior.  This section should include a
   discussion of the mutability of the inner header's PHB group at the
   egress of a tunnel.  Further, this section should also discuss how
   the proposed PHB group could be used in denial-of-service attacks,
   reduction of service contract attacks, and service contract violation
   attacks.  Lastly, this section should discuss the means for detecting
   such attacks as they are relevant to the proposed behavior.

   3.10. It is strongly RECOMMENDED that an appendix be provided for
   each PHB definition that considers the implications of the proposed
   behavior on current and potential services.  These services could
   include but are not restricted to be user specific, device specific,
   domain specific or end to end services.  It is also strongly
   RECOMMENDED that the appendix include a section describing how the
   services are verified by users, devices, and/or domains.

   3.11.  If the PHB definition is targeted for local use within a
   domain, it is RECOMMENDED that the appendix  include a description of
   how the PHB group is mapped to existing general use PHB groups as
   well as other local use PHB groups.

   3.12.  It is RECOMMENDED that an appendix be provided for each PHB
   definition which considers the impact of the proposed new PHB groups
   on existing higher-layer protocols.  Under some circumstances PHB
   definitions may allow for possible changes to higher-layer protocols
   which may increase or decrease the utility of the proposed PHB group.


4.  Interoperability with Non-Differentiated Services-Compliant Nodes

   We define a non-differentiated services-capable node (non-DS-capable
   node) as a node which does not interpret the DS field as specified in
   [DSFIELD] and/or does not implement some or all of the standardized
   PHBs.  This may be due to the capabilities or configuration of the
   node.  We distinguish such a node from a one which does not implement
   differentiated forwarding behaviors which can be selected by the
   value of the IPv4 TOS byte or the IPv6 Traffic Class byte.  We define
   a legacy node as one which implements IPv4 Precedence as defined in
   [RFC791], but which is otherwise non-DS capable.

   Differentiated services depend on the resource allocation mechanisms
   provided by per-hop behavior implementations on nodes.  The quality
   or statistical assurance level of a service may break down in the
   event that traffic transits a non-DS-capable node, or a non-DS-


Black, et. al.            Expires: November 1998               [Page 19]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   capable domain.

   We will examine two separate cases.  The first case concerns the use
   of non-DS-capable nodes within a DS domain.  Note that PHB forwarding
   is primarily useful for allocating scarce node and link resources in
   a controlled manner.  On high-speed, lightly loaded links, the worst-
   case packet delay, jitter, and loss may be negligible, and the use of
   a non-DS-capable node on the upstream end of such a link may not
   result in service degradation.  In more realistic circumstances, the
   lack of PHB forwarding in a node may make it impossible to offer low-
   delay, low-loss, or provisioned bandwidth services across paths which
   traverse the node.  However, use of a legacy node may be an
   acceptable alternative, assuming that the DS domain restricts itself
   to using only the precedence-compatible PHBs defined in [Baker], and
   assuming that the particular precedence implementation results in
   forwarding behaviors which are compatible with the services offered
   along paths which traverse that node.

   The second case concerns the behavior of services which traverse non-
   DS-capable domains.  We assume for the sake of argument that a non-
   DS-capable domain does not deploy traffic conditioning functions on
   domain edge nodes; therefore, even in the event that the domain
   consists of legacy or DS-capable interior nodes, the lack of traffic
   enforcement at the edges will limit the ability to consistently
   deliver some types of services across the domain.  A DS domain and a
   non-DS-capable domain may negotiate an agreement which governs how
   egress traffic from the DS-domain should be marked before entry into
   the non-DS-capable domain.  This agreement might be monitored for
   compliance by traffic sampling instead of by rigorous traffic
   conditioning.  Alternatively, where there is knowledge that the non-
   DS-capable domain consists of legacy nodes, the upstream DS domain
   may opportunistically re-mark differentiated services traffic to one
   or more IPv4 precedence values.  Where there is no knowledge of the
   traffic management capabilities of the domain, and no agreement in
   place, a DS domain egress node may choose to re-mark the DS field to
   zero, under the assumption that the non-DS-capable domain will treat
   the traffic uniformly with best-effort service.

   In the event that a non-DS-capable peers with a DS domain, traffic
   flowing from the non-DS-capable domain should be conditioned at the
   DS ingress node of the DS domain according to the appropriate SLA or
   policy.


5.  Multicast Considerations

   For future study.


6.  Security and Tunneling Considerations

   This section addresses security issues raised by the introduction of


Black, et. al.            Expires: November 1998               [Page 20]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   differentiated services, primarily the potential for denial-of-
   service attacks, and the related potential for theft of service by
   unauthorized traffic (Section 6.1).  In addition, the operation of
   differentiated services in the presence of IPsec and its interaction
   with IPsec are also discussed (Section 6.2), as well as auditing
   requirements (Section 6.3).  This section considers issues introduced
   by the use of both IPsec and non-IPsec tunnels.


6.1  Theft and Denial of Service

   The primary goal of differentiated services is to allow different
   levels of service to be provided for traffic streams on a common
   network infrastructure.  A variety of resource management techniques
   may be used to achieve this, but the end result will be that some
   packets receive different (e.g., better) service than others.  The
   mapping of network traffic to the specific behaviors that result in
   different (e.g., better or worse) service is indicated primarily by
   the DS field, and hence an adversary may be able to obtain better
   service by modifying the DS field to values indicating behaviors used
   for enhanced services or by injecting packets with DS field's set to
   such values.  Taken to its limits, this theft of service becomes a
   denial-of-service attack when the modified or injected traffic
   depletes the resources available to forward it and other traffic
   streams.  The defense against such theft- and denial-of-service
   attacks consists of a combination of edge policing and security of
   the network infrastructure within a DS domain.

   As described in Section 2.1, DS ingress nodes must ensure that all
   traffic entering a DS domain has DS field values that are acceptable
   to that domain's service provision policy.  This makes the ingress
   nodes the first line of defense against theft-of-service and denial-
   of-service attacks based on modified DS field values (e.g., values to
   which the traffic is not entitled).  An important instance of an
   ingress node is that any traffic-originating node in a DS domain is
   the ingress node for that traffic, and must ensure that that traffic
   carries acceptable DS field values.

   A domain's service provision policy may require the ingress nodes to
   change the DS field values on some entering packets (e.g., an ingress
   router may set the DS field values of a customer's traffic in
   accordance with the appropriate SLA).  Ingress nodes should police
   all other inbound traffic to ensure that the DS field values are
   acceptable; packets found to have unacceptable values must either be
   discarded or must have their DS fields modified to acceptable values
   before being forwarded.  For example, an ingress node receiving
   traffic from a domain with which no enhanced service agreement exists
   may reset the DS field to DE(fault) service [DSFIELD].  A service
   provisioning policy may require traffic authentication to validate
   the use of some DS field values (e.g., those corresponding to
   enhanced services), and such authentication may be performed by
   technical means (e.g., IPsec) and/or non-technical means (e.g., the


Black, et. al.            Expires: November 1998               [Page 21]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   inbound link is known to be connected to exactly one customer site).

   An inter-domain agreement may reduce or eliminate the need for
   ingress node traffic policing by making the upstream domain partly or
   completely responsible for ensuring that traffic has DS field values
   acceptable to the downstream domain.  In this case, the ingress node
   may still perform redundant acceptability checks to reduce the
   dependence on the upstream domain (e.g., such checks can prevent
   theft-of-service attacks from propagating across the domain
   boundary).  If an acceptability check fails because the upstream
   domain is not fulfilling its responsibilities, that failure is an
   auditable event; the generated audit log entry should include the
   date/time the packet was received, the source and destination IP
   addresses, and the DS field value that caused the failure.  In
   practice, the limited gains from such checks need to be weighed
   against their potential performance impact in determining what, if
   any, checks to perform under these circumstances.

   Interior nodes in a DS domain may rely on the DS field to associate
   differentiated services traffic with the behaviors used to implement
   enhanced services.  Any node doing so depends on the correct
   operation of the DS domain to prevent the arrival of traffic with
   unacceptable DS field values.  Robustness concerns dictate that the
   arrival of packets with unacceptable DS field values must not cause
   the failure (e.g., crash) of network nodes.  Interior nodes are not
   responsible for enforcing the service provisioning policy (or
   individual SLAs) and hence are not required to check DS field values
   for acceptability.  Interior nodes may perform some acceptability
   checks on DS field values (e.g., check for DS field values that are
   never used for traffic on a specific link, never used with a source/
   destination address outside a specific range, etc.) to improve
   security and robustness (e.g., resistance to theft of service attacks
   based on DS field modifications).  Any detected failure of such an
   acceptability check is an auditable event and the generated audit log
   entry should include the date/time the packet was received, the
   source and destination IP addresses, and the DS field value that
   caused the failure.  In practice, the limited gains from such checks
   need to be weighed against their potential performance impact in
   determining what, if any, checks to perform at interior nodes.

   Any link that cannot be adequately secured against modification of DS
   field values or traffic injection by adversaries should be treated as
   a boundary link (and hence any arriving traffic on that link is
   treated as if it were entering the domain at an ingress node).  Local
   security policy provides the definition of "adequately secured," and
   such a definition may include a determination that the risks and
   consequences of DS field modification and/or traffic injection do not
   justify any additional security measures for a link.  Link security
   can be enhanced via physical access controls and/or software means
   such as tunnels that ensure packet integrity.




Black, et. al.            Expires: November 1998               [Page 22]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


6.2  IPsec and Tunneling interactions

   The IPsec protocol, as defined in [ESP, AH], does not include the IP
   header's DS field in any of its cryptographic calculations (in the
   case of tunnel mode, it is the outer IP header's DS field that is not
   included).  Hence modification of the DS field by a network node has
   no effect on IPsec's end-to-end security, because it cannot cause any
   IPsec integrity check to fail.  As a consequence, IPsec does not
   provide any defense against an adversary's modification of the DS
   field (i.e., a man-in-the-middle attack); the adversary's
   modification will also have no effect on IPsec's end-to-end security.
   In some environments, the ability to modify the DS field without
   affecting IPsec integrity checks may constitute a covert channel; if
   it is necessary to eliminate such a channel or reduce its bandwidth,
   the DS domains should be configured so that the required processing
   (e.g., set all DS fields on sensitive traffic to a single value) can
   be performed at DS egress nodes where traffic exits higher security
   domains.

   IPsec's tunnel mode provides security for the encapsulated IP
   header's DS field.  A tunnel mode IPsec packet contains two IP
   headers: an outer header supplied by the ingress node and an
   encapsulated inner header supplied by the original source of the
   packet.  When an IPsec tunnel is hosted (in whole or in part) on a
   differentiated services network, the intermediate network nodes
   operate on the DS field in the outer header.  At the tunnel egress
   node, IPsec processing includes stripping the outer header and
   forwarding the packet (if required) using the inner header.  Since
   the inner IP header has not been processed by a DS ingress node, the
   tunnel egress node is the DS ingress node for traffic exiting the
   tunnel, and hence must carry out the corresponding responsibilities
   (see Section 6.1).  If the IPsec processing includes a sufficiently
   strong cryptographic integrity check of the encapsulated packet
   (where sufficiency is determined by local security policy), the
   tunnel egress node can safely assume that the DS field in the inner
   header has the same value as it had at the tunnel ingress node.  If
   the tunnel ingress node is in the same DS domain as the tunnel egress
   node, the tunnel egress node can safely treat a packet passing such
   an integrity check as if it had arrived from another node within the
   same DS domain and hence omit the DS ingress node policing that would
   otherwise be required.  An important consequence is that otherwise
   insecure internal links within DS domains can be secured by a
   sufficiently strong IPsec tunnel.

   This analysis and its implications apply to any tunneling protocol
   that performs integrity checks, but the level of assurance of the
   inner header's DS field depends on the strength of the integrity
   check performed by the tunneling protocol.  In the absence of
   sufficient assurance for a tunnel that may transit nodes outside the
   current DS domain (or is otherwise vulnerable), the encapsulated
   packet must be treated as if it had arrived at a DS ingress node from
   outside the domain.


Black, et. al.            Expires: November 1998               [Page 23]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   IPsec currently specifies that the inner header's DS field must not
   be changed by IPsec decapsulation processing at the tunnel egress
   node.  This ensures that an adversary's modifications to the DS field
   cannot be used to launch theft- or denial-of-service attacks across
   an IPsec tunnel endpoint, as any such modifications will be discarded
   at the tunnel endpoint.

   Note: the following paragraph requires coordination with and approval
   by he Security Area of the IETF, and may result in the need for brief
   modifications of the appropriate security RFCs.

   A tunnel egress node in a DS domain may modify the DS field in an
   inner IP header based on the DS field value in the outer header,
   including copying part or all of the outer DS field to the inner DS
   field.  For a tunnel contained entirely within a single DS domain and
   for which the links are adequately secured against modifications of
   the outer DS field, the only limits on modifications are those
   imposed by the domain's service provisioning policy.  Otherwise, the
   tunnel egress node performing such modifications is acting as a DS
   ingress node for traffic exiting the tunnel, and must carry out the
   responsibilities of an ingress node, including ensuring that the
   resulting DS field values are acceptable (see Section 6.1).

   If the tunnel enters the DS domain at a node different from the
   tunnel egress node, the tunnel egress node may depend on the upstream
   DS ingress node having ensured the acceptability of the outer DS
   field value.  Even in this case, there are some acceptability checks
   that can only be performed by the tunnel egress node (e.g., a
   consistency check between the inner and outer DS field values for an
   encrypted tunnel).  Any detected failure of such a check is an
   auditable event and the generated audit log entry should include the
   date/time the packet was received, the source and destination IP
   addresses, and the DS field value that was unacceptable.  The
   requirements in this paragraph apply to any future use of the
   currently unused (CU) bits in the IPv4 TOS byte and the IPv6 Traffic
   Class byte [DSFIELD].


6.3  Auditing

   Not all systems that support differentiated services will implement
   auditing.  However, if differentiated services support is
   incorporated into a system that supports auditing, then the
   differentiated services implementation must also support auditing and
   must allow a system administrator to enable or disable auditing for
   differentiated services.  For the most part, the granularity of
   auditing is a local matter.  However, several auditable events are
   identified in this document and for each of these events a minimum
   set of information that should be included in an audit log is
   defined.  Additional information also may be included in the audit
   log for each of these events, and additional events, not explicitly
   called out in this specification, also may result in audit log


Black, et. al.            Expires: November 1998               [Page 24]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   entries.  There is no requirement for the receiver to transmit any
   message to the purported sender in response to the detection of an
   auditable event, because of the potential to induce denial of service
   via such action.


7.  Acknowledgements

   The authors would like to acknowledge the following individuals for
   their helpful comments and suggestions: Kathleen Nichols, Brian
   Carpenter, Konstantinos Dovrolis, Shivkumar Kalyana, Wu-chang Feng,
   Marty Borden, Yoram Bernet, Ronald Bonica, James Binder, and Borje
   Ohlman.


8.  References

   [802.1p]    ISO/IEC Final CD 15802-3 Information technology - Tele-
               communications and information exchange between systems -
               Local and metropolitan area networks - Common
               specifications - Part 3: Media Access Control (MAC)
               bridges, (current draft available as IEEE P802.1D/D15).

   [AH]        S. Kent and R. Atkinson, "IP Authentication Header",
               Internet Draft <draft-ietf-ipsec-auth-header-06.txt>,
               May 1998.

   [ATM]       ATM Traffic Management Specification Version 4.0
               <af-tm-0056.000>, April 1996.

   [Baker]     F. Baker, S. Brim, T. Li, F. Kastenholz, S. Jagannath,
               and J. Renwick, "IP Precedence in Differentiated
               Services Using the Assured Service", Internet Draft
               <draft-ietf-diffserv-precedence-00.txt>, April 1998.

   [DSFIELD]   K. Nichols and S. Blake, "Definition of the
               Differentiated Services Field (DS Byte) in the IPv4 and
               IPv6 Headers", Internet Draft
               <draft-ietf-diffserv-header-00.txt>, May 1998.

   [DSFWK]     Differentiated Services Framework Document (work in
               preparation).

   [Clark97]   D. Clark and J. Wroclawski, "An Approach to Service
               Allocation in the Internet", Internet Draft
               <draft-clark-diff-svc-alloc-00.txt>, July 1997.

   [Ellesson]  E. Ellesson and S. Blake, "A Proposal for the Format and
               Semantics of the TOS Byte and Traffic Class Byte in IPv4
               and IPv6", Internet Draft <draft-ellesson-tos-00.txt>,
               November 1997.



Black, et. al.            Expires: November 1998               [Page 25]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


   [ESP]       S. Kent and R. Atkinson, "IP Encapsulating Security
               Payload", Internet Draft
               <draft-ietf-ipsec-esp-v2-05.txt>, May 1998.

   [Ferguson]  P. Ferguson, "Simple Differential Services: IP TOS and
               Precedence, Delay Indication, and Drop Preference,
               Internet Draft <draft-ferguson-delay-drop-02.txt>,
               April 1998.

   [FRELAY]    ANSI T1S1, "DSSI Core Aspects of Frame Rely", March 1990.

   [Heinanen]  J. Heinanen, "Use of the IPv4 TOS Octet to Support
               Differentiated Services", Internet Draft
               <draft-heinanen-diff-tos-octet-01.txt>, November 1997.

   [IntServ]   R. Braden, D. Clark, and S. Shenker, "Integrated Services
               in the Internet Architecture: An Overview", Internet RFC
               1633, July 1994.

   [MPLSFWK]   R. Callon, P. Doolan, N. Feldman, A. Fredette, G.
               Swallow, and A. Viswanathan, "A Framework for
               Multiprotocol Label Switching", Internet Draft
               <draft-ietf-mpls-framework-02.txt>, November 1997.

   [PASTE]     T. Li and Y. Rekhter, "Provider Architecture for
               Differentiated Services and Traffic Engineering (PASTE)",
               Internet Draft <draft-li-paste-00.txt>, January 1998.

   [RFC791]    Information Sciences Institute, "Internet Protocol",
               Internet RFC 791, September 1981.

   [RFC1349]   P. Almquist, "Type of Service in the Internet Protocol
               Suite", Internet RFC 1349, July 1992.

   [RFC2119]   S. Bradner, "Key words for use in RFCs to Indicate
               Requirement Levels", Internet RFC 2119, March 1997.

   [RSVP]      B. Braden et. al., "Resource ReSerVation Protocol (RSVP)
               -- Version 1 Functional Specification", Internet RFC
               2205, September 1997.

   [SIMA]      K. Kilkki, "Simple Integrated Media Access (SIMA)",
               Internet Draft <draft-kalevi-simple-media-access-01.txt>,
               June 1997.

   [2BIT]      K. Nichols, V. Jacobson, and L. Zhang, "A Two-bit
               Differentiated Services Architecture for the Internet",
               Internet Draft <draft-nichols-diff-svc-arch-00.txt>,
               November 1997.

   [TR]        ISO/IEC 8802-5 Information technology -
               Telecommunications and information exchange between


Black, et. al.            Expires: November 1998               [Page 26]


INTERNET-DRAFT   An Architecture for Differentiated Services    May 1998


               systems - Local and metropolitan area networks - Common
               specifications - Part 5: Token Ring Access Method and
               Physical Layer Specifications, (also ANSI/IEEE Std 802.5-
               1995), 1995.

   [Weiss]     W. Weiss, "Providing Differentiated Services Through
               Cooperative Dropping and Delay Indication", Internet
               Draft <draft-weiss-cooperative-drop-00.txt>, March 1998.


Authors' Addresses

   David Black
   The Open Group Research Institute
   Eleven Cambridge Center
   Cambridge, MA  02142
   Phone:  +1-617-621-7347
   E-mail: d.black@opengroup.org

   Steven Blake
   IBM Corporation
   800 Park Offices Drive
   Research Triangle Park, NC  27709
   Phone:  +1-919-254-2030
   E-mail: slblake@raleigh.ibm.com

   Mark A. Carlson
   Redcape Software, Inc.
   2990 Center Green Court South
   Boulder, CO 80301
   Phone:  +1-303-448-0048 x115
   E-mail: mac@redcape.com

   Elwyn Davies
   Nortel UK
   London Road
   Harlow, Essex CM17 9NA, UK
   Phone:  +44-1279-405498
   E-mail: elwynd@nortel.co.uk

   Zheng Wang
   Bell Labs Lucent Tech
   101 Crawfords Corner Road
   Holmdel, NJ 07733
   E-mail: zhwang@bell-labs.com

   Walter Weiss
   Lucent Technologies
   300 Baker Avenue, Suite 100,
   Concord, MA  01742-2168
   E-mail: wweiss@lucent.com



Black, et. al.            Expires: November 1998               [Page 27]