draft-blake-diffserv-marking-00

Internet Engineering Task Force                             Steven Blake
INTERNET-DRAFT                                           IBM Corporation
Expires: June 1998
                                                           December 1997



              Some Issues and Applications of Packet Marking
                       for Differentiated Services

                   <draft-blake-diffserv-marking-00.txt>


Status of This Memo

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   To learn the current status of any Internet-Draft, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).


Abstract

   ''Packet marking'' is proposed as an architectural generalization of
   the type of service (TOS) and precedence facilities of IPv4 [RFC795,
   RFC1349], as well as the traffic class facilities of IPv6 [IPv6].  It
   is intended to encompass all mechanisms by which a host or a router
   may mark a packet to invoke some differentiated packet handling
   behavior by another node along the transit path of the packet.  This
   memo examines several proposed applications of a packet marking
   facility and attempts to categorize each application in terms of the
   behavioral requirements it imposes on hosts and routers.  In
   addition, issues related to the deployment of packet marking,
   including provisioning, authorization, and security, are examined.
   This memo is proposed as a framework to focus discussion on
   implementation issues and mechanisms as new differentiated services
   enabled by a packet marking facility are introduced into the
   Internet.






Blake                      Expires: June 1998                  [Page  1]

INTERNET-DRAFT               Packet Marking                December 1997


Table of Contents

1.  Introduction ....................................................  3

2.  Motivation ......................................................  4

3.  Some Proposed Applications of Packet Marking ....................  6
  3.1  Explicit Priority ............................................  6
    3.1.1  Delay Priority ...........................................  6
    3.1.2  Drop Priority ............................................  7
    3.1.3  Network Control Priority .................................  8
  3.2  Explicit Service Class Indication ............................  8
    3.2.1  Precedence Service Classes ...............................  9
    3.2.2  Transport Isolation ...................................... 10
    3.2.3  Aggregated Integrated Services Classes ................... 10
    3.2.4  Service-based Route Selection ............................ 11
  3.3  Best-Effort Service Allocation ............................... 11
  3.4  Integrated Services Conformant Packet Indication ............. 13
  3.5  Forward Explicit Congestion Notification ..................... 14

4.  Differentiation Mechanism Categorization ........................ 16
  4.1  Host Packet Processing Mechanism Categorization .............. 16
  4.2  Router Packet Processing Mechanism Categorization ............ 17
  4.3  Biased vs. Substitute Best-Effort Router Mechanisms .......... 19
    4.3.1  Transmission Mechanism Categorization .................... 19
    4.3.2  Path Selection Mechanism Categorization .................. 22

5.  Service Categorization .......................................... 22
  5.1  Service Granularity .......................................... 22
  5.2  Service Invocation ........................................... 23
  5.3  Service Behavior ............................................. 24
  5.4  Direction of Value ........................................... 25

6.  Fairness and Congestion Control Considerations .................. 26

7.  Provisioning Considerations ..................................... 27

8.  Authorization Considerations .................................... 29

9.  Routing Considerations .......................................... 30

10. System Implementation Considerations ............................ 31

11. Standardization Considerations .................................. 33

12. Security Considerations ......................................... 34

13. Acknowledgements ................................................ 35

14. References ...................................................... 35

Author's Address .................................................... 39


Blake                      Expires: June 1998                  [Page  2]

INTERNET-DRAFT               Packet Marking                December 1997


1.  Introduction

   Best-effort networks such as the current Internet provide a service
   to users which can best be characterized as delivering connectivity
   plus a weak guarantee of fair access to network resources.  In this
   sort of network the performance of individual applications is highly
   dependent on the instantaneous demand for network resources.
   Although this level of service has proven satisfactory for a wide
   variety of uses, there exist both applications and users which would
   benefit substantially from a more predictable level of performance,
   or from an "inequitable" share of the network resources.  As a
   consequence, there has been significant effort to define new
   mechanisms to enable differentiated services within the Internet.

   This document examines a particular class of differentiation
   mechanisms that are triggered by a facility we refer to loosely as
   "packet marking".  Network nodes (routers and hosts) can implement,
   in addition to the classical best-effort service, a variety of packet
   processing, forwarding, buffer management, and scheduling behaviors
   to differentiate packet queueing delay, packet loss, and application
   flow throughput.  These alternative differentiation mechanisms can
   be invoked for a particular packet by marking it; i.e., by setting
   some combination of one or more bits in a "packet handling" field in
   the packet header.  Note that both the IPv4 and IPv6 protocols
   possess header fields intended specifically for this purpose (TOS/
   Precedence [RFC1349], and Class [IPv6], respectively).  We will use
   the term "PH field" to signify the packet handling field in the
   general case.

   A differentiated service is provided for a stream of packets by
   marking (possibly a subset of) the constituent packets thereby
   invoking some differentiation mechanism for each marked packet on the
   nodes along the stream's path.  A stream here may represent various
   granularities of traffic ranging from an individual application flow
   all the way to the aggregate traffic exchanged between a pair of
   service providers.  The differentiated service when invoked is
   visible to the application or user as some change in a quantitative
   characteristic of the aggregate packet transport (e.g., average delay,
   loss, or throughput).

   A distinguishing feature of the differentiated services mechanisms
   examined here is that they treat packets with identical markings
   equivalently; the mechanisms act on aggregated classes of packets
   (where a class represents those packets with particular markings) and
   they operate without per-flow state in every node along a packet's
   path.  This is in contrast to the Integrated Services model
   [RFC1633], where per-flow classification, policing, and scheduling
   state is installed and maintained on nodes along the path either via
   application signaling [RSVP] or via administrative configuration.
   While the Integrated Services mechanisms provide more granular QoS
   guarantees to individual application flows, the requirement for
   application signaling and per-flow state in the network introduces


Blake                      Expires: June 1998                  [Page  3]

INTERNET-DRAFT               Packet Marking                December 1997


   performance, scalability, and application compatibility issues.
   Many applications can still benefit while utilizing a simpler and
   more scalable set of differentiation mechanisms.  Note also that
   packet marking may help facilitate Integrated Services state
   aggregation in the interior of the Internet (see Sec. 3.2.3).

   The concept of packet marking described in this document should be
   distinguished from the efforts of the Multi-protocol Label Switching
   working group [MPLS].  MPLS is primarily intended to simplify router
   forwarding implementations and to enable enhanced routing services.
   However, some of the issues discussed in this document may be
   relevant to MPLS as well.

   Section 2 of this document elaborates on the motivations for
   deploying packet marking differentiated services.  Section 3 outlines
   some of the proposed applications of packet marking.  Section 4
   introduces a categorization of differentiation mechanisms for hosts
   and routers and describes how the proposals examined in Sec. 2 fit
   into this categorization.  Section 5 provides a categorization of
   differentiated services enabled by packet marking.  Sections 6-12
   address various issues of fairness, congestion control, provisioning,
   authorization, routing, implementation, standardization and security
   which are introduced with the implementation and deployment of packet
   marking differentiated services.


2.  Motivation

   As discussed in [Shenker], different applications have different
   utility functions of the bandwidth provided for the application by
   the network infrastructure.  These different applications (e.g.,
   elastic (best-effort); hard, delay-adaptive, and rate-adaptive real-
   time) exhibit varying sensitivity to the transient and steady-
   state level of resources available from the network.  This
   sensitivity is often a function of human factors considerations, but
   can also result from the fundamental characteristics of the
   application (e.g., distributed database synchronization).

   In addition, users and organizations may realize greater utility from
   a particular subset of the applications (either elastic or inelastic)
   in use, due to personal or business objectives.  As an example, many
   corporate networks prioritize transaction processing and interactive
   traffic from some applications over batch and other applications'
   traffic due to their immediate impact to business operations.

   Finally, different customers of a network service provider may
   realize greater or lesser utility from the network service they
   receive.  These customer utilities are not necessarily proportional
   to the speeds of their access links to the service provider.

   Network owners, be they private network managers, or public Internet
   service providers, wish to maximize their return on investment in


Blake                      Expires: June 1998                  [Page  4]

INTERNET-DRAFT               Packet Marking                December 1997


   network infrastructure.  Their ability to increase revenue is
   constrained by the elasticity of demand, the mixture of application
   types, and the means of pricing and cost-recovery available to them,
   but in general they stand to benefit if they can tailor their service
   offerings and pricing structures to satisfy the entire range of
   application and customer service requirements, by allowing each
   customer to maximize his utility/cost function.  Because of the
   variation in customer utility functions, differentiated pricing
   (e.g., by Service Level Agreements) is a key revenue generating
   mechanism, but its success depends on the ability to engineer the
   network to satisfy the diversity of application/customer
   service requirements.

   One means of satisfying these requirements is to engineer
   differentiated packet handling mechanisms into the network.  These
   mechanisms, which can be conceptualized as mechanisms for
   prioritization and resource allocation, allow the service provider
   to provision the network for each of the offered classes of service
   so as to meet the application/customer requirements at the level of
   statistical assurance promised.

   Another means of satisfying these requirements is to over-provision
   the network, so that it tends to run at low utilization with minimal
   congestion.  One problem with over-provisioning as a strategy for
   enabling differentiated services is that any best-effort network
   (i.e., without admission control) with concentration points can
   experience transient congestion and loss, which will make it
   difficult to support the most rigorous of application requirements.
   Another more fundamental difficulty is that over-provisioning
   provides better service to some customers than they would be willing
   to pay for (as judged by their utility functions).  Whether over-
   provisioning is a cost-effective method of service differentiation
   as compared to providing differentiation mechanisms within the
   network depends on the level and type of application/customer demand,
   the incremental cost of additional network infrastructure, and the
   rate of change in demand.  However, it appears clear that in an
   environment (such as today) where there is explosive growth in the
   users of and traffic in the Internet, that low-complexity
   differentiation mechanisms offer a more rapid and effective means of
   tailoring network service offerings than network over-provisioning.

   Classifying packets to determine their service class can be
   implemented by a number of means, including source/destination
   address, protocol, and TCP/UDP port filtering.  A problem with
   filtering at this level of granularity is that each router along the
   path of a packet will require the necessary filtering rules to
   determine the service class.  This has scaling problems, both in
   terms of the number of filtering rules, as well as in the need for
   mechanisms to dynamically add and delete new rules according to
   changes in customer and application traffic.  A further problem is
   that, with the deployment of IP Security, transport payloads
   encrypted within an Encapsulated Security Payload (ESP) cannot be


Blake                      Expires: June 1998                  [Page  5]

INTERNET-DRAFT               Packet Marking                December 1997


   identified by a router, because the protocol and port values are
   obscured [ESP].  This will prevent any service differentiation of
   encrypted traffic at the application level.

   A motivation for deploying packet marking is that the network routers
   need only filter on the value of the PH field to determine the
   appropriate differentiation mechanisms to apply to a packet.  When
   coupled with aggregate buffer management and packet scheduling
   mechanisms, as well as network authorization of the PH values and
   adequate provisioning, packet marking provides a scalable mechanism
   for offering differentiated transport services to different traffic
   streams.  These differentiation mechanisms may be useful without
   explicit network authorization and provisioning to allow best-effort
   applications to trade some fraction of their fair share packet rate
   for a lower loss rate or for lower average queueing delay [RFC1046].
   Packet marking may also be useful as a means for improving the
   scalability of per-flow Integrated Services by simplifying the
   implementation of flow aggregation and by improving the efficiency of
   the Integrated Services packet classification mechanism [CLASSY,
   GBH97].


3.  Some Proposed Applications of Packet Marking

   In this section we describe some proposed applications of a packet
   marking facility.  We are using the term "application" here to refer
   to one or more services which could be delivered by specifying the
   semantics of one or more PH bits and by specifying the
   differentiation mechanisms they invoke within the network.  Some of
   the proposed semantics are explicit regarding the service requested
   (e.g., transport isolation) while others could be used to provide a
   variety of services.  Several proposals overlap in terms of the
   differentiation mechanisms utilized, and as such, a common set of
   PH bits could be used to enable these proposed services.  We will
   examine the proposals individually here and will then categorize them
   more rigorously in Sec. 4.

3.1  Explicit Priority

   One application of packet marking is to provide an explicit request
   for priority handling of the packet by routers.  Priorities are
   usually ranked in a strict hierarchy relative to some metric (e.g.,
   delay, loss probability).  The priority value is usually intended to
   reflect the importance of a packet relative to other packets from the
   source as well as to packets from other sources [RFC795, RFC1046,
   RFC1812 Sec. 5.3.3].  In this section we assume the router does not
   perform priority based path selection (this is discussed in Sec.
   3.2.4).

3.1.1  Delay Priority

   Delay priority indication within a packet is intended to convey the


Blake                      Expires: June 1998                  [Page  6]

INTERNET-DRAFT               Packet Marking                December 1997


   sensitivity of an application to router queueing delay.  A possible
   range of values is the following:

     o High        -- prefer low maximum queueing delay; jitter
                      sensitive,
     o Interactive -- prefer low average queueing delay; jitter
                      insensitive,
     o Regular     -- can tolerate the normal delay distribution
                      delivered by best-effort FIFO queueing,
     o Low         -- can tolerate extensive queueing delay or jitter.

   A greater granularity of delay priority values is possible.  However,
   without strict per-flow admission control and policing, quantifiable
   bounds on the delay distribution at a particular priority level are
   difficult to determine.  Delay priority is useful for allowing
   applications which are delay sensitive to avoid large queues,
   possibly at the expense of packet loss rate, while permitting
   applications which are not sensitive to queueing delay to utilize
   router buffers to avoid packet loss and achieve higher throughput (or
   to avoid much worse round-trip time (RTT) delays).  One simple
   implementation mechanism is to provide a separate queue for each
   delay priority level, with strict priority service between the
   levels.  A problem with this sort of implementation is possible
   starvation of service at the lower delay priority levels.
   Implementation issues of delay priority are further discussed in Sec.
   4.

3.1.2  Drop Priority

   Drop or discard priority indication within a packet is intended to
   convey the sensitivity of an application (or a sub-layer of its
   traffic) to packet losses.  A possible range of values is the
   following:

     o Critical -- extremely loss sensitive; do not discard while other
                   non-critical packets are queued,
     o High     -- loss sensitive; drop only under extreme congestion,
     o Regular  -- can tolerate normal loss rates under active buffer
                   management [RED, ACTIVE],
     o Low      -- not sensitive to loss; discard under light
                   congestion.

   A greater granularity of drop priority values is possible, however,
   as with delay priority, in the absence of strict per-flow admission
   control and policing, quantifiable bounds on the loss probabilities
   at a particular priority level are difficult to determine.
   Furthermore, it may be difficult to engineer several levels of drop
   priority without introducing delay for the higher drop priority
   levels under congestion.  One possible implementation of drop
   priority is to use multiple thresholds of packet occupancy in a
   single FIFO queue to trigger the discard action for incoming packets
   at a particular drop priority level.  These thresholds could be based


Blake                      Expires: June 1998                  [Page  7]

INTERNET-DRAFT               Packet Marking                December 1997


   on the instantaneous queue occupancy with deterministic discard or on
   an averaged queue occupancy with stochastic discard [Clark97, Feng97]
   (see Sec. 4. for further discussion of implementation issues).  Drop
   priority is useful both for improving the throughput of more
   important application flows as well as in enabling rate-adaptive
   multi-layer audio and video applications, which can adjust their
   rates after detecting impending congestion due to the drop of lower
   priority packets of the encoded signal, while still protecting the
   higher quality components of the signal from loss.  Such an approach
   to layering has superior control-plane scalability to alternatives
   such as receiver-driven layered multicast [McCanne] (however, there
   are issue of fairness and congestion which may bias an application to
   the alternative method (see Sec. 6)).

3.1.3  Network Control Priority

   Network Control priority indication within a packet is intended to
   indicate that the packet is a component of a network control protocol
   exchange whose correct and timely operation is critical to the
   stability of the network.  It is primarily intended for use with
   routing protocols (e.g., RIP, OSPF, IS-IS, BGP), but could also be
   used for other network signaling and control protocols (e.g., SNMP,
   RSVP, MPLS) [RFC1812 Sec. 7.1.2].  The value of prioritizing routing
   traffic over data traffic is to prevent routing collapse under heavy
   load (e.g., preventing BGP connection timeouts due to excessive TCP
   losses and retransmits).  The value of prioritizing SNMP traffic is
   to eliminate a denial-of-service attack (where the network manager
   cannot monitor or configure a network element).  A sensible
   implementation will both guarantee an extremely low loss rate for
   network control packets (i.e., by never discarding a network control
   packet when other types of packets are queued) and will attempt to
   bound the queueing delay they experience.  This could be accomplished
   by implementing a separate network control queue with strict
   priority, or by providing priority pushout within a single FIFO queue
   (implementation issues are further discussed in Sec. 4).  Because
   network control traffic is usually a small fraction of the total
   traffic within a network, this prioritization should not have a
   noticeable impact on data transport performance.  However, because of
   the high priority provided for this class of traffic, only routers
   and network management stations should be allowed to set the network
   control priority indication, and the network should take steps to
   authenticate the source of a packet with the priority indication set
   (see Sec. 12).

3.2  Explicit Service Class Indication

   Another application of packet marking is to explicitly indicate a
   "service class" for a packet.  A service class is a more general
   concept than delay or drop priority.  It can be associated with a set
   of resources provisioned within the interior of the network (e.g.,
   bandwidth, buffers, routes) for a particular set of application/
   customer traffic flows which are mapped onto that class.  The service


Blake                      Expires: June 1998                  [Page  8]

INTERNET-DRAFT               Packet Marking                December 1997


   class concept does not impose any strict hierarchy of delay, loss,
   or throughput priority between classes, but instead may permit the
   specification of quantitative bounds on delay, loss, or throughput
   performance for a class for a particular traffic profile within that
   class.  Issues regarding the implementation of explicit service
   classes are discussed in Sec. 4.

3.2.1  Precedence Service Classes

   One instantiation of the service class concept is to provide a set of
   "precedence" service classes, in a manner very similar to the delay
   and drop priorities discussed in Secs. 3.1.1 and 3.1.2, but with
   potentially more flexibility in the provisioning of the classes.
   Each individual class would be provisioned to provide "better"
   service than the class of immediately lower rank, where the precise
   definition of better service could for instance be defined as a
   higher probability of timely delivery; i.e., lower probability of
   loss and lower average delay.  Each class in the hierarchy could be
   engineered to provide a statistically quantifiable service for some
   expected or regulated load, while also being engineered to prevent
   starvation of service to the lowest precedence classes.  Regulation
   could be implemented in the form of dynamic service class mapping and
   policing at the edge of the network.  The result of implementing this
   range of services would likely be improved throughput for application
   flows in the higher precedence classes.

   One application of this scheme would be to map business-critical
   transaction traffic to a service class of high precedence, while
   mapping casual web browsing traffic to a lower precedence class.
   These classes could be implemented using a variety of methods,
   including some variant of Weighted Fair Queueing (WFQ) or Class-based
   Queueing (CBQ) [HPFQA, HFSC, CBQ].  Depending on the precise service
   guarantees promised for the classes, they could potentially be
   implemented using combinations of the explicit delay and drop
   priority PH markings and router mechanisms described in Secs. 3.1.1
   and 3.1.2.

   [TWOBIT] describes a particular precedence service class
   implementation which relies on authorization and policing/shaping at
   the network edge and strict delay priority queueing in the interior
   routers.  Flows or flow aggregates assigned to the "Premium" service
   class are policed based on a peak rate limit and any residual bursts
   are shaped at the network edge; this smoothes the characteristics of
   the Premium traffic which results in minimal accumulated queueing
   delay in the interior routers (when the total Premium service load
   is moderate).  Packets which exceed the negotiated peak-rate limit
   are discarded.  Per-router service class provisioning is not
   required in this scheme since two-level strict priority queueing is
   used as the differentiation mechanism.  However, Premium service
   must be conservatively allocated to prevent starvation of the best-
   effort service queue.



Blake                      Expires: June 1998                  [Page  9]

INTERNET-DRAFT               Packet Marking                December 1997


3.2.2  Transport Isolation

   Although TCP is the most widely used transport protocol in the public
   Internet, there are alternative transport protocols which may have a
   more or less aggressive response to network congestion or packet
   loss.  When run in parallel with TCP traffic, these transport
   protocols, some of which are also used in alternative protocol packet
   networks, may be unable to achieve their fair share rate (if they are
   less aggressive), or may prevent TCP flows from achieving their fair
   share rates (if they are more aggressive).  These transport protocols
   can be isolated from each other by mapping the flows which utilize
   them to different service classes which are appropriately provisioned
   for some level of minimal service.  A router may queue these
   transport-isolated service classes separately.  The flows within each
   service class queue then only compete with each other for the minimal
   guaranteed bandwidth which is provisioned for that class, and can
   temporarily consume the bandwidth provisioned for other classes
   whenever they are unloaded.  An advantage of transport isolation is
   that it can protect normal best-effort TCP traffic from some well-
   known mis-behaved transport protocols.  The minimal bandwidth for
   each transport service class must be provisioned at each router.

3.2.3  Aggregated Integrated Services Classes

   Integrated Services and RSVP as currently specified depend on per-
   flow signaling and per-flow packet classification, using the
   destination address, protocol, and destination port of a packet, and
   often also the source address and source port [RFC1633, RSVP].  As
   has been mentioned previously, the requirement for per-flow control
   and classification state may introduce scalability problems in the
   interior of the Internet, where the demand for reservations on high-
   speed links may exceed several thousand simultaneous flows.
   Scalability can be improved by aggregating both the control and
   packet classification state generated by a set of (unicast) flows
   transiting a particular path through a segment of the Internet.  This
   approach was introduced in [CLASSY], and is examined in more detail
   in [GBH97] and [TWOBIT].  It should be noted that no feasible design
   for the aggregation of multicast flows has been published.

   The proposals for control-state aggregation are not the topic of this
   memo (the reader is encouraged to see [GBH97] and [TWOBIT]).
   However, the details of packet classification aggregation are
   relevant here.  The basic concept is to mark traffic corresponding to
   an Integrated Services traffic class with a particular PH service
   class marking (e.g., Controlled Load, Guaranteed) at the router along
   the flows' path which initiates aggregation.  Subsequent routers
   within the aggregating region do not classify the aggregated reserved
   packets using the normal per-flow/session packet filters but instead
   classify them based on their PH service class marking.  These flows
   are then serviced using a queue which has been either statically or
   is dynamically provisioned to provide the required QoS for the total
   set of aggregated Integrated Services flows of a particular traffic


Blake                      Expires: June 1998                  [Page 10]


INTERNET-DRAFT               Packet Marking                December 1997


   class which traverses that link.  Aggregated RSVP control messages
   between routers on the edge of the aggregating region could be used
   to specify the aggregated reservation request between those nodes,
   and the interior routers could use this information to perform
   admission control and to dynamically adjust the resources (e.g.,
   bandwidth, buffers) which are allocated to each aggregated Integrated
   Services traffic class.  More than two service classes could be used
   for aggregation, each provisioned to deliver a particular QoS for the
   flows utilizing that class.

   One additional packet marking requirement introduced by aggregation
   is the need to explicitly mark those packets of a reserved flow which
   do not conform to the flow's reservation Tspec (see Sec. 3.4).  This
   marking is necessary since per-flow policing is not possible within
   the interior of the aggregating region, and a single non-conformant
   flow could reduce the QoS delivered to all other flows aggregated
   into its service class.  The alternative of marking these non-
   conformant flows as best-effort could lead to unnecessary packet re-
   ordering.  Furthermore, it is critical that flows not policed at an
   RSVP aggregation point not be marked with one of the aggregated
   Integrated Service class PH markings and serviced using the resources
   dedicated to the aggregated flows.  These aggregated service classes
   require isolation from other (potentially real-time) traffic, since
   resources have been specifically dedicated to them based on
   advertised and regulated traffic loads.  The routers on the edge of
   the aggregating region must prevent unauthorized use of these PH
   markings by non-reserved flows.

3.2.4  Service-based Route Selection

   An alternative means of differentiating the service provided to a
   given class of traffic is to implement service-specific routes (i.e,
   TOS routes) [RFC1349, RFC1583].  Service-specific routes can be
   defined based on those characteristics of packet transport that are
   largely affected by the path selected between two end-nodes.  The
   canonical set of characteristics used are delay, reliability,
   throughput, and cost [RFC1349], although routes based on a more
   detailed set of characteristics could in principle be defined.
   Routing protocols can then be used to compute service-specific routes
   by factoring in different link and path metrics (e.g., propagation
   delay, bit error rate, link rate, transport cost).

   Issues surrounding service-specific route selection are examined in
   Sec. 9.

3.3  Best-Effort Service Allocation

   There have been several recent proposals to use packet marking
   mechanisms to provide best-effort service allocation [Clark97, SIMA,
   Crow96, May97, Feng97, Bohn93, Ferg97, TWOBIT].  The term best-effort
   service allocation refers to the notion of providing different
   expectations of best-effort service to different categories of users


Blake                      Expires: June 1998                  [Page 11]


INTERNET-DRAFT               Packet Marking                December 1997


   or applications based on some negotiated service profile with the
   network.  These expectations could be characterized in terms of
   average throughput, loss, or delay, along with variance estimates
   (statistical assurance levels) for these metrics.  These schemes are
   primarily motivated to provide different tiers of service to elastic
   best-effort applications; in their simplest form they rely on network
   dimensioning with authorization and enforcement at the network edge
   to provide statistical assurances on performance which may not be
   suitable for all application types.  Explicit provisioning of
   resources in the interior of the network is not precluded, but the
   proposals are designed to work effectively using only simple
   differentiation mechanisms in the interior routers, such as strict
   drop or delay priority.  We outline three of the proposed schemes
   here.

   [Clark97] describes two proposals for service allocation; one relying
   on marking at the network edge near the sender (sender based scheme),
   and the other relying on marking within the network and reaction at
   the network edge near the receiver (receiver scheme).  We describe
   the sender scheme here and the receiver scheme in Sec. 3.5.

   In the sender scheme, a profile meter monitors the traffic load
   generated by a source and marks traffic which exceeds the negotiated
   profile (which might be defined by a token bucket, for example) with
   an in-profile indicator.  The in-profile marking is interpreted as
   the inverse of a drop preference indicator by the interior routers,
   which preferentially discard drop preference traffic whenever
   impending congestion is detected.  The proposed differentiation
   mechanism is a weighted variant of the RED algorithm [RED], termed
   "RED with In and Out" (RIO), which uses a single FIFO queue and two
   RED drop thresholds, the lower being assigned to the drop preference
   traffic.  This mechanism is designed to shut down out-of-profile
   flows as the in-profile traffic utilization on a link approaches
   one.  No explicit provisioning of resources to the two levels of
   service are required in the interior.  This basic scheme can be
   augmented by defining two or more levels of service assurance (e.g.,
   statistical, assured).  The service provider must dimension the
   network and provision the profiles of assured sources carefully to
   reduce the probability of congestion loss.  In addition, some form of
   differentiation must be implemented in the router (such as separate
   provisioned queues) to preferentially deliver in-profile assured
   packets under congestion.

   A conceptually similar scheme is described in [SIMA].  In this
   proposal, the user contracts with the network for a Nominal Bit Rate
   (NBR) of service.  A monitor at the edge of the network measures the
   traffic load relative to the NBR and dynamically computes a level of
   drop preference (out of seven) for the packet.  A packet flow at rate
   NBR would be marked with a mid-range drop-preference; the network
   would be dimensioned to provide a small loss ratio at this level.
   Pricing of service is governed by the size of the NBR; users can
   achieve better service by purchasing a larger NBR and underutilizing


Blake                      Expires: June 1998                  [Page 12]


INTERNET-DRAFT               Packet Marking                December 1997


   it (thereby receiving low drop-preference for their packets).  The
   network routers implement preferential discard of traffic based on
   a series of thresholds for each drop-preference level.  The system
   also allows the user to select real-time or non-real-time service.
   Real-time packets are served by a small queue which receives strict-
   priority service over the non-real-time queue.  The small size of the
   queue and the potentially higher loss rate gives the user incentive
   not to utilize the real-time service for elastic applications.

   [Feng97] describes a design that is similar to [Clark97].  One
   salient difference is that the profile-meter at the network edge
   (termed the Packet Marking Gateway (PMG)) statistically marks packets
   for priority service based on some computation of the number of
   priority marked packets needed to achieve a target throughput.  The
   network routers perform service differentiation using a weighted RED
   implementation.  One extension of this work is the incorporation of
   the packet marking facility within the TCP congestion control
   algorithm.

3.4  Integrated Services Conformant Packet Indication

   Packet marking can be used to simplify and enhance the implementation
   of Integrated Services, by marking packets within an Integrated
   Services flow which conform to the flow's Tspec (or conversely by
   marking non-conformant packets).  As was mentioned in Sec. 3.2.3,
   non-conformant packet marking is essential to permitting RSVP
   aggregation, as per-flow policing is not possible when the control-
   state is aggregated and non-conformant packets can degrade the QoS of
   other aggregated flows.  Packet marking of conformant packets may be
   useful for non-aggregated Integrated Services flows as well, as it
   can provide a hint to routers as to which packets may require
   classification (a computationally expensive procedure) as well as
   providing an indication as to of which packets of the flow have
   failed policing upstream.  We describe both a conformant packet
   marking scheme and its dual below.

   In the first scheme, a single bit is used to indicate that the packet
   belongs a flow and that that packet has not failed a policing
   function (it may be conformant).  Only packets which have this bit
   set are flow-classified by the routers, and only these packets are
   counted against the flow's Tspec.  There are three alternatives for
   where and when this bit should be set:

     o this bit is set by the source when it has sent a PATH message for
       the flow,

     o the bit is set by the source only when it has received a RESV
       message for the flow,

     o the bit is set in the network by the farthest upstream router
       which accepted a RESV for the flow (often the source's first hop
       router).


Blake                      Expires: June 1998                  [Page 13]


INTERNET-DRAFT               Packet Marking                December 1997


   An issue with alternative one is that the bit may be set for flows
   which are never reserved.  An issue for alternative two is that the
   semantics of the bit do not permit partial reserved paths (where the
   reservation succeeds partially upstream from the receiver but fails
   before reaching the source) since the bit will never be set (by the
   host) and the routers will never classify the packet.  This issue can
   be addressed in part by alternative three, but in this case, the
   farthest upstream router must classify every packet received on the
   same interface as traversed by the flow to identify its packets; this
   offsets the main advantage of providing the indication.  Furthermore,
   this introduces a dependency on the behavior of an upstream router
   (since the furthest upstream router which accepted the reservation
   must wait for RESVERR messages to guess whether an upstream node has
   accepted the reservation and will mark the packets).

   In the second scheme, the bit is only used to mark packets which have
   failed a policing function (non-conformant).  Every packet which does
   not have the bit set is flow-classified, eliminating a potential
   performance advantage of the first scheme since in this case all
   best-effort packets are also classified.  The bit is set if a packet
   fails a flow's Tspec policing function (token bucket).  Downstream
   routers could choose not to classify these packets or could choose
   not to count them against the flow's Tspec.

   There are potential re-ordering hazards for both schemes, depending
   on how non-conformant packets are serviced at the router.  If non-
   conformant packets are not classified and are serviced in the best-
   effort queue, then re-ordering is likely whenever there is a
   disparatity in the queueing delay between the flow's normal service
   queue and the best-effort queue.  The first scheme only appears to be
   able to reduce the amount of packets which are classified while
   preserving the defined RSVP network behavior if alternative one is
   chosen.  The second scheme can only reduce the number of packets
   which are classified significantly if a large fraction of the
   Integrated Services packets are non-conformant.  However, the
   semantics of the second scheme much more closely match the
   requirements for aggregated flows (where flow-classification is
   eliminated).  Both schemes are mutually compatible if separate PH
   bits are utilized for each.

3.5  Forward Explicit Congestion Notification

   The final packet marking application we will discuss is Forward
   Explicit Congestion Notification (FECN).  FECN is one-half of a
   bi-directional scheme where the network marks packets which are
   transmitted across a congested link, and some process at the
   receiving node sends a Backwards Explicit Congestion Notification
   (BECN) back to the source, to influence its rate of transmission so
   that the congestion within the network will subside.  The BECN need
   not be returned in the PH field but may be sent in some higher-layer
   message.  There are two proposed implementations of FECN which we
   will examine.


Blake                      Expires: June 1998                  [Page 14]


INTERNET-DRAFT               Packet Marking                December 1997


   In the approach describe in [ECN94] and [ECN97], a router sets the
   FECN bit stochastically based on the RED algorithm, which computes a
   probability of packet detection which is an increasing function of
   the queue's average packet occupancy [RED].  The router sets this bit
   as an alternative action to discarding the packet, which it would
   have done if the associated transport protocol had not advertised its
   ECN-capability).  There are two variants proposed; one a single-bit
   scheme and the other a two-bit scheme.  In the single bit scheme, the
   application transport protocol (which is ECN-capable) sets the FECN
   bit, which is reset by any router which randomly detects the packet
   due to a build up of queued packets.  The packet can then no longer
   be distinguished from packets utilizing non-ECN-capable transports,
   and if it is detected downstream at another congested router, it will
   be discarded.  In the two-bit scheme, the transport protocol's ECN-
   capability is advertised explicitly in a separate bit, and packets
   which are detected at multiple routers are not discarded.  Upon
   receipt of a packet with the FECN, the receiver sends a BECN back to
   the source, either as a IP-layer message (e.g., ICMP Source Quench)
   or as a transport-layer acknowledgement (e.g., TCP ACK option).  The
   source transport protocol is supposed to treat the receipt of a BECN
   equivalently to the loss of a packet and back-off its transmission
   rate accordingly.  Such a mechanism when widely deployed may
   significantly reduce the number of lost packets and retransmissions
   in the network.  This reduction in the number of packet losses is
   especially beneficial to interactive applications like Telnet which
   are sensitive to RTT delays which result from packet loss and
   retransmission intervals.

   The approach described in [Clark97] is the receiver-based best-effort
   service allocation scheme mentioned in Sec. 3.3.  In this (one-bit)
   scheme, routers set the FECN bit for every packet which experiences
   congestion (deterministic marking vs. stochastic marking).  A
   receiver profile meter further downstream monitors the number of
   packets which are marked by FECN and resets the FECN of all those
   packets within the receiver's service profile.  Packets with FECN in
   excess of the profile are forwarded to the receiver with the FECN
   set.  The receiver transport protocol is supposed to take the same
   action as described for the first scheme (send a BECN), and the
   source transport protocol is also supposed to behave as in the first
   scheme (back-off).  In addition, the receiver profile meter may take
   explicit action against flows from mis-behaving sources (those which
   do not appear to honor the BECN).

   The key difference between these two proposals is the marking action
   in the interior routers (stochastic vs. deterministic marking).  As
   such, it does not appear that the schemes are compatible using the
   same PH FECN bit, unless the network is configured such that there is
   a receiver profile meter downstream of every interior router, in
   which case the interior routers can be configured to mark the FECN
   bit as described in [Clark97] and the receiver profile meters can
   filter the FECN indications as appropriate prior to forwarding the
   packet further downstream.


Blake                      Expires: June 1998                  [Page 15]


INTERNET-DRAFT               Packet Marking                December 1997


   It is not clear how well ECN will scale for multicast traffic, due to
   the potential implosion of BECNs at a multicast source [CCBES].


4.  Differentiation Mechanism Categorization

   Packet marking is intended to invoke one or more differentiation
   mechanisms -- either in a host (source/destination) or in the routers
   along the packet's transit path -- so as to differentiate the data
   transport performance.  In the above statement we are using the term
   "router" generally to refer to any node in the packet's path which
   forwards it towards the destination; e.g., a profile-meter [Clark97]
   or firewall.  In Sec. 3 we examined several proposed uses of a packet
   marking facility and the differentiation mechanisms they might
   invoke.  In this section we reverse the analysis by examining the
   differentiation mechanisms which could be implemented within a host
   or router, and then detail which of the mechanisms are invoked by the
   proposals described in Sec. 3.  It should be noted that, for ease of
   deployment, and with the exception of FECN,  most of the proposals
   attempt to provide differentiated services using only router
   mechanisms, without substantial changes to the host (if any).

4.1  Host Packet Processing Mechanism Categorization

   Host packet processing mechanisms relevant to differentiated services
   can be categorized into the following functions:

     o Path selection          -- selection of the output interface or
                                  next-hop router,
     o Transmission scheduling -- selection of the next packet to
                                  forward from the transmission queue;
                                  selection of the link-layer priority,
     o Reception scheduling    -- selection of the next packet to
                                  process and deliver to the transport
                                  layer from the reception queue,
     o Congestion control      -- selection of the transmission rate, or
                                  of the interval over which to suspend
                                  transmission, based on congestion
                                  indications from the network.

   Host path selection is rarely invoked since most hosts are single-
   homed (single network interface) and most do not run a routing
   protocol which would allow them to intelligently select the next-hop
   router for service-specific routing (Sec. 3.2.4); however, nothing
   precludes a properly configured host from making service-specific
   route selections.

   Non-FIFO host transmission scheduling may be invoked to promote a
   delay prioritized packet, or one within a precedence service class
   (or within an aggregated Integrated Services class if the host
   supports aggregation).  It may also be used to control the
   transmission rate of a flow, if that flow's service class is known to


Blake                      Expires: June 1998                  [Page 16]


INTERNET-DRAFT               Packet Marking                December 1997


   be rate regulated by the network.  Host transmission scheduling may
   also invoke link-layer prioritization features; e.g., by selecting a
   particular ATM QoS VC, or by marking the packet with a particular
   802.1p priority [IS802].

   Non-FIFO reception scheduling may in principle be invoked by a delay
   prioritized or precedence service class packet, or by a transport-
   isolated class (where one transport protocol has priority over
   others).  Loss priority may be utilized if the receiving host's
   buffer is saturated and packets must be discarded.  However, any non-
   FIFO or drop-prioritized reception processing may introduce
   complexity in the receiving host's networking protocol stack that is
   not justified in practice by improved performance.

   Receipt of a backwards explicit congestion notification should
   directly affect the congestion control function of the source's
   transport protocol (causing it to reduce its rate of transmission).
   Also, a transport protocol modified as described in [Feng97] will
   mark packets for priority as required to achieve a negotiated
   throughput.

4.2  Router Packet Processing Mechanism Categorization

  Router packet processing mechanisms relevant to differentiated
  services can be categorized into the following functions:

    o Reception scheduling    -- selection of the next packet to process
                                 from the reception queue,
    o Packet classification   -- identification of the flow by header
                                 filtering; identification of the
                                 differentiation mechanisms to apply by
                                 PH lookup,
    o Path selection          -- selection of the next-hop node,
    o Traffic policing        -- setting of PH based on the monitored
                                 rate of traffic within a flow/class,
    o Buffer Management       -- selection of the queue/discard action;
                                 pushout under congestion,
    o Transmission Scheduling -- selection of the next queue to service
                                 for transmission.

   Note that buffer management and transmission scheduling can be
   strongly coupled.

   Reception scheduling is normally FIFO in routers.  This is usually
   because the forwarding subsystem is integrated with the header
   processing subsystem, and the packet must be received from the
   reception queue (if any) before the PH field can be examined.  In
   addition, most wire-speed routers do not encounter significant
   reception queues.

   Any differentiation mechanism which utilizes packet marking will
   require that the routers check the PH field to determine the


Blake                      Expires: June 1998                  [Page 17]


INTERNET-DRAFT               Packet Marking                December 1997


   differentiation mechanisms to apply to the packet.  An Integrated
   Services conformant packet indication may invoke flow classification.
   Also, some explicit service class may be defined which invokes some
   packet classification function at some point in the network for
   authorization purposes or for finer-granularity service
   differentiation.

   Path selection may be affected if service-class based routing is
   configured.  In this case the PH field will determine the set of
   routes to search first.

   Traffic monitoring and policing is required by several of the best-
   effort service allocation proposals, to determine whether the traffic
   of a flow is within a negotiated profile (or how it varies relative
   to it).  Integrated Services policing can utilize a non-conformant
   packet indication to signal an out-of-profile packet within a
   reserved flow.

   Any of the delay priority or explicit service class markings may
   direct the buffer management subsystem to queue the packet into a
   non-default queue.  Services requiring isolation and/or provisioned
   resources in the router (e.g., buffer space, bandwidth) generally
   require a separate queue.   Within a queue, an explicit drop priority
   or precedence service class marking, or an out-of-profile indication,
   may invoke a buffer management discard action, depending on the
   current state and history of buffer occupancy in the queue [RED].
   The same buffer discard logic may be utilized to set FECN (if the
   transport protocol is ECN-capable).

   When multiple queues are supported to provide delay prioritization,
   provisioned service class bandwidth, or isolation, a queue scheduling
   algorithm must be implemented to determine which queue's head-of-line
   packet to transmit next.  The scheduling algorithm could vary in
   complexity from simple strict priority, to one of the more
   sophisticated rate-based scheduling algorithms such as WFQ or CBQ
   [HPFQA, HFSC, CBQ].  While the complexity of strict priority queueing
   is usually O(1), the complexity of the more sophisticated rate-based
   scheduling algorithms is usually O(log N) for N queues.  This may
   impose an upper bound on the number of economically implementable
   delay priorities or service classes.

   It is useful to examine the amount of state that each packet marking
   application imposes within routers.  Some of the stateless
   applications include an explicit delay/drop priority (when
   implemented as strict priority) or service class or out-of-profile
   indicator that indicates drop-preference.  These impose no per-flow
   or per-class state in the routers, although any network authorization
   or policing/monitoring function which sets these indicators may
   require per-flow state (although these functions are usually located
   near the source or receiver).  An Integrated Services aggregating
   router requires per-flow classification and policing state prior to
   service class aggregation.  Any provisioned explicit service class


Blake                      Expires: June 1998                  [Page 18]


INTERNET-DRAFT               Packet Marking                December 1997


   mechanism will impose per-class buffer management and scheduling
   state in the routers.  In the general case, packet marking
   applications only impose per-flow state at aggregation points in the
   network where the number of flows is not large.

4.3  Biased vs. Substitute Best-Effort Router Mechanisms

   We have examined in a general way the different router subsystems
   which may be parameterized to differentiate packet transport
   behavior.  This analysis gives us the basis to more specifically
   examine the scope of differentiation that can be provided.  We
   characterize a router's differentiation capabilities into two broad
   categories: biased and substitute best-effort mechanisms.

   Normal best-effort service is generally considered to be fair (under
   the assumption of cooperating, well-behaved applications utilizing
   transport protocols with TCP-like congestion control).  This service
   is often implemented using a single FIFO queue, with no per-flow
   identification, or path-selection, or special buffer management or
   scheduling (we accept the possibility of active buffer management,
   perhaps incorporating fairness enforcement mechanisms [ACTIVE, RED,
   FRED, Floyd97]).  Such a service can be considered fair because there
   are no explicit biases in the packet handling behavior.  It is in
   fact the case that there are biases in best-effort service, even
   among well-behaved applications (e.g., flows with large RTTs achieve
   lower througput [Floyd91]), but these biases are artifacts of the
   congestion control algorithms utilized and are not due to explicit
   biasing mechanisms in the network.  We define a biased mechanism here
   as one which explicitly allocates more resources (e.g., buffers,
   queue service rate) to some set of traffic flows, permitting these
   flows to achieve superior (loss/delay/throughput) performance over
   other flows.  We will more carefully define a substitute best-effort
   mechanism in Sec. 4.3.1, but for now we define it as mechanism which
   does not provide additional resources to flows over any long time
   scale, but which may temporally provide such resources over short
   time scales.

4.3.1  Transmission Mechanism Categorization

   Router transmission mechanisms -- buffer management, packet
   scheduling, and link-layer priority selection -- are one basic means
   by which a router can differentiate the service of a flow.  To
   provide a framework for the discussion of the possible axis of
   differentiation we will first describe a hypothetical router
   transmission subsystem which enforces per-flow link-fairness.  This
   design is motivated by the discussion in [CCBES].  Such a system
   would incorporate per-flow queueing with a fair (equal service weight
   per queue) scheduling algorithm; for example a variant of WFQ
   [HPFQA].  In a system with finite buffers, per-flow buffer management
   would also be implemented.  The buffer management system might impose
   absolute bounds on the instantaneous buffer occupancy of a flow.  To
   account for bursty flows, a RED-based discard policy might be


Blake                      Expires: June 1998                  [Page 19]


INTERNET-DRAFT               Packet Marking                December 1997


   implemented based on the long-term traffic history of the flow (and
   not the flow's short-term average queue occupancy).  The goal of this
   buffer management system would be to deliver equivalent packet loss
   rates to flows with equal long-term average rates (within some time
   horizon), with minimal packet loss for flows under-utilizing their
   fair-share rate, and higher loss rates for flows attempting to exceed
   their fair-share rate.  The goal of the scheduling system is to
   provide equal service rates to all flows under backlogged conditions.
   In practice, once flows had stabilized to their fair-share rate, only
   the bursty flows would queue more than one packet at a time.  Note
   that maintaining per-flow state (dynamically generated in the case of
   best-effort sources) is probably too complex for economical
   implementation; it is only proposed as a hypothetical example to
   highlight some properties of flow service differentiation.  Note also
   that link-fairness is not the only, nor necessarily the best
   definition of fairness in a network; it is used here only for
   illustrative purposes.

   Flow service can be differentiated by modifying any of the parameters
   of our hypothetical transmission subsystem.  For example, to increase
   the nominal throughput of a flow, that flow's queue weight could be
   increased, thereby increasing its backlog drain rate.  To decrease
   the probability of packet loss, a flow's maximum allowable buffer
   consumption could be increased, and the parameters of the discard
   policy could be modified to preferentially allow the flow's packets
   to have access to buffers slot under congestion.  To increase the
   probability of loss under congestion, the reverse actions could be
   taken, and threshold mechanisms could be implemented if there are
   packets of multiple drop priorities within a flow (as in best-effort
   service allocation).  To reduce the queueing delay of a flow's
   packets (without increasing the long-term service rate of the flow),
   the scheduling algorithm could be amended so that a delay prioritized
   packet could be transmitted prior to that flow's queue receiving its
   turn at the scheduler.  This delay prioritization capability would be
   bounded by a token bucket with a token pool scaled proportionally to
   the typical burst size of the flow, and with a token rate equal to
   the flow's fair-share rate (thus turning the flow's queue into a
   variable bit-rate shaper).  Each of these modifications are an
   example of a biased differentiation mechanism.  Note that the impact
   of this biasing is degradation of the service of other flows under
   contention for transmission resources.

   We can further categorize biased transmission mechanisms into
   provisioned and non-provisioned mechanisms.  Provisioned mechanisms
   are specifically configured to provide a particular service for a
   flow, such as explicit delay or drop priority, or throughput priority
   within a precedence service class.  A non-provisioned biased
   mechanism such as FECN implements a biased discard policy for packets
   which are marked as ECN-capable.  Such a mechanism is not intended to
   enable biased service for well-behaved applications, however, they
   introduce the possibility of service bias for badly behaved
   applications (e.g., those that do not honor BECN) by allowing them


Blake                      Expires: June 1998                  [Page 20]


INTERNET-DRAFT               Packet Marking                December 1997


   to achieve better-than-fair throughput due to the lower loss rate.

   In contrast to fair and biased transmission mechanisms, we may also
   hypothesize the possibility of substitute best-effort mechanisms.
   The stability of the current Internet depends on the fact that its
   existing service model is fair [CCBES].  Introduction of biased
   service capabilities will require provisioning and traffic
   regulation.  However, the normal "best-effort" service available to
   applications may not suit all of their needs, and it may be the case
   that applications could improve their performance without subscribing
   to any particular provisioned differentiated service from the
   network.  This would only be possible if these alternative mechanisms
   did not aggravate network stability, which implies that they must
   also be fair.  For the purposes of this discussion we define a
   substitute best-effort mechanism as fair if, when selected by a flow,
   it does not degrade the overall performance of other active flows,
   where we define "performance" for normal best-effort flows as average
   throughput, loss, and delay.  One example of a substitute best-effort
   mechanism would be queueing isolation to protect a flow with a long
   RTT (note that this does not strictly meet our definition of fairness
   since, if selected, other flows are not able to achieve an unfair
   share of the link capacity).  Another example would be a mechanism
   which allowed a flow to trade its fair-share service rate or its
   average packet loss rate for low queueing delay [RFC1046].  Trading
   rate for low delay could be achieved by giving a flow delay priority
   within a token bucket whose token rate was less than the flow's fair-
   share service rate.  Trading loss for low delay could be achieved by
   queueing the flow in a delay-prioritized queue with a small per-flow
   buffer slot quota.  Such a capability might be useful for low-
   throughput interactive applications like IP telephony.

   An alternative mechanism would allow a flow to trade queueing delay
   or service rate for lower loss.  The former could be achieved by
   queueing the packet in a queue with a larger per-flow quota but with
   low delay priority.   The latter could be achieved by queueing the
   flow in a larger buffer with a lower fair-share service rate.  This
   capability might be useful for some short-term transaction traffic
   (e.g., RPC, some WWW) which is insensitive to queueing delay but
   which is sensitive to RTT delays.

   The third alternative, allowing a flow to trade delay or loss for an
   improved service rate, does not make sense in the context of a
   congestion-controlled best-effort network (See Sec. 6).

   It is not known to the author whether the substitute best-effort
   mechanisms proposed have been researched, and whether they exacerbate
   fairness and stability within a best-effort network.  Furthermore,
   although we have discussed the transmission service mechanisms in the
   context of per-flow queueing and buffer management, in fact one of
   the goals of packet marking differentiated services is to eliminate
   per-flow state in the core of the network.  Aggregate queueing and
   buffer management mechanisms which provide differentiated transport


Blake                      Expires: June 1998                  [Page 21]


INTERNET-DRAFT               Packet Marking                December 1997


   services may suffer from fairness problems within a service class
   similar to the current best-effort Internet with single FIFO queues
   [Floyd97].

4.3.2  Path Selection Mechanism Categorization

   We can categorize path selection mechanisms using the same framework
   as was used in Sec. 4.3.1.  A provisioned biased path selection
   mechanism would compute a route based on metrics (e.g., delay, loss,
   link rate, and transmission cost) that would suit the requirements of
   a particular class of traffic.  The flows that had access to these
   paths would be authorized prior to entry, and their traffic would be
   regulated.  The paths would be provisioned to satisfy the regulated
   traffic load (perhaps using statistical assumptions).  The "out-of-
   class" traffic taking these paths would also be regulated to preserve
   the service level of the in-class flows.

   Non-provisioned biased path selection mechanisms (which also fit our
   definition of substitute fair mechanisms) would not utilize per-flow
   authorization and traffic regulation.  Examples include computing
   paths which avoid satellite hops for delay sensitive traffic, or
   which avoid wireless hops for loss sensitive traffic.  Whether these
   alternative paths would actually improve the service of the flows
   which took them may depend on the relative load on the paths from
   other traffic flows.  The assumption that would justify their use by
   non-regulated flows is that these paths are in some other way
   inferior to the normal shortest-hop path (longer delay, higher loss,
   or lower link rate).


5.  Service Categorization

   Marking a packet invokes a particular router or host differentiation
   mechanism on that packet.  This facility is used to instantiate a
   service for a flow of traffic.  Some of the packet marking
   applications discussed in Sec. 3 imply a specific differentiation
   mechanism (e.g., FECN); others imply a general service from the
   network (e.g., precedence) without implying any particular
   differentiation mechanism implementation.

   In this section we propose a set of criteria for categorizing
   differentiated service implementations.

5.1  Service Granularity

   We can categorize differentiated services implementations by the
   granularity at which they act (at which they differentiate transport
   performance).  At the lowest level of granularity is per-packet
   mechanisms such as the services described in Sec. 3.3 and 3.4, where
   packets are marked based on the characteristics of the corresponding
   packet flow relative to some traffic specification.  These
   differentiation mechanisms may be invoked for a subset of the flow's


Blake                      Expires: June 1998                  [Page 22]


INTERNET-DRAFT               Packet Marking                December 1997


   packets, with the aggregate effect (and its interaction with host
   congestion control) delivering the desired service.  FECN is another
   example of a per-packet differentiation service.

   Per-flow differentiated services utilize the same packet marking for
   each packet of a flow.  Examples of this type include explicit delay/
   drop/network control priority, explicit service class indication, and
   service-based route selection.  This granularity can be relaxed to
   provide per-source host differentiation, where all of the packets
   transmitted from a particular source receive the same packet marking
   (or in the case of the schemes describe in Sec. 3.3, the individual
   flows of the source are not distinguished).  Per-source
   differentiation is particularly suitable when there is no need for
   per-application differentiation (for example when all of the source's
   flows have homogeneous service requirements).  Note that the
   distinction between per-packet services and per-flow/source services
   is not crisp.

   Per-network differentiated services act on the aggregate of flows
   from a particular cluster of nodes, from a particular subnet, or from
   a particular site (e.g., VPN service).  As is the case for per-source
   services, per-network services are appropriate whenever the
   aggregated flows have homogeneous service requirements.

   Finally, we include per-receiver services such as that described in
   [Clark97] (and also RSVP aggregation).  This class of services would
   require tight integration with host congestion control or network
   policing mechanisms to ensure appropriate behavior (i.e., reduction
   in transmission rate due to congestion experienced by out-of-profile
   packets).

5.2  Service Invocation

   Another dimension of service categorization is the point of service
   invocation.  The earliest point of possible invocation is at the
   source (at the application layer).  Examples of possible source-
   invoked services are explicit delay/drop/network control priority,
   explicit service class indication, and Integrated Services conformant
   packet indication.  Source-invoked services are particular useful
   where end-to-end differentiated service is required, since this
   exposes the service interface to the application [QOSP].

   An alternative service invocation point is at some point(s) within
   the network.  Examples here may include the services described in
   Sec. 3.2. and 3.3, Integrated Services non-conformant packet
   indication and aggregation, and FECN.  The service may be invoked on
   any granularity of traffic (see Sec. 5.1) and requires configuration
   within the network to identify the flows or aggregate of flows to
   which the service should be applied.  The scope of such a service is
   usually intermediate (across a network or network-to-network [QOSP]).
   Network-invoked services are useful whenever network authorization
   and policing are required, or whenever a set of flows with


Blake                      Expires: June 1998                  [Page 23]


INTERNET-DRAFT               Packet Marking                December 1997


   homogeneous service requirements can be aggregated.

   A hybrid invocation model would permit the source to set the PH field
   to request a particular differentiated service, while allowing the
   network to authorize and police the traffic from any source which is
   allowed to utilize the service.  Such a service model might permit
   less granular configuration and authorization state in the network
   (i.e., no per-flow and only per-source state).

   Receiver-invocation of differentiated service is also possible, but
   requires some signaling mechanism to allow the receiver to control
   the sending rate of a source or the packet markings used across the
   network.  The canonical example of a receiver-invoked service is
   the Integrated Services via RSVP signaling.  Another example is ECN
   via a receiver-generated BECN (which could be influenced by a
   receiver profile meter as describe in [Clark97]).  A receiver-to-
   network signaling protocol similar to RSVP which did not rely on the
   appropriate behavior of the source to enable the differentiated
   service (or make it deployable) is conceivable, although the author
   is not aware of a proposal for such a signaling mechanism in the
   context of packet marking differentiated services.

   Service invocation can also be characterized in time as well as in
   space.  An application, source, or group of sources may have
   negotiated on ongoing arrangement with a service provider to provide
   a differentiated service for marked packets.  This type of static
   service allocation may involve a variety of time- and destination-
   specific constraints which limits service availability, but it does
   not require signaling or any form of immediate configuration to
   permit utilization of the service; sources meeting these constraints
   as well as constraints on traffic levels may begin to utilize the
   service immediately by marking packets (or the network may
   automatically mark them).  In contrast, dynamic service allocation
   involves some form of pre-negotiation (i.e., via signaling) between
   the source(s)/receiver(s) and the network for service prior to
   availability.  This negotiation may involve service start/stop times,
   traffic levels, service characteristics, pricing, etc.  The network
   will be required to dynamically configure authorization and policing
   policy mechanisms to instantiate the service, and may also have to
   dynamically provision resources within the network interior.

5.3  Service Behavior

   A differentiated service's behavior can be categorized by whether it
   is biased or offers a substitute best-effort service.  Biased
   services can be further categorized as to whether they are
   provisioned or non-provisioned.  For a more detailed discussion of
   the differences see Sec. 4.3

   A major implementation issue with biased services is the need to
   regulate the amount of traffic which can invoke them, usually via
   source traffic shaping, or network authorization and traffic


Blake                      Expires: June 1998                  [Page 24]


INTERNET-DRAFT               Packet Marking                December 1997


   policing, as well as appropriate provisioning within the network.
   This is required to satisfy whatever delay/loss/throughput
   performance guarantees are associated with the service.  The behavior
   of the service in the presence of non-conformant traffic can be
   characterized as to how the non-conformant traffic is handled.  Such
   traffic could be discarded automatically by the network [TWOBIT], or
   it could be handled with lower priority and suffer a higher
   probability of loss or delay [Clark97].  The effect of the presence
   of non-conformant traffic on the conformant subset is also relevant.

   In general, a provisioned biased differentiated service could be
   defined as a set of probability density functions of packet delay,
   loss, and flow throughput relative to some statistical traffic model
   (and time interval).  In practice, however, determining this level of
   information detail for each differentiated services customer would be
   difficult if not impossible.  From these density functions service
   assurance levels, measured as the probability of service
   availability, could be inferred, although without stationarity
   assumptions the service failure modes could not be predicted.  A
   differentiated services user will be primarily interested in the
   degradation behavior of the service.  Differentiated services
   implementations can be characterized by whether service failure
   (e.g., due to under-provisioning or network infrastructure failure)
   results in a soft degradation in the delay/loss/throughput metrics,
   a complete degradation to traditional best-effort service, or total
   interconnectivity failure.  Furthermore, implementations can be
   characterized by the promised maximal duration and frequency of
   service failure.

5.4  Direction of Value

   An important means of categorizing differentiated services is by
   examining in which direction value flows when a differentiated
   service is provided for a packet flow (either to the source or to the
   receiver).  In any point-to-point exchange of traffic there is
   usually a benefit to both ends of the conversation; however, it is
   often the case that the level of direct benefit to one party exceeds
   the level to the other.  This is important to a service provider as
   it might be more appropriate to implement pricing policies which
   target the primary beneficiary.

   Most of the packet marking applications examined provide benefit to
   a source of traffic by preferentially handling that source's packets
   for improved transport performance.  Since these mechanisms are not
   necessarily destination-specific, they can be viewed as primarily
   benefiting the source.  As such one would expect that the source (or
   his proxy) would be the entity charged for the differentiated service
   (e.g., source-purchased traffic profile).  When the traffic profile
   is associated with a particular set of destinations, and when
   reverse-path services are utilized, we can consider the value to be
   bi-directional, and the charges for the service can be distributed
   between the end-points (often the same organizational entity).


Blake                      Expires: June 1998                  [Page 25]


INTERNET-DRAFT               Packet Marking                December 1997


   The model of source-pricing of differentiated service may not suit
   WWW-based information delivery, since the value of service flows
   primarily to the receiver of information and there is no incentive
   for the information source to request service differentiation for
   a particular subset of receivers (this changes if there is some
   charge to the receiver associated with information retrieval).  For
   such applications a receiver-invoked service such as describe in
   [Clark97] may be most appropriate.  Signaling across the network to
   (or near) the source to initiate differentiation may permit more
   sophisticated receiver-invoked services (e.g., RSVP).  However, there
   must likely be some associated settlement mechanism to incent the
   service provider or source to deploy such a protocol, and scalability
   and interoperability factors must also be weighed.

   An interesting problem arises for multicast applications where the
   receiver is the main beneficiary (arguably commercial broadcast
   applications are an example).  When packet marking is invoked at or
   near the traffic source, all receivers of the transmission receive
   the benefit of the differentiated service (if we assume that the
   network does not remark the packet along different branches of the
   multicast path).  This may deliver greater value to some receivers
   than they are willing to pay for.  Conversely, if the service
   provider charges more for differentiated multicast service, this may
   make it difficult for the source to provide the desired service to
   one or more particular receivers (in an efficient way).  Receiver-
   invoked service mechanisms such as described in [Clark97] may scale
   poorly in a multicast environment due to BECN implosion at the
   source.  Also, Integrated Services aggregation suffers a variety of
   scaling and heterogeneity problems for IP multicast reservations,
   since the granularity of service is often too coarse (and due to
   control-plane scaling problems).


6.  Fairness and Congestion Control Considerations

   In the absence of traffic regulation and associated network
   provisioning, the stability of the Internet still depends on the use
   of cooperative congestion control by all applications [CCBES,
   Floyd97].  This is true even for application flows (or packet
   subsets) which specify (non-provisioned) drop-preference service.  A
   form of congestion collapse can occur in the Internet if applications
   rely on the network to discard excess packets and do not implement
   closed-loop congestion control, because packets which will later be
   discarded downstream could utilize bandwidth on a link which could be
   better utilized by non-drop-preference congestion-controlled flows
   (e.g., normal TCP) [Floyd97].  The normal router mechanisms which
   would allow the non-drop-preference traffic to ramp up to the link-
   rate (e.g., weighted RED) may not function effectively if the drop-
   preference traffic is not well-behaved.  One solution to this problem
   might be to introduce aggregate bounds on the amount of drop-
   preference traffic transmitted which would incent the application not
   to abuse the service.  However, if a drop-preference marked packet


Blake                      Expires: June 1998                  [Page 26]


INTERNET-DRAFT               Packet Marking                December 1997


   has such a low probability of being delivered (due to aggregate
   constraints), the drop-preference facility is not very useful.

   A similar form of congestion collapse can occur if badly behaved
   applications which advertise ECN-capability do not respond to a
   receiver BECN.  This is because the router gives these packets
   preferential drop priority.  This could allow non-conformant
   transport protocols to achieve better throughput than conformant ECN-
   capable transports and non-ECN-capable transports.  Although it is
   not clear whether ECN introduces a fairness problem that is any worse
   than the existing problem of badly behaved transports, its deployment
   should be approached cautiously.  One way to alleviate these fairness
   problems might be to implement fairness enforcement mechanisms such
   as described in [Floyd97] and [FRED] (note that these mechanisms
   might contradict the scalability objectives addressed by packet
   marking).

   In Sec. 4.3 we introduced the concept of a substitute best-effort
   service.  Because substitute best-effort differentiation mechanisms
   provide short-term biasing at the expense of long-term throughput,
   delay, or loss, the router implementation must take active measures
   to ensure that these mechanisms do not jeopardize network fairness.
   The mechanisms should be engineered to ensure that sources cannot
   achieve an unfair share of network resources by modulating between
   substitute best-effort services.  Badly behaved applications should
   not be able to achieve better throughput (or to further degrade
   the service of other flows) by selecting a substitute best-effort
   service.  One possible means of achieving these objectives is to
   make the mechanisms non-work-conserving, thereby incenting the
   application to select these substitute services only if the trade-off
   they provide is absolutely beneficial, and to penalize badly behaved
   applications which select these services.

   One of the scalability objectives of packet marking differentiated
   services is to eliminate per-flow state in the core of the network.
   We have examined a hypothetical per-flow router transmission system
   to highlight how differentiation might be provided.  However, in a
   scalable system, only aggregated state would be maintained (or per-
   flow state would only be maintained for a small subset of the
   active flows).  Aggregated buffer management and queueing
   implementations may suffer the same fairness problems between flows
   within a service class as is exhibited today with best-effort traffic
   and single-queue FIFO routers.  Provisioning and traffic regulation
   might alleviate these problems, but techniques such as described in
   [Floyd97] and [FRED] might also be required in some circumstances.


7.  Provisioning Considerations

   We have used the term "provisioning" in this document to describe the
   deployment and assignment of network resources for the exclusive or
   preferential use by certain (sets of) traffic flows.  Aggregate


Blake                      Expires: June 1998                  [Page 27]


INTERNET-DRAFT               Packet Marking                December 1997


   differentiation mechanisms in and of themselves cannot deliver a
   quantifiable service without constraints on the aggregate amount of
   traffic which invokes those mechanisms.  The resource allocation
   policy can be implemented in each interior router, for example by
   service class-specific queues with provisioned minimal bandwidth
   levels and buffer quotas.  Alternatively, resource allocation can be
   implemented more globally, relying on traffic authorization and
   policing at the network edge and stateless differentiation mechanisms
   in the network interior.  Whichever choice is preferred depends on
   scalability concerns, the aggregate amount of traffic utilizing a
   particular differentiated service, as well as the level of
   statistical performance assurance associated with the service.

   A basic motivation of differentiated services is to provide tiered
   levels of statistical assurance of service for a particular traffic
   load, with tiered pricing to match the service provider cost and
   customer utility associated with each level.  Service assurance was
   discussed in detail in Sec. 5.3.  To elaborate on that discussion,
   the statistical assurance of a differentiated service depends upon
   uncertainty about the interior transit path taken by appropriately
   marked packets, on the statistical multiplexing gain assumed in the
   service allocation policy, and on the instantaneous behavior of other
   differentiated services users.  As such, there is potentially a
   strong time-dependence on service quality.

   When provisioning a differentiated service, a provider must take into
   account the dimensioning of the network, as well as statistical
   models of customer activity and traffic levels.  Defining a service
   allocation policy which satisfies a particular statistical assurance
   level is equivalent to an admission control problem.  The primary
   design choices in a service allocation policy are static vs. dynamic
   allocation, and domain-wide vs. hop-by-hop admission control.  When
   using a static allocation policy, a provider must provision more
   resources than when using a dynamic allocation policy to achieve the
   same level of statistical assurance since the admission control
   decision must be made on historical data or traffic models and not on
   instantaneous measurements of network activity.  However, dynamic
   admission control requires some means of signaling and/or dynamic
   configuration to convey the service request to the network (and its
   affected elements).  This dynamic admission control decision could be
   made by a centralized administrative entity (e.g., the Bandwidth
   Broker in [TWOBIT]) which bases its decision on a domain-wide view of
   existing service allocations (and possibly a coarse view of the
   instantaneous traffic activity).  Alternatively, the decision could
   be made at each node along the hop-by-hop path taken by the affected
   packets (e.g., the RSVP/Integrated Services model).  It is difficult
   to provide a strict guarantee of service along an unspecified path
   at an unspecified time [Clark97], which implies that services which
   promise strict performance guarantees will usually include
   constraints on the available destinations or network egresses as well
   as the interval of service availability.  Note also that the choices
   between static vs. dynamic allocation and domain-wide vs. hop-by-hop


Blake                      Expires: June 1998                  [Page 28]


INTERNET-DRAFT               Packet Marking                December 1997


   dynamic admission control are not mutually exclusive.

   Implementation of a dynamic, domain-wide admission control policy, as
   well as long term service planning, depends on the availability of
   statistics on service utilization and performance.  Means of
   capturing the characteristics of marked traffic, such as the
   utilization of a particular service class or differentiation
   mechanism, packet discard distributions, queue delay distributions,
   etc., are required (e.g., via new router MIB variables).  Service
   providers and customers may need to deploy test and measurement
   applications to characterize and validate the assurance level of a
   service.  These mechanisms may also be needed to facilitate inter-
   provider monitoring and settlement.

   Service provisioning must also take into account the scalability of
   the mechanisms used to provide the service.  Scalability may be
   affected by the amount of configuration state, network monitoring
   state, router processing and transmission state, and dynamic service
   signaling traffic levels and state which is required for a particular
   service implementation.


8.  Authorization Considerations

   Authorization of use of packet marking-based biased differentiated
   services is required to permit any level of service assurance.
   Authorization is required at whichever point(s) in the network where
   the service is invoked (see Sec. 5.2).  We break the authorization
   problem down into the components of packet classification, traffic
   policing, and PH marking.

   The packet classification component matches received packets to
   statically or dynamically allocated service profiles, based on any
   combination of per-source address/subnet, per-destination address/
   subnet, or per-flow packet header filters.  The classification
   function may be deployed on site subnet boundaries, on site backbone
   boundaries, on the site border to a service provider, on provider
   ingress boundaries, and/or on inter-provider boundaries.  The
   granularity of packet classification will generally be relaxed as
   the classification component moves into the interior of the network
   to facilitate scalability.  The classification component may also
   honor source service requests (based on the PH value set by the
   source).

   The traffic policing component measures the instantaneous load of
   packets matching a classification entry relative to some traffic
   profile.  This traffic profile is configured based on some source/
   network or network/network agreement on the amounts and signature of
   marked traffic.  The traffic policing component could be based on a
   simple token bucket filter [TWOBIT], or on a more sophisticated
   monitoring  function which takes into account the congestion-control
   behavior of TCP [Feng97, Clark97].  Non-conformant packets may be


Blake                      Expires: June 1998                  [Page 29]


INTERNET-DRAFT               Packet Marking                December 1997


   discarded, depending on the service policy.

   The PH marking component sets the PH field of each service profile-
   conformant packet to invoke the differentiation mechanism(s) deployed
   within the network to instantiate the appropriate service.  Non-
   conformant packets may be re-marked, depending on the service policy.

   The classification, policing, and PH marking components within a
   network must be configured whenever a service is allocated to a (set
   of) sources.  This may involve manual configuration for statically
   allocated services, or dynamic signaling (e.g., SNMP, RSVP) for
   dynamically allocated services.

   Interoperability of packet marking differentiated services between
   service providers depends on a joint agreement on PH semantics,
   traffic profiles, and authorization policies for each service
   supported.


9.  Routing Considerations

   Service assurance may depend on the stability of the routing system
   (e.g, the prevalence of routing flap or the frequency of routing
   melt-down).  The deployment of service-based routing (Sec. 3.2.4)
   introduces a variety of additional routing considerations.  One issue
   involves the existence of an incomplete service-specific path between
   a source and destination (or across a domain which deploys service-
   specific routing).  This incomplete path might exist due to router
   misconfiguration, due to different policy decisions among service
   providers, or due to routing transients.  In the event that a
   service-specific route is not available at a router along the transit
   path, we assume that the default routing entry is followed [RFC1583].
   The problem occurs when the service-specific and default routes are
   calculated using a different set of metrics.  It may be possible that
   if the default route is followed, then the packet may loop back to
   a node which has a matching service-specific route entry, and a
   stable routing loop may form.  Although the effects of router
   misconfiguration and routing transients are hard to mitigate, this
   behavior may be particularly onerous since it could be hard to
   detect.  This problem could be avoided if, whenever a router which
   implements service-specific routing has to forward a packet (with a
   PH marking associated with a service-specific routing class) using
   the default routing entry, the PH field is reset to indicate the
   default routing class.  This solution avoids the stable routing loop
   problem; however, if the same PH bits are overloaded to specify both
   service class queueing and route selection (based on some request
   such as "Minimize Delay"), then whenever these PH bits are reset, the
   service class queueing indicators are erased, and the service
   provided to the flow may be degraded at downstream nodes.

   Another issue is the choice of behavior in the event that a matching
   service-specific routing entry is not available.  The basic choices,


Blake                      Expires: June 1998                  [Page 30]


INTERNET-DRAFT               Packet Marking                December 1997


   "Strong TOS", "Weak TOS", and "Very Weak TOS", are defined in
   [RFC1349].  The "Strong TOS" model requires that a router only
   forward a packet if a matching service-specific route is a available
   (otherwise the packet is discarded).  The "Weak TOS" model requires
   that the router use the best-matching default routing entry if a
   matching service-specific route is not available (this is the
   behavior assumed in the example above).  The "Very Weak TOS" model
   requires that the router attempt to utilize the best-matching,
   numerically lowest TOS entry if neither a matching service-specific
   nor a matching default entry are available.  The "Very Weak TOS"
   model only makes sense if the services are somehow ranked in
   numerical order of precedence.  Historically, the "Weak TOS" model
   has been favored, since the "Strong TOS" model may penalize packets
   utilizing a service class when service-specific routing is not
   deployed or a particular service-specific path is broken [RFC1349].

   The introduction of service-specific routing introduces an additional
   criterion to the route lookup algorithm (the service class match).
   The route lookup algorithm may choose to prefer an exact matching
   service class value above a longest-prefix address match which does
   not support the specific service (e.g., the default service class).
   Whichever criterion is preferred must be standardized to prevent the
   formation of routing loops amongst routers which implement contrary
   policies.  However, both [RFC1349] and [RFC1583] mandate the use of
   the longest-prefix match as the preferred criterion, as this appears
   to be the more robust option.

   Whenever service-specific routing is deployed, interoperability
   between service providers must be considered.  There must exist some
   compatibility between service-specific route calculation mechanisms
   in the deployed IGP and EGP routing protocols to prevent interdomain
   routing loops, and peering service providers must agree to implement
   compatible policies (including the resetting of routing-sensitive PH
   bits) to avoid routing loops or sub-optimal routing paths.

   PH bits which affect route selection should not be modified
   dynamically within a flow (on a per-packet basis) since this may
   affect packet ordering and RTT estimation.  Best-effort service
   allocation mechanisms such as described in Sec. 3.3 should not
   utilize routing-sensitive PH bit combinations to indicate the
   conformance of a packet.

   Deployment of service-specific routing may introduce scalability
   issues due to the increased amount of routing protocol state
   maintained in, as well as the increased amount of routing table
   computations performed by, the network routers.


10. System Implementation Considerations

   We reiterate that the goal of packet marking is to provide a
   simplified, scalable mechanism for invoking service differentiation


Blake                      Expires: June 1998                  [Page 31]


INTERNET-DRAFT               Packet Marking                December 1997


   which avoids per-flow state in the interior of the network to the
   maximum extent possible, as this appears to be a more scalable
   approach than alternatives such as the RSVP/Integrated Services
   model.  Marked packets are handled as an aggregate.  Note that the
   link-fairness model described in Sec. 4.3.1 is an idealized example
   which implies far too much dynamic per-flow state for practical
   deployment on high-speed nodes.  Note also that aggregate
   differentiation mechanisms may suffer fairness problems within a
   service class (see Sec. 6).

   Various differentiation mechanisms may introduce performance and
   scalability problems within a router implementation.  One particular
   example is the impact on router forwarding implementations which
   rely on dynamic per-flow caching of forwarding state (e.g., IPv6
   Source Address/Flow Label caching as described in [RFC1883]).  Such
   implementations may enjoy a performance advantage since the first
   packet of a flow is searched using the traditional router forwarding
   and classification algorithms to determine the next outgoing link,
   the appropriate service class, the appropriate delay/drop priority,
   etc., while subsequent packets of the flow can be forwarded using a
   cache lookup which can usually be performed using an O(1) algorithm
   (although this advantage may come at the cost of scalability in terms
   of the number of simultaneous flows supported at wire-speed).  The
   caching algorithm as described in [RFC1883] assumes that the PH field
   of a packet remains constant within the six second caching window of
   a flow.  If the PH field is used to affect the delay or drop priority
   of a packet, and if the PH field is modified dynamically to indicate
   conformance of the packet to some service profile (see Sec. 3.3),
   then the caching algorithm may prevent the router from taking this
   indication into account.  This problem can be avoided if the PH field
   is defined as part of the cache lookup key.  Modified values in the
   PH field signify a separate "flow" which will require traditional
   classification (at least for the first modified packet header).
   Alternatively, any differentiation mechanism which is determined by
   the PH value may be excluded from the set of cached state and checked
   for each individual packet.

   Another example is the requirement to recompute the IPv4 header
   checksum whenever the PH field is modified.  This would be required
   for profile meters as defined for example in [Clark97]. [SIMA], and
   [TWOBIT].  This would also be required for IPv4 routers which deploy
   FECN.  IPv4 routers already recompute the IPv4 header checksum
   whenever they decrement the TTL of a packet.  However, FECN
   introduces a potential implementation constraint for routers which
   utilize distributed forwarding across a switching fabric, since the
   processing component which performs routing and packet classification
   and which decrements the packet's TTL may lie across a switching
   fabric from the output interface queues.  In general, the processing
   component which recomputes the IPv4 header checksum must have
   knowledge of the state of the targeted output queue whenever FECN is
   implemented.



Blake                      Expires: June 1998                  [Page 32]


INTERNET-DRAFT               Packet Marking                December 1997


   Router implementation complexity and performance scalability will be
   affected by the number of output interface queues which are
   implemented to provide service differentiation, as well as by the
   complexity of the scheduling algorithms used.  In addition, per-class
   memory requirements and the processing requirements to maintain per-
   class state will also have an impact. Maintenance of configuration
   parameters (e.g., for flow/source/destination classification) and
   network management counters (e.g., for service performance
   monitoring) may increase memory requirements and introduce additional
   performance constraints.

   Compatibility with application expectations for network behavior is
   critical.  Routers may implement aggregated service differentiation
   mechanisms using multiple queues.  As a consequence, modulating
   between different PH markings may cause different packets of a flow
   to be serviced using different queues, which may result in packet
   reordering.  Applications which modulate between PH markings (e.g.,
   to signify drop priority for multiple layers of a video signal)
   should expect that the packet ordering be maintained.  Consequently,
   application-visible differentiation mechanisms, as well as network-
   invoked differentiation mechanisms should utilize sets of PH markings
   which are guaranteed to be serviced within the same queue (and with
   the same routing metrics).


11. Standardization Considerations

   Packet marked differentiated services cannot be deployed within the
   public Internet without some level of standardization.  In
   particular, the semantics of some of the PH bits must be defined to
   allow deployment of interoperable routers, authorization components,
   admission control components, and network management agents (we
   assume that some of the available PH bits may be reserved for
   network-specific use).  Specification of those PH bits which may be
   changed dynamically in-flight is needed to avoid packet reordering
   problems (see Sec. 10).  Specification of the PH bits which are
   allowed to affect route selection is required for interoperability of
   routing protocol implementations.  Specification of the PH bits which
   may be set by the application or source host (e.g., for substitute
   best-effort services) and which are not likely to be changed or reset
   in-flight is required for interoperable application development.
   Network MIB variables and dynamic signaling protocols necessary for
   service configuration and monitoring must be specified.  Furthermore,
   basic implementation requirements which are essential for the stable
   operation of the network should also be specified (e.g., thou shalt
   prevent drop-preference traffic from starving normal best-effort
   traffic).

   Incremental deployment strategies for packet marked differentiated
   services may be required if the IPv4 Precedence/TOS field semantics
   are redefined from their specification in [RFC795] and [RFC1349].
   This may be needed for example to preserve routing protocol traffic


Blake                      Expires: June 1998                  [Page 33]


INTERNET-DRAFT               Packet Marking                December 1997


   prioritization based on the IPv4 Precedence field in networks where
   the new PH semantics are incrementally honored.  This may also be
   required where a "worse-than Routine" drop priority level must be
   defined to implement a particular differentiated service, and packets
   arrive to the network which are not sourced or re-marked to use the
   new PH semantics and are instead marked using the "Routine"
   Precedence value (the routers may interpret the "Routine" Precedence
   value to indicate "worse-than Routine" drop priority).

   Another area of potential standardization is the interaction and
   compatibility between packet marked differentiated services and the
   traditional Integrated Services.  Also, service measurement
   methodologies may be defined and specified as a Best Current
   Practice.

   Interoperability of packet marked differentiated services between
   different service providers may require the standardization of the
   semantics and expected behavior of a small set of differentiation
   mechanisms and/or service classes to allow compatible exchange of
   traffic.

   Those aspects of packet marking which should remain implementation-
   dependent include the particular buffer management, scheduling, and
   authorization mechanisms and policies used to instantiate a set of
   differentiated services.


12. Security Considerations

   As discussed in Sec. 2, the wide-spread deployment of IP Security
   obscures the header fields which are traditionally used for per-flow
   packet classification.  Therefore, deployment of packet marking
   differentiated services eliminates a disincentive to the deployment
   of IP Security.

   Because the differentiation mechanisms which are deployed will likely
   introduce service bias, new denial-of-service attacks may be
   introduced.  As examples, host transport protocols which advertise
   ECN capability but which do not respond appropriately to a BECN may
   degrade the performance of other users and applications, as may
   unauthorized use of priority or service class indications.
   Unauthorized use of a network control priority indication may permit
   an attacker to severely degrade the performance of the network.
   Furthermore, an attack on the differentiated services authorization,
   signaling, or configuration mechanisms may permit theft-of-service or
   may enable a severe denial-of-service attack.  As a consequence,
   authorization, signaling, and configuration mechanisms must be
   strongly protected (e.g., by authentication).  Access to provisioned
   biased services must always be authorized, and routers must implement
   active measures (or intrinsic mechanism design) to enforce fairness
   amongst users of substitute best-effort services.  Network control
   priority in particular must be authorized, for example by always


Blake                      Expires: June 1998                  [Page 34]


INTERNET-DRAFT               Packet Marking                December 1997


   resetting the associated PH bit(s) on host access links (this may be
   difficult to implement on shared-media subnets), or by only honoring
   the network control priority indication from configured peers.

   The IP Security Authentication Header (AH) does not cover the IPv4
   Precedence/TOS field in the integrity check value computation [AH].
   This behavior is in fact essential for the deployment of network-
   invoked differentiated services where the source host is unaware of
   the PH value which will be delivered to the destination host, since
   it may be changed in-flight.  In the case where the source host is
   authorized to select the PH value, this AH behavior does not provide
   end-to-end authentication and integrity of the PH value.  The AH
   header format and integrity check value computation could be
   redefined to incorporate an application-selectable mask on the PH
   field which would allow the application to specify the particular PH
   bits which might require end-to-end authentication (so as to help
   determine denial-of-service attacks within the network).  However,
   end-to-end integrity of the PH field does not guarantee that a
   differentiated service has been delivered, since the network is free
   to ignore the PH field.  Separate measurement and assurance
   mechanisms are needed to ensure that any negotiated differentiated
   services are being provided.


13. Acknowledgements

   The issues examined in this memo have been topics of discussion
   within the Internet community for many years.  As such, the author
   does not claim credit for the originality of any of the ideas herein,
   and has made an earnest attempt to reference their original
   proponents.  Assistance from the community in documenting the origins
   of these ideas is appreciated.

   The author would like to specifically acknowledge the assistance of
   Janet Andersen, Ed Bowen, Charles Burton, Ed Ellesson, Brian
   Haberman, and Hal Sandick.  The author would also like to thank Fred
   Baker and Steve Deering for insights obtained during both public and
   private conversations.


14. References

   [ACTIVE]   B. Braden et. al., "Recommendations on Queue Management
              and Congestion Avoidance in the Internet", Internet Draft
              <draft-irtf-e2e-queue-mgt-recs.txt>, March 1997.

   [AH]       S. Kent and R. Atkinson, "IP Authentication Header",
              Internet Draft <draft-ietf-ipsec-auth-header-02.txt>,
              October 1997.





Blake                      Expires: June 1998                  [Page 35]


INTERNET-DRAFT               Packet Marking                December 1997


   [Bohn93]   R. Bohn, H. Braun, K. Claffy, and S. Wolff, "Mitigating
              the coming Internet crunch: multiple service levels via
              Precedence", submitted for publication, November 1993,
              ftp://ftp.sdsc.edu/pub/sdsc/anr/papers/precedence.ps.Z.

   [CBQ]      S. Floyd and V. Jacobson, "Link-sharing and Resource
              Management Models for Packet Networks", IEEE/ACM
              Transactions on Networking, Vol. 3 no. 4, pp. 365-386,
              August 1995.

   [CCBES]    C. Lefelhocz, B. Lyles, S. Shenker, and L. Zhang,
              "Congestion Control for Best-Effort Service: Why We Need a
              New Paradigm", IEEE Network, Vol. 10, no. 1, January 1996.

   [Clark97]  D. Clark and J. Wroclawski, "An Approach to Service
              Allocation in the Internet", Internet Draft
              <draft-clark-diff-svc-alloc-00.txt>, July 1997.

   [CLASSY]   S. Berson and S. Vincent, "A "Classy" Approach to
              Aggregation for Integrated Services", Internet Draft
              <draft-berson-classy-approach-00.txt>, March 1997.

   [Crow97]   J. Crowcroft, "All You Need Is Just One Bit", keynote
              presentation, IFIP Conf. on Protocols for High Speed
              Networks, October 1996,
              http://www.cs.ucl.ac.uk/staff/jon/hipparch/dollarbit.

   [ECN94]    S. Floyd, "TCP and Explicit Congestion Notification",
              ACM Computer Communications Review, Vol. 24 no. 5, pp.
              10-23, October 1994.

   [ECN97]    K. Ramakrishnan and S. Floyd, "A Proposal to Add Explicit
              Congestion Notification (ECN) to IPv6 and to TCP",
              Internet Draft <draft-kksjf-ecn-00.txt>, November 1997.

   [ESP]      S. Kent and R. Atkinson, "IP Encapsulating Security
              Payload", Internet Draft <draft-ietf-ipsec-esp-v2-01.txt>,
              October 1997.

   [Feng97]   W. Feng, D. Kandlur, D. Saha, and K. Shin, "Adaptive
              Packet Marking for Providing Differentiated Services in
              the Internet", Univ. Michigan Technical Report
              CSE-TR-347-97, October 1997,
              http://www.eecs.umich.edu/~wuchang/work/pmg.ps.Z.

   [Ferg97]   P. Ferguson, "Simple Differential Services: IP TOS and
              Precedence, Delay Indication, and Drop Preference,
              Internet Draft <draft-ferguson-delay-drop-00.txt>,
              November 1997.





Blake                      Expires: June 1998                  [Page 36]


INTERNET-DRAFT               Packet Marking                December 1997


   [Floyd91]  S. Floyd, "Connections with Multiple Congested Gateways
              in Packet-Switched Networks Part 1: One-way Traffic",
              Computer Communications Review, Vol.21, No.5, October
              1991, p. 30-47, ftp://ftp.ee.lbl.gov/papers/gates1.ps.Z.

   [Floyd97]  S. Floyd and K. Fall, "Router Mechanisms to Support End-
              to-End Congestion Control", LBNL Technical Report,
              February 1997, http://ftp.ee.lbl.gov/papers/collapse.ps.

   [FRED]     D. Lin and R. Morris, "Dynamics of Random Early
              Detection", Proc. ACM SIGCOMM 1997, September 1997.

   [GBH97]    R. Guerin, S. Blake, and S. Herzog, "Aggregating RSVP-
              based QoS Requests", Internet Draft
              <draft-guerin-aggreg-rsvp-00.txt>, November 1997.

   [HFSC]     I. Stoica, H. Zhang, and T. Ng, "A Hierarchical Fair
              Service Curve Algorithm for Link-Sharing, Real-Time and
              Priority Services", Proc. ACM SIGCOMM 97, September 1997.

   [HPFQA]    J. Bennett and Hui Zhang, "Hierarchical Packet Fair
              Queueing Algorithms", Proc. ACM SIGCOMM 96, August 1996.

   [IPv6]     S. Deering and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", Internet Draft
              <draft-ietf-ipngwg-ipv6-spec-v2-01.txt>, November 1997.

   [IS802]    M. Seaman, A. Smith, and E. Crawley, "Integrated Services
              Mappings on IEEE 802 Networks", Internet Draft
              <draft-ietf-issll-is802-svc-mapping-00.txt>, July 1997.

   [May97]    M. May, J. Bolot, C. Diot, and A. Jean-Marie, "1-Bit
              Schemes for Service Discrimination in the Internet:
              Analysis and Evaluation", INRIA Research Report, August
              1997,
             http://www.inria.fr/rodeo/personnel/mmay/papers/rr_1bit.ps.

   [McCanne]  S. McCanne, V. Jacobson, and M. Vetterli, "Receiver-driven
              Layered Multicast", Proc. ACM SIGCOMM 96, August 1996.

   [MPLS]     R. Callon, P. Doolan, N. Feldman, A. Fredette, G. Swallow,
              and A. Viswanathan, "A Framework for Multiprotocol Label
              Switching", Internet Draft
              <draft-ietf-mpls-framework-01.txt>, July, 1997.

   [QOSP]     S. Bradner, editor, "Internet Protocol Quality of Service
              Problem Statement", Internet Draft
              <draft-bradner-qos-problem-00.txt>, September 1997.

   [RED]      S. Floyd and V. Jacobson, "Random Early Detection Gateways
              for Congestion Avoidance", IEEE/ACM Transactions on
              Networking, August 1993.


Blake                      Expires: June 1998                  [Page 37]


INTERNET-DRAFT               Packet Marking                December 1997


   [RFC795]   J. Postel, "Service Mappings", Internet RFC 795, September
              1981.

   [RFC1046]  W. Prue and J. Postel, "A Queuing Algorithm to Provide
              Type-of-Service for IP Links", Internet RFC 1046, February
              1988.

   [RFC1349]  P. Almquist, "Type of Service in the Internet Protocol
              Suite", Internet RFC 1349, July 1992.

   [RFC1583]  J. Moy, "OSPF Version 2", Internet RFC 1583, March 1994.

   [RFC1633]  R. Braden, D. Clark, and S. Shenker, "Integrated Services
              in the Internet Architecture: An Overview", Internet RFC
              1633, July 1994.

   [RFC1812]  F. Baker, editor, "Requirements for IP Version 4 Routers",
              Internet RFC 1812, June 1995.

   [RFC1883]  S. Deering and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", Internet RFC 1883, December 1995.

   [RSVP]     B. Braden et. al., "Resource ReSerVation Protocol (RSVP)
              -- Version 1 Functional Specification", Internet RFC 2205,
              September 1997.

   [Shenker]  S. Shenker, "Fundamental Design Issues for the Future
              Internet", IEEE/ACM Trans. on Networking, vol. 13, no. 7,
              Sep. 1995.

   [SIMA]     K. Kilkki, "Simple Integrated Media Access (SIMA)",
              Internet Draft <draft-kalevi-simple-media-access-01.txt>,
              June 1997.

   [TWOBIT]   K. Nichols, V. Jacobson, and L. Zhang, "A Two-bit
              Differentiated Services Architecture for the Internet",
              Internet Draft <draft-nichols-diff-svc-arch-00.txt>,
              November 1997.
















Blake                      Expires: June 1998                  [Page 38]


INTERNET-DRAFT               Packet Marking                December 1997


Author's Address

   Steven Blake
   E95/664
   IBM Corporation
   800 Park Offices Drive
   Research Triangle Park, NC  27709
   Phone:  +1-919-254-2030
   Fax:    +1-919-254-5483
   E-mail: slblake@raleigh.ibm.com












































Blake                      Expires: June 1998                  [Page 39]

Document	Document type	Expired Internet-Draft (individual) Expired & archived
	Select version	00
	Author	Steven L. Blake Email authors
	RFC stream	(None)
	Intended RFC status	(None)
	Other formats	txt pdf bibtex bibxml