Neil Harrison
   Internet Draft                                          Peter Willis
   Document: draft-harrison-mpls-oam-00.txt             British Telecom
   Expires: August 2001
                                                         Shahram Davari
                                                             PMC-Sierra

                                                         Ben Mack-Crane
                                                                Tellabs

                                                           Hiroshi Ohta
                                                                    NTT

                                                          February 2001


                    OAM Functionality for MPLS Networks


Status of this Memo

   This document is an Internet-Draft and is in full conformance
   with all provisions of Section 10 of RFC2026.


   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
        http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at
        http://www.ietf.org/shadow.html.


Copyright Notice

   Copyright(C) The Internet Society (2001). All Rights Reserved.


Abstract

   This Internet draft provides requirements and mechanisms for OAM
   (Operation and Maintenance) for the user-plane in MPLS networks. A
   connectivity verification "CV" OAM packet is defined, which is
   transmitted periodically from LSP source to LSP sink. The CV flow
   could be used to detect defects related to misrouting of LSPs as
   well as link and nodal failure, and if required to trigger
   protection switching to the protection path.

   Harrison et.al        Expires August 2001                   Page 1
                 OAM Functionality for MPLS Networks    February 2001


   A forward defect identifier "FDI" and a backward defect identifier
   "BDI" are defined, which carry the defect type and location to the
   near end and far end respectively. At every LSP terminating node,
   the FDI is mapped from server layer to client layer. By doing so FDI
   could suppress the alarm storm, and let the appropriate layer take
   control of protection switching. BDI is used by LSP source to start
   or stop the QoS aggregation, depending on whether the LSP is in
   available or unavailable state. The criteria for entry and exit to
   the available and unavailable states are also defined in this
   document.


Table of Contents

   1.    Introduction..................................................3
   2.    Definitions...................................................4
   3.    Symbols and Abbreviations.....................................5
   4.    Requirements for MPLS OAM.....................................5
   5.    Principles of OAM Function....................................6
   5.1   Client/Server Recursion-Layering..............................6
   5.2   OAM Functionality and Layer Independence......................7
   5.3   Defects.......................................................7
   5.4   Availability..................................................7
   5.5   Decoupling of User behavior from Connectivity Assessment......8
   5.6   Forward and Backward Defect Indicators........................8
   5.7   Connectivity Verification.....................................9
   5.8   Customers Should not be Used as Defect Detectors.............10
   5.9   The Reliability of OAM Functionality Under Fault Conditions..10
   6.    Mechanisms of MPLS OAM.......................................10
   6.1   Special MPLS Label Values....................................10
   6.2   Handling of Errored OAM Packets..............................10
   6.3   Label Stack Overhead Encoding Rules for OAM Packets..........11
   6.3.1 For CV OAM Packets...........................................11
   6.3.2 For P OAM Packets............................................12
   6.3.3 For FDI and BDI OAM Packets..................................12
   6.3.4 MPLS OAM Function Types for the OAM Alert Label..............13
   6.4   MPLS OAM Packets.............................................14
   6.4.1 Connectivity Verification (CV) Packets.......................15
   6.4.2 Performance ôPö Packets......................................16
   6.4.3 Forward defect Indicator ôFDIö packets.......................16
   6.4.4 Backward Defect Indicator ôBDIö..............................17
   6.5   Defect Types and their Entry/Exit Criteria...................18
   6.5.1 Defect Type Codepoints.......................................18
   6.5.2 dLOCV Entry Criteria.........................................20
   6.5.3 DTTSI Entry Criteria.........................................21
   6.5.4 dLoop Entry Criteria.........................................21
   6.5.5 dLOCV, dTTSI and dLoop exit criteria.........................22
   6.6   Available and unavailable state processing...................23

   Harrison et. al.      Expires August 2001                   Page 2
                 OAM Functionality for MPLS Networks    February 2001

   6.6.1 Short Break definition.......................................23
   6.6.2 Available/Unavailable State Definition.......................24
   6.6.3 Near-end and Far-end Measurements of Availability............24
   6.6.4 Near-End State Processing Flow-chart.........................25
   6.6.5 Far-End State Processing Flow-chart..........................27
   6.6.6 A pictorial view of near-end and far-end state processing....28
   7.    Security Considerations......................................29
   8.    References...................................................29
   9.    Author's Addresses...........................................29


   Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in RFC-2119 [1].


1. Introduction

   This Internet draft provides requirements and mechanisms for OAM
   (Operation and Maintenance) for the user-plane in MPLS networks. It
   is recognized that OAM functionality is important in public networks
   for ease of network operation, for verifying network performance and
   to reduce operational costs. OAM functionality is especially
   important for networks, which are required to deliver (and hence be
   measurable against) QoS (Quality of Service) and availability
   performance parameters/objectives.

   A connectivity verification "CV" OAM packet is defined in this
   document, which is transmitted periodically from LSP source to LSP
   sink. The CV flow could be used to detect defects related misrouting
   of LSPs as well as link and nodal failure, and if required to
   trigger protection switching to the protection path. A forward
   defect identifier "FDI" and a backward defect identifier "BDI" are
   defined, which carry the defect type and location to the near end
   and far end respectively. At every LSP terminating node, the FDI is
   mapped from server layer to client layer. By doing so FDI could
   suppress the alarm storm, and let the appropriate layer take control
   of protection switching. BDI is used by LSP source to start or stop
   the QoS aggregation, depending on whether the LSP is in available or
   unavailable state. The criteria for entry and exit to the available
   and unavailable states are also defined in this document.

   The OAM functionality defined herein is limited to point-point LSP
   tunnels. OAM functionality for multipoint-point and point-multipoint
   LSP tunnels is FFS.


   Harrison et. al.      Expires August 2001                   Page 3
                 OAM Functionality for MPLS Networks    February 2001


2. Definitions

   This document introduces some new terminology, which is required to
   discuss the functional network components associated with OAM.


       Functional Architecture                    Meaning
       Term
       ------------------                   ------------------

       Client/server               A term referring to the transparent
       (relationship between       transport of a client (ie higher)
       layer networks)             layer link connection by a server
                                   (ie lower) layer network trail.

       Link connection             A partition of a layer N trail that
                                   exists between two logically
                                   adjacent switching points within the
                                   layer N network.

       LSP Tunnel                  An LSP Tunnel is an LSP with well-
                                   defined source (ingress point) and
                                   sink (egress point)

       Subnetwork                  A subnetwork is a contiguous
                                   topological region of a network
                                   delimited by its set of peripheral
                                   access points, and is characterized
                                   by the possible routing across the
                                   subnetwork between those access
                                   points.  A network is the largest
                                   subnetwork and a node is the
                                   smallest subnetwork (at least in
                                   practical physical terms, though
                                   there are smaller sub-networks
                                   within nodes).

       Trail                       A generic transport entity at layer
                                   N which is composed of a client
                                   payload (which can be a packet from
                                   a client at higher layer N-1) with
                                   specific overhead added at layer N
                                   to ensure the forwarding integrity
                                   of the server transport entity at
                                   layer N.

       Trail termination point     A source or sink point of a trail at
                                   layer N, at which the trail overhead
                                   is added or removed respectively.  A
                                   trail termination point must have a
                                   unique means of identification
                                   within the layer network.


   Harrison et. al.      Expires August 2001                   Page 4
                 OAM Functionality for MPLS Networks    February 2001



3. Symbols and Abbreviations

   This list is not exhaustive of all the abbreviations used in this
   draft.  In particular, those in common usage within the MPLS
   community (like 'MPLS' itself) have been excluded.


       Abbreviation               Meaning
    ---------------    ----------------------------
       AIS             Alarm Indication Signal

       BDI             Backward Defect Indication

       CV Packet       Connectivity Verification Packet

       FDI             Forward Defect Indication

       FFS             For Further Study

       OAM             Operations and Maintenance

       P Packets       Performance Packets

       QoS             Quality of Service

       SLA             Service Level Agreement

       TTSI            Trail Termination Source Identifier



4. Requirements for MPLS OAM

   MPLS layer OAM functionality is not a substitute for physical or
   server layer OAM (e.g., SDH/SONET) or client layer OAM (e.g., IP).
   MPLS LSPs create layer networks in their own right, and will have
   defects that are only relevant to the MPLS LSP layer networks.

   OAM functionality is useful because:

   1)   It allows the Operator to verify whether Quality of Service
        guarantees given in SLAs (Service Level Agreements) are in fact
        being met by the connection.
   2)   It allows the Operator to reduce networkÆs operating costs, by
        allowing more efficient detection and handling of defects.
        Long-term statistics show that the costs of operating a public
        network are higher than the initial installation costs.
   3)   It gives support for improved accounting/billing procedures.
   4)   It helps provide security for customer traffic by the detection
        of traffic mis-connections (which may otherwise be
        undetectable).


   Harrison et. al.      Expires August 2001                   Page 5
                 OAM Functionality for MPLS Networks    February 2001

   The following functions are required:

   1)   Connectivity Verification of LSPs to confirm that defects do
        not exist on the target LSPs.
   2)   Fast and efficient defect detection, notification and
        localization.
   3)   Measurement of availability performance.

   The necessity of additional functions are for further study. In
   particular, the need for in-service measurement of LSP QoS
   performance (measurement of packet losses, spurious packets, errored
   packets, delay and delay variation) is for further study. Note that
   an LSP needs to be in the available state for QoS assessment to be
   valid.

   Defects include following cases:

   1)   Simple loss of LSP connectivity (due to a server layer failure
        or a failure within the MPLS layer network);
   2)   Swapped LSP trails;
   3)   Unintended LSP mismerging (of 2 or more LSP trails);
   4)   Unintended replication of LSP packets (of the same LSP trail
        for example, due to routing loops).


5. Principles of OAM Function

   The following principles can, for the most part, be applied to any
   layer networks, ie not just MPLS. This recommendation defines
   specific embodiments of these principles, as functional OAM
   entities, for MPLS layer networks. Although it is recommended that
   all the OAM functional entities are deployed network-wide, operators
   are free to choose if they wish to apply all or only some of these
   OAM functional entities (ie CV flows but not P flows), and whether
   deployment is network-wide or limited in scope to LSPs of certain
   types, e.g. apply only to important LSPs such as those supporting
   VPNs. In cases of limited OAM functional entity deployment or scope,
   then operators should be aware that there could be deficiencies in
   their ability to detect/handle certain defect cases.


5.1 Client/Server Recursion-Layering

   A very important functional architecture feature of layer networks
   is client/server recursion (also known as layering). That is, a
   client layer link connection (ie a partition of a longer client
   layer trail between two logically adjacent client layer nodes) is
   created by a server layer trail. This is the basis of client layer
   topology construction. This recursion principle extends between
   various client/server layer relationships and ultimately 'to the
   duct'. Note also that client layer link connections can be multiple
   in number, ie a single server layer trail entity can support a
   multiple number of client layer link connections.


   Harrison et. al.      Expires August 2001                   Page 6
                 OAM Functionality for MPLS Networks    February 2001

   The key points to note here are:

   (1)  The client and server layer trails termination points will
        generally not be congruent.  And since the trail termination
        points are associated with the addressable access points of a
        layer network, it follows that the addressing of the two layers
        will also generally not be congruent.
   (2)  The 'duct' (or more precisely the environment of physical
        occupancy and connectivity) is the lowest layer network. The
        degree of connectivity in this layer effectively defines the
        degree of independent connectivity in all client layers. This
        could be put another way, by saying that the availability
        performance of any client layer network design is determined
        (and inherited from) the physical infrastructure. This means
        that if one cannot state which link connections have a common
        lower server layer trail, then one cannot say anything with
        certainty about the resilience design of a client layer
        network.


5.2 OAM Functionality and Layer Independence

   The OAM functionality of a layer network must not be dependent on
   any specific server or client layer technology. This is critical to
   ensure that layer networks can evolve (or new/old layer networks be
   added/removed) without impacting other layer networks.

   The control-plane of a given layer network must also have its own
   OAM.

   [Note - Control-plane OAM is outside the scope of this draft.]


5.3 Defects

   All the major defect conditions must be identified with in-service
   measurable entry and exit criteria, and all consequent actions must
   be specified.  The entry and exit criteria of various defects should
   be temporally harmonized as far as possible to simplify trail
   defect-state processing.  Attention should be paid to relating the
   defect entry/exit criteria to æshort-breaksÆ, which are generally
   accepted by many operators as 3-9s periods of gross signal
   disturbance from which the network may self-recover. If the event
   lasts for >=10s this is the normally accepted threshold for entering
   the unavailable state (also see the next item).


5.4 Availability

   The most important performance metric of a trail (or a subnetwork
   partition thereof) is availability.  This means that the entry and
   exit criteria for the available state must be defined. It is also
   important to understand how unavailable/available state transitions
   relate to the stopping/starting of the aggregation of available

   Harrison et. al.      Expires August 2001                   Page 7
                 OAM Functionality for MPLS Networks    February 2001

   state QoS metrics; noting that from pragmatic considerations this
   may be effectively applied at an earlier point to preserve the
   integrity of the available state metrics, e.g. after 3s say, which
   marks the onset of (at least) a short-break, and which from
   operational experience is a good practical rule-of-thumb for setting
   a point beyond which a network is unlikely to self-recover.


5.5 Decoupling of User behavior from Connectivity Assessment

   User traffic behavior must not be a factor in connectivity status
   assessment. In practical terms, this means decoupling user traffic
   behavior from all defects and (the dependent) available state
   entry/exit criteria.


5.6 Forward and Backward Defect Indicators

   The node in the layer network, which first detects a defect (sourced
   from within that layer), should apply a well-known 'Forward Defect
   Indication' (FDI) signal in the downstream direction. In the
   majority of current transport network technologies such a signal has
   been termed AIS (Alarm Indication Signal). At the trail termination
   point where the appropriate FDI signal is generated:

   (1)  There should be a complimentary Backward Defect Indication
        (BDI) signal (which is removed at the upstream trail
        termination point) and
   (2)  There must be a mapping of the FDI signal from the server layer
        to the appropriate FDI signal of the client layer(s) as part of
        the server->client adaptation process.

   The primary purpose of the FDI signal is to suppress client layer
   alarms (which would otherwise create an 'alarm storm' in places
   which could be geographically and organizationally far removed from
   the originating defect source location).

   Three secondary purposes of FDI (and in some cases BDI) are:

   (1)  To allow correct processing of available state performance
        metrics.
   (2)  To inform applications that the connection is no longer
        functioning correctly and to take appropriate action, e.g.
        perhaps invoke a 're-connect' action, or in the case of voice
        perhaps mute the speech path.
   (3)  To inform client layer trails (e.g. nested LSPs in the case of
        MPLS) that a defect has occurred in a lower server layer trail,
        and hence to provide some indication that protection-switching
        in the affected client layer trails could be postponed to give
        the server layer trail an opportunity to effect protection
        switching.

   FDI/BDI signals should also provide information on the defect
   location and type. Such information is very useful to the lead

   Harrison et. al.      Expires August 2001                   Page 8
                 OAM Functionality for MPLS Networks    February 2001

   operator in a co-operating domain scenario, and can also
   differentiate failures, which are internal or external to public and
   private domains.

   Note that, if being used, the BDI signal must be generated (in the
   backward direction) in response to detecting a defect at a trail
   sink termination point (in the forward direction) and not from some
   intermediate point, such as where the defect might be actually
   located. The reasons for this are that:

   (1)  In the case of bi-directional trails and unidirectional
        defects, each trail direction might not be congruently routed.
   (2)  In the case of unidirectional trails the BDI signal may be
        provided out-of-band, e.g. perhaps via a control-plane or
        management-plane mechanism. [Note: The exact means for
        providing the BDI functionality in this is FFS]

   The above requirements mean that the FDI/BDI architecture is valid
   for all routing cases.


5.7 Connectivity Verification

   An essential characteristic of the trails in a layer network is that
   their trail termination points must have a unique identifier (at
   least within that layer network). However, on link connections
   between nodes within the layer network, relative identifiers are
   commonly used for traffic forwarding. These relative identifiers
   only have to be unique per interface, e.g. the VPI/VCI of ATM, the
   DLCI of FR, the ælabelÆ of MPLS.

   When relative identifiers are used for traffic forwarding there is a
   possibility of trail misconnectivity due to defects.  These cover a
   variety of connectivity failure modes, including:

   1)   Simple loss of continuity (due to a server layer failure or a
        failure within the layer network considered);
   2)   Swapped connections;
   3)   Unintended mismerging (of 2 or more trails);
   4)   Unintended replication (of the same trail due, for example, to
        routing loops).

   Although some of these defects may be rare in practice, unless
   detected/corrected their consequences can be very severe for an
   operator; ranging from simple availability/QoS SLA violations
   through to more serious security, censorship and mis-billing
   implications.

   It is therefore required that a unique trail source identifier be
   periodically transmitted from the trail source to the trail sink to
   detect these types of defect.



   Harrison et. al.      Expires August 2001                   Page 9
                 OAM Functionality for MPLS Networks    February 2001

5.8 Customers Should not be Used as Defect Detectors

   The OAM tools provided should ensure (as far as reasonably
   practicable) that customers should not have to act as failure
   detectors for the operator.


5.9 The Reliability of OAM Functionality Under Fault Conditions

   Under fault conditions a layer network cannot, by definition, be
   expected to behave in a predictable manner. Therefore care should be
   exercised when specifying and using OAM functions that require a
   layer network to function in a reliable and predictable manner for
   fault diagnosis.


6. Mechanisms of MPLS OAM


6.1 Special MPLS Label Values

   The label structure defined in [1] indicates a single label field of
   20 bits.  Label field values 0-3 have already been reserved for
   special functions. A special label, the 'OAM Alert Label', is
   defined as follows:

                        Table 1: OAM Alert Label

        Label value
         (Decimal)                        Meaning
        ------------              -----------------------
             4           OAM Alert Label.  This indicates that the
                         first octet following the OAM Alert Label
    [Note: this value is in the OAM payload (ie octet 5) is an OAM
    yet to be officially Function Type field whose value defines
    assigned by IANA]    the type of defect handling OAM function
                         (ie CV, P, FDI or BDI), which follows in
                         the payload area.


   All OAM packets must have a minimum payload length of 40 octets to
   facilitate ease of processing.  This is achieved by padding with all
   0s when necessary. All padding bits are reserved for future operator
   defined usage.


6.2 Handling of Errored OAM Packets

   Each OAM packet uses a BIP16 (in the last two octets of the OAM
   payload area) to detect errors.  The BIP16 is computed over all the
   fields of the OAM payload, including the initial octet, which

   Harrison et. al.      Expires August 2001                  Page 10
                 OAM Functionality for MPLS Networks    February 2001

   specifies the Function Type and the BIP16 bit positions (which are
   all pre-set to zero for initial calculation purposes).

   BIP16 processing must be performed on all OAM packets prior to being
   able to reliably pass their payload for further processing.  Any OAM
   packets that show a BIP16 violation upon reception processing should
   be discarded.

   In the case of the CV packet flow, persistent BIP16 violations will
   cause a Loss of Connectivity Verification; this defect is defined
   later, but for now we can note that it would occur after nominally
   3s.  This behavior is consistent with the nature of the defect.
   However, it is recommended that at a local equipment level some
   notification is given to the Network Management System to indicate
   that BIP16 discards are occurring.

   In the case of the other OAM packet types, ie the FDI, BDI and P
   packets (these are defined later), it is again recommended that at a
   local equipment level some indication is given to the Network
   Management System that BIP16 discards are occurring.  The threshold
   to be used for recording/reporting such BIP16 discard activity for
   these OAM packets should be programmable, and is outside the scope
   of this Recommendation.


6.3 Label Stack Overhead Encoding Rules for OAM Packets


6.3.1   For CV OAM Packets

   CV OAM packets are differentiated from normal user-plane traffic by
   an increase of one in the label stack depth at a given LSP level at
   which they are inserted. Therefore, they maintain this label stack
   difference of one (from normal user-plane traffic) as they traverse
   any lower layer server LSPs.

   The OAM Alert Labeled header is added before (ie below) the normal
   user-plane forwarding labeled header at the LSP trail source point.
   The S bit is set only in the OAM Alert Label.

   The CV OAM packet can be used on both E-LSPs and L-LSPs. However,
   the coding of the EXP field is different in the two cases.
   In the case of L-LSPs, the coding of the EXP field should be set to
   all 0s in both the OAM Alert Labeled header and the preceding normal
   user-plane forwarding header.  This is to ensure the CV OAM packets
   have a Per Hop Behavior (PHB), which ensures the lowest drop
   probability [2].

   In the case of E-LSPs, the coding of the EXP field should be set to
   all 0s in the OAM Alert Labeled header and to whatever is the
   'minimum loss-probability PHB' in the preceding normal user-plane
   forwarding header for that E-LSP.  This is again to ensure the CV
   OAM packets have a PHB, which ensures the lowest drop probability
   [2].

   Harrison et. al.      Expires August 2001                  Page 11
                 OAM Functionality for MPLS Networks    February 2001


   The TTL field should be set to 1 in the OAM Alert Labeled header.
   The reasons for this are:

   ¸    CV OAM packets should never travel beyond the LSP trail
        termination sink point at the LSP level they were originally
        generated (noting that they are not examined by intermediate
        label-swapping LSRs, and are only observed at LSP sink points),
        and
   ¸    The TTL of the immediately prior normal user-plane forwarding
        header is used to mitigate against damage from looping packets.


6.3.2   For P OAM Packets

   The label stack overhead encoding rules of performance P OAM packets
   are FFS.


6.3.3   For FDI and BDI OAM Packets

   FDI and BDI OAM packets are invoked, on a nominal 1 per second
   basis, when defects are detected. The FDI packet traces forward and
   upward through any nested LSP stack. The BDI packet is sent
   backwards towards its peer-level LSP trail termination sink point in
   the reverse direction (assuming a bi-directional in-band LSP exists)
   for each LSP at and above the level of the defect.

   The OAM Alert labeled header is inserted before (ie below) a normal
   user-plane forwarding labeled header, and a label stack of 2 is only
   ever required for either the FDI or BDI packet at their origin.
   Note that in the case of FDI, it is assumed that the server->client
   LSP adaptation mappings that were in existence prior to the failure
   are recursively used to ensure correct FDI forwarding.  It is
   therefore important that the LSP sink point remembers any server-
   >client LSP labels mappings that were in existence prior to the
   failure.  Although the exact means for achieving this are outside
   the scope of this Recommendation, some examples of how these server-
   > client layer label mappings could be configured are as follows:

   ¸    Manually, via the NMS say;
   ¸    Automatically on LSP set-up via extensions to LDP/RSVP
        signaling;
   ¸    By an automatic 'learning process', i.e. if, during the
        establishment of the client LSPs, the signaling is tunneled
        trough the server layer, then the server trail terminating node
        could keep the information about the established LSPs in memory
        as they occur.

   When server->client layer LSP relationships are changed (e.g.
   existing client layer LSP removed, or new client LSP added say),
   then it is important that the server->client label mappings are also
   updated to reflect the new relationships.


   Harrison et. al.      Expires August 2001                  Page 12
                 OAM Functionality for MPLS Networks    February 2001

   The S bit is set only in the OAM Alert Labeled header. The FDI OAM
   packet is recursively mapped upwards, through a client/server
   adaptation process at LSP trail termination sink points, into any
   further affected higher client layer LSPs.  When this arrives at the
   top LSP it needs to be mapped into an equivalent FDI for whatever
   client layer is then being carried.  In the case of IP (or indeed
   any other client layer), this is outside the scope of this document.

   Note that higher level LSPs will also see failures (as a result of
   corruption of their own CV flow) but they will also see an incoming
   FDI OAM packet flow from the lowest level LSP where the failure
   originates.  This dynamic behavior allows for correct identification
   of the true source of the defect and is explained in more detail
   later.  But for now it is sufficient to note that the incoming FDI
   is needed to:

   ¸    Suppress unnecessary alarms in the affected higher layer LSPs.
   ¸    Give an indication to affected higher-level LSPs that they may
        need to hold-off protection switching as the defect is at a
        lower level LSP.
   ¸    To allow the appropriate BDI coding at the affected higher
        layer.

   It is assumed that when a BDI OAM packet is returned in-band it
   follows a bi-directional LSP and, like the CV and P OAM packets,
   that it should never travel beyond the LSP trail termination sink
   point (of the return LSP).

   The coding of the EXP field associated with the OAM Alert Labeled
   header and the preceding normal user-plane forwarding labeled header
   at the LSP level at which the FDI or BDI is inserted is the same as
   that previously described for the CV OAM packet.

   The TTL field should be set to 1 in the OAM Alert Labeled packet
   header. The reasons for this are:

   ¸    The FDI OAM packet is recursively regenerated at each LSP trail
        termination sink point into all affected client layer LSPs (if
        any); so the TTL field is recursively regenerated with a value
        of 1;
   ¸    The BDI OAM packet should never travel beyond the LSP trail
        termination sink point of the return LSP at the LSP level that
        it was originally generated;
   ¸    The TTL of the immediately prior normal user-plane forwarding
        header is used to mitigate against damage from looping packets.


6.3.4   MPLS OAM Function Types for the OAM Alert Label

   The first octet of the OAM packet payload specifies the OAM Function
   Type as follows:

                        Table 2: OAM Function Types


   Harrison et. al.      Expires August 2001                  Page 13
                 OAM Functionality for MPLS Networks    February 2001

       OAM Function Type    First octet of OAM packet payload
       codepoint (Hex)      Function Type Purpose
       -----------------    ----------------------------------
              00            Reserved

              01            CV (Connectivity Verification).  Used
                            to detect/diagnose all types of LSP
                            connectivity defect (sourced either
                            from below or within the MPLS
                            network).  This will be the main in-
                            service OAM defect detection tool.

              02            P (Performance).  Used to measure
                            user-plane loss of packets and their
                            aggregate octets.

              03            FDI (Forward Defect Indicator).  This
                            is generated by an MPLS node detecting
                            any defect (defined later) and
                            inserted into affected client layers.
                            Its primary purpose is to suppress
                            alarms being raised within affected
                            higher level client LSPs and (in turn)
                            their client layers.  It includes
                            fields to indicate the nature of the
                            defect and its location.

              04            BDI (Backward Defect Indicator).  This
                            is generated at a return LSP trail
                            termination source point in response
                            to a defect being detected at a LSP
                            trail termination sink point in the
                            other direction.  The defect type and
                            location codepoints of the
                            complimentary FDI are mapped into
                            similar fields of the BDI.  The BDI
                            may be realized either in the user-
                            plane if bi-directional LSPs are being
                            used (the case considered in this
                            document) or out-of-band (e.g. via
                            management-plane function) in the case
                            of uni-directional LSPs.  The latter
                            scenario is outside the scope of this
                            document.


   All other OAM Function Type codepoints are reserved for possible
   future standardization.


6.4 MPLS OAM Packets



   Harrison et. al.      Expires August 2001                  Page 14
                 OAM Functionality for MPLS Networks    February 2001

6.4.1   Connectivity Verification (CV) Packets

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Func Type (1) |                 (must be 0)                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   +                                                               +
   |                       Ingress Router ID                       |
   +                                                               +
   |                                                               |
   +                                                               +
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           LSP ID                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   \\                   Reserved (0) 14 bytes                     \\
   |                                                               |
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                               |         BIP 16                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 1: CV Payload Structure

   The intention is that the CV OAM packet is transmitted from the LSP
   trail termination source point at a nominal rate of 1 CV per second.
   It is important that the rate of CV OAM packet generation is
   constant so that simple and deterministic defect processing can be
   carried out at the LSP trail termination sink point.

   CV OAM packets within a given LSP are not synchronous to any other
   CV OAM packets in any other LSP (this includes all nested LSPs, and
   CV OAM packets from the remote end of an LSP at level N but in the
   other direction when bi-directional LSPs at level N are being used).

   The structure of the LSP Trail Termination Source Identifier (TTSI)
   is defined by using a 16 octet Router ID IPv6 address plus a 4 octet
   LSP Tunnel ID [3].  Note that the first 2 octets of the LSP Tunnel
   ID are currently padded with all 0s to allow for any future increase
   in the Tunnel ID field.

   For nodes that do not support IPv6 addressing, an IPv4 address can
   be used for the Router ID using the format described in RFC1884 [4].

   Harrison et. al.      Expires August 2001                  Page 15
                 OAM Functionality for MPLS Networks    February 2001

   That is:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   +                                                               +
   |                            (0)                                |
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                               |               (FF)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       IPv4 Address                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                   Figure 2: IPV6 Compatible IPV4 Address

   On LSP establishment the LSP trail termination sink point should be
   configured with the expected TTSI (Ingress router ID + LSP ID).
   Ideally this should be done automatically via LSP signaling at LSP
   set-up time (e.g. via a CR-LDP or RSVP control-plane mechanism), but
   it could also be configured manually.  The mechanism for achieving
   this configuration is outside the scope of this Recommendation.


6.4.2   Performance ôPö Packets

   The structure of the P OAM packet is FFS.


6.4.3   Forward defect Indicator ôFDIö packets


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Func Type (3) | (must be 0)   |        Defect Type            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Defect Location                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   \\                 Reserved (0) 30 bytes                       \\
   |                                                               |
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                               |         BIP 16                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                        Figure 3: FDI Payload Structure

   The FDI is sent downstream from the first node detecting the defect.
   In the case of MPLS server layer failures (i.e. in a lower layer
   technology such as SDH) this would be the first MPLS node downstream
   of the server layer failure (as a consequence of the appropriate
   client/server adaptation of the server FDI signal). In the case of
   MPLS layer failures (i.e. failures within the MPLS fabric) this

   Harrison et. al.      Expires August 2001                  Page 16
                 OAM Functionality for MPLS Networks    February 2001

   would be the first LSP trail termination sink point at the same LSP
   level as the failure.

   The primary function of the FDI is to stop downstream client layer
   alarm storms and hence correctly focus the attention of Operational
   personnel.  However, FDI can also have an important role in:

   ¸    Facilitating correctly targeted nested LSP protection schemes,
        i.e. one would want a lower level (server) LSP to protection
        switch before a higher level (client) LSP if the fault was
        sourced from within the lower level LSP, and

   ¸    Identifying availability/short-break events and hence suspend
        up-state QoS metric aggregation.

   The format of the Defect Location field and its handing at inter
   domain NNI boundaries is FFS.

   The Defect Type field is set at 2 octets here. This is currently
   considered sufficient, but it should be confirmed once all the
   Defects Types have been identified and fully specified. A candidate
   set of Defect Types and their codepoints are given later.

   The handling of the Defect Type field at inter domain NNI boundaries
   is FFS. However, 2 octets have been reserved for this function.

   When a FDI is to be passed from a server layer LSP to its client
   layer LSP(s) (ie at the client/server adaptation function following
   the server layer LSP trail termination sink point), the Defect
   Location and Defect Type field should be copied from the server
   layer LSP FDI into the client layer LSP(s) FDI.

   The mapping of MPLS layer sourced FDI from the highest-level LSP
   into its client layer (e.g. IP) is outside the scope of this
   document.


6.4.4   Backward Defect Indicator ôBDIö

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Func Type (4) |   (must be 0) |        Defect Type            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Defect Location                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   \\                  Reserved(0) 30 bytes                       \\
   |                                                               |
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                               |         BIP 16                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                        Figure 4: BDI Payload Structure

   Harrison et. al.      Expires August 2001                  Page 17
                 OAM Functionality for MPLS Networks    February 2001


   For the case of bi-directional LSPs, the BDI is sent from the LSP
   trail source point of the return LSP as a mirror of the appropriate
   (see Note) FDI at the LSP trail sink point of the other direction.

   The Defect Location and Defect Type fields are a direct mapping of
   those sets in the appropriate (see Note) FDI and have identical
   formats as described previously for the FDI OAM packet.

   Note - The word 'appropriate' here signifies that any incoming FDI
   (i.e. from a lower layer) takes precedence over any FDI that would
   have been generated at the layer being considered due to detecting
   defects at this layer (where these defects are only consequential as
   a result of a lower layer defect).

   The BDI does not propagate beyond its return LSP trail termination
   sink point, and it is discarded at that point after any processing
   based its observation is carried out, e.g. for single-ended short-
   break and/or availability measurements.


6.5 Defect Types and their Entry/Exit Criteria

6.5.1   Defect Type Codepoints

   The following coding structure is proposed for the various defect
   types so far identified:

                        Table 3: Defect Types

              DT code in FDI/BDI
              OAM packets (Hex)
              Note: first octet
              indicates layer and
    Defect    second octet
     Type     indicates defect                Meaning
    -------   --------------------    ------------------------
    dServer          01 01         Any server layer defect
                                   arising below the MPLS layer
                                   network.  It is not suggested
                                   that these are individually
                                   identified and defined for
                                   each type of server layer,
                                   since this function is only
                                   appropriate to the server
                                   layer itself.  Hence, we only
                                   need an indication that it is
                                   the server layer and not the
                                   MPLS layer.

    dLOCV            02 01         Simple Loss of Connectivity
                                   Verification due to missing
                                   CV OAM packets with expected
                                   TTSI.  Note that if the cause

   Harrison et. al.      Expires August 2001                  Page 18
                 OAM Functionality for MPLS Networks    February 2001

                                   of dLOCV is the server layer
                                   (ie there is also an incoming
                                   FDI signal from the server
                                   layer) then the DT codepoint
                                   01 01_H is used.  The dLOCV
                                   codepoint 02 01_H is only
                                   used for MPLS layer simple
                                   connectivity failures only.

    dTTSI            02 02         Trail Termination Source
                                   Identifier Mismatch due to an
                                   unexpected TTSI observed in
                                   the incoming CV OAM packets.
                                   This detects swapped
                                   connections and unintended
                                   mismerging failures, which
                                   can be differentiated by
                                   noting whether an expected
                                   TTSI is also missing or
                                   present respectively.  Note
                                   that in the case of the
                                   former (ie swapped
                                   connections), the dTTSI
                                   defect condition takes
                                   priority over the dLOCV
                                   defect condition, which is
                                   also present.

    dLoop            02 03         This detects an unintended
                                   replication Looping defect
                                   from observation of an
                                   increased rate of expected CV
                                   OAM packets above the nominal
                                   1/sec. (Note this defect is
                                   added for completeness, but
                                   it is expected to be rare)

    dUnknown         02 FF         Unknown defect detected in
                                   the MPLS layer.  This is
                                   expected to be used for MPLS
                                   nodal failures, which are
                                   detected within the node
                                   (probably by proprietary
                                   means) and affect user-plane
                                   traffic.

    None             00 00         Reserved

    None             FF FF         Reserved


   There are 3 MPLS layer user-plane defects, ie dLOCV, dTTSI and
   dLoop, which we now define in more detail.


   Harrison et. al.      Expires August 2001                  Page 19
                 OAM Functionality for MPLS Networks    February 2001


6.5.2   dLOCV Entry Criteria

   Entry to the dLOCV condition, and hence entry to the LSP Trail Sink
   Near-End Defect State, occurs when there are no expected CV OAM
   packets observed in any period of 3 consecutive seconds.

   In terms of consequent actions:

   ¸    If there is an incoming FDI signal from a server layer below
        the MPLS network, then this is mapped to the DT codepoint 01
        01_H in the FDI OAM packets sent forwards and the BDI OAM
        packets sent backwards.  The local DL codepoint is also
        inserted in these FDI and BDI OAM packets. There are no alarms
        associated with the MPLS layer itself but only the server
        layer, which sourced the FDI signal.

   Else:

   ¸    If there is an incoming FDI signal from a lower level LSP
        within the MPLS network, then that FDI signal's DL/DT
        codepoints are mapped into the FDI sent to any further client
        layers (i.e. suppresses generation of FDI DL/DT codepoints from
        this point) and the BDI OAM packet sent backwards.  There are
        no alarms generated regarding this LSP (the alarm will be
        associated with the lowest layer LSP within which the defect
        originated).

   Else:

   ¸    If there is no FDI signal incoming from the server layer or a
        lower level LSP AND there are no CV OAM packets observed with
        an unexpected TTSI which give rise to the dTTSI defect, then
        the DT codepoint 02 01_H is inserted in the FDI OAM packets
        sent downstream and the BDI OAM packets sent upstream.  The
        local DL codepoint is also inserted in these FDI and BDI OAM
        packets. A local alarm is raised relevant to this defect
        condition.

   Note:

        (i)    Since OAM packet flows are not synchronized in LSPs at
               different hierarchical levels (ie when LSPs are nested),
               there is a possibility that a client layer LSP detects a
               defect before its server layer LSP. This error could be
               up to 1s due to CV packet arrival time differences plus
               some additional uncertainty due to network delay
               effects. This could result in an error of judgment as to
               the type of defect that is present and hence which
               consequent actions are appropriate; especially whether
               the raising of a local alarm is appropriate and the
               correct setting of the DL and DT codepoints in FDI/BDI
               OAM packets.  To mitigate this effect, it is recommended
               that the raising of an alarm is deferred for at least 2

   Harrison et. al.      Expires August 2001                  Page 20
                 OAM Functionality for MPLS Networks    February 2001

               seconds after a defect state is detected (the exact
               value is FFS). This will also allow the network to
               settle into a stable state as regards defect detection
               behavior.

        (ii)   The starting/stopping of aggregation of any LSP user-
               plane packet/octet loss metrics (e.g. if using the P OAM
               packet say) is dependent on whether the LSP is in the
               available or unavailable state.


6.5.3   DTTSI Entry Criteria

   Entry to the dTTSI condition, and hence entry to the LSP Trail Sink
   Near-End Defect State, occurs when there are >= 2 CV OAM packets
   observed in any period of 3 consecutive seconds each with an
   unexpected TTSI. Any expected CV OAM packets or any incoming FDI
   signals (from either the server layer or a lower level LSP) are
   ignored, and it should be noted that the dTTSI defect overrides the
   dLOCV defect if both are present (as would be the case, for example,
   with swapped LSPs). The DT codepoint 02 02_H is inserted in the FDI
   OAM packets sent forwards and the BDI OAM packets sent backwards.
   The local DL codepoint is also inserted in these FDI and BDI OAM
   packets. A local alarm is raised relevant to this defect condition
   and the unexpected TTSI captured locally (this may also be
   optionally sent to the NMS as an exception report say). The
   downstream traffic must also be suppressed.

   Note:

   (i)  Since OAM packet flows are not synchronized in LSPs at
        different hierarchical levels (ie when LSPs are nested), there
        is a possibility that a client layer LSP detects a defect
        before its server layer LSP.  This error could be up to 1s due
        to CV packet arrival time differences plus some additional
        uncertainty due to network delay effects. This could result in
        an error of judgment as to the type of defect that is present
        and hence which consequent actions are appropriate; especially
        whether the raising of a local alarm is appropriate and the
        correct setting of the DL and DT codepoints in FDI/BDI OAM
        packets.  To mitigate this effect, it is recommended that the
        raising of an alarm is deferred for at least 2 seconds after a
        defect state is detected (the exact value is FFS).  This will
        also allow the network to settle into a stable state as regards
        defect detection behavior.

   (ii) The starting/stopping of aggregation of any LSP user-plane
        packet/octet loss metrics (e.g. if using the P OAM packet say)
        is dependent on whether the LSP is in the available or
        unavailable state.


6.5.4   dLoop Entry Criteria


   Harrison et. al.      Expires August 2001                  Page 21
                 OAM Functionality for MPLS Networks    February 2001

   Entry to the dLoop condition, and hence entry to the LSP Trail Sink
   Near-End Defect State, occurs when there are >= 5 CV OAM packets
   observed in any period of 3 consecutive seconds each with an
   expected TTSI.  The DT codepoint 02 03_H is inserted in the FDI OAM
   packets sent forwards and the BDI OAM packets sent backwards.  The
   local DL codepoint is also inserted in these FDI and BDI OAM
   packets. A local alarm is raised relevant to this defect condition.

   Note:

   (i)  Since OAM packet flows are not synchronized in LSPs at
        different hierarchical levels (ie when LSPs are nested), there
        is a possibility that a client layer LSP detects a defect
        before its server layer LSP. This error could be up to 1s due
        to CV packet arrival time differences plus some additional
        uncertainty due to network delay effects. This could result in
        an error of judgment as to the type of defect that is present
        and hence which consequent actions are appropriate; especially
        whether the raising of a local alarm is appropriate and the
        correct setting of the DL and DT codepoints in FDI/BDI OAM
        packets.  To mitigate this effect, it is recommended that the
        raising of an alarm is deferred for at least 2 seconds after a
        defect state is detected (the exact value is FFS). This will
        also allow the network to settle into a stable state as regards
        defect detection behavior.

   (ii) The starting/stopping of aggregation of any LSP user-plane
        packet/octet loss metrics (e.g. if using the P OAM packet say)
        is dependent on whether the LSP is in the available or
        unavailable state.


6.5.5   dLOCV, dTTSI and dLoop exit criteria

   Exit of the dLOCV, dTTSI or dLoop condition, and hence exit of the
   LSP Trail Sink Near-End Defect State, occurs when there are:

   ¸    >= 2 but <= 4 CV OAM packets observed each with an expected
        TTSI, AND
   ¸    No CV OAM packets observed with an unexpected TTSI in any
        period of 3 consecutive seconds.

   Note that the numbers of CV OAM packets observed each with an
   expected TTSI are a suggested number. It must be further studied if
   these numbers are appropriate.

   All the consequent actions invoked when entering the LSP Trail Sink
   Near-End Defect State (i.e. sending of FDI and BDI OAM packets, the
   raising of local alarms and the suppression of traffic in the dTTSI
   case only) are stopped when we exit the LSP Trail Sink Near-End
   Defect State.

   Note û The starting/stopping of aggregation of any LSP user-plane
   packet/octet loss metrics (e.g. if using the P OAM packet say) is

   Harrison et. al.      Expires August 2001                  Page 22
                 OAM Functionality for MPLS Networks    February 2001

   dependent on whether the LSP is in the available or unavailable
   state.


6.6 Available and unavailable state processing

   The main purpose of defining harmonized defect entry/exit criteria
   as noted above is in order to significantly simplify:

   ¸    Near-end/far-end LSP Trail Sink Defect State processing;
   ¸    Near-end/far-end LSP Available State processing (which will
        shortly be discussed);
   ¸    The decision point at which any LSP user-plane traffic QoS
        metrics (if being collected) are stopped/started with respect
        to aggregation into long-term registers.

   In all sections where the evaluation of events is described, the
   measurement technique is based on a sliding-window with a 1 second
   granularity of advance.  Note that the datum for the commencement of
   the sliding window is an arbitrary point in time decided by the each
   node independently and is not synchronized to OAM packet arrival
   events on any LSPs.  This is deemed acceptable to allow simpler
   nodal processing.

   It should be noted that this Recommendation uses the traditional
   functional dependency relationship between QoS and availability.
   That is:

   ¸    QoS is a unidirectional metric, ie if QoS metrics are being
        measured then each direction is measured independently.
   ¸    Availability is a bi-directional metric in the case of bi-
        directional LSPs, in the sense that if any direction enters the
        unavailable state (defined later) then both directions are
        deemed to be unavailable.  In the case of unidirectional LSPs,
        then availability can only have unidirectional significance.
   ¸    QoS measurements must be suspended (as regards aggregation into
        long-term available state registers) if an LSP enters the
        unavailable state; noting that this means the QoS measurements
        of both directions from the definition of the availability
        metric above in the case of bi-directional LSPs.

   However, it should also be noted that (for both pragmatic reasons
   and to preserve their statistical significance) QoS metric
   aggregation is actually suspended after detecting a short-break
   event.


6.6.1   Short Break definition

   We first define a short-break event.  This is defined as a period
   where the entry and exit to any of the previously defined defect
   conditions both occur within 9s, ie the LSP Trail Sink Near-End
   Defect State lasts for <= 9s.  The start of the short-break occurs
   at the beginning of the defect entry criteria and the end of the

   Harrison et. al.      Expires August 2001                  Page 23
                 OAM Functionality for MPLS Networks    February 2001

   short-break occurs at the beginning of the defect exit criteria.
   Clearly this has a minimum period of 3s.  Short-breaks are only
   defined to exist when the LSP is in the Available State.

   Note û Short-breaks are more common than many people realize (in one
   operator's network a study of SES (Severely Errored Second) events
   showed that about 50% of these would have been classified as short-
   breaks).  They can cause severe disruption to some applications and
   are therefore an important performance metric (perhaps second in
   importance after availability).  Since they exist at the physical
   layers they will exist (by inheritance) in client layers, such as
   MPLS and IP.  An important property of the short-break, which we
   will exploit, is that it yields a pragmatic harmonized threshold for
   defect evaluation (across all defect types as noted previously) and
   the stopping/starting of QoS metric aggregation into long-term up-
   state performance registers.


6.6.2   Available/Unavailable State Definition

   If the LSP Trail Sink Near-End Defect State exceeds 10 consecutive
   seconds in duration then the LSP enters the Unavailable State. The
   start point of the Unavailable State is deemed to be at the
   beginning of these 10 consecutive seconds. We therefore no longer
   have a short-break (and the event should not be registered as such).

   A LSP re-enters the Available State after first exiting the LSP
   Trail Sink Near-End Defect State and there has been an aggregate
   period of 10 consecutive seconds in which there have been:

   ¸    >=9 and <= 11 CV OAM packets each with an expected TTSI, AND
   ¸    No CV OAM packets with an unexpected TTSI.

   Note that the numbers of CV OAM packets observed each with an
   expected TTSI are suggested numbers.  It must be further studied if
   these numbers are appropriate.

   The start point of the Available State is deemed to be at the
   beginning of these 10 consecutive seconds.


6.6.3   Near-end and Far-end Measurements of Availability

   All of the above discussion is strictly only relevant to the near-
   end processing when the LSP trail termination sink point is in the
   LSP Trail Sink Near-End Defect State as discussed previously.  We
   can also measure the far-end availability behavior (useful when only
   a single end is accessible for measurement) by using the BDI signal
   (when bi-directional LSPs are being used) since this is a reflected
   upstream mirror of the duration over which FDI is sent downstream.

   We therefore define the LSP Trail Sink Far-End Defect State to be
   the period over which BDI OAM packets are observed subject to the
   following entry and exit criteria:

   Harrison et. al.      Expires August 2001                  Page 24
                 OAM Functionality for MPLS Networks    February 2001


   ¸    Entry of the LSP Trail Sink Far-End Defect State occurs on the
        first BDI OAM packet observed.
   ¸    Exit of the LSP Trail Sink Far-End Defect State occurs after a
        period of 3 consecutive seconds in which no BDI OAM packets
        have been received.

   Note that this 3s processing delay on exit is to cater for cases in
   which perhaps a single BDI is lost (say due to congestion or
   errors).  Its effect must be catered for in the far-end processing
   state machine as discussed later.

   Since we have fixed the temporal duration of the far-end state to be
   directly related to the near-end state (albeit with a +3s exit
   checking period) we can therefore measure both short-breaks and
   unavailability of both directions from a single end (on the
   assumption that bi-directional LSPs are being used).


6.6.4   Near-End State Processing Flow-chart

   The following figure summarizes many of the key points regarding the
   near-end state-processing algorithm for a given LSP.

                Figure 5: LSP Near-End State Processing Flow Chart

   1.   Assume we start in the available state in the box marked
        æStartÆ.  All timers (shown later) can conceptually be assumed
        reset at this point.  If there are any QoS metrics being
        collected (e.g. packet/octet loss measurements from the P OAM
        packet) then this is assumed to be active at this time.
   2.   The first decision box is ædLOCV, dTTSI or dLoop?Æ. These
        defects were defined previously.  If none of these defects are
        present we keep checking for this condition and stay in the
        available state.  However, if one of these defects is present
        we enter the Trail Sink Near-End Defect State.
   3.   The consequent actions now required depend on the nature of the
        defect observed, and whether there is any incoming FDI from a
        lower layer, and should follow the rules given previously. But
        note that any QoS metrics, which are being collected, are
        suppressed from aggregation into the long-term registers
        against available time.  The registers are effectively
        backdated 3s to allow for the defect detection time (at this
        stage we cannot judge whether the event will be a Short-Break,
        and hence the LSP remains in the Available State, or whether
        the LSP will enter the Unavailable State).
   4.   We now start timer T1.  This timer is used to determine the
        duration of the Trail Sink Near-End Defect State, and if this
        persists for a sufficient time (ie a further 10s) then this
        timer is used to branch the flow-chart into the Unavailable
        State processing region.
   5.   Below (timer) T1, we loop round the decision boxes æT1<10s?Æ
        and æEnd dLOCV, dTTSI or dLoop?Æ. We can exit this loop if the
        defect state ends (in accordance with criteria given

   Harrison et. al.      Expires August 2001                  Page 25
                 OAM Functionality for MPLS Networks    February 2001

        previously) before T1 reaches 10s. Since we are still in the
        available state, we restart any QoS metric aggregation into the
        long-term registers (noting the last 3s must be accounted for),
        we stop FDI/BDI OAM packet generation and capture the short-
        break event in the local registers. Additionally, if the event
        was due to a dTTSI, then we should also capture the TTSI of the
        offending LSP and cease the suppression of traffic.  The
        timestamp of the event should be related to the onset of the
        defect, which caused it. If however T1 reaches 10s we enter the
        Unavailable State. Note that it is not possible to enter the
        Unavailable State unless the Trail Sink Near-End Defect State
        has persisted for at least 10s in the Available State.
   6.   We now record a date/time-stamped Unavailable State entry event
        in the local registers together with information on the nature
        of the defect, which caused it.  Note that the date/timestamp
        must be backdated 13s.  Optionally, we may also send an
        exception report to the NMS with the Unavailable State entry
        date/timestamp noted above, together with any other relevant
        information about the defect which caused it, e.g. in the case
        of dTTSI this should include the TTSI of the offending LSP.  We
        now stop timer T1 and start timer T2, whose purpose is to
        record the duration of the Unavailable State. Note that when we
        enter the Unavailable State we also remain in the Trail Sink
        Near-End Defect State.
   7.   We now run round a decision box æEnd dLOCV, dTTSI or dLoop?Æ,
        which is just below the point where we started timer T2, which
        checks for the end of the defect state.  When the defect ends
        (in accordance with the criteria given previously) we stop
        FDI/BDI OAM packet generation and exit the Trail Sink Near-End
        Defect State. Any QoS metric aggregation is still inhibited.

   8.   We now run round the decision loop comprised of the two boxes
        æ>=9 but <= 11 expected CV OAM packets in last 10s AND no
        unexpected CV OAM packets' and ædLOCV, dTTSI or dLoop?Æ.  If a
        further defect occurs before we meet the exit criteria of the
        former decision box, we re-enter the Trail Sink Near-End Defect
        State and hence restart the generation of FDI/BDI OAM packets
        (with DL/DT codepoints and other consequent actions relevant to
        the specific defect observed). Any QoS metric aggregation
        continues to be inhibited.  In this case we are back at point 7
        above in the state processing and recommence checking for the
        end of the defect. Note that timer T2 continues to run.
   9.   To get out of the Unavailable State we must first have exited
        the Trail Sink Near-End Defect State as noted in 7 above, and
        then met the criteria of the decision box æ>=9 but <= 11
        expected CV OAM packets in last 10s AND no unexpected CV OAM
        packets?Æ as noted in 8 above. Note that the ælast 10sÆ
        referred to here includes the 3s interval required to check for
        the end of the Trail Sink Near-End Defect State as noted above
        in item 7.
   10.  We now stop timer T2 and record the duration of the
        unavailability event in the local registers.  We recommence any
        QoS metric aggregation into the local registers and cease all
        consequent actions associated with the Unavailable State.  Note

   Harrison et. al.      Expires August 2001                  Page 26
                 OAM Functionality for MPLS Networks    February 2001

        that T2 will record Unavailable State duration, which is 3s
        less than the true unavailability event. Note also that the
        last 10s belong to the Available State and so any QoS metric
        aggregation will need to take these 10s into account.
        Optionally, we may also send an exception report to the NMS
        with the Unavailable State exit date/timestamp suitably
        corrected as noted above.
   11.  This now takes us back to our starting point in the Available
        State.


6.6.5   Far-End State Processing Flow-chart

   The following figure summarizes many of the key points regarding the
   far-end state-processing algorithm for a given LSP.

                Figure 6: LSP Far-End State Processing Flow Chart

   1.   Assume we start in the available state at the box marked
        æStartÆ.  All timers shown later in the flow chart can
        conceptually be assumed to be reset at this point.  If there is
        any backward QoS aggregation activated on the return direction
        LSP then this will be via a separate P OAM packet flow on the
        return LSP.
   2.   The first decision box is æBDI OAM packet?Æ.  If the answer is
        'No', then we keep looping this check condition and stay in the
        Available State.  If the answer is 'Yes', then this implies
        that the near-end processing at the other end of the (outgoing)
        LSP has entered the Trail Sink Near-End Defect State.  Note
        that this also implies that the defect has already existed for
        3s at the other end of this LSP.
   3.   We then enter the Trail Sink Far-End Defect State and inhibit
        any backward QoS metric aggregation.  The QoS registers will
        need to be corrected for the previous 3s, which should not be,
        aggregated into the long-term Available State counts.
   4.   We now start timer T3, and run round the loop composed of the
        decision boxes æT3 <13s?Æ and æ3s BDI-Free?Æ.  T3 is used to
        check the duration of the Trail Sink Far-End Defect State.  If
        T3 does not reach 13s and we get 3s, which are BDI-Free, then
        we re-start any backward packet level metric aggregation.  Note
        that the last 6s must be accounted for in any backward QoS
        metric aggregation registers.  This arises since it takes the
        near-end processing 3s to declare the end of the defect at the
        other send of the (outgoing) LSP, and a further 3s to declare
        the end of the Trail Sink Far-End Defect State at this end of
        the (return) LSP, and all this time should count towards the
        Available State at this end of the LSP to ensure correct QoS
        metric aggregation.  A Short-Break date/time-stamped event
        should also be recorded in the local registers together the
        DL/DT information of the defect as given in the BDI OAM packet.
        This Short-Break event must be date/time-stamped relative to 3s
        before the time at which the first BDI OAM packet was observed.
        This now takes us back to the initial start position.  If
        however T3 reaches 13s we enter the far-end Unavailable State.

   Harrison et. al.      Expires August 2001                  Page 27
                 OAM Functionality for MPLS Networks    February 2001

        Note that it is not possible to enter the Unavailable State
        unless the Trail Sink Far-End Defect State has effectively
        persisted for at least 13s (and which means that at the other
        end of the (outgoing) LSP the Trail Sink Near-End Defect State
        has persisted for at least 10s) in available time.
   5.   Optionally, we may now send a date/time-stamped unavailability
        entry exception report to the NMS, which includes the relevant
        BDI OAM packet DL/DT information.  Note that the date/timestamp
        of any such exception report should be backdated by 16s (ie 3s
        prior to the first BDI OAM packet being observed for this
        event) to align the far-end processing with that of the near-
        end processing at the other end.  We now stop timer T3 and
        start a timer T4, whose purpose is to record the duration of
        this unavailability event.  Note that when we enter the
        Unavailable State we also remain in the Trail Sink Far-End
        Defect State.
   6.   We now run round a loop that checks for 3s which are BDI-Free.
        This is used to take us out of the Trail Sink Far-End Defect
        State.  Note that this is not strictly necessary, and this
        check condition could have been omitted and we could just have
        shown the following one which checks for a continuous (ie
        overall) 10s of BDI-Free behavior.  However, it has been shown
        like this to harmonize the ælookÆ of the near-end and far-end
        Trail Sink Defect State processing.
   7.   If we get 3s which are BDI-Free then we exit the Trail Sink
        Far-End Defect State and run a loop which checks if we have had
        an overall continuous period of 10s which are BDI-Free.  If any
        further BDI OAM packets appear within this overall 10s checking
        period then we re-enter the Trail Sink Far-End Defect State and
        need to repeat the process from step 6 above.  If, however, no
        further BDI OAM packets appear within the 10s checking period
        we exit the far-end Unavailable State.
   8.   We stop timer T4 and record the duration of the unavailability
        event.  T4 will record a time, which is 3s less than the true
        unavailability event.  A date/time-stamped unavailability exit
        event, backdated 13s, together with the unavailability duration
        should now be recorded in the local registers.  Optionally,
        this information may also be sent to the NMS as an exception
        report.
   9.   Any backward QoS metric aggregation can now be restarted,
        noting that the last 13s belong to available time and so the
        aggregate registers should be corrected accordingly

6.6.6   A pictorial view of near-end and far-end state processing

   The following figure is given to help clarify the temporal
   relationships between the near-end and far-end state processing
   given in the previous flow-charts for short-break event and an
   unavailability event.

   Figure 7:  Near-End and Far-End Temporal Processing of a Short-Break
   and Unavailability event



   Harrison et. al.      Expires August 2001                  Page 28
                 OAM Functionality for MPLS Networks    February 2001

7. Security Considerations

   The OAM function described in this document enhances the security of
   MPLS networks, by detecting mis-connections, and therefore
   preventing customersÆ traffic to be exposed to other customers.

   The MPLS OAM functions as defined in this document do not raise any
   new security issue, to MPLS networks.


8. References


   [1]  Rosen E, et al, RFC 3032, "MPLS label stack encoding".

   [2]  Le Faucheur et al, "MPLS support of Differentiated Services",
   draft-ietf-mpls-ext-08.txt, work in progress.

   [3]  Awduche et al, "RSVP-TE: Extensions to RSVP for LSP Tunnels",
   draft-ietf-mpls-rsvp-lsp-tunnel-05.txt, work in progress.

   [4]  Hinden and Deering, RFC 1884, "IP Version 6 Addressing
   Architecture".

9. Author's Addresses

   Neil Harrison
   British Telecom              Phone: 44-1604-845933
   Heath Bank                   Email: neil.2.Harrison@bt.com
   Iugby Road, Harleston
   South Hampton, UK

   Peter Willis
   British Telecom              Phone: 44-1473-645178
   BT, PP RSB10/PP3 B81         Email: peter.j.willis@bt.com
   Adastrial Park
   Martlesham, Ipswich, UK

   Shahram Davari
   PMC-Sierra
   411 Legget Drive             Phone: 1-613-271-4018
   Kanata, ON, Canada           Email: Shahram_Davari@pmc-sierra.com

   Ben Mack-Crane
   Tellabs
   4951 Indiana Ave             Phone: 1-630-512-7255
   Lisle, IL, USA               Email: ben.mack-crane@tellabs.com

   Hiroshi Ohta
   NTT
   Y-709A, 1-1 HikarinoÆka      phone: 81-468-59-8840
   Yokosuka-Shi                 Email: ohta.hiroshi@nslab.ntt.co.jp
   Kanagawa, Japan

   Harrison et. al.      Expires August 2001                  Page 29