Internet Working Group                                          Y. Jiang
                                                                   W. Xu
Internet Draft                                                    Huawei
                                                                  Z. Cao
Intended status: Standards Track                            China Mobile


Expires: January 2015                                       July 4, 2014


               Fault Management in Service Function Chaining
                          draft-jxc-sfc-fm-00.txt


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on January 4, 2015.

Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of



Jiang and et al        Expires January 4, 2015                [Page 1]


Internet-Draft            SFC Fault Management               July 2014


   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Abstract

   SFC provides a flexible and agile approach to service innovation, but
   whether the SFC path is constructed as expected, whether the chaining
   is functioning correctly still needs to be verified. This document
   discusses fault management requirements in SFC and provides a fault
   management solution for service function chaining.

Table of Contents

   1.   Introduction .............................................. 2
      1.1. Conventions used in this document ...................... 3
      1.2. Terminology ............................................ 3
      1.3. SFC OAM Requirements ................................... 4
   2.   Packet Format ............................................. 5
   3.   Theory of Operation ....................................... 8
      3.1. Continuity Check and Connectivity Verification of SFC .. 8
      3.1.1.  MEP sending an SFC CC-CV packet ..................... 8
      3.1.2.  MEP terminating an SFC CC-CV packet ................. 9
      3.2. SFC Route Tracing ...................................... 9
      3.2.1.  MEP sending an SFC Trace Request ................... 11
      3.2.2.  SFE/SFF processing an SFC Trace Route Request ...... 11
      3.2.3.  Service Function treating an SFC Trace Request ..... 11
      3.2.4.  MEP receiving an SFC Trace Reply ................... 11
   4.   Security Considerations .................................. 12
   5.   IANA Considerations ...................................... 12
   6.   References ............................................... 12
      6.1. Normative References .................................. 12
      6.2. Informative References ................................ 12
   7.   Acknowledgments .......................................... 13



1. Introduction

   This document discusses Operations, Administration and Maintenance
   (OAM), specifically, fault management requirements for Service
   Function Chaining (SFC), and further provides a solution that can be
   used to detect data plane failures in SFC Paths.

   A requisite of SFC OAM is that SFC OAM messages must follow the same
   data path as normal SFC packets would traverse. SFC OAM request and
   reply messages are used primarily to validate the SFC data plane, and



Jiang and et al        Expires January 4, 2015                [Page 2]


Internet-Draft            SFC Fault Management               July 2014


   may further be used to verify the SFC data plane against the SFC
   control plane.

1.1. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].



1.2. Terminology

   Maintenance Entity Group (MEG): The set of one or more maintenance
   entities that maintain and monitor a section or a transport path in
   an OAM domain.

   MEP: MEG End Point, an OAM end point capable of initiating (source
   MEP) and terminating (sink MEP) OAM packets for fault management and
   performance monitoring.

   MIP: MEG Intermediate Point, an OAM intermediate point terminates and
   processes OAM packets that are sent to this particular MIP and may
   generate OAM packets in reaction to received OAM packets.

   Service Function (SF): a logical entity which can provide   one or
   more service processing functions for packets/frames such as firewall,
   DPI (Deep Packet Inspection), LI (Lawful Intercept) and etc. Usually
   these processing functions are computation intensive. This entity may
   also provide packet/frame encapsulation/decapsulation capability.

   Service Forwarding Entity (SFE): a logical entity which forwards
   packets/frames to one or more SFs in a same service chain. Optionally,
   it provides mapping, insertion and removal of header(s)    in
   packets/frames. Note service forwarding path may not be the shortest
   path to its destination.

   Service Function Forwarder (SFF):  A service function forwarder is
   responsible for delivering traffic received from the SFC network
   forwarder to one or more connected service functions via
   information carried in the SFC encapsulation.

   Service Chaining Header: a header in front of packet, added by an
   SFE/SFF. SFE/SFF uses service chaining header information to forward
   service chaining packet.




Jiang and et al        Expires January 4, 2015                [Page 3]


Internet-Draft            SFC Fault Management               July 2014


   Service Chaining Packet: an original packet added with a service
   chaining header.

1.3. SFC OAM Requirements

   The following SFC OAM requirements MUST be supported:

   (R1) SFC OAM MUST allow for continuity check between SFEs/SFFs.

   (R2) SFC OAM MUST allow for connectivity verification between
   SFEs/SFFs.

   (R3) SFC OAM MUST support trace routing in a service function path.

   (R4) SFC OAM MUST support connectivity verification between SFs in an
   SFC chain.

   (R5) SFC OAM MUST support performance measurements in SFs and
   SFEs/SFFs.

   (R6) SFC OAM MUST support monitoring of unidirectional and bi-
   directional SFC path.

   (R7) SFC OAM MUST support fate sharing of SFC OAM packets and SFC
   service packets on the same SFC path (congruent path).



   Since control plane is not a prerequisite for SFC, we cannot resort
   to control plane hello session. Furthermore, OAM packets need to be
   transported on the same data path as the SFC packets, so that any
   data plane failure can be identified.

   Therefore, there is a need to provide an OAM tool that would enable
   users to detect failures in the SFC data plane, and a mechanism to
   isolate and identify faults.

   This document discusses the fault management problem in SFC. The
   basic idea is to verify that packets in a particular Service Function
   Chain actually passing through the SFEs/SFFs and SFs along the
   respective SFC path.

   It is proposed that this test be carried out by sending an OAM
   message (called an "SFC trace request message") across an SFC path.
   The SFC trace request message carries the SFC identifier whose SFC
   path is being verified.  This SFC OAM request message is forwarded on



Jiang and et al        Expires January 4, 2015                [Page 4]


Internet-Draft            SFC Fault Management               July 2014


   the SFC path just like any SFC data packet belonging to that Service
   Function Chain.

   The OAM message is processed by each SFE/SFF along the SFC, and the
   SFE/SFF will respond with an SFC trace Reply message, carrying
   information such as the previous SF identifier and its position in
   the SFC.



2. Packet Format

   OAM messages are encapsulated in an SFC packet in the following
   format (they should not be combined with any SFC data traffic in the
   same SFC packet):

   +------------+-------------+
   | SFC Header | OAM message |
   +------------+-------------+
   Where SFC header is formatted as in Figure 1:

   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Version|O|                 other parameters                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    other parameters                           |
   .                                                               |
   .                                                               |
   .                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                            Figure 1 SFC header



   O: The O flag indicates that an SFC OAM message is following the SFC
   header.



   An SFC OAM message is depicted in Figure 2.








Jiang and et al        Expires January 4, 2015                [Page 5]


Internet-Draft            SFC Fault Management               July 2014


     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Version   | Message Type  |         Reserved              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Originator Handle                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Sequence Number                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        TLVs                                   |
   .                                                               .
   .                                                               .
   .                                                               .
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                          Figure 2 SFC OAM message

   o Version: version of SFC OAM message. This field is 8 bits long,
      and current version is set to 0x01.

   o Message Type  indicate the type of SFC OAM message.

      The SFC OAM message has the following types:

          Value        Meaning

          -----        -------

              1        continuity check message

              2        trace request message

              3        trace reply message



   o Originator Handle: The Originator Handle is filled in by the
      packet original sender.

   o Sequence Number: The Sequence Number is assigned by the sender of
      the SFC request message and can be used to track the correct reply
      message.

   o The Sending Timestamp is the time-of-day (in seconds and
      microseconds, according to the sender's clock) when the SFC OAM
      request is sent.  The Receiving Timestamp in an SFC OAM reply
      message is the time-of-day (according to the receiver's clock)
      that the corresponding request was received.


Jiang and et al        Expires January 4, 2015                [Page 6]


Internet-Draft            SFC Fault Management               July 2014


   o TLVs (Type-Length-Value) have the following format:

     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |             Type              |           Length              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Value                                  |
   .                                                               .
   .                                                               .
   .                                                               .
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                           Figure 3 SFC OAM TLVs

   Types of SFC OAM TLV will be defined in the next revision; Length is
   the length of the Value field in octets; and Value field is variable
   depending on its Type (it is zero padded to align to a 4-octet
   boundary).






























Jiang and et al        Expires January 4, 2015                [Page 7]


Internet-Draft            SFC Fault Management               July 2014


3. Theory of Operation

   In order to describe SFC OAM in an abstract way, we reuse some
   nomenclatures in MPLS WG. SFC OAM operates in the context of
   Maintenance Entities (MEs) that define a relationship between two
   points of a service function path to which maintenance and monitoring
   operations apply. The two points that define a maintenance entity are
   called Maintenance Entity Group End Points (MEPs).

   An abstract reference model for an ME is illustrated in Figure 4
   below:

                         +-+    +-+    +-+    +-+
                         |A|----|B|----|C|----|D|
                         +-+    +-+    +-+    +-+
                      Figure 4 SFC OAM Reference Model

   In Figure 4, node A can be a classifier or an entry SFE/SFF, node D
   can be an exit SFE/SFF, and node B and C can be any SFE/SFF or SF
   (with the restriction that any two SFs cannot be directly connected
   in SFC forwarding layer) on the SFC path.

   In general, MEG End Points (MEPs) are the source and sink points of a
   MEG for SFC OAM.

3.1. Continuity Check and Connectivity Verification of SFC

   Proactive Continuity Check (CC) can be used to detect a loss of
   continuity defect between two MEPs in a MEG.

   Proactive Connectivity Verification (CV) can be used to detect an
   unexpected connectivity defect between two MEGs or unexpected
   connectivity within the MEG with an unexpected MEP.

   BFD can also be used as a tool of proactive CC & CV in SFC, where BFD
   Control packets must be sent along the same path as the monitored SFC
   path.


3.1.1. MEP sending an SFC CC-CV packet

   A source MEP can proactively sends CC-CV packets periodically to its
   sink peer MEP. An SFC CC-CV packet is an SFC CC-CV message
   encapsulated with an SFC Header. The SFC header is set as described
   in [I-D.niu-sfc-mechanism] and its flag O MUST be set to 1.

   The SFC OAM message is set as follows:


Jiang and et al        Expires January 4, 2015                [Page 8]


Internet-Draft            SFC Fault Management               July 2014


   - The message type MUST be set to 1.

   - The Sender's Handle is set by the original sender, and MUST be set
      with the sender's identifier.

   - The Sequence Number is set with a random value.


3.1.2. MEP terminating an SFC CC-CV packet

   A sink MEP detects a loss of continuity defect when it fails to
   receive proactive CC-V OAM packets from the source MEP for a
   consecutive time.

   When CC-V packets are received by a sink MEP, it is parsed. If any
   mis-connectivity defect is detected, a warning should be raised and
   fault management system should be notified of the detected defects.


3.2. SFC Route Tracing

   According to the SFC architecture described in figure 2 of [I-D.
   jiang-sfc-arch] and figure 2 of [I-D. quinn-sfc-arch], SFC can be
   categorized into two abstraction layers, that is, service function
   layer and SFC forwarding layer. In the service function layer, a
   service function chain actually is a service function graph, where a
   service function is connected to another service function one by one
   in sequence. In the SFC forwarding layer, service functions are
   further attached to SFE/SFF nodes thus form a more detailed
   forwarding graph. As defects can be located on either service
   functions or SFE/SFF nodes, it is critical to trace route both
   service functions and SFE/SFF nodes to detect and isolate any defects
   for SFC.

   In order to trace route of a service function chain, different layers
   of service function chain can be monitored:

   o Service-function-layer, that is, only SF identifiers can be set as
      the destination MEP in the trace route request and response
      messages. The trace routing operation collects all the SFs'
      identifiers along an SFC path. By comparing this SF list with the
      pre-configured service function graph, an operator could determine
      whether there is any fault in the SF connectivity and locate the
      defect on an SF when there are any of them.





Jiang and et al        Expires January 4, 2015                [Page 9]


Internet-Draft            SFC Fault Management               July 2014


   o SFC-forwarding-layer, that is, both SF identifiers and SFE/SFF can
      be set as the destination MEP in the trace route request and
      response messages.  The trace routing operation collects all the
      SFs' identifiers and SFE/SFF identifiers along an SFC path. By
      comparing this SF and SFE/SFF list with the pre-configured SFC
      forwarding graph, an operator could determine whether there is any
      fault in the forwarding layer and locate the defect on an SFE/SFF
      or an SF.

   Furthermore, two different mechanisms may be used to trace route a
   service function chain:

   o TTL mechanism

   Similar to the IP trace route, the detection node launches a number
   of trace request messages in sequence to detect the fault in a
   specific path, the TTL of request message is set successively to 1,
   2, ..., and so on.

   The trace route request will pass the SFs along the service function
   graph, and each SF will decrease the TTL value by 1.

   A trace route reply message will be generated and send back to the
   launcher when the resulted TTL is equal to zero.

   In this way, the launcher of trace routing can get the list of SFs
   that the trace route request message passes by parsing all the trace
   route reply messages, and isolate the fault location if there is any.

   o record route mechanism

   The detection node launches a single trace route request message, and
   this message is transported over the specific SFC path.

   When the trace route request message is received by an SF in the SFC
   path, the SF adds its SF identifier to the end of an SF list carried
   in the message. Moreover, a trace route reply message should be
   generated and sent back to the launcher, and the new record route SF
   list MUST be copied to the trace route reply message.

   In this way, the launcher of trace routing can get the list of SFs
   that the trace route request message passes by parsing all the trace
   route reply messages, and isolate the fault location if there is any.






Jiang and et al        Expires January 4, 2015               [Page 10]


Internet-Draft            SFC Fault Management               July 2014


3.2.1. MEP sending an SFC Trace Request

   In general, MEG End Points (MEPs) are the source and sink points of a
   MEG for SFC OAM. An MEP initiates a trace route request packet to
   detect and track any fault in a Service Function Chain.
   An SFC Trace route request packet is an SFC trace route request
   message encapsulated with an SFC Header. The SFC header is set as
   described in [I-D.niu-sfc-mechanism] and flag O MUST be set to 1.
   The SFC OAM message is further set as follows:
   - The message type MUST be set to 2.
   - The Sender's Handle MUST be set to the sender's identifier.
   - The Receiver's Handle can be set to the exit SFE/SFF's identifier.


3.2.2. SFE/SFF processing an SFC Trace Route Request

   When an SFE/SFF receives a trace route request packet with O flag
   being set in SFC header, it firstly adds its identifier to the end of
   the record route list in the trace request. It then performs service
   forwarding function, and sends the new trace route request packet to
   the next SF or next SFE/SFF.

   Furthermore, the SFE/SFF sends a trace reply packet back to the
   source MEP with a copy of the new record route SF list.



3.2.3. Service Function treating an SFC Trace Request

   An SF can only be configured as an MIP in an MEG of SFC. When an SF
   (being an MIP) receives a trace request packet with OAM flag being
   set in SFC header from an SFE/SFF, it only sends it back to the
   SFE/SFF transparently.



3.2.4. MEP receiving an SFC Trace Reply

   An MEP should only process an SFC trace reply packet in response to
   an SFC trace request that it has sent. Thus, upon receipt of an SFC
   trace reply packet, an MEP should try to match the trace reply packet
   with a trace request that it has previously sent, by checking the
   corresponding path identifier and Sequence Number in the SFC OAM
   packets. If no match is found, then the MEP MUST drop the trace reply
   packet silently.




Jiang and et al        Expires January 4, 2015               [Page 11]


Internet-Draft            SFC Fault Management               July 2014


   Since each SFE/SFF in the SFC path will send a trace reply packet
   when the trace request packet passes it, a source MEP will receive a
   sequence of trace reply packets from SFEs/SFFs (other than the MEP
   itself) along the SFC path. Thus, the source MEP can get the full
   service topology and SFC path if there is no defect in the SFC data
   plane, and could detect and locate the data plane defects if there
   are any of them.



4. Security Considerations

   It will be considered in a future revision.

5. IANA Considerations

   It will be considered in a future revision.



6. References

6.1. Normative References




6.2. Informative References

   [sfc-ps] P. Quinn, and T. Nadeau; Service Function Chaining Problem
             Statement; April 2014; Work in Progress

   [I-D.jiang-sfc-arch] Y. Jiang, H. Li; An Architecture of Service
             Function Chaining; February 2014; Work in Progress

   [I-D.niu-sfc-mechanism] L. Niu, H. Li, Y. Jiang; A Service Function
             Chaining Header and its Mechanism; March 2014; Work in
             Progress

   [I-D.quinn-sfc-arch] P. Quinn, J. Halpern; Service Function Chaining
             (SFC) Architecture; May 2014; Work in Progress








Jiang and et al        Expires January 4, 2015               [Page 12]


Internet-Draft            SFC Fault Management               July 2014


7. Acknowledgments

   TBD




   Authors' Addresses

   Yuanlong Jiang
   Huawei Technologies Co., Ltd.
   Bantian, Longgang district
   Shenzhen 518129, China
   Email: jiangyuanlong@huawei.com


   Weiping Xu
   Huawei Technologies Co., Ltd.
   Bantian, Longgang district
   Shenzhen 518129, China
   Email: xuweiping@huawei.com

   Zhen Cao
   China Mobile
   Xuanwumenxi Ave, Xuanwu District
   Beijing 100053, China
   Email: caozhen@chinamobile.com





















Jiang and et al        Expires January 4, 2015               [Page 13]