Network Working Group                                          M. Bhatia
Internet-Draft                                            Alcatel-Lucent
Intended status: Standards Track                                 M. Chen
Expires: July 9, 2012                                            Z. Wang
                                            Huawei Technologies Co., Ltd
                                                                  L. Guo
                                                           China Telecom
                                                         M. Binderberger
                                                         January 6, 2012


Bidirectional Forwarding Detection (BFD) on Link Aggregation Group (LAG)
                               Interfaces
                        draft-mmm-bfd-on-lags-02

Abstract

   This document proposes a mechanism to run BFD on Link Aggregation
   Group (LAG) interfaces.  It does so by running an independent BFD
   session on every LAG member link.

   (For IP/UDP encapsulation)
   A dedicated well-known multicast IP address for both IPv4 and IPv6 is
   introduced as the destination IP address of the BFD packets when
   running BFD on the member links of the LAG.

   (For Ethernet encapsulation)
   A new Ethernet type is introduced to send BFD packets directly in
   Ethernet frames when running BFD on the member links of the LAG.

   There is currently also no standard that describes how BFD runs on a
   LAG interface as a whole.  This draft proposes a definition for this
   problem too while taking into consideration existing implementations.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-



Bhatia, et al.            Expires July 9, 2012                  [Page 1]


Internet-Draft           BFD for LAG Interfaces             January 2012


   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on July 9, 2012.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.




























Bhatia, et al.            Expires July 9, 2012                  [Page 2]


Internet-Draft           BFD for LAG Interfaces             January 2012


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  BFD over LAG with a single session . . . . . . . . . . . . . .  4
     2.1.  BFD over Big Pipe  . . . . . . . . . . . . . . . . . . . .  5
     2.2.  Examples of existing implementations . . . . . . . . . . .  5
   3.  BFD on LAG member links  . . . . . . . . . . . . . . . . . . .  6
     3.1.  BFD BLM session  . . . . . . . . . . . . . . . . . . . . .  6
     3.2.  Micro BFD sessions . . . . . . . . . . . . . . . . . . . .  7
       3.2.1.  BFD packet details (IP/UDP Encapsulation,
               Multicast destination address) . . . . . . . . . . . .  7
       3.2.2.  BFD packet details (IP/UDP Encapsulation, Unicast
               destination address) . . . . . . . . . . . . . . . . .  8
       3.2.3.  BFD packet details (Ethernet encapsulation)  . . . . .  8
     3.3.  Concluded BFD state  . . . . . . . . . . . . . . . . . . .  8
     3.4.  User interface for BFD packets . . . . . . . . . . . . . .  9
       3.4.1.  User interface (IP/UDP encapsulation)  . . . . . . . .  9
       3.4.2.  User interface (Ethernet encapsulation)  . . . . . . .  9
   4.  BFD on LAG members and layer-3 applications  . . . . . . . . . 10
   5.  Application example: LMM using BLM . . . . . . . . . . . . . . 10
   6.  Security Consideration . . . . . . . . . . . . . . . . . . . . 12
     6.1.  (IP/UDP encapsulation) . . . . . . . . . . . . . . . . . . 12
     6.2.  (Ethernet encapsulation) . . . . . . . . . . . . . . . . . 12
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 12
     7.1.  (IP/UDP encapsulation, multicast)  . . . . . . . . . . . . 12
     7.2.  (IP/UDP encapsulation, unicast)  . . . . . . . . . . . . . 12
     7.3.  (Ethernet encapsulation) . . . . . . . . . . . . . . . . . 12
   8.  IEEE Considerations  . . . . . . . . . . . . . . . . . . . . . 12
   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 13
     10.2. Informative References . . . . . . . . . . . . . . . . . . 13
   Appendix A.  IETF discussion status  . . . . . . . . . . . . . . . 14
     A.1.  Unicast vs. Multicast IP address . . . . . . . . . . . . . 14
     A.2.  Design Using Unicast IP encapsulation  . . . . . . . . . . 15
     A.3.  Discussion about the BFD packet encapsulation  . . . . . . 16
     A.4.  Details of an example User interface for BFD packets . . . 16
     A.5.  BLM sessions and the address family  . . . . . . . . . . . 18
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18












Bhatia, et al.            Expires July 9, 2012                  [Page 3]


Internet-Draft           BFD for LAG Interfaces             January 2012


1.  Introduction

   The Bidirectional Forwarding Detection (BFD) protocol [RFC5880]
   provides a mechanism to detect faults in the bidirectional path
   between two forwarding engines, including interfaces, data link(s),
   and to the extent possible the forwarding engines themselves, with
   potentially very low latency.

   BFD can be used for detecting failures of the path between two
   network devices.  Typically the application clients are not aware of
   any inner structure of the underlying interface, being layer 3
   applications themselves like Open Shortest Path First (OSPF)
   [RFC2328] or Border Gateway Protocol (BGP)[RFC4271].  While this
   works for interfaces like Ethernet and Packet Over SONET (POS), it
   causes problems for bundled interfaces like LAG.

   A LAG is used to bind together several physical ports between two
   adjacent nodes so they appear to higher-layer protocols as a single,
   higher bandwidth "virtual" pipe.  A LAG interface thereby allows
   aggregation of multiple network interfaces as one virtual interface
   for the purpose of providing fault-tolerance and higher bandwidth.

   The problem with running BFD over a LAG is that with a single BFD
   session and without internal knowledge of the LAG structure it is
   impossible for BFD to guarantee a detection of anything but a full
   LAG shutdown within the BFD timeout period.  The LAG shutdown is
   typically initiated by some LAG module, which we will refer to as the
   LAG Management Module (LMM) in the rest of the document.  LAG timers
   are typically multiple times slower than the BFD detection timers
   (multiple 100msec of LMM vs. multiple 10msec of BFD).  There is thus
   a need to bring some sort of determinism in how BFD runs over a LAG.
   There is also a need to detect member link failures much faster than
   what Link Aggregation Control Protocol (LACP) allows.

   The document proposes establishing a BFD session over every member
   link the LAG is built upon.  BFD can combine these information to
   provide fast detection for layer-3 applications.

   While there are native Ethernet mechanisms to detect failures
   (802.1ax, .3ah) that could be used for LAG, the solution proposed in
   this document enables operators who have already deployed BFD over
   different technologies (e.g.  IP, MPLS) to use a common failure
   detection mechanism.


2.  BFD over LAG with a single session





Bhatia, et al.            Expires July 9, 2012                  [Page 4]


Internet-Draft           BFD for LAG Interfaces             January 2012


2.1.  BFD over Big Pipe

   The simplest approach to run BFD on a LAG interface is to ignore the
   internal structure and treat the LAG as one "big pipe".  We call this
   mode of operation as "BFD over Big Pipe" or "BBP" for short.  It
   corresponds to section 7.1 in RFC 5882 [RFC5882].

   We need to standardize the BBP approach.  The following requirements
   define what it means to treat a LAG interface as a single interface
   with no additional structure:

   o  BFD can send packets on any member link

   o  BFD must accept packets from every member link

   o  The Rx/Tx link can change any time and/or regularly with every
      change pattern without causing BFD to fail

   The BFD session on the LAG interface then follows RFC 5880 and
   RFC 5881 in all details.

2.2.  Examples of existing implementations

   Because there is no standard, vendors have implemented their own
   proprietary mechanisms to run BFD over LAG interfaces.  Two examples
   are shown here.  Both satisfy the requirements in Section 2.1

   Some implementations send BFD packets only over one member link.
   Others spray BFD packets over all member links of the LAG.  There are
   issues with each of these approaches.

   In the first approach, BFD sends packets onto the LAG and the LAG
   load balance algorithm will select a member port, which may be the
   same port for all the packets of this BFD session.  BFD will remain
   up as long as this "primary" port is alive.  It will go down once the
   primary port goes down till another port is selected as the primary.
   Problems arise with this design as BFD is oblivious to the presence
   of other member links in the LAG.  If a non-primary link goes down,
   the BFD session remains unaffected as it can still send and receive
   BFD packets over the primary link.  This results in all traffic sent
   over the failed member link getting dropped, till the LMM removes the
   failed link from the LAG.  Conversely if the primary link goes down,
   then the BFD session will go down, till a new member link is elected
   as the primary link.

   In the second approach, BFD packets are sprayed over all the member
   links of a LAG.  This is done naively via round-robin, where each BFD
   packet is sent using the subsequent member link, in a round-robin



Bhatia, et al.            Expires July 9, 2012                  [Page 5]


Internet-Draft           BFD for LAG Interfaces             January 2012


   fashion.  It solves the problem of BFD going down because of the
   primary port going down, but it still does not solve the problem of
   traffic getting lost when one of the member link goes down.  This is
   because, when a member link goes down, BFD remains up and traffic
   continues to go over the link that has failed till a higher layer
   protocol detects this and removes the offending link from the LAG.

   Between the two approaches the second one is RECOMMENDED as its much
   more flexible and is not prone to single link failures.  To
   completely solve all issues we RECOMMEND running BFD on all member
   links as described in Section 3.


3.  BFD on LAG member links

   The mechanism proposed for a fast detection of LAG member link
   failure is running BFD sessions on every LAG member link.  We name
   this mode of BFD operations "BFD on LAG members" or "BLM" for short.
   It corresponds to section 7.3 in RFC 5882 [RFC5882].

3.1.  BFD BLM session

   The overall BLM session consists of the LAG interface, i.e. the
   aggregated link, a set of BFD sessions running on the member links
   and a new BFD state for the LAG; this state is explained in more
   detail in Section 3.3.  We call the member-link sessions as micro BFD
   sessions; their details are discussed in Section 3.2.

   The set of micro sessions is such that we have one micro session per
   member link.  This set can change over the lifetime of a BLM session.
   E.g.  BFD receives updates for the micro session set when links are
   physically added or removed from the LAG and will accordingly create
   or delete micro BFD sessions.

   The details of how the update happens are implementation specific and
   outside the document's scope.  For example the client requesting the
   BLM session could provide these updates.

   (The following paragraph applies only when IP/UDP encapsulation is in
   use) Only one address family MUST be used per BLM session, i.e. the
   set of micro BFD sessions belonging to the BLM session MUST either
   all use IPv4 or all use IPv6.

   Multiple BLM session requests for the same LAG interface result in a
   shared BLM session.  The set of micro sessions finally used is the
   superset of the individual micro session sets.  If conflicting
   session parameters are requested then it is a local issue as to how
   to resolve the parameter conflicts, as explained in RFC 5882, Section



Bhatia, et al.            Expires July 9, 2012                  [Page 6]


Internet-Draft           BFD for LAG Interfaces             January 2012


   2.

3.2.  Micro BFD sessions

   A single micro BFD session runs on every member link of the LAG.
   These micro BFD sessions follow RFC 5880 [RFC5880].

   Only asynchronous mode is considered in this document.  The echo
   function is outside the document's scope.  At least one system MUST
   take the Active role (possibly both).  The micro BFD sessions on the
   member links are independent BFD sessions.  They use their own
   unique, local discriminator values, maintain their own set of state
   variables and have their own independent state machine.  Timer values
   MAY be different, even among the micro sessions belonging to the same
   LAG, although it is expected that micro sessions belonging to the
   same LAG use the same timer values.

   The demultiplexing of a received packet is solely based on the Your
   Discriminator field, if this field is nonzero.  For the initial Down
   packet of a micro session this value may be zero.  In this case
   demultiplexing MUST be based on some combination of other fields
   which MUST include the interface information of the member link.

   When receiving a BFD packet for a micro session with a valid, non-
   zero Your Discriminator then a check MUST be done if the packet was
   received on the correct member link interface.  If the check fails
   then the packet MUST be discarded.  This test needs to be done before
   state variables for the micro sessions are updated by the received
   packet.

3.2.1.  BFD packet details (IP/UDP Encapsulation, Multicast destination
        address)

   [Either this section or the alternative sections Section 3.2.3,
   Section 3.2.2 should remain in the final document.  There is no
   intention to support multiple encapsulations.]

   The BFD Control packets for each micro BFD session are IP/UDP
   encapsulated as defined in [RFC5881].  They use a well-known link-
   local multicast IP address (224.0.0.X for IPv4, FF02::X for IPv6, to
   be assigned by IANA).

   On Ethernet-based LAG member links the corresponding destination
   multicast MACs will be 01:00:5e:00:00:XX for IPv4 and
   33:33:00:00:00:XX for IPv6.  Each member link uses its own MAC
   address as the source MAC address.





Bhatia, et al.            Expires July 9, 2012                  [Page 7]


Internet-Draft           BFD for LAG Interfaces             January 2012


3.2.2.  BFD packet details (IP/UDP Encapsulation, Unicast destination
        address)

   [Either this section or the previous sections Section 3.2.3,
   Section 3.2.1 should remain in the final document.  There is no
   intention to support multiple encapsulations.]

   The BFD Control packets for each micro BFD session are IP/UDP
   encapsulated as defined in [RFC5881], but with one major change: the
   UDP destination port will not be 3784 but "BfdBndlPort" (to be
   assigned by IANA).  Control packets use a destination IP address that
   is the peer's remote IP address.  The details of how this destination
   IP address is learnt is beyond the scope of this document.

   On Ethernet-based LAG member links the destination MAC is the MAC
   assigned to the peers LAG aggregator.

3.2.3.  BFD packet details (Ethernet encapsulation)

   [Either this section or the next sections Section 3.2.1,
   Section 3.2.2 should remain in the final document.  There is no
   intention to support multiple encapsulations.]

   The BFD packet is directly encapsulated into the Ethernet frame.  The
   frame has the following format: Ethernet header according to
   [IEEE802.3], then Type/Length field set to "BfdEtherType", followed
   by the BFD packet

   The Ethernet payload must be padded with zeros to reach 46 bytes if
   the BFD packet size is not already larger.

   When receiving an Ethernet frame the payload is used for further BFD
   processing.  Additional padding data MUST be ignored if it was
   required to reach the minimum payload length of 46 bytes.

   IANA needs to assign a L2 MAC address according to [RFC5342] that
   would be used as the destination MAC for all control packets in the
   micro BFD sessions.

   A new Ethertype must be assigned by the IEEE Registration Authority
   to the BFD over Ethernet protocol that will be used for all micro BFD
   sessions.

3.3.  Concluded BFD state

   An additional state variable is introduced for BFD on LAG members:
   the concluded state.  The state values are Down, Up and AdminDown.
   This state is not part of the micro session state machine.  Instead



Bhatia, et al.            Expires July 9, 2012                  [Page 8]


Internet-Draft           BFD for LAG Interfaces             January 2012


   it describes the overall state of the LAG.  It is a local state and
   does not appear (directly) in any BFD packet on any link.

   The concluded state may be set to AdminDown for administrative
   purpose, to keep the BLM and the micro sessions indefinitely down.
   When the concluded state is entering AdminDown then all micro
   sessions belonging to the BLM MUST enter the AdminDown state as well.

   A function must be defined, which evaluates all the states of the
   micro sessions that belong to the BLM.  This function has two output
   values Down and Up and the concluded state is updated with the last
   evaluation result, unless it is already in AdminDown state.  The
   evaluation takes place whenever a micro session is added, removed or
   is changing state.

   The details of the evaluation function are outside the scope of the
   document.  The function could for example test for a minimum number
   of micro sessions in Up state.  The function could even be
   "outsourced" and e.g. the decision logic of the LMM module could be
   used.

   The concluded state is important for layer-3 clients requesting BFD
   sessions over the LAG or over Vlans on the LAG.  Details will be
   discussed in Section 4.

3.4.  User interface for BFD packets

3.4.1.  User interface (IP/UDP encapsulation)

   [Either this section or the next section Section 3.4.2 should remain
   in the final document.  There is no intention to support both
   encapsulations.]

   The user interface for BFD micro sessions encapsulated in IP/UDP MUST
   allow to send an IP/UDP packet on a specified LAG port.  When
   receiving BFD packets for micro sessions then the IP/UDP packet MUST
   be provided together with an information what LAG port the packet was
   received on.

3.4.2.  User interface (Ethernet encapsulation)

   [Either this section or the previous section Section 3.4.1 should
   remain in the final document.  There is no intention to support both
   encapsulations.]

   The user interface for BFD directly transported in an Ethernet frame
   MUST allow to send and receive a complete Ethernet frame with the
   specific "BfdEtherType" type value.  The information that specifies



Bhatia, et al.            Expires July 9, 2012                  [Page 9]


Internet-Draft           BFD for LAG Interfaces             January 2012


   the LAG port from which a frame is sent to or received from is either
   an explicit extension of this API or is implicitly given by binding
   the API to a specific port.

   As an example, the API could be identical with the MA_DATA request
   and indication as defined in section 2.3 of [IEEE802.3].


4.  BFD on LAG members and layer-3 applications

   Layer 3 protocols like e.g.  OSPF may use BFD on LAG members in one
   of the following ways:

   a.  The session request from the client creates a virtual session.
       This virtual session is not sending actual BFD packets.  Instead
       the state, which is reported to the layer-3 client, is based on
       the concluded state.

       Implementations compliant to this standard MUST support this
       mode.  This is the default mode in which BFD over LAG works.

   b.  The session request from the client creates a BBP session, as
       described in Section 2.1.  BFD SHOULD update the state of the BBP
       session with the concluded state of the corresponding LAG in the
       following way:

       1.  when the concluded state is Down then the BBP session state
           is transitioning to Down as well

       2.  for a concluded state of Up or AdminDown the BBP session
           state is unaffected

       This state update allows BBP session to run with more relaxed
       timer values as the more intense liveliness detection is done by
       the micro BFD sessions.

       Compliant implementations MUST support this mode.

   An implementation MUST provide a configuration knob which lets the
   user select the mode.


5.  Application example: LMM using BLM

   There are certainly many ways to use BLM.  Here is one example
   envisioned by the authors.

   The LAG Management Module (LMM) could be envisaged as a client of



Bhatia, et al.            Expires July 9, 2012                 [Page 10]


Internet-Draft           BFD for LAG Interfaces             January 2012


   BFD, i.e. the LMM requests a BLM session and takes responsibility to
   update the set of micro sessions

   LMM then uses BFD, instead of or in parallel with LACP, to monitor
   the health of the individual members links of the LAG.  Details are
   outlined below.

   Bringing a member link up:

   When the status of a port is about to change to Distribution TRUE
   (see section 5.3.15 in [IEEE802.1AX]), i.e. before the port is added
   to the distribution function of the LAG, then the particular BFD
   micro session is requested.  An implementation MAY wait for the micro
   BFD session to reach Up state before adding the port to the LAG's
   distribution function and changing the port status to Distribution
   TRUE.

   In case LACP is in use then the steps of the previous paragraph are
   executed in the "Distributing" state (see the Mux machine state
   diagram Figure 5-14 in [IEEE802.1AX]).  I.e.  LACP is in Distributing
   state before the implementation potentially waits for the BFD micro
   session to reach Up state.

   Detecting a member link failure:

   When running in parallel operation the logic for failure is that both
   LACP and BFD can indicate a failure.

   When a micro BFD session, that runs on a member link of a LAG, goes
   down then this member link MUST be taken out of the distribution
   function of the particular LAG and the port status MUST change to
   Distribution FALSE.  The BFD micro session for the link MUST be
   deleted when the link has been taken out of distribution.

   In case LACP is in use then the variable "Selected" MUST be set to
   UNSELECTED when BFD reports a Down state.  The steps of the previous
   paragraph are executed in the "Collecting" state (see the Mux machine
   state diagram Figure 5-14 in 802.1AX).

   The behaviour of the LMM MUST be configurable if waiting for a BFD
   status of Up to add a member link is supported, to allow an
   alternative mode of adding the member link irrespective of the BFD
   state for interoperability purpose.  Bringing the member link up
   without waiting for BFD is then the default behaviour.







Bhatia, et al.            Expires July 9, 2012                 [Page 11]


Internet-Draft           BFD for LAG Interfaces             January 2012


6.  Security Consideration

6.1.  (IP/UDP encapsulation)

   This document does not introduce any additional security issues and
   the security mechanisms defined in [RFC5880] apply in this document.

   Routers compliant to this standard will now need to process packets
   addressed to a new multicast address.  This however, should not open
   any new attack vector as it is a link local multicast and the
   attacker would have to be on the same link as the router to launch
   such packets.

6.2.  (Ethernet encapsulation)

   This document does not introduce any additional security issues and
   the security mechanisms defined in [RFC5880] apply in this document.

   If no mechanism exists to transport Ethernet frames from a node other
   than a directly connected node then the security is identical to the
   TTL=255 check for IP packets.


7.  IANA Considerations

7.1.  (IP/UDP encapsulation, multicast)

   The IANA is requested to assign a well-known link-local multicast IP
   address: "224.0.0.XXX" for IPv4 and FF02::X for IPv6.

7.2.  (IP/UDP encapsulation, unicast)

   The IANA is requested to assign a well-known port number for the UDP
   encapsulated micro BFD sessions.

7.3.  (Ethernet encapsulation)

   IANA needs to assign a L2 MAC address according to RFC 5342 [RFC5342]
   that would be used as the destination MAC for all control packets in
   the micro BFD sessions.


8.  IEEE Considerations

   (The following applies only in case of Ethernet encpasulation) A new
   Ethertype must be assigned by the IEEE Registration Authority to the
   BFD over Ethernet protocol that will be used for all micro BFD
   sessions.



Bhatia, et al.            Expires July 9, 2012                 [Page 12]


Internet-Draft           BFD for LAG Interfaces             January 2012


9.  Acknowledgements

   Most of the text for this document came originally from
   draft-chen-bfd-interface-00.

   We would like to thank Dave Katz, Alexander Vainshtein, Greg Mirsky
   and Jeff Tantsura for their comments on this draft.

   We would also like to thank the members of the BFD WG who expressed
   strong support about the need to run BFD on all the member links of a
   LAG.


10.  References

10.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD)", RFC 5880, June 2010.

   [RFC5881]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
              June 2010.

   [RFC5882]  Katz, D. and D. Ward, "Generic Application of
              Bidirectional Forwarding Detection (BFD)", RFC 5882,
              June 2010.

10.2.  Informative References

   [IEEE802.1AX]
              IEEE Std. 802.1AX, "IEEE Standard for Local and
              metropolitan area networks - Link Aggregation",
              November 2008.

   [IEEE802.3]
              IEEE Std. 802.3, "IEEE Standard for Information technology
              - Telecommunications and information exchange between
              systems - Local and metropolitan area networks - Specific
              requirements Part 3: Carrier sense multiple access with
              collision detection (CSMA/CD) access method and physical
              layer specifications", December 2008.

   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768,
              August 1980.



Bhatia, et al.            Expires July 9, 2012                 [Page 13]


Internet-Draft           BFD for LAG Interfaces             January 2012


   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
              September 1981.

   [RFC2328]  Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.

   [RFC4271]  Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
              Protocol 4 (BGP-4)", RFC 4271, January 2006.

   [RFC5342]  Eastlake, D., "IANA Considerations and IETF Protocol Usage
              for IEEE 802 Parameters", BCP 141, RFC 5342,
              September 2008.


Appendix A.  IETF discussion status

   [This section will finally go away.  It documents some of the
   discussions and decisions made recently on the BFD mailing list.]

A.1.  Unicast vs. Multicast IP address

   The destination IP address for the BFD control packets for the micro
   BFD sessions can be Unicast or Multicast.  Each has its set of
   advantages and disadvantages.

   Advantages with using a Unicast IP destination address:

   o  Minimal code changes to support micro BFD sessions per member
      link.  A new UDP port number can be used to differentiate BFD
      control packets associated with the micro BFD sessions and the
      regular BFD sessions.

   Disadvantages with using a Unicast IP destination address:

   o  It is configuration intensive.  Each LAG needs to be configured
      with the remote end's IP address for BFD to boot strap.
      Similarly, a change in the IP address of the interface would need
      all LAGs to be reconfigured.  While one could minimize the amount
      of human intervention required, it cannot be completely
      eliminated.

   o  ARP needs to be resolved for sending Unicast packets.  This means
      that ARP must be resolved even before the first control packet is
      sent to bring up the micro BFD session.  There are multiple ways
      to achieve this.  The most logical approach is to mandate LACP on
      this LAG.  This way., LACP will bring up the links so that ARP
      resolution can begin.  However, this necessitates the need to run
      LACP along with BFD on all member links.  The other option is to
      allow ARP processing even when the port state is down.  This means



Bhatia, et al.            Expires July 9, 2012                 [Page 14]


Internet-Draft           BFD for LAG Interfaces             January 2012


      that implementations would have to allow all packets with
      broadcast MAC and port MAC to be sent to CPU for processing.  This
      violates the basic tenets of IP layering and opens a hole for a
      DoS attack.  This also requires a huge change to the IP stack to
      allow packet Rx and Tx on ports that are down.

   o  Not possible to support unnumbered IP interfaces.

   Advantages with using a Multicast IP destination address:

   o  No additional configuration is required, and the micro BFD
      sessions are set up automatically.  It remains independent of the
      LAG IP addressing scheme.  The member links get added to the LAG
      as soon as the micro BFD sessions come up.

   o  This involves minimal modifications to data plane and L3 stack.
      Currently, ports that are down do process packets coming with
      certain well known L2 MAC addresses.  This solution requires such
      ports to process packets addressed to another well known L2 MAC
      address (derived from the multicast IP address assigned by IANA).

   o  Can support unnumbered IP interfaces.

   Disadvantages with using a Multicast IP destination address:

   o  Unfamiliar.  Some bit of the data plane and the source code would
      need to be modified to accept BFD control packets that are
      multicast.

   o  Need to allocate a new link local multicast address from IANA.

   Based on the above analysis, we decided to go with multicast IP
   addressing scheme for the micro BFD sessions.

A.2.  Design Using Unicast IP encapsulation

   While we personally think that the Multicast solution for micro BFD
   sessions is better then the Unicast, we briefly describe how we could
   make Unicast work.

   Once LACP has brought up the links, routers will initiate
   establishing a Unicast BFD session over each component link of the
   LAG.  The remote destination addresses could either be configured on
   the routers or could be discovered via some discovery protocol (that
   can be standardized later).  The exact mechanism to get the
   destination IP address is beyond the scope of this document.

   Some service providers have expressed interest to run BBP on top of



Bhatia, et al.            Expires July 9, 2012                 [Page 15]


Internet-Draft           BFD for LAG Interfaces             January 2012


   the micro BFD sessions.  In this case, its imperative that Unicast
   BFD packets corresponding to the micro sessions use a different UDP
   port (assigned by IANA) lest they get mixed up with the BFD packets
   meant for the BBP sessions.

   This design requires LACP to be present so that it brings up the
   links and ARP processing can begin.  Operators however have also
   expressed interest in a solution that works in the absence of LACP.
   This could be done by using a well known L2 MAC address to carry the
   micro session BFD packets.  This way routers dont have to depend upon
   ARP to boot strap the micro BFD sessions.

A.3.  Discussion about the BFD packet encapsulation

   With at least three implementations using IP/UDP for the BFD packet
   encapsulation on the LAG member links there cannot be any doubt that
   technically IP/UDP encapsulation works for this purpose.  What such a
   view is missing though is the requirement to have some kind of
   standardized packet send and receive API to allow everyone to
   implement the new standard.

   The user interface for IP/UDP packets would be either for UDP,
   defined in RFC 0768 [RFC0768], or the IP user interface, defined in
   RFC 0791 [RFC0791].  None of them allows to provide any control about
   the LAG member port a packet is transmitted nor does it provide the
   information on which port the packet was received.  Thus an agreement
   is required to extend these APIs to control the sending port and to
   know about the receiving port.

   If we don't use the layer-3 user interface then we need to look at
   802.3 and 802.1AX standards, as they describe in the case of a LAG
   what is "below" the IP layer.  In this case we are already on the
   Ethernet layer and adding IP and UDP headers to the BFD packet may
   either conflict with IP/UDP itself or may be without any function.
   Thus encapsulating BFD directly in Ethernet and using a user
   interface fitting into 802.3 and 802.1AX seems a viable approach.

A.4.  Details of an example User interface for BFD packets

   An additional sublayer is inserted between the MAC or MAC control of
   the physical port(s) and the Link aggregation sublayer.  This allows
   to receive and inject BFD packets on every LAG port.

   This additional sublayer allows to drop Ethernet frames with a
   specific Ethernet type (we name the value "BfdEtherType" from now)
   off the stream of frames coming from the MAC layer.  It hands over
   the dropped-off frames to the BFD module.  The new sublayer also
   allows to inject Ethernet frames with the specific Ethernet type into



Bhatia, et al.            Expires July 9, 2012                 [Page 16]


Internet-Draft           BFD for LAG Interfaces             January 2012


   the stream of frames towards the MAC layer.  All other frames are
   passing transparently between the MAC and the link aggregation layer.

                        +------------+
                        | Mac Client |
                        +------------+
                          ^        |
                          |        |
                        ...................
                          |        |
                          |        V
             +---------------------------------+
             |        Link aggregation         |
             |            sublayer             |
             +---------------------------------+
                ^                   ^
                |                   |
             ........................................
                |                   |
                |                   |                  +-----+
                |       +---------- | ---------------->| BFD |
                |       |           |       +--------->|     |
                V (A)   V (B)       V       V          +-----+
             +-------------+     +-------------+
             |  inject/    |     |  inject/    |
             |  drop-off   | ... |  drop-off   |
             +-------------+     +-------------+
                    ^ (C)               ^
                    |                   |
             ........................................
                    |                   |
                    V                   V
             +-------------+     +-------------+
             | MAC control | ... | MAC control |
             | (optional)  |     | (optional)  |
             +-------------+     +-------------+
             |    MAC      |     |    MAC      |
             +-------------+     +-------------+
             |  Physical   |     |  Physical   |
             |   layer     |     |   layer     |
             +-------------+     +-------------+

        Inject/drop-off mechanism for specific BFD Ethernet frames

                                 Figure 1

   The API in (A) behaves like the MAC side of the API defined in
   section 2.3 of [IEEE802.3].  All MA_DATA and MA_CONTROL requests are



Bhatia, et al.            Expires July 9, 2012                 [Page 17]


Internet-Draft           BFD for LAG Interfaces             January 2012


   passed transparently to the API in (C), which behaves like the MAC
   Client side.  Vice versa all MA_CONTROL indication received at (C)
   are passed transparently to (A).  MA_DATA indication received at (C)
   are passed to (A) when the Ethernet Type is not BfdEtherType.
   Otherwise the MA_DATA indication is passed to API (B), which behaves
   like the MAC side of the API in section 2.3 [IEEE802.3] but without
   any MAC Control support.

   A MA_DATA request received at (B) is passed to (C) if the Ethernet
   Type field in the frame is set to BfdEtherType; otherwise the frame
   is dropped.

A.5.  BLM sessions and the address family

   When the BFD encapsulation is Ethernet then the following discussion
   is obsolete.  In case of IP/UDP encapsulation it should be
   highlighted that the way a BLM session is defined above means a BLM
   request for a LAG with IPv4 and a BLM request for the same LAG with
   IPv6 is considered a shared session, with the obvious conflict that
   the micro session must be all either IPv4 exlcusiv-or IPv6.  One
   could consider to allow BLM-v4 and BLM-v6 for the same LAG instead,
   which would mean we have to separate concluded states.  This would
   require more details in Section 4.


Authors' Addresses

   Manav Bhatia
   Alcatel-Lucent
   Bangalore,   560045
   India

   Email: manav.bhatia@alcatel-lucent.com


   Mach(Guoyi) Chen
   Huawei Technologies Co., Ltd
   Q14 Huawei Campus, No. 156 Beiqing Road, Hai-dian District
   Beijing  100095
   China

   Email: mach@huawei.com









Bhatia, et al.            Expires July 9, 2012                 [Page 18]


Internet-Draft           BFD for LAG Interfaces             January 2012


   Zuliang Wang
   Huawei Technologies Co., Ltd
   Q15 Huawei Campus, No. 156 Beiqing Road, Hai-dian District
   Beijing  100095
   China

   Email: liang_tsing@huawei.com


   Liang Guo
   China Telecom
   Guangzhou
   China

   Email: guoliang@gsta.com


   Marc Binderberger
   Lausanne,
   Switzerland

   Email: marc@sniff.de





























Bhatia, et al.            Expires July 9, 2012                 [Page 19]