Network Working Group M. Bhatia
Internet-Draft Alcatel-Lucent
Intended status: Standards Track M. Chen
Expires: July 9, 2012 Z. Wang
Huawei Technologies Co., Ltd
L. Guo
China Telecom
M. Binderberger
January 6, 2012
Bidirectional Forwarding Detection (BFD) on Link Aggregation Group (LAG)
Interfaces
draft-mmm-bfd-on-lags-02
Abstract
This document proposes a mechanism to run BFD on Link Aggregation
Group (LAG) interfaces. It does so by running an independent BFD
session on every LAG member link.
(For IP/UDP encapsulation)
A dedicated well-known multicast IP address for both IPv4 and IPv6 is
introduced as the destination IP address of the BFD packets when
running BFD on the member links of the LAG.
(For Ethernet encapsulation)
A new Ethernet type is introduced to send BFD packets directly in
Ethernet frames when running BFD on the member links of the LAG.
There is currently also no standard that describes how BFD runs on a
LAG interface as a whole. This draft proposes a definition for this
problem too while taking into consideration existing implementations.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Bhatia, et al. Expires July 9, 2012 [Page 1]
Internet-Draft BFD for LAG Interfaces January 2012
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 9, 2012.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Bhatia, et al. Expires July 9, 2012 [Page 2]
Internet-Draft BFD for LAG Interfaces January 2012
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. BFD over LAG with a single session . . . . . . . . . . . . . . 4
2.1. BFD over Big Pipe . . . . . . . . . . . . . . . . . . . . 5
2.2. Examples of existing implementations . . . . . . . . . . . 5
3. BFD on LAG member links . . . . . . . . . . . . . . . . . . . 6
3.1. BFD BLM session . . . . . . . . . . . . . . . . . . . . . 6
3.2. Micro BFD sessions . . . . . . . . . . . . . . . . . . . . 7
3.2.1. BFD packet details (IP/UDP Encapsulation,
Multicast destination address) . . . . . . . . . . . . 7
3.2.2. BFD packet details (IP/UDP Encapsulation, Unicast
destination address) . . . . . . . . . . . . . . . . . 8
3.2.3. BFD packet details (Ethernet encapsulation) . . . . . 8
3.3. Concluded BFD state . . . . . . . . . . . . . . . . . . . 8
3.4. User interface for BFD packets . . . . . . . . . . . . . . 9
3.4.1. User interface (IP/UDP encapsulation) . . . . . . . . 9
3.4.2. User interface (Ethernet encapsulation) . . . . . . . 9
4. BFD on LAG members and layer-3 applications . . . . . . . . . 10
5. Application example: LMM using BLM . . . . . . . . . . . . . . 10
6. Security Consideration . . . . . . . . . . . . . . . . . . . . 12
6.1. (IP/UDP encapsulation) . . . . . . . . . . . . . . . . . . 12
6.2. (Ethernet encapsulation) . . . . . . . . . . . . . . . . . 12
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
7.1. (IP/UDP encapsulation, multicast) . . . . . . . . . . . . 12
7.2. (IP/UDP encapsulation, unicast) . . . . . . . . . . . . . 12
7.3. (Ethernet encapsulation) . . . . . . . . . . . . . . . . . 12
8. IEEE Considerations . . . . . . . . . . . . . . . . . . . . . 12
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
10.1. Normative References . . . . . . . . . . . . . . . . . . . 13
10.2. Informative References . . . . . . . . . . . . . . . . . . 13
Appendix A. IETF discussion status . . . . . . . . . . . . . . . 14
A.1. Unicast vs. Multicast IP address . . . . . . . . . . . . . 14
A.2. Design Using Unicast IP encapsulation . . . . . . . . . . 15
A.3. Discussion about the BFD packet encapsulation . . . . . . 16
A.4. Details of an example User interface for BFD packets . . . 16
A.5. BLM sessions and the address family . . . . . . . . . . . 18
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18
Bhatia, et al. Expires July 9, 2012 [Page 3]
Internet-Draft BFD for LAG Interfaces January 2012
1. Introduction
The Bidirectional Forwarding Detection (BFD) protocol [RFC5880]
provides a mechanism to detect faults in the bidirectional path
between two forwarding engines, including interfaces, data link(s),
and to the extent possible the forwarding engines themselves, with
potentially very low latency.
BFD can be used for detecting failures of the path between two
network devices. Typically the application clients are not aware of
any inner structure of the underlying interface, being layer 3
applications themselves like Open Shortest Path First (OSPF)
[RFC2328] or Border Gateway Protocol (BGP)[RFC4271]. While this
works for interfaces like Ethernet and Packet Over SONET (POS), it
causes problems for bundled interfaces like LAG.
A LAG is used to bind together several physical ports between two
adjacent nodes so they appear to higher-layer protocols as a single,
higher bandwidth "virtual" pipe. A LAG interface thereby allows
aggregation of multiple network interfaces as one virtual interface
for the purpose of providing fault-tolerance and higher bandwidth.
The problem with running BFD over a LAG is that with a single BFD
session and without internal knowledge of the LAG structure it is
impossible for BFD to guarantee a detection of anything but a full
LAG shutdown within the BFD timeout period. The LAG shutdown is
typically initiated by some LAG module, which we will refer to as the
LAG Management Module (LMM) in the rest of the document. LAG timers
are typically multiple times slower than the BFD detection timers
(multiple 100msec of LMM vs. multiple 10msec of BFD). There is thus
a need to bring some sort of determinism in how BFD runs over a LAG.
There is also a need to detect member link failures much faster than
what Link Aggregation Control Protocol (LACP) allows.
The document proposes establishing a BFD session over every member
link the LAG is built upon. BFD can combine these information to
provide fast detection for layer-3 applications.
While there are native Ethernet mechanisms to detect failures
(802.1ax, .3ah) that could be used for LAG, the solution proposed in
this document enables operators who have already deployed BFD over
different technologies (e.g. IP, MPLS) to use a common failure
detection mechanism.
2. BFD over LAG with a single session
Bhatia, et al. Expires July 9, 2012 [Page 4]
Internet-Draft BFD for LAG Interfaces January 2012
2.1. BFD over Big Pipe
The simplest approach to run BFD on a LAG interface is to ignore the
internal structure and treat the LAG as one "big pipe". We call this
mode of operation as "BFD over Big Pipe" or "BBP" for short. It
corresponds to section 7.1 in RFC 5882 [RFC5882].
We need to standardize the BBP approach. The following requirements
define what it means to treat a LAG interface as a single interface
with no additional structure:
o BFD can send packets on any member link
o BFD must accept packets from every member link
o The Rx/Tx link can change any time and/or regularly with every
change pattern without causing BFD to fail
The BFD session on the LAG interface then follows RFC 5880 and
RFC 5881 in all details.
2.2. Examples of existing implementations
Because there is no standard, vendors have implemented their own
proprietary mechanisms to run BFD over LAG interfaces. Two examples
are shown here. Both satisfy the requirements in Section 2.1
Some implementations send BFD packets only over one member link.
Others spray BFD packets over all member links of the LAG. There are
issues with each of these approaches.
In the first approach, BFD sends packets onto the LAG and the LAG
load balance algorithm will select a member port, which may be the
same port for all the packets of this BFD session. BFD will remain
up as long as this "primary" port is alive. It will go down once the
primary port goes down till another port is selected as the primary.
Problems arise with this design as BFD is oblivious to the presence
of other member links in the LAG. If a non-primary link goes down,
the BFD session remains unaffected as it can still send and receive
BFD packets over the primary link. This results in all traffic sent
over the failed member link getting dropped, till the LMM removes the
failed link from the LAG. Conversely if the primary link goes down,
then the BFD session will go down, till a new member link is elected
as the primary link.
In the second approach, BFD packets are sprayed over all the member
links of a LAG. This is done naively via round-robin, where each BFD
packet is sent using the subsequent member link, in a round-robin
Bhatia, et al. Expires July 9, 2012 [Page 5]
Internet-Draft BFD for LAG Interfaces January 2012
fashion. It solves the problem of BFD going down because of the
primary port going down, but it still does not solve the problem of
traffic getting lost when one of the member link goes down. This is
because, when a member link goes down, BFD remains up and traffic
continues to go over the link that has failed till a higher layer
protocol detects this and removes the offending link from the LAG.
Between the two approaches the second one is RECOMMENDED as its much
more flexible and is not prone to single link failures. To
completely solve all issues we RECOMMEND running BFD on all member
links as described in Section 3.
3. BFD on LAG member links
The mechanism proposed for a fast detection of LAG member link
failure is running BFD sessions on every LAG member link. We name
this mode of BFD operations "BFD on LAG members" or "BLM" for short.
It corresponds to section 7.3 in RFC 5882 [RFC5882].
3.1. BFD BLM session
The overall BLM session consists of the LAG interface, i.e. the
aggregated link, a set of BFD sessions running on the member links
and a new BFD state for the LAG; this state is explained in more
detail in Section 3.3. We call the member-link sessions as micro BFD
sessions; their details are discussed in Section 3.2.
The set of micro sessions is such that we have one micro session per
member link. This set can change over the lifetime of a BLM session.
E.g. BFD receives updates for the micro session set when links are
physically added or removed from the LAG and will accordingly create
or delete micro BFD sessions.
The details of how the update happens are implementation specific and
outside the document's scope. For example the client requesting the
BLM session could provide these updates.
(The following paragraph applies only when IP/UDP encapsulation is in
use) Only one address family MUST be used per BLM session, i.e. the
set of micro BFD sessions belonging to the BLM session MUST either
all use IPv4 or all use IPv6.
Multiple BLM session requests for the same LAG interface result in a
shared BLM session. The set of micro sessions finally used is the
superset of the individual micro session sets. If conflicting
session parameters are requested then it is a local issue as to how
to resolve the parameter conflicts, as explained in RFC 5882, Section
Bhatia, et al. Expires July 9, 2012 [Page 6]
Internet-Draft BFD for LAG Interfaces January 2012
2.
3.2. Micro BFD sessions
A single micro BFD session runs on every member link of the LAG.
These micro BFD sessions follow RFC 5880 [RFC5880].
Only asynchronous mode is considered in this document. The echo
function is outside the document's scope. At least one system MUST
take the Active role (possibly both). The micro BFD sessions on the
member links are independent BFD sessions. They use their own
unique, local discriminator values, maintain their own set of state
variables and have their own independent state machine. Timer values
MAY be different, even among the micro sessions belonging to the same
LAG, although it is expected that micro sessions belonging to the
same LAG use the same timer values.
The demultiplexing of a received packet is solely based on the Your
Discriminator field, if this field is nonzero. For the initial Down
packet of a micro session this value may be zero. In this case
demultiplexing MUST be based on some combination of other fields
which MUST include the interface information of the member link.
When receiving a BFD packet for a micro session with a valid, non-
zero Your Discriminator then a check MUST be done if the packet was
received on the correct member link interface. If the check fails
then the packet MUST be discarded. This test needs to be done before
state variables for the micro sessions are updated by the received
packet.
3.2.1. BFD packet details (IP/UDP Encapsulation, Multicast destination
address)
[Either this section or the alternative sections Section 3.2.3,
Section 3.2.2 should remain in the final document. There is no
intention to support multiple encapsulations.]
The BFD Control packets for each micro BFD session are IP/UDP
encapsulated as defined in [RFC5881]. They use a well-known link-
local multicast IP address (224.0.0.X for IPv4, FF02::X for IPv6, to
be assigned by IANA).
On Ethernet-based LAG member links the corresponding destination
multicast MACs will be 01:00:5e:00:00:XX for IPv4 and
33:33:00:00:00:XX for IPv6. Each member link uses its own MAC
address as the source MAC address.
Bhatia, et al. Expires July 9, 2012 [Page 7]
Internet-Draft BFD for LAG Interfaces January 2012
3.2.2. BFD packet details (IP/UDP Encapsulation, Unicast destination
address)
[Either this section or the previous sections Section 3.2.3,
Section 3.2.1 should remain in the final document. There is no
intention to support multiple encapsulations.]
The BFD Control packets for each micro BFD session are IP/UDP
encapsulated as defined in [RFC5881], but with one major change: the
UDP destination port will not be 3784 but "BfdBndlPort" (to be
assigned by IANA). Control packets use a destination IP address that
is the peer's remote IP address. The details of how this destination
IP address is learnt is beyond the scope of this document.
On Ethernet-based LAG member links the destination MAC is the MAC
assigned to the peers LAG aggregator.
3.2.3. BFD packet details (Ethernet encapsulation)
[Either this section or the next sections Section 3.2.1,
Section 3.2.2 should remain in the final document. There is no
intention to support multiple encapsulations.]
The BFD packet is directly encapsulated into the Ethernet frame. The
frame has the following format: Ethernet header according to
[IEEE802.3], then Type/Length field set to "BfdEtherType", followed
by the BFD packet
The Ethernet payload must be padded with zeros to reach 46 bytes if
the BFD packet size is not already larger.
When receiving an Ethernet frame the payload is used for further BFD
processing. Additional padding data MUST be ignored if it was
required to reach the minimum payload length of 46 bytes.
IANA needs to assign a L2 MAC address according to [RFC5342] that
would be used as the destination MAC for all control packets in the
micro BFD sessions.
A new Ethertype must be assigned by the IEEE Registration Authority
to the BFD over Ethernet protocol that will be used for all micro BFD
sessions.
3.3. Concluded BFD state
An additional state variable is introduced for BFD on LAG members:
the concluded state. The state values are Down, Up and AdminDown.
This state is not part of the micro session state machine. Instead
Bhatia, et al. Expires July 9, 2012 [Page 8]
Internet-Draft BFD for LAG Interfaces January 2012
it describes the overall state of the LAG. It is a local state and
does not appear (directly) in any BFD packet on any link.
The concluded state may be set to AdminDown for administrative
purpose, to keep the BLM and the micro sessions indefinitely down.
When the concluded state is entering AdminDown then all micro
sessions belonging to the BLM MUST enter the AdminDown state as well.
A function must be defined, which evaluates all the states of the
micro sessions that belong to the BLM. This function has two output
values Down and Up and the concluded state is updated with the last
evaluation result, unless it is already in AdminDown state. The
evaluation takes place whenever a micro session is added, removed or
is changing state.
The details of the evaluation function are outside the scope of the
document. The function could for example test for a minimum number
of micro sessions in Up state. The function could even be
"outsourced" and e.g. the decision logic of the LMM module could be
used.
The concluded state is important for layer-3 clients requesting BFD
sessions over the LAG or over Vlans on the LAG. Details will be
discussed in Section 4.
3.4. User interface for BFD packets
3.4.1. User interface (IP/UDP encapsulation)
[Either this section or the next section Section 3.4.2 should remain
in the final document. There is no intention to support both
encapsulations.]
The user interface for BFD micro sessions encapsulated in IP/UDP MUST
allow to send an IP/UDP packet on a specified LAG port. When
receiving BFD packets for micro sessions then the IP/UDP packet MUST
be provided together with an information what LAG port the packet was
received on.
3.4.2. User interface (Ethernet encapsulation)
[Either this section or the previous section Section 3.4.1 should
remain in the final document. There is no intention to support both
encapsulations.]
The user interface for BFD directly transported in an Ethernet frame
MUST allow to send and receive a complete Ethernet frame with the
specific "BfdEtherType" type value. The information that specifies
Bhatia, et al. Expires July 9, 2012 [Page 9]
Internet-Draft BFD for LAG Interfaces January 2012
the LAG port from which a frame is sent to or received from is either
an explicit extension of this API or is implicitly given by binding
the API to a specific port.
As an example, the API could be identical with the MA_DATA request
and indication as defined in section 2.3 of [IEEE802.3].
4. BFD on LAG members and layer-3 applications
Layer 3 protocols like e.g. OSPF may use BFD on LAG members in one
of the following ways:
a. The session request from the client creates a virtual session.
This virtual session is not sending actual BFD packets. Instead
the state, which is reported to the layer-3 client, is based on
the concluded state.
Implementations compliant to this standard MUST support this
mode. This is the default mode in which BFD over LAG works.
b. The session request from the client creates a BBP session, as
described in Section 2.1. BFD SHOULD update the state of the BBP
session with the concluded state of the corresponding LAG in the
following way:
1. when the concluded state is Down then the BBP session state
is transitioning to Down as well
2. for a concluded state of Up or AdminDown the BBP session
state is unaffected
This state update allows BBP session to run with more relaxed
timer values as the more intense liveliness detection is done by
the micro BFD sessions.
Compliant implementations MUST support this mode.
An implementation MUST provide a configuration knob which lets the
user select the mode.
5. Application example: LMM using BLM
There are certainly many ways to use BLM. Here is one example
envisioned by the authors.
The LAG Management Module (LMM) could be envisaged as a client of
Bhatia, et al. Expires July 9, 2012 [Page 10]
Internet-Draft BFD for LAG Interfaces January 2012
BFD, i.e. the LMM requests a BLM session and takes responsibility to
update the set of micro sessions
LMM then uses BFD, instead of or in parallel with LACP, to monitor
the health of the individual members links of the LAG. Details are
outlined below.
Bringing a member link up:
When the status of a port is about to change to Distribution TRUE
(see section 5.3.15 in [IEEE802.1AX]), i.e. before the port is added
to the distribution function of the LAG, then the particular BFD
micro session is requested. An implementation MAY wait for the micro
BFD session to reach Up state before adding the port to the LAG's
distribution function and changing the port status to Distribution
TRUE.
In case LACP is in use then the steps of the previous paragraph are
executed in the "Distributing" state (see the Mux machine state
diagram Figure 5-14 in [IEEE802.1AX]). I.e. LACP is in Distributing
state before the implementation potentially waits for the BFD micro
session to reach Up state.
Detecting a member link failure:
When running in parallel operation the logic for failure is that both
LACP and BFD can indicate a failure.
When a micro BFD session, that runs on a member link of a LAG, goes
down then this member link MUST be taken out of the distribution
function of the particular LAG and the port status MUST change to
Distribution FALSE. The BFD micro session for the link MUST be
deleted when the link has been taken out of distribution.
In case LACP is in use then the variable "Selected" MUST be set to
UNSELECTED when BFD reports a Down state. The steps of the previous
paragraph are executed in the "Collecting" state (see the Mux machine
state diagram Figure 5-14 in 802.1AX).
The behaviour of the LMM MUST be configurable if waiting for a BFD
status of Up to add a member link is supported, to allow an
alternative mode of adding the member link irrespective of the BFD
state for interoperability purpose. Bringing the member link up
without waiting for BFD is then the default behaviour.
Bhatia, et al. Expires July 9, 2012 [Page 11]
Internet-Draft BFD for LAG Interfaces January 2012
6. Security Consideration
6.1. (IP/UDP encapsulation)
This document does not introduce any additional security issues and
the security mechanisms defined in [RFC5880] apply in this document.
Routers compliant to this standard will now need to process packets
addressed to a new multicast address. This however, should not open
any new attack vector as it is a link local multicast and the
attacker would have to be on the same link as the router to launch
such packets.
6.2. (Ethernet encapsulation)
This document does not introduce any additional security issues and
the security mechanisms defined in [RFC5880] apply in this document.
If no mechanism exists to transport Ethernet frames from a node other
than a directly connected node then the security is identical to the
TTL=255 check for IP packets.
7. IANA Considerations
7.1. (IP/UDP encapsulation, multicast)
The IANA is requested to assign a well-known link-local multicast IP
address: "224.0.0.XXX" for IPv4 and FF02::X for IPv6.
7.2. (IP/UDP encapsulation, unicast)
The IANA is requested to assign a well-known port number for the UDP
encapsulated micro BFD sessions.
7.3. (Ethernet encapsulation)
IANA needs to assign a L2 MAC address according to RFC 5342 [RFC5342]
that would be used as the destination MAC for all control packets in
the micro BFD sessions.
8. IEEE Considerations
(The following applies only in case of Ethernet encpasulation) A new
Ethertype must be assigned by the IEEE Registration Authority to the
BFD over Ethernet protocol that will be used for all micro BFD
sessions.
Bhatia, et al. Expires July 9, 2012 [Page 12]
Internet-Draft BFD for LAG Interfaces January 2012
9. Acknowledgements
Most of the text for this document came originally from
draft-chen-bfd-interface-00.
We would like to thank Dave Katz, Alexander Vainshtein, Greg Mirsky
and Jeff Tantsura for their comments on this draft.
We would also like to thank the members of the BFD WG who expressed
strong support about the need to run BFD on all the member links of a
LAG.
10. References
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD)", RFC 5880, June 2010.
[RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
June 2010.
[RFC5882] Katz, D. and D. Ward, "Generic Application of
Bidirectional Forwarding Detection (BFD)", RFC 5882,
June 2010.
10.2. Informative References
[IEEE802.1AX]
IEEE Std. 802.1AX, "IEEE Standard for Local and
metropolitan area networks - Link Aggregation",
November 2008.
[IEEE802.3]
IEEE Std. 802.3, "IEEE Standard for Information technology
- Telecommunications and information exchange between
systems - Local and metropolitan area networks - Specific
requirements Part 3: Carrier sense multiple access with
collision detection (CSMA/CD) access method and physical
layer specifications", December 2008.
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
August 1980.
Bhatia, et al. Expires July 9, 2012 [Page 13]
Internet-Draft BFD for LAG Interfaces January 2012
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
September 1981.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC5342] Eastlake, D., "IANA Considerations and IETF Protocol Usage
for IEEE 802 Parameters", BCP 141, RFC 5342,
September 2008.
Appendix A. IETF discussion status
[This section will finally go away. It documents some of the
discussions and decisions made recently on the BFD mailing list.]
A.1. Unicast vs. Multicast IP address
The destination IP address for the BFD control packets for the micro
BFD sessions can be Unicast or Multicast. Each has its set of
advantages and disadvantages.
Advantages with using a Unicast IP destination address:
o Minimal code changes to support micro BFD sessions per member
link. A new UDP port number can be used to differentiate BFD
control packets associated with the micro BFD sessions and the
regular BFD sessions.
Disadvantages with using a Unicast IP destination address:
o It is configuration intensive. Each LAG needs to be configured
with the remote end's IP address for BFD to boot strap.
Similarly, a change in the IP address of the interface would need
all LAGs to be reconfigured. While one could minimize the amount
of human intervention required, it cannot be completely
eliminated.
o ARP needs to be resolved for sending Unicast packets. This means
that ARP must be resolved even before the first control packet is
sent to bring up the micro BFD session. There are multiple ways
to achieve this. The most logical approach is to mandate LACP on
this LAG. This way., LACP will bring up the links so that ARP
resolution can begin. However, this necessitates the need to run
LACP along with BFD on all member links. The other option is to
allow ARP processing even when the port state is down. This means
Bhatia, et al. Expires July 9, 2012 [Page 14]
Internet-Draft BFD for LAG Interfaces January 2012
that implementations would have to allow all packets with
broadcast MAC and port MAC to be sent to CPU for processing. This
violates the basic tenets of IP layering and opens a hole for a
DoS attack. This also requires a huge change to the IP stack to
allow packet Rx and Tx on ports that are down.
o Not possible to support unnumbered IP interfaces.
Advantages with using a Multicast IP destination address:
o No additional configuration is required, and the micro BFD
sessions are set up automatically. It remains independent of the
LAG IP addressing scheme. The member links get added to the LAG
as soon as the micro BFD sessions come up.
o This involves minimal modifications to data plane and L3 stack.
Currently, ports that are down do process packets coming with
certain well known L2 MAC addresses. This solution requires such
ports to process packets addressed to another well known L2 MAC
address (derived from the multicast IP address assigned by IANA).
o Can support unnumbered IP interfaces.
Disadvantages with using a Multicast IP destination address:
o Unfamiliar. Some bit of the data plane and the source code would
need to be modified to accept BFD control packets that are
multicast.
o Need to allocate a new link local multicast address from IANA.
Based on the above analysis, we decided to go with multicast IP
addressing scheme for the micro BFD sessions.
A.2. Design Using Unicast IP encapsulation
While we personally think that the Multicast solution for micro BFD
sessions is better then the Unicast, we briefly describe how we could
make Unicast work.
Once LACP has brought up the links, routers will initiate
establishing a Unicast BFD session over each component link of the
LAG. The remote destination addresses could either be configured on
the routers or could be discovered via some discovery protocol (that
can be standardized later). The exact mechanism to get the
destination IP address is beyond the scope of this document.
Some service providers have expressed interest to run BBP on top of
Bhatia, et al. Expires July 9, 2012 [Page 15]
Internet-Draft BFD for LAG Interfaces January 2012
the micro BFD sessions. In this case, its imperative that Unicast
BFD packets corresponding to the micro sessions use a different UDP
port (assigned by IANA) lest they get mixed up with the BFD packets
meant for the BBP sessions.
This design requires LACP to be present so that it brings up the
links and ARP processing can begin. Operators however have also
expressed interest in a solution that works in the absence of LACP.
This could be done by using a well known L2 MAC address to carry the
micro session BFD packets. This way routers dont have to depend upon
ARP to boot strap the micro BFD sessions.
A.3. Discussion about the BFD packet encapsulation
With at least three implementations using IP/UDP for the BFD packet
encapsulation on the LAG member links there cannot be any doubt that
technically IP/UDP encapsulation works for this purpose. What such a
view is missing though is the requirement to have some kind of
standardized packet send and receive API to allow everyone to
implement the new standard.
The user interface for IP/UDP packets would be either for UDP,
defined in RFC 0768 [RFC0768], or the IP user interface, defined in
RFC 0791 [RFC0791]. None of them allows to provide any control about
the LAG member port a packet is transmitted nor does it provide the
information on which port the packet was received. Thus an agreement
is required to extend these APIs to control the sending port and to
know about the receiving port.
If we don't use the layer-3 user interface then we need to look at
802.3 and 802.1AX standards, as they describe in the case of a LAG
what is "below" the IP layer. In this case we are already on the
Ethernet layer and adding IP and UDP headers to the BFD packet may
either conflict with IP/UDP itself or may be without any function.
Thus encapsulating BFD directly in Ethernet and using a user
interface fitting into 802.3 and 802.1AX seems a viable approach.
A.4. Details of an example User interface for BFD packets
An additional sublayer is inserted between the MAC or MAC control of
the physical port(s) and the Link aggregation sublayer. This allows
to receive and inject BFD packets on every LAG port.
This additional sublayer allows to drop Ethernet frames with a
specific Ethernet type (we name the value "BfdEtherType" from now)
off the stream of frames coming from the MAC layer. It hands over
the dropped-off frames to the BFD module. The new sublayer also
allows to inject Ethernet frames with the specific Ethernet type into
Bhatia, et al. Expires July 9, 2012 [Page 16]
Internet-Draft BFD for LAG Interfaces January 2012
the stream of frames towards the MAC layer. All other frames are
passing transparently between the MAC and the link aggregation layer.
+------------+
| Mac Client |
+------------+
^ |
| |
...................
| |
| V
+---------------------------------+
| Link aggregation |
| sublayer |
+---------------------------------+
^ ^
| |
........................................
| |
| | +-----+
| +---------- | ---------------->| BFD |
| | | +--------->| |
V (A) V (B) V V +-----+
+-------------+ +-------------+
| inject/ | | inject/ |
| drop-off | ... | drop-off |
+-------------+ +-------------+
^ (C) ^
| |
........................................
| |
V V
+-------------+ +-------------+
| MAC control | ... | MAC control |
| (optional) | | (optional) |
+-------------+ +-------------+
| MAC | | MAC |
+-------------+ +-------------+
| Physical | | Physical |
| layer | | layer |
+-------------+ +-------------+
Inject/drop-off mechanism for specific BFD Ethernet frames
Figure 1
The API in (A) behaves like the MAC side of the API defined in
section 2.3 of [IEEE802.3]. All MA_DATA and MA_CONTROL requests are
Bhatia, et al. Expires July 9, 2012 [Page 17]
Internet-Draft BFD for LAG Interfaces January 2012
passed transparently to the API in (C), which behaves like the MAC
Client side. Vice versa all MA_CONTROL indication received at (C)
are passed transparently to (A). MA_DATA indication received at (C)
are passed to (A) when the Ethernet Type is not BfdEtherType.
Otherwise the MA_DATA indication is passed to API (B), which behaves
like the MAC side of the API in section 2.3 [IEEE802.3] but without
any MAC Control support.
A MA_DATA request received at (B) is passed to (C) if the Ethernet
Type field in the frame is set to BfdEtherType; otherwise the frame
is dropped.
A.5. BLM sessions and the address family
When the BFD encapsulation is Ethernet then the following discussion
is obsolete. In case of IP/UDP encapsulation it should be
highlighted that the way a BLM session is defined above means a BLM
request for a LAG with IPv4 and a BLM request for the same LAG with
IPv6 is considered a shared session, with the obvious conflict that
the micro session must be all either IPv4 exlcusiv-or IPv6. One
could consider to allow BLM-v4 and BLM-v6 for the same LAG instead,
which would mean we have to separate concluded states. This would
require more details in Section 4.
Authors' Addresses
Manav Bhatia
Alcatel-Lucent
Bangalore, 560045
India
Email: manav.bhatia@alcatel-lucent.com
Mach(Guoyi) Chen
Huawei Technologies Co., Ltd
Q14 Huawei Campus, No. 156 Beiqing Road, Hai-dian District
Beijing 100095
China
Email: mach@huawei.com
Bhatia, et al. Expires July 9, 2012 [Page 18]
Internet-Draft BFD for LAG Interfaces January 2012
Zuliang Wang
Huawei Technologies Co., Ltd
Q15 Huawei Campus, No. 156 Beiqing Road, Hai-dian District
Beijing 100095
China
Email: liang_tsing@huawei.com
Liang Guo
China Telecom
Guangzhou
China
Email: guoliang@gsta.com
Marc Binderberger
Lausanne,
Switzerland
Email: marc@sniff.de
Bhatia, et al. Expires July 9, 2012 [Page 19]