TRILL Working Group Yizhou Li
INTERNET-DRAFT Donald Eastlake
Intended Status: Informational Weiguo Hao
Huawei Technologies
Expires: January 13, 2014 July 12, 2013
Problems of Active-Active connection at the TRILL Edge
draft-yizhou-trill-active-active-connection-prob-00
Abstract
The IETF TRILL (Transparent Interconnection of Lots of
Links)_protocol provides support for flow level multi-pathing with
rapid failover for both unicast and multi-destination traffic in
networks with arbitrary topology and link technology between TRILL
switches. Active-active at the TRILL edge is the extension, in so far
as practical, of these characteristics to end stations that are
multiply connected to a TRILL campus. This informational document
discusses some of the high level problems to be overcome in providing
active-active at the TRILL edge.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright and License Notice
Yizhou, et al [Page 1]
INTERNET DRAFT Problems of Active-Active connection July 2013
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Target Scenario . . . . . . . . . . . . . . . . . . . . . . . 3
3. Problems in active-active connection at the edge . . . . . . . 5
3.1 Frame duplications . . . . . . . . . . . . . . . . . . . . . 5
3.2 Address flip-flop . . . . . . . . . . . . . . . . . . . . . 5
3.3 Packet drop due to RPF check . . . . . . . . . . . . . . . . 6
3.4 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.5 Member RBridges info synchronization . . . . . . . . . . . . 6
4 Current Work . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5 Security Considerations . . . . . . . . . . . . . . . . . . . . 7
6 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8
6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.1 Normative References . . . . . . . . . . . . . . . . . . . 8
5.2 Informative References . . . . . . . . . . . . . . . . . . 8
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9
Yizhou, et al [Page 2]
INTERNET DRAFT Problems of Active-Active connection July 2013
1 Introduction
The IETF TRILL (Transparent Interconnection of Lots of Links)
[RFC6325] protocol provides loop free and per hop based multipath
data forwarding with minimum configuration. TRILL uses IS-IS
[RFC6165] [RFC6326bis] as its control plane routing protocol and
defines a TRILL specific header for user data. In a TRILL campus,
communications between TRILL switches can
(1) use multiple parallel links and/or paths,
(2) load spread over different links and/or paths at a fine grained
flow level through equal cost multipathing of unicast traffic and
multiple distribution trees for multi-destination traffic, and
(3) rapidly re-configure to accommodate link or node failures or
additions.
Active-active connection is the extension, to the extent practical,
of similar load spreading and robustness to the connections between
end stations and the TRILL campus. Such end stations may have
multiple ports and will be connected, directly or via bridges, to
multiple edge TRILL switches. It must be possible, except in some
failure conditions, to load spread end station traffic at the flow
level across links to such multiple edge TRILL switches and rapidly
re-configure to accommodate topology changes.
1.1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
The acronyms and terminology in [RFC6325] is used herein with the
following additions:
CE - customer equipment. Could be a bridge or end station.
TRILL switch - an alternative term for an RBridge.
2. Target Scenario
The TRILL appointed forwarder [RFC6325] [RFC6327bis] [RFC6439]
mechanism provides per VLAN active-standby traffic spreading and loop
avoidance at the same time. One and only one appointed RBridge can
Yizhou, et al [Page 3]
INTERNET DRAFT Problems of Active-Active connection July 2013
ingress/egress native frames into/from TRILL campus for a given VLAN
among all edge RBridges connecting a legacy network to TRILL campus.
This is true whether the legacy network is a simple point-to-point
link or a complex bridged LAN or anything inbetween. By carefully
selecting different RBridge as appointed forwarder for different set
of VLANs, load spreading over different edge RBidges across different
VLANs can be achieved.
This section presents a typical scenario of active-active connections
to TRILL campus via multiple edge RBridges where current TRILL
appointed forwarder mechanism is not applicable.
The appointed forwarder mechanism [RFC6439] requires each of the edge
RBridges to exchange TRILL IS-IS Hello packets from their access
ports. As figure 1 shows, when multiple access links of multiple
edge RBridges are bundled as an MC-LAG (Multi-Chassis Link
Aggregation Group), Hello messages sent by RB1 via access port to CE1
will not be forwarded to RB2 by CE1. RB2 (and other members of MC-
LAG1) will not see that Hello from RB1. Every member RBridge of MC-
LAG1 thinks of itself as appointed forwarder on MC-LAG1 link for all
VLANs and will ingress/egress frames for all VLANs. Hence appointed
forwarder mechanism is not applicable in such active-active scenario.
----------------------
| |
| TRILL Campus |
| |
----------------------
| | |
----- | --------
| | |
+------+ +------+ +------+
| | | | | |
|(RB1) | |(RB2) | | (RBk)|
+------+ +------+ +------+
|..| |..| |..|
| +----+ | | | |
| +---|-----|--|----------+ |
| +-|---|-----+ +-----------+ |
MC- | | | +------------------+ | |
LAG1--->(| | |) (| | |) <---MC-LAG2
+-------+ . . . +-------+
| CE1 | | CEn |
| | | |
+-------+ +-------+
Yizhou, et al [Page 4]
INTERNET DRAFT Problems of Active-Active connection July 2013
Active-Active connection is useful when we want to achieve the
following requirements though MC-LAG implementation varies by vendor.
- Flow rather than VLAN based load balancing is required.
- Rapid failure recovery. Current appointed forwarder mechanism
relies on the Hello timer expiration to detect the unreachability of
another edge RBridge connecting to the same local Ethernet link. Then
re-appoint the forwarder for specific VLANs may be required. Such
procedures takes time in the scale of seconds. Active-Active
connection should minimize the frame loss and recovery time in
failure.
3. Problems in active-active connection at the edge
This sections present the problems needed to be addressed in active-
active connection scenario.
3.1 Frame duplications
When an MC-LAG is formed to multiple RBridges, there may be a
potential duplication of the frame to be received by the a CE. Two
possible scenarios are presented as follows.
1. Looping back: CE1 forwards a multi-destination frame from a user
device. As shown in Figure 1, the frame enters the TRILL campus via a
member of an MC-LAG (say RB1) and then is forwarded through the
campus to another member (say RB2) of the same MC-LAG. Then CE1
receives a duplicated copy from RB2.
2. Duplication from remote: A remote RBridge sends a multi-
destination frame of VLAN x. All members of MC-LAG1 will receive the
frame. As each of them thinks it is the appointed forwarder for all
VLANs, they would all forward the frame to CE1. The consequence is CE
receives multiple copies.
Frame duplication only happens in multi-destination frame forwarding.
Unicast does not have this issue.
3.2 Address flip-flop
Consider RB1 and RB2 using their own nickname as source nickname to
ingress data frame into a TRILL campus. As shown by Figure 1, CE1 may
send a data frame with the same source MAC address to any member RB
of MC-LAG1. If the egress RBridge receives TRILL packet from
different ingress RBridge RBridges but with same same source MAC
Yizhou, et al [Page 5]
INTERNET DRAFT Problems of Active-Active connection July 2013
address, it learns different address correspondence from the data
frames. Address correspondence may keep flip-flopping among nicknames
of the member RBridges of the MC-LAG for the same MAC address in the
same VLAN. Some TRILL switches may behave badly under these
circumstances and, for example, interpret this as a severe network
problem. It may also cause the returning traffic to go through the
different paths to reach the destination resulting in persistent re-
ordering of the frames.
3.3 Packet drop due to RPF check
In order to solve the problems above, a pseudonode nickname [TRILLPN]
solution was proposed. The basic idea is to represent all member
links of the MC-LAG as a virtual RBridge with single pseudonode
nickname. Any member RBridge of the MC-LAG should use this pseudonode
nickname rather than its own nickname as ingress nickname when inject
TRILL data frames. It solves the abovementioned problems pretty well;
however, it introduces another issue: packet drop due to RPF check.
When forwarding multi-destination frame, different member RBridges of
an MC-LAG may choose the same tree. A random RBridge RBn in TRILL
campus may receive the frame on single tree from the pseudonode
nickname on different incoming ports. RPF check fails in this case.
Frames will be dropped.
3.4 Loops
Active-Active connection does not introduce extra looping risk as MC-
LAG is just like a single link. So a frame will not keep geting
ingress and egressed to/from the TRILL campus via a single MC-LAG
link in normal situation. However we do need to pay attention that
any solutions for active-active connection scenario make sure the
campus is loop-free.
3.5 Member RBridges info synchronization
When multiple edge RBridges are bundled as an MC-LAG to make CE
multi-homed to TRILL campus, it is necessary to make sure the
RBridges are aware of the status of each link in MC-LAG.
Synchronization of information is necessary.
1. Member RBridges configuration synchronization: it is unavoidable
to synchronize the configuration parameters among edge RBridges of an
MC-LAG. Such configuration may include system ID, system priority,
port key, port priority, partner information, etc. If abovementioned
Yizhou, et al [Page 6]
INTERNET DRAFT Problems of Active-Active connection July 2013
[TRILLPN] and/or [CMT] was employed, there are more configurations to
be synchronized, for instance, pseudonode nickname of the virtual
RBridge. Without synchronization mechanism, we have to manually
provision each member RBridge to guarantee consistency. In addition,
some of the configuration may dynamically change during failure, for
instance, tree-id selected by member RBridges [CMT]. Manual
inconsistency check is not applicable in this case.
2. Member RBridges state synchronization: link failure or node
failure on a member RBridge may introduce packet loss. Link failure
includes both access port and trunk port link failure. When failure
occurs, MC-LAG may need to invoke re-selection logic to spread the
traffic across the rest links/nodes. Therefore fast detection and
failure recovery is required upon state synchronization. Some
mechanism could be employed, for example, TRILL BFD
support[TRILLBFD]. Trunk port and node failure can be detected it.
However access port/link failure needs some special care. An RBridge
that has an access port/link failure should notify the other members
RBs with port information to make them adjust the corresponding MC-
LAG.
3. Member RBridges learnt MAC address synchronization: it is required
that member RBs share the MAC address and egress nickname
correspondence they have learnt. By such synchronization, flooding
due to unknown unicast can be reduced.
If some inter-chassis protocol is employed among member RBridges for
MC-LAG member discovery, info synchronization and failure handling,
we need to make sure it can run smoothly over TRILL campus. The
protocol may use IP address to identify the other members. We need to
make sure such packets can be correctly TRILL encapsulated.
If no such inter-chassis protocol is available, TRILL has to provide
its own mechanisms to support the information synchronization.
4 Current Work
There have been some solution drafts presented in TRILL WG.
[TRILLPN], [CMT] and [TRILLBFD] address parts of the problems above.
5 Security Considerations
This draft presents the problems in a particular scenario. It does
not introduce any extra security risks. For general TRILL Security
Considerations, see [RFC6325].
Yizhou, et al [Page 7]
INTERNET DRAFT Problems of Active-Active connection July 2013
6 IANA Considerations
No IANA action is required. RFC Editor: please delete this section
before publication.
6 References
5.1 Normative References
[RFC6165] Banerjee, A. and D. Ward, "Extensions to IS-IS for Layer-2
Systems", RFC 6165, April 2011.
[RFC6325] Perlman, R., et.al. "RBridge: Base Protocol Specification",
RFC 6325, July 2011.
[RFC6326bis] Eastlake, D., Banerjee, A., Dutt, D., Perlman, R., and
A. Ghanwani, "TRILL Use of IS-IS", draft-eastlake-isis-
rfc6326bis, work in progress.
[RFC6327bis] Eastlake 3rd, D., R. Perlman, A. Ghanwani, H. Yang, and
V. Manral, "TRILL: Adjacency", draft-ietf-trill-
rfc6327bis, work in progress.
[RFC6439] Eastlake, D. et.al., "RBridge: Appointed Forwarder", RFC
6439, November 2011.
5.2 Informative References
[TRILLPN] Zhai,H., et.al., "RBridge: Pseudonode Nickname", draft-hu-
trill-pseudonode-nickname, Work in progress, November
2011.
[CMT] Senevirathne, T., Pathangi, J., and J. Hudson, "Coordinated
Multicast Trees (CMT)for TRILL", draft-ietf-trill-cmt-
01.txt Work in Progress, November 2012
[TRILLBFD] V. Manral., et al., "TRILL (Transparent Interconnetion of
Lots of Links): Bidirectional Forwarding Detection (BFD)
Support", draft-ietf-trill-rbridge-bfd-07.txt work in
Progress, July 2012
[8021AX] IEEE, "Link Aggregration", 802.1AX-2008, 2008.
[8021Q] IEEE, "Media Access Control (MAC) Bridges and Virtual Bridged
Local Area Networks", IEEE Std 802.1Q-2011, August, 2011
Yizhou, et al [Page 8]
INTERNET DRAFT Problems of Active-Active connection July 2013
Authors' Addresses
Yizhou Li
Huawei Technologies
101 Software Avenue,
Nanjing 210012
China
Phone: +86-25-56625375
EMail: liyizhou@huawei.com
Donald Eastlake
Huawei R&D USA
155 Beaver Street
Milford, MA 01757 USA
Phone: +1-508-333-2270
Email: d3e3e3@gmail.com
Weiguo Hao
Huawei Technologies
101 Software Avenue,
Nanjing 210012
China
Phone: +86-25-56623144
EMail: haoweiguo@huawei.com
Yizhou, et al [Page 9]