INTERNET-DRAFT Mingui Zhang
Intended Status: Proposed Standard Donald Eastlake
Expires: May 3, 2012 Huawei
October 31, 2011
RBridge Aggregation
draft-zhang-trill-aggregation-01.txt
Abstract
TRILL supports multi-access TRILL links that can have multiple
RBridges attached. This draft specifies RBridge Aggregation that
enables concurrent data forwarding by multiple RBridges for the end
stations in the same VLAN on a TRILL link without partition. RBridge
Aggregation offers active/active multi-homing to multi-access TRILL
links, which improves their reliability and increases the access
bandwidth of RBridge campus.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright and License Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
Mingui Zhang Expires May 3, 2012 [Page 1]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Frame Processing . . . . . . . . . . . . . . . . . . . . . . . 5
3.1. Unicast Ingressing . . . . . . . . . . . . . . . . . . . . 5
3.2. Unicast Egressing . . . . . . . . . . . . . . . . . . . . . 6
3.3. Multicast Ingressing . . . . . . . . . . . . . . . . . . . 6
3.4. Multicast Egressing . . . . . . . . . . . . . . . . . . . . 6
4. Address Synchronization . . . . . . . . . . . . . . . . . . . . 7
5. Configuration of RBridge Aggregation . . . . . . . . . . . . . 7
5.1. Hashing Function Determination . . . . . . . . . . . . . . 7
6. Resilience . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.1. Failure Recovery . . . . . . . . . . . . . . . . . . . . . 8
6.2. Failover . . . . . . . . . . . . . . . . . . . . . . . . . 9
6.3. Connectivity of Wiring Close Topology . . . . . . . . . . . 10
7. Load Balance . . . . . . . . . . . . . . . . . . . . . . . . . 10
8. Security Considerations . . . . . . . . . . . . . . . . . . . . 10
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 10
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
10.1. Normative References . . . . . . . . . . . . . . . . . . . 10
10.2. Informative References . . . . . . . . . . . . . . . . . . 11
Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12
Mingui Zhang Expires May 3, 2012 [Page 2]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
1. Introduction
The multipathing feature of TRILL addresses the limitation of
Spanning Tree Protocol which often results in inefficient use of the
link topology. It is common that a TRILL link is attached to multiple
edge RBridges and all these edge RBridges offer packets forwarding
for this multi-access TRILL link. This multiple attachment provides
load balancing to the TRILL link. However, currently, traffic load of
a TRILL link can merely be balanced among different VLANs [RFC6325]
while the traffic of end stations in a specific VLAN goes through
only a single RBridge, i.e., the appointed forwarder of this VLAN.
This still inherits two limitations of Spanning Tree Protocol: under-
utilization of bandwidth and lack of reliability.
RBridge Aggregation is proposed to addresses the above two
limitations. With RBridge Aggregation, multiple edge RBridges process
the frames of the same VLAN on a TRILL link concurrently. They
ingress frames and use the same ingress nickname (say RBv) as if the
frames is ingressed by another virtual RBridge into the TRILL campus.
The virtual links between the aggregated member and the virtual
RBridge are advertised in LSPs to other RBridges, therefore
aggregated members always act as the penultimate hop to the virtual
RBridge. When the aggregated member receive frames destined to this
virtual RBridge, they decapsulate these frames and egress them to the
local link.
The frame processing procedures are carefully designed in this
document to avoid traffic duplication and forwarding loops. MAC
addresses learned by any of the aggregated members MUST be
immediately synchronized among all members. Simple configuration at
the RBridge port and access switch port is required to realize
RBridge Aggregation.
With RBridge aggregation, a TRILL link can achieve reliable
active/active multihoming to a TRILL campus, which realizes fast
failure recovery and failover. Traffic load is balanced in a finer
granularity: the traffic load for a specific VLAN can freely go
through any of the aggregated members.
Familiarity with [RFC6325], [RFC6327], and [RBaf] is assumed in this
document. As in [RFC6325], in this document the word "link" means a
"bridged LAN", unless otherwise qualified,.
1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Mingui Zhang Expires May 3, 2012 [Page 3]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
2. Aggregation
For loop avoidance, there can ONLY be a single appointed forwarder
ingressing and egressing native frames on a link for a specific VLAN-
x at the same time [RBaf]. This single forwarder mechanism does not
take the full advantage of the "multiple attachment" character of
TRILL links. This can waste the available access bandwidth and reduce
the network resilience. Take Figure 2.1 as an example, although both
RB1 and RB2 have the ability to perform frame forwarding for VLAN-x,
DRB can only appoint one of them to be the appointed forwarder, the
other one will be inhabited from ingressing and egressing native
frames of VLAN-x.
+-----+ +-----+
| RBi | | RBi |
+-----+ +-----+
| |
/\/\/\/\/\/\ /\/\/\/\/\/\
/ Transit \ / Transit \
< RBridges > < RBridges >
\ Campus / \ Campus /
\/\/\/\/\/\/ --> \/\/\/\/\/\/
| | | |
+-----+ +-----+ +-----+ +-----+
| RB1 |--| RB2 | | RB1 |--| RB2 |
+-----+ +-----+ +-----+ +-----+
\ / \ /
+----+ *******
+-| B1 |-+ * RBv *
| +----+ | *******
| | ||
|[H] [H]| +----+
+--------+ +-| B1 |-+
VLAN-x | +----+ |
| |
|[H] [H]|
+--------+
VLAN-x
Figure 2.1: Illustration of RBridge Aggregation
The RBridges can be aggregated to break the above limitations. Figure
2.1 illustrates RBridge Aggregation. RB1 and RB2 are both attached to
the local link which carries VLAN-x. We assume that there is a
virtual RBridge acting as VLAN-x's forwarder and using the nickname
RBv. When RB1 or RB2 ingresses frames from the local link to TRILL
networks, they will use RBv as the ingress nickname. The two virtual
links between RB1, RB2 and RBv will be announced in LSPs. Other
Mingui Zhang Expires May 3, 2012 [Page 4]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
RBridges will believe there is a RBridge node RBv connecting RB1, RB2
and the local link. When packets are sent to the local link, RBv will
serve as the egress RBridge (i.e., the last hop) while RB1 or RB2
will serve as the penultimate hop. Note that although the examples
used to illustrate RBridge Aggregation in this document include two
edge RBridges, the RBridge Aggregation solution supports more than
two aggregation members.
The frame processing will be defined in Section 3. To ease the
implementation of RBridge aggregation, limited changes are introduced
to the aggregated RBridge members while no new feature is added to
the access bridge B1 and other RBridges in the campus.
3. Frame Processing
RBridge Aggregation introduces two forwarders for the same TRILL
link. If things do not change, it is possible to cause two problems:
1. Traffic duplication - The members of the aggregated RBridges
ingress or egress the same frame at the same time for the local
link. Then end stations may receive duplicated frames.
2. Forwarding loops - Take Figure 2.1 as an example, RB1 sends
multicast frames which will reach RB2 who will egress the
multicast frame back to the local link which cause a forwarding
loop. Here, the RBridge Aggregation can be looked as a shortcut
between the leaf nodes of a spanning tree. This problem is called
"flooding rebirth". The forwarding loop caused by flooding rebirth
can further cause harmful broadcast storming to the local link.
Frame processing is carefully designed in the following subsections
to eliminate the above problems. Although all the aggregated RBridges
have the right to deliver the frames for the local link at the same
time, it's still necessary to determine a single responsible
appointed forwarder for a specific frame.
3.1. Unicast Ingressing
When a unicast frame is received from the local link by one of the
aggregated RBridges, this ingress RBridge fills RBv into the TRILL
header of the frame as the ingress nickname and then sends it to its
corresponding egress RBridge as a normal unicast frame.
There is no problem until we consider unknown unicast from the local
link. When the access bridge receives a frame destined to a MAC
address not in the address table, it will flood this frame to all
other ports. The aggregation members will all receive this unicast
frame. Nevertheless, the members do not know that this unicast frame
Mingui Zhang Expires May 3, 2012 [Page 5]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
is flooded to them. If the aggregated RBridges right have the
destination MAC address in their address table. This frame will
simply be sent as a known unicast by all the aggregated RBridges so
that the remote egress RBridge will receive duplicated frames.
Our solution is to configure the access links as "link aggregation"
[802-1AX] at the side of the bridge (see Section 7). We can also use
unknown unicast blocking technique to solve this problem: Within the
access links to the aggregated RBridges, one and only one is picked
out to let through unknown unicast while all the other ports suppress
the egress of unknown unicast frames. Since only one aggregated
RBridge will receive this unknown unicast frame, traffic duplication
is avoided.
3.2. Unicast Egressing
When an aggregated RBridge member receive a unicast frame whose
egress nickname is the nickname of the virtual RBridge of the
aggregated members, this RBridge will decapsulate the frame and
egress it to the local link.
3.3. Multicast Ingressing
Similar as unicast ingressing, the ingress nickname of the frames is
set to RBv. In order to avoid duplicated multicast frames, multicast
ingress frames can ONLY be forwarded by one of the RBridges. To
achieve this, the aggregated RBridges forward multicast frames based
on its locally implemented hashing function. As an example, the last
bit of the source MAC address are used as the input of the hashing
function. Frames with the source MAC address whose last bit is 0 will
be forwarded by RB1 while RB2 will simply discard such frames. Frames
with the source MAC address whose last bit is 1 will be forwarded by
RB2 while RB1 will discard such frames. To realize fine grained load
balance, more bits can be used by the hashing function of aggregated
RBridges, which can be manually configured.
3.4. Multicast Egressing
It is probably that both the aggregated RBridges will receive the
multicast frames destined to the local link. However, only one of
them will act as the forwarder of these frames according to their
local hashing. Again, as an example, the last bit of the source MAC
address of the multicast frames are used to break the tie: RB1 only
forwards frames with the source MAC address suffixed by 0 while RB2
only forwards frames with the source MAC address suffixed by 1.
When a multicast frame originated by the local link is forwarded
across the TRILL network and received by the peer RBridge, it is
Mingui Zhang Expires May 3, 2012 [Page 6]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
important that the peer RBridge does not egress this frame back to
the local link, otherwise it will cause a forwarding loop to the
local link (flooding rebirth). The above hashing function will be
used by the peer RBridge who will determine not to forward this
multicast frame. In order to keep consistence to the hashing result
of the ingress RBridge, bits that are possible to be changed with the
frame forwarding should not be used in hashing, such as bit from the
hop count field.
4. Address Synchronization
MAC addresses SHOULD be synchronized between the aggregated members
through ESADI immediately after they are learned from the data plane
frame processing. A MAC address sent through ESADI message from the
peer is stored in the MAC table as if it is locally learned.
Afterwards, a frame destined to this MAC address can be delivered to
the local link or TRILL network by either of the aggregated members.
In a corner case that a unicast frame are received by a aggregated
member in the flight of ESADI message and the destination MAC address
has not learned from its peer, this frame will be sent as an unknown
unicast by this member.
5. Configuration of RBridge Aggregation
RBridge Aggregation should be configured by network managers when
they configure the RBridge ports. Only the RBridge ports connected to
the same bridge can be configured to be aggregated and all VLANs
carried on this TRILL link will be treated as aggregated. The
pseudonode nickname is used as the nickname of the aggregated virtual
RBridges. If the TRILL link do not have pseudonode nickname, the
nickname for the virtual RBridge is required to be manually
configured and used by all the aggregated members.
The members of an aggregated group should report connections to the
aggregated VLANs so that the multicast traffic of these VLANs will
reach all the members.
In [RFC6325], in order to suppress loops, multiple appointed
forwarders for the same VLAN on a same local link is prohibited. This
limitation should be relaxed in the RBridge Aggregation solution.
5.1. Hashing Function Determination
Hashing function is well supported by hardware. Network manager
should determine the TRILL data frame fields that are used as the
hashing input. It is important that all aggregated members get
consistent output on the same native data frame. Therefore the fields
Mingui Zhang Expires May 3, 2012 [Page 7]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
that are to be changed during frame processing MUST not be used as
the hashing input. Source, Destination MAC address and inner VLAN ID
are all candidates for this kind of hashing input.
Each aggregated member maintain a circular list of the aggregated
members. Assume the hashing function is H(T) and there are "A"
members in the Aggregated RBridges group. The responsible forwarder
is chosen as RBr = H(T) mod A for multicast and broadcast packets.
6. Resilience
RBridge Aggregation offers active/active multi-homing to a multi-
access TRILL link, which increases its reliability. In the event of
access link failures, the TRILL link need not wait for the time-
consuming forwarder re-appointment to recover the connectivity to
TRILL campus.
6.1. Failure Recovery
Without RBridge aggregation, if a local link is disconnected from its
Appointed Forwarder, the data forwarding can be restored after the
DRB successfully choose a new appointed forwarder for this link.
However, it may take a longer time before the new appointed forwarder
begins to function properly. Until the new Appointed Forwarder
properly functions, the disruption continues.
In RBridge aggregation, if a aggregated member is not connected to
the local link any more, it will send out an LSP to announce that it
is not connected to the virtual RBridge RBv. Since all aggregated
RBridges had reported the connection to RBv, remote RBridges in the
TRILL campus can send frames to RBv via any other aggregated RBridges
where the frames will be egressed to the local link. The connection
to the local link remains uninterrupted.
For ingressing unicast frames, if the link between the access bridge
and aggregated RBridges fails, the access bridge will send these
frames to the other RBridge where they will be delivered directly
without disruption. Take Figure 2 as an example, suppose link B1-RB1
fails, the packets originally sent through link B1-RB1 will be sent
as unknown unicast to all the interfaces of B1. Since RB2 stores all
VLAN-x's addresses learned by RB1. The packets going through link B1-
RB2 will be regarded as known unicast by RB2 and forwarded to its
destination.
Mingui Zhang Expires May 3, 2012 [Page 8]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
+-+ +-->0->RB1<-+ +-+ +-->0->RB1<-+
| | | 1->RB2 | | | | 1->RB2 |
|H| | 2->RB3 | |H| | 2->RB3 |
|A|->| ...->RB...| --> |A|->| ...->RB...|
|S| | k-1->RBk | |S| | k-1->RBk+1|
|H| | k->RBk+1| |H| | k->RBk+1|
| | | ...->RB...| | | | ...->RB...|
+-+ +-n-1->RBn--+ +-+ +-n-1->RBn--+
Figure 6.1: Hashing function change during a link failure
In normal case, in order to avoid duplicate frames and forwarding
loops, an aggregated member will not send multicast frames that
should not be sent by it according to the hashing function. However,
when an aggregation member cannot forward these frames due to link
failures, the next aggregated member on the aggregation list should
take over the responsibility to deliver these multicast frames. This
can be realized through the local change of its hashing function. The
new hashing function is changed in this way: originally, the member
will only deliver the frames with the output of hashing function
pointed to itself. This change is shown in Figure 6.1. When this
member (RBk+1) knows that a member (RBk) is failed, it will take over
the responsibility to deliver frames that are originally delivered by
the failed member. Take Figure 2 for an example, in normal case, RB2
only deliver packets with source MAC addresses suffixed by 1. When
link RB1-B1 fails and RB1 can not deliver the multicast frames
from/to the local link, RB2 will take the responsibility to deliver
packets with source MAC addresses suffixed by either 0 or 1.
If the failed link is the link that let unknown unicast through. The
access bridge should change the link connected to the next aggregated
member to let through unknown unicast. This mechanism can be
implemented through configuration of the ACL of the access bridge.
6.2. Failover
When an aggregator detects that it is disconnected from the local
link in the flight of data frames, it can transmit the frames to the
other aggregator for delivery. In this way, the links connected to
the aggregated RBridges are protected by each other. Unicast frames
will be redirected directly during the failover. For a multi-
destination frame or unknown unicast frame that should be delivered
by one of the aggregated RBridges according to the hashing function,
this RBridge can send the frame to the other RBridge through a
reserved outer VLAN. The other RBridge will deliver multi-
destinations frames from this reserved VLAN without considering the
hashing function.
Mingui Zhang Expires May 3, 2012 [Page 9]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
6.3. Connectivity of Wiring Close Topology
According to the solution defined in Section A.3.3 of [RFC6325], the
edge RBridges connected to a wiring close topology act as the roots
of spanning trees at the same time. The TRILL link will be
partitioned into several spanning trees.
With RBridge Aggregation, the access bridge will treat the aggregated
members as leaf nodes of the spanning tree. The edge RBridges do not
have to emit BPDUs and participate the Spanning Tree Protocol any
more. Possible forwarding loops are broken at the aggregated RBridges
and the bridged TRILL need not to be partitioned, which defines a
clearer boundary between the TRILL campus and the traditional bridged
LAN.
7. Load Balance
When a TRILL link is attached to aggregated RBridges, its packets can
be forwarded by each of these RBridges. The access switch can
configure access links as "link aggregation", then it can balance the
load among these links through link aggregation technique [802-1AX].
However, the access switch can well configure these link as normal
links. That is not to say the traffic are not balanced in this case.
Actually, the load will be balanced in the manner of multipathing
(ECMP and Mutli-Topology Routing). Take Figure 2.1 as an example,
bridge B1 is attached to RB1 and RB2 through link B1-RB1 and link B1-
RB2. Suppose host Ha is attached to RBridge RBi and it is sending
packets to a host located in the local link. If the remote RBridge
RBi selects RB1 as the egress RBridge, then B1 will learn the source
MAC address at the port attached to link B1-RB1. Therefore the
packets destined to Ha from the local link will naturally be sent via
link B1-RB1. Otherwise, if RB2 is selected as the egress RBridge, the
packets will be sent through link B1-RB2.
8. Security Considerations
This document raises no new security issues for IS-IS.
9. IANA Considerations
This document requires no IANA actions. RFC Editor: please remove
this section before publication.
10. References
10.1. Normative References
Mingui Zhang Expires May 3, 2012 [Page 10]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
[RFC6325] R. Perlman, D. Eastlake, et al, "RBridges: Base Protocol
Specification", RFC 6325, July 2011.
[RBaf] R. Perlman, D. Eastlake, et al, "RBridges: Appointed
Forwarders", draft-ietf-trill-rbridge-af-05.txt, working in
progress.
10.2. Informative References
[802-1AX] "IEEE Standard for Local and metropolitan area networks -
Link Aggregation", IEEE Std 802.1 AX-2008, 3 November
2008.
Mingui Zhang Expires May 3, 2012 [Page 11]
INTERNET-DRAFT RBridge Aggregation October 31, 2011
Author's Addresses
Mingui Zhang
Huawei Technologies Co.,Ltd
HuaWei Building, No.3 Xinxi Rd., Shang-Di
Information Industry Base, Hai-Dian District,
Beijing, 100085 P.R. China
Email: zhangmingui@huawei.com
Donald E. Eastlake, 3rd
Huawei Technologies
155 Beaver Street
Milford, MA 01757 USA
Phone: +1-508-333-2270
EMail: d3e3e3@gmail.com
Mingui Zhang Expires May 3, 2012 [Page 12]