Skip to main content

RBridge Aggregation
draft-zhang-trill-aggregation-01

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Expired".
Authors Donald E. Eastlake 3rd , Mingui Zhang
Last updated 2011-10-31 (Latest revision 2011-10-24)
RFC stream (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-zhang-trill-aggregation-01
INTERNET-DRAFT                                              Mingui Zhang
Intended Status: Proposed Standard                       Donald Eastlake
Expires: May 3, 2012                                              Huawei
                                                        October 31, 2011

                          RBridge Aggregation
                  draft-zhang-trill-aggregation-01.txt

Abstract

   TRILL supports multi-access TRILL links that can have multiple
   RBridges attached. This draft specifies RBridge Aggregation that
   enables concurrent data forwarding by multiple RBridges for the end
   stations in the same VLAN on a TRILL link without partition. RBridge
   Aggregation offers active/active multi-homing to multi-access TRILL
   links, which improves their reliability and increases the access
   bandwidth of RBridge campus.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

Copyright and License Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
 

Mingui Zhang              Expires May 3, 2012                   [Page 1]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . .  3
   2. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3. Frame Processing  . . . . . . . . . . . . . . . . . . . . . . .  5
     3.1. Unicast Ingressing  . . . . . . . . . . . . . . . . . . . .  5
     3.2. Unicast Egressing . . . . . . . . . . . . . . . . . . . . .  6
     3.3. Multicast Ingressing  . . . . . . . . . . . . . . . . . . .  6
     3.4. Multicast Egressing . . . . . . . . . . . . . . . . . . . .  6
   4. Address Synchronization . . . . . . . . . . . . . . . . . . . .  7
   5. Configuration of RBridge Aggregation  . . . . . . . . . . . . .  7
     5.1. Hashing Function Determination  . . . . . . . . . . . . . .  7
   6. Resilience  . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     6.1. Failure Recovery  . . . . . . . . . . . . . . . . . . . . .  8
     6.2. Failover  . . . . . . . . . . . . . . . . . . . . . . . . .  9
     6.3. Connectivity of Wiring Close Topology . . . . . . . . . . . 10
   7. Load Balance  . . . . . . . . . . . . . . . . . . . . . . . . . 10
   8. Security Considerations . . . . . . . . . . . . . . . . . . . . 10
   9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 10
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 10
     10.2. Informative References . . . . . . . . . . . . . . . . . . 11
   Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12

 

Mingui Zhang              Expires May 3, 2012                   [Page 2]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

1. Introduction

   The multipathing feature of TRILL addresses the limitation of
   Spanning Tree Protocol which often results in inefficient use of the
   link topology. It is common that a TRILL link is attached to multiple
   edge RBridges and all these edge RBridges offer packets forwarding
   for this multi-access TRILL link. This multiple attachment provides
   load balancing to the TRILL link. However, currently, traffic load of
   a TRILL link can merely be balanced among different VLANs [RFC6325]
   while the traffic of end stations in a specific VLAN goes through
   only a single RBridge, i.e., the appointed forwarder of this VLAN.
   This still inherits two limitations of Spanning Tree Protocol: under-
   utilization of bandwidth and lack of reliability. 

   RBridge Aggregation is proposed to addresses the above two
   limitations. With RBridge Aggregation, multiple edge RBridges process
   the frames of the same VLAN on a TRILL link concurrently. They
   ingress frames and use the same ingress nickname (say RBv) as if the
   frames is ingressed by another virtual RBridge into the TRILL campus.
   The virtual links between the aggregated member and the virtual
   RBridge are advertised in LSPs to other RBridges, therefore
   aggregated members always act as the penultimate hop to the virtual
   RBridge. When the aggregated member receive frames destined to this
   virtual RBridge, they decapsulate these frames and egress them to the
   local link. 

   The frame processing procedures are carefully designed in this
   document to avoid traffic duplication and forwarding loops. MAC
   addresses learned by any of the aggregated members MUST be
   immediately synchronized among all members. Simple configuration at
   the RBridge port and access switch port is required to realize
   RBridge Aggregation.

   With RBridge aggregation, a TRILL link can achieve reliable
   active/active multihoming to a TRILL campus, which realizes fast
   failure recovery and failover. Traffic load is balanced in a finer
   granularity: the traffic load for a specific VLAN can freely go
   through any of the aggregated members.

   Familiarity with [RFC6325], [RFC6327], and [RBaf] is assumed in this
   document. As in [RFC6325], in this document the word "link" means a
   "bridged LAN", unless otherwise qualified,. 

1.1. Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].
 

Mingui Zhang              Expires May 3, 2012                   [Page 3]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

2. Aggregation

   For loop avoidance, there can ONLY be a single appointed forwarder
   ingressing and egressing native frames on a link for a specific VLAN-
   x at the same time [RBaf]. This single forwarder mechanism does not
   take the full advantage of the "multiple attachment" character of
   TRILL links. This can waste the available access bandwidth and reduce
   the network resilience. Take Figure 2.1 as an example, although both
   RB1 and RB2 have the ability to perform frame forwarding for VLAN-x,
   DRB can only appoint one of them to be the appointed forwarder, the
   other one will be inhabited from ingressing and egressing native
   frames of VLAN-x. 

                   +-----+                    +-----+
                   | RBi |                    | RBi |
                   +-----+                    +-----+
                      |                          |
                 /\/\/\/\/\/\               /\/\/\/\/\/\
                /   Transit  \             /   Transit  \
               <    RBridges  >           <    RBridges  >
                \   Campus   /             \   Campus   /
                 \/\/\/\/\/\/      -->      \/\/\/\/\/\/
                  |        |                 |        |
               +-----+  +-----+           +-----+  +-----+
               | RB1 |--| RB2 |           | RB1 |--| RB2 |
               +-----+  +-----+           +-----+  +-----+
                    \    /                      \   /
                    +----+                     *******
                  +-| B1 |-+                   * RBv *
                  | +----+ |                   *******
                  |        |                     ||
                  |[H]  [H]|                   +----+
                  +--------+                 +-| B1 |-+
                    VLAN-x                   | +----+ |
                                             |        |
                                             |[H]  [H]|
                                             +--------+
                                               VLAN-x

            Figure 2.1: Illustration of RBridge Aggregation

   The RBridges can be aggregated to break the above limitations. Figure
   2.1 illustrates RBridge Aggregation. RB1 and RB2 are both attached to
   the local link which carries VLAN-x. We assume that there is a
   virtual RBridge acting as VLAN-x's forwarder and using the nickname
   RBv. When RB1 or RB2 ingresses frames from the local link to TRILL
   networks, they will use RBv as the ingress nickname. The two virtual
   links between RB1, RB2 and RBv will be announced in LSPs. Other
 

Mingui Zhang              Expires May 3, 2012                   [Page 4]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

   RBridges will believe there is a RBridge node RBv connecting RB1, RB2
   and the local link. When packets are sent to the local link, RBv will
   serve as the egress RBridge (i.e., the last hop) while RB1 or RB2
   will serve as the penultimate hop. Note that although the examples
   used to illustrate RBridge Aggregation in this document include two
   edge RBridges, the RBridge Aggregation solution supports more than
   two aggregation members.

   The frame processing will be defined in Section 3. To ease the
   implementation of RBridge aggregation, limited changes are introduced
   to the aggregated RBridge members while no new feature is added to
   the access bridge B1 and other RBridges in the campus.

3. Frame Processing 

   RBridge Aggregation introduces two forwarders for the same TRILL
   link. If things do not change, it is possible to cause two problems:

   1. Traffic duplication - The members of the aggregated RBridges
      ingress or egress the same frame at the same time for the local
      link. Then end stations may receive duplicated frames.

   2. Forwarding loops - Take Figure 2.1 as an example, RB1 sends
      multicast frames which will reach RB2 who will egress the
      multicast frame back to the local link which cause a forwarding
      loop. Here, the RBridge Aggregation can be looked as a shortcut
      between the leaf nodes of a spanning tree. This problem is called
      "flooding rebirth". The forwarding loop caused by flooding rebirth
      can further cause harmful broadcast storming to the local link.

   Frame processing is carefully designed in the following subsections
   to eliminate the above problems. Although all the aggregated RBridges
   have the right to deliver the frames for the local link at the same
   time, it's still necessary to determine a single responsible
   appointed forwarder for a specific frame. 

3.1. Unicast Ingressing

   When a unicast frame is received from the local link by one of the
   aggregated RBridges, this ingress RBridge fills RBv into the TRILL
   header of the frame as the ingress nickname and then sends it to its
   corresponding egress RBridge as a normal unicast frame.

   There is no problem until we consider unknown unicast from the local
   link. When the access bridge receives a frame destined to a MAC
   address not in the address table, it will flood this frame to all
   other ports. The aggregation members will all receive this unicast
   frame. Nevertheless, the members do not know that this unicast frame
 

Mingui Zhang              Expires May 3, 2012                   [Page 5]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

   is flooded to them. If the aggregated RBridges right have the
   destination MAC address in their address table. This frame will
   simply be sent as a known unicast by all the aggregated RBridges so
   that the remote egress RBridge will receive duplicated frames. 

   Our solution is to configure the access links as "link aggregation"
   [802-1AX] at the side of the bridge (see Section 7). We can also use
   unknown unicast blocking technique to solve this problem: Within the
   access links to the aggregated RBridges, one and only one is picked
   out to let through unknown unicast while all the other ports suppress
   the egress of unknown unicast frames. Since only one aggregated
   RBridge will receive this unknown unicast frame, traffic duplication
   is avoided.

3.2. Unicast Egressing

   When an aggregated RBridge member receive a unicast frame whose
   egress nickname is the nickname of the virtual RBridge of the
   aggregated members, this RBridge will decapsulate the frame and
   egress it to the local link.

3.3. Multicast Ingressing

   Similar as unicast ingressing, the ingress nickname of the frames is
   set to RBv. In order to avoid duplicated multicast frames, multicast
   ingress frames can ONLY be forwarded by one of the RBridges. To
   achieve this, the aggregated RBridges forward multicast frames based
   on its locally implemented hashing function. As an example, the last
   bit of the source MAC address are used as the input of the hashing
   function. Frames with the source MAC address whose last bit is 0 will
   be forwarded by RB1 while RB2 will simply discard such frames. Frames
   with the source MAC address whose last bit is 1 will be forwarded by
   RB2 while RB1 will discard such frames. To realize fine grained load
   balance, more bits can be used by the hashing function of aggregated
   RBridges, which can be manually configured. 

3.4. Multicast Egressing

   It is probably that both the aggregated RBridges will receive the
   multicast frames destined to the local link. However, only one of
   them will act as the forwarder of these frames according to their
   local hashing. Again, as an example, the last bit of the source MAC
   address of the multicast frames are used to break the tie: RB1 only
   forwards frames with the source MAC address suffixed by 0 while RB2
   only forwards frames with the source MAC address suffixed by 1. 

   When a multicast frame originated by the local link is forwarded
   across the TRILL network and received by the peer RBridge, it is
 

Mingui Zhang              Expires May 3, 2012                   [Page 6]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

   important that the peer RBridge does not egress this frame back to
   the local link, otherwise it will cause a forwarding loop to the
   local link (flooding rebirth). The above hashing function will be
   used by the peer RBridge who will determine not to forward this
   multicast frame. In order to keep consistence to the hashing result
   of the ingress RBridge, bits that are possible to be changed with the
   frame forwarding should not be used in hashing, such as bit from the
   hop count field.

4. Address Synchronization

   MAC addresses SHOULD be synchronized between the aggregated members
   through ESADI immediately after they are learned from the data plane
   frame processing. A MAC address sent through ESADI message from the
   peer is stored in the MAC table as if it is locally learned.
   Afterwards, a frame destined to this MAC address can be delivered to
   the local link or TRILL network by either of the aggregated members.
   In a corner case that a unicast frame are received by a aggregated
   member in the flight of ESADI message and the destination MAC address
   has not learned from its peer, this frame will be sent as an unknown
   unicast by this member. 

5. Configuration of RBridge Aggregation

   RBridge Aggregation should be configured by network managers when
   they configure the RBridge ports. Only the RBridge ports connected to
   the same bridge can be configured to be aggregated and all VLANs
   carried on this TRILL link will be treated as aggregated. The
   pseudonode nickname is used as the nickname of the aggregated virtual
   RBridges. If the TRILL link do not have pseudonode nickname, the
   nickname for the virtual RBridge is required to be manually
   configured and used by all the aggregated members. 

   The members of an aggregated group should report connections to the
   aggregated VLANs so that the multicast traffic of these VLANs will
   reach all the members. 

   In [RFC6325], in order to suppress loops, multiple appointed
   forwarders for the same VLAN on a same local link is prohibited. This
   limitation should be relaxed in the RBridge Aggregation solution. 

5.1. Hashing Function Determination

   Hashing function is well supported by hardware. Network manager
   should determine the TRILL data frame fields that are used as the
   hashing input. It is important that all aggregated members get
   consistent output on the same native data frame. Therefore the fields
 

Mingui Zhang              Expires May 3, 2012                   [Page 7]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

   that are to be changed during frame processing MUST not be used as
   the hashing input. Source, Destination MAC address and inner VLAN ID
   are all candidates for this kind of hashing input.

   Each aggregated member maintain a circular list of the aggregated
   members. Assume the hashing function is H(T) and there are "A"
   members in the Aggregated RBridges group. The responsible forwarder
   is chosen as RBr = H(T) mod A for multicast and broadcast packets. 

6. Resilience

   RBridge Aggregation offers active/active multi-homing to a multi-
   access TRILL link, which increases its reliability. In the event of
   access link failures, the TRILL link need not wait for the time-
   consuming forwarder re-appointment to recover the connectivity to
   TRILL campus. 

6.1. Failure Recovery

   Without RBridge aggregation, if a local link is disconnected from its
   Appointed Forwarder, the data forwarding can be restored after the
   DRB successfully choose a new appointed forwarder for this link.
   However, it may take a longer time before the new appointed forwarder
   begins to function properly. Until the new Appointed Forwarder
   properly functions, the disruption continues.

   In RBridge aggregation, if a aggregated member is not connected to
   the local link any more, it will send out an LSP to announce that it
   is not connected to the virtual RBridge RBv. Since all aggregated
   RBridges had reported the connection to RBv, remote RBridges in the
   TRILL campus can send frames to RBv via any other aggregated RBridges
   where the frames will be egressed to the local link. The connection
   to the local link remains uninterrupted. 

   For ingressing unicast frames, if the link between the access bridge
   and aggregated RBridges fails, the access bridge will send these
   frames to the other RBridge where they will be delivered directly
   without disruption. Take Figure 2 as an example, suppose link B1-RB1
   fails, the packets originally sent through link B1-RB1 will be sent
   as unknown unicast to all the interfaces of B1. Since RB2 stores all
   VLAN-x's addresses learned by RB1. The packets going through link B1-
   RB2 will be regarded as known unicast by RB2 and forwarded to its
   destination.

 

Mingui Zhang              Expires May 3, 2012                   [Page 8]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

          +-+  +-->0->RB1<-+       +-+  +-->0->RB1<-+
          | |  |   1->RB2  |       | |  |   1->RB2  |
          |H|  |   2->RB3  |       |H|  |   2->RB3  |
          |A|->| ...->RB...|  -->  |A|->| ...->RB...|
          |S|  | k-1->RBk  |       |S|  | k-1->RBk+1|
          |H|  |   k->RBk+1|       |H|  |   k->RBk+1|
          | |  | ...->RB...|       | |  | ...->RB...|
          +-+  +-n-1->RBn--+       +-+  +-n-1->RBn--+

       Figure 6.1: Hashing function change during a link failure 

   In normal case, in order to avoid duplicate frames and forwarding
   loops, an aggregated member will not send multicast frames that
   should not be sent by it according to the hashing function. However,
   when an aggregation member cannot forward these frames due to link
   failures, the next aggregated member on the aggregation list should
   take over the responsibility to deliver these multicast frames. This
   can be realized through the local change of its hashing function. The
   new hashing function is changed in this way: originally, the member
   will only deliver the frames with the output of hashing function
   pointed to itself. This change is shown in Figure 6.1. When this
   member (RBk+1) knows that a member (RBk) is failed, it will take over
   the responsibility to deliver frames that are originally delivered by
   the failed member. Take Figure 2 for an example, in normal case, RB2
   only deliver packets with source MAC addresses suffixed by 1. When
   link RB1-B1 fails and RB1 can not deliver the multicast frames
   from/to the local link, RB2 will take the responsibility to deliver
   packets with source MAC addresses suffixed by either 0 or 1. 

   If the failed link is the link that let unknown unicast through. The
   access bridge should change the link connected to the next aggregated
   member to let through unknown unicast. This mechanism can be
   implemented through configuration of the ACL of the access bridge.

6.2. Failover

   When an aggregator detects that it is disconnected from the local
   link in the flight of data frames, it can transmit the frames to the
   other aggregator for delivery. In this way, the links connected to
   the aggregated RBridges are protected by each other. Unicast frames
   will be redirected directly during the failover. For a multi-
   destination frame or unknown unicast frame that should be delivered
   by one of the aggregated RBridges according to the hashing function,
   this RBridge can send the frame to the other RBridge through a
   reserved outer VLAN. The other RBridge will deliver multi-
   destinations frames from this reserved VLAN without considering the
   hashing function.

 

Mingui Zhang              Expires May 3, 2012                   [Page 9]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

6.3. Connectivity of Wiring Close Topology 

   According to the solution defined in Section A.3.3 of [RFC6325], the
   edge RBridges connected to a wiring close topology act as the roots
   of spanning trees at the same time. The TRILL link will be
   partitioned into several spanning trees. 

   With RBridge Aggregation, the access bridge will treat the aggregated
   members as leaf nodes of the spanning tree. The edge RBridges do not
   have to emit BPDUs and participate the Spanning Tree Protocol any
   more. Possible forwarding loops are broken at the aggregated RBridges
   and the bridged TRILL need not to be partitioned, which defines a
   clearer boundary between the TRILL campus and the traditional bridged
   LAN.

7. Load Balance

   When a TRILL link is attached to aggregated RBridges, its packets can
   be forwarded by each of these RBridges. The access switch can
   configure access links as "link aggregation", then it can balance the
   load among these links through link aggregation technique [802-1AX].

   However, the access switch can well configure these link as normal
   links. That is not to say the traffic are not balanced in this case.
   Actually, the load will be balanced in the manner of multipathing
   (ECMP and Mutli-Topology Routing). Take Figure 2.1 as an example,
   bridge B1 is attached to RB1 and RB2 through link B1-RB1 and link B1-
   RB2. Suppose host Ha is attached to RBridge RBi and it is sending
   packets to a host located in the local link. If the remote RBridge
   RBi selects RB1 as the egress RBridge, then B1 will learn the source
   MAC address at the port attached to link B1-RB1. Therefore the
   packets destined to Ha from the local link will naturally be sent via
   link B1-RB1. Otherwise, if RB2 is selected as the egress RBridge, the
   packets will be sent through link B1-RB2.

8. Security Considerations

   This document raises no new security issues for IS-IS.

9. IANA Considerations

   This document requires no IANA actions. RFC Editor: please remove
   this section before publication.

10. References 

10.1. Normative References

 

Mingui Zhang              Expires May 3, 2012                  [Page 10]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

   [RFC6325] R. Perlman, D. Eastlake, et al, "RBridges: Base Protocol
             Specification", RFC 6325, July 2011.

   [RBaf]    R. Perlman, D. Eastlake, et al, "RBridges: Appointed
             Forwarders", draft-ietf-trill-rbridge-af-05.txt, working in
             progress.

10.2. Informative References

   [802-1AX] "IEEE Standard for Local and metropolitan area networks -
             Link Aggregation", IEEE Std 802.1 AX-2008, 3 November
             2008.

 

Mingui Zhang              Expires May 3, 2012                  [Page 11]
INTERNET-DRAFT            RBridge Aggregation           October 31, 2011

Author's Addresses

   Mingui Zhang
   Huawei Technologies Co.,Ltd
   HuaWei Building, No.3 Xinxi Rd., Shang-Di
   Information Industry Base, Hai-Dian District, 
   Beijing, 100085 P.R. China
        
   Email: zhangmingui@huawei.com

   Donald E. Eastlake, 3rd
   Huawei Technologies
   155 Beaver Street
   Milford, MA 01757 USA

   Phone: +1-508-333-2270
   EMail: d3e3e3@gmail.com

Mingui Zhang              Expires May 3, 2012                  [Page 12]