INTERNET-DRAFT                                              Mingui Zhang
Intended Status: Proposed Standard                                Huawei
                                                           Radia Perlman
                                                  Individual Contributor
                                                            Hongjun Zhai
                                                                     ZTE
                                                         Mukhtiar Shaikh
                                                        Muhammad Durrani
                                                                 Brocade
Expires: September 7, 2014                                 March 6, 2014

        TRILL Active-Active Edge Using Multiple MAC Attachments
                draft-zhang-trill-aa-multi-attach-01.txt

Abstract

   TRILL active-active service is to provide end stations with flow
   level load balance and resilience against link failures at the edge
   of TRILL campuses.

   This draft proposes that member RBridges in an active-active edge
   RBridge group use their own nicknames as ingress RBridge nicknames to
   encapsulate frames from attached end systems. Thus, remote edge
   RBridges are required to learn multiple locations of one MAC address
   in one VLAN. Design goals of this proposal are discussed in the
   document.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html




Mingui Zhang, et al    Expires September 7, 2014                [Page 1]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


Copyright and License Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2. Acronyms and Terminology  . . . . . . . . . . . . . . . . . . .  3
     2.1. Acronyms  . . . . . . . . . . . . . . . . . . . . . . . . .  4
     2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . .  4
   3. Overview  . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
   4. Backward Compatibility  . . . . . . . . . . . . . . . . . . . .  5
   5. Design Goals  . . . . . . . . . . . . . . . . . . . . . . . . .  5
     5.1. No MAC Flip-Floping (Normal Unicast Egress) . . . . . . . .  6
     5.2. Regular Unicast/Multicast Ingress . . . . . . . . . . . . .  6
     5.3. Right Multicast Egress  . . . . . . . . . . . . . . . . . .  6
       5.3.1. No Duplication (Single Exit Point)  . . . . . . . . . .  6
       5.3.1. No Echo (Split Horizon) . . . . . . . . . . . . . . . .  6
     5.4. No Black-hole & No Triangular Forwarding  . . . . . . . . .  7
     5.5. Load Balance Towards the AAE  . . . . . . . . . . . . . . .  7
   6. Security Considerations . . . . . . . . . . . . . . . . . . . .  7
   7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . .  7
   Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . .  7
   8. References  . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     8.1. Normative References  . . . . . . . . . . . . . . . . . . .  8
     8.2. Informative References  . . . . . . . . . . . . . . . . . .  8
   Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . .  9












Mingui Zhang, et al    Expires September 7, 2014                [Page 2]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


1. Introduction

   In the TRILL Active-Active Edge (AAE) topology, a Multi-Chassis Link
   Aggregation Group (MC-LAG) is used to connect multiple RBridges to a
   switch or a vSwitch. An endnode clump is attached to this switch or
   vSwitch. It's required that data traffic within a specific VLAN from
   this endnode clump can be ingressed and egressed by any of these
   RBridges simultaneously. End systems in the clump can spread their
   traffic among these edge RBridges at the flow level. When a link
   fails, end systems can keep using the rest of links in the MC-LAG
   without waiting for the convergence of TRILL, which provides the
   resilience towards link failures.

   Since a packet from each endnode can be ingressed by any RBridge in
   the AAE group, a remote edge RBridge may observe multiple attachment
   points (i.e., egress RBridges) for this endnode identified by its MAC
   address. This issue is known as the "MAC flip-flopping". Three
   potential solutions arise to address this issue:

      1) AAE member RBridges use a pseudonode nickname, instead of their
      own, as the ingress nickname for end systems attached to the MC-
      LAG. [CMT] is based on this solution.

      2) AAE member RBridges split work among themselves for which ones
      will be responsible for which MAC addresses. A member RBridge will
      encapsulate the packet using its own nickname if it is responsible
      for the source MAC address. Otherwise, if the frame is known
      unicast, it encapsulates the packet using the nickname of the
      responsible RBridge; if the frame is multicast, it needs to
      redirect the packet to its responsible RBridge for encapsulation.

      3) AAE member RBridges keep using their own nicknames. Remote edge
      RBridges are required to learn multiple points of attachment per
      VLAN for a MAC address attached to the AAE, and separately time
      each one out.

   The purpose of this ID is to develop an approach based on solution 3.
   Although it focuses on exploring solution 3, the major design goals
   discussed here are common for AAE. Through mirroring the scenarios
   studied in this draft, other potential solutions may benefit as well.

   The main body of the document is organized as follows. Section 2
   lists the acronyms and terminologies. Section 3 gives the overview
   model. Section 4 gives three options for incremental deployment.
   Section 5 describes how this approach meets the design goals.

2. Acronyms and Terminology




Mingui Zhang, et al    Expires September 7, 2014                [Page 3]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


2.1. Acronyms

   TRILL: TRansparent Interconnection of Lots of Links
   AAE: Active/Active Edge
   MC-LAG: Multi-Chassis Link Aggregation Group
   IS-IS: Intermediate System to Intermediate System

2.2. Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   Familiarity with [RFC6325], [RFC6327], [6327bis] and [RFC6439] is
   assumed in this document.

3. Overview

                               +-----+
                               | RB4 |
                    +----------+-----+----------+
                    |                           |
                    |                           |
                    |       Rest of campus      |
                    |                           |
                    |                           |
                    +-+-----+--+-----+--+-----+-+
                      | RB1 |  | RB2 |  | RB3 |
                      +-----\  +-----+  /-----+
                              \   |   /
                                \ | /
                                 |||MC-LAG
                                 |||
                                +---+
                                | B |
                                +---+
                             H1 H2 H3 H4: vlan 10

      Figure 3.1: An example topology of TRILL Active-Active Edge

   Figure 3.1 shows an example network of TRILL Active-Active Edge. In
   this figure, endnodes (H1, H2, H3 and H4) are attached to a bridge
   (B) which communicates with multiple RBridges (RB1, RB2 and RB3) via
   the MC-LAG. Suppose RB4 is a 'remote' RBridge out of the AAE group in
   the TRILL campus. This connection model is also applicable to the
   virtualized environment where the physical bridge can be replaced
   with a vSwitch while those bare metal hosts are replaced with virtual
   machines (VM).



Mingui Zhang, et al    Expires September 7, 2014                [Page 4]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


   For a packet received from their attached endnode clumps, member
   RBridges of the AAE group always encapsulate it using their own
   nickname no matter it's unicast or multicast.

   In this proposal, all edge RBridges in the entire campus need to
   learn multiple attachment points for each MAC address, and separately
   time each one out.

4. Backward Compatibility

   Three options are listed below to cope with incremental deployment
   scenarios.

   -- Option A

      A new capability announcement would appear in LSPs. "I can cope
      with multiple endnode attachments". Only if all edge RBridges
      announce this capability can the AAE group use this approach. For
      those legacy RBridges who are not capable to cope with multiple
      endnode attachments, new type TRILL switches will not establish
      connectivity with them so that they are isolated from these new
      type TRILL switches. Note only edge RBridges (those that are
      Appointed Forwarders [RFC6439]) need to be able to support this.
      It does not affect totally transit RBridges.

   -- Option B

      Each edge RBridge in the AAE group ingress data frames from any
      MC-LAG into a specific topology. In this way, the topology ID is
      used as the discriminator of different locations of a specific MAC
      address at the remote RBridge. TRILL MAY reserve a list of
      topology IDs to be dedicated to AAE. RBridges which do not support
      this reserved list MUST NOT establish connectivity with edge
      RBridges in the AAE group.

   -- Option C

      If the data plane learning of all RBridges does not support the
      multi-location learning feature. It's possible to make use of the
      ESADI protocol [ESADI] to distribute MAC addresses. Compared to
      the data plane learning, TRILL ESADI allows one RBridge to
      remember multiple locations of a MAC address at the control
      plane.

5. Design Goals

   Proposals for the major design goals of AAE are explored in this
   section.



Mingui Zhang, et al    Expires September 7, 2014                [Page 5]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


5.1. No MAC Flip-Floping (Normal Unicast Egress)

   Since all RBridges talking with the AAE RBridges in the campus are
   able to keep multiple locations for one MAC address, a MAC address
   learnt from one AAE member will not be overwritten by the same MAC
   address learnt from another AAE member. Multiple entries for this MAC
   address will be created. The remote RBridge can adhere to one of the
   locations (e.g., the closest one) for each MAC address rather than
   keep flip-floping among them.

5.2. Regular Unicast/Multicast Ingress

   MC-LAG guarantees that each frame will be sent upward to the AAE via
   exactly one uplink. RBridges in the AAE can simply follow the process
   per [RFC6325] to ingress the frame. For example, each RBridge uses
   its own nickname as the ingress nickname to encapsulate the packet.
   In such scenario, each RBridge takes for granted that it is the
   Appointed Forwarder for the VLANs enabled on this MC-LAG.

5.3. Right Multicast Egress

   A fundamental design goal of AAE is that there is no duplication and
   forwarding loop.

5.3.1. No Duplication (Single Exit Point)

   When multi-destination packets for a specific VLAN are received from
   the campus, it's important that exactly one RBridge out of the AAE
   group let through each multicast packet, therefore no duplication
   happens. The single exit point can be selected based on static
   algorithms, e.g., VLAN or source MAC address 'mod' the number of AAE
   members. Also, AAE member RBridges may listen to the LACP PDUs and
   make use of the hashing function of MC-LAG to determine this single
   exit point.

5.3.1. No Echo (Split Horizon)

   When a multicast frame originated from an MC-LAG is ingressed by an
   RBridge of an AAE group, forwarded across the TRILL network and then
   received by another RBridge in the same AAE group, it is important
   that this RBridge does not egress this frame back to this MC-LAG.
   Otherwise, it will cause a forwarding loop (echo). The well known
   'split horizon' technique can be used to eliminate the echo issue.
   The essential point for split horizon is that the MC-LAG is appointed
   with an unique identifier across the AAE group. When an AAE member
   receives a multicast packet has this identifier, the receiver MUST
   NOT egress it to the MC-LAG with the same identifier.




Mingui Zhang, et al    Expires September 7, 2014                [Page 6]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


   This document propose to split horizon based on the tuple consisting
   of the Fine Grained Label (FGL) plus the ingress RBridge nickname.
   When there are multiple MC-LAGs connected to the same RBridge, each
   MC-LAG MUST be assigned with an unique FGL. RBridges in an AAE group
   should discover and remember nicknames of other members. If a
   multicast packet is from an edge RBridge in a same AAE group as RB1,
   its FGL will be read and RB1 MUST NOT egress it out of the interface
   configured with the same FGL. For other interfaces, RB1 SHOULD egress
   the packet.

5.4. No Black-hole & No Triangular Forwarding

   If a sub-link of the MC-LAG fails while remote RBridges continue to
   send packets to those MAC addresses they have learnt via the failed
   port, black-hole happens.

   The proposal in this draft may make use of MAC withdrawal. When a
   member RBridge detects that the port connected to a sub-link of the
   MC-LAG fails, all MAC addresses attached to this RBridge through the
   failed sub-link will be flushed. After doing that, no traffic will be
   sent via the failed port, hence no black-hole happens.

5.5. Load Balance Towards the AAE

   Since a remote RBridge can record multiple attachments of one MAC
   address, this remote RBridge can choose to spread the traffic to this
   MAC towards any of the AAE members. Each of them is able to egress
   the traffic. Flow-level load balance mechanisms can be implemented to
   optimize the distribution of the traffic load towards the AAE group.

6. Security Considerations

   Security issue should be considered when a specific extension is made
   to TRILL.

   Authenticity for contents transported in IS-IS PDUs is enforced using
   regular IS-IS security mechanism [ISIS][RFC5310].

   For security considerations pertain to extensions hosted by TRILL
   ESADI should refer to the Security Considerations in [ESADI].

7. IANA Considerations

   This document requires no IANA actions. RFC Editor: please remove
   this section before publication.

Acknowledgements




Mingui Zhang, et al    Expires September 7, 2014                [Page 7]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


   The authors would like to thank the comments and suggestions from
   Donald Eastlake, Erik Nordmark, Fangwei Hu and Liang Xia.

8. References

8.1. Normative References

   [RFC6325] Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A.
             Ghanwani, "Routing Bridges (RBridges): Base Protocol
             Specification", RFC 6325, July 2011.

   [RFC6327] Eastlake 3rd, D., Perlman, R., Ghanwani, A., Dutt, D., and
             V. Manral, "Routing Bridges (RBridges): Adjacency", RFC
             6327, July 2011.

   [6327bis] D. Eastlake, R. Perlman, et al, "TRILL: Adjacency", draft-
             ietf-trill-rfc6327bis-04.txt, January 2014, in RFC Ed
             Queue.

   [RFC6439] Perlman, R., Eastlake, D., Li, Y., Banerjee, A., and F. Hu,
             "Routing Bridges (RBridges): Appointed Forwarders", RFC
             6439, November 2011.

   [ESADI]   H. Zhai, F. Hu, et al, "TRILL (Transparent Interconnection
             of Lots of Links): ESADI (End Station Address Distribution
             Information) Protocol", draft-ietf-trill-esadi-05.txt,
             February 2014, working in progress.

8.2. Informative References

   [CMT]     T. Senevirathne, J. Pathangi, et al, "Coordinated Multicast
             Trees (CMT)for TRILL", draft-ietf-trill-cmt-02.txt,
             November 2012, working in progress.

   [ISIS]    ISO, "Intermediate system to Intermediate system routeing
             information exchange protocol for use in conjunction with
             the Protocol for providing the Connectionless-mode Network
             Service (ISO 8473)", ISO/IEC 10589:2002.

   [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R.,
             and M. Fanto, "IS-IS Generic Cryptographic Authentication",
             RFC 5310, February 2009.









Mingui Zhang, et al    Expires September 7, 2014                [Page 8]


INTERNET-DRAFT     MAC Multi-Attach for Active/Active      March 6, 2014


Author's Addresses


   Mingui Zhang
   Huawei Technologies
   No.156 Beiqing Rd. Haidian District,
   Beijing 100095 P.R. China

   Email: zhangmingui@huawei.com

   Radia Perlman
   Individual Contributor

   Email: radiaperlman@gmail.com

   Hongjun Zhai
   ZTE Corporation
   68 Zijinghua Road
   Nanjing 200012 China

   Phone: +86-25-52877345
   Email: zhai.hongjun@zte.com.cn

   Mukhtiar Shaikh
   Brocade

   Email: mshaikh@brocade.com

   Muhammad Durrani
   Brocade

   Email: mdurrani@brocade.com



















Mingui Zhang, et al    Expires September 7, 2014                [Page 9]