TRILL Working Group                                            Yizhou Li
INTERNET-DRAFT                                           Donald Eastlake
Intended Status: Informational                                Weiguo Hao
                                                     Huawei Technologies
Expires: January 13, 2014                                  July 12, 2013


         Problems of Active-Active connection at the TRILL Edge
          draft-yizhou-trill-active-active-connection-prob-00


Abstract

   The IETF TRILL (Transparent Interconnection of Lots of
   Links)_protocol provides support for flow level multi-pathing with
   rapid failover for both unicast and multi-destination traffic in
   networks with arbitrary topology and link technology between TRILL
   switches. Active-active at the TRILL edge is the extension, in so far
   as practical, of these characteristics to end stations that are
   multiply connected to a TRILL campus. This informational document
   discusses some of the high level problems to be overcome in providing
   active-active at the TRILL edge.


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html


Copyright and License Notice




Yizhou, et al                                                   [Page 1]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.



Table of Contents

   1  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1  Terminology . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Target Scenario  . . . . . . . . . . . . . . . . . . . . . . .  3
   3. Problems in active-active connection at the edge  . . . . . . .  5
     3.1 Frame duplications . . . . . . . . . . . . . . . . . . . . .  5
     3.2 Address flip-flop  . . . . . . . . . . . . . . . . . . . . .  5
     3.3 Packet drop due to RPF check . . . . . . . . . . . . . . . .  6
     3.4 Loops  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     3.5 Member RBridges info synchronization . . . . . . . . . . . .  6
   4 Current Work . . . . . . . . . . . . . . . . . . . . . . . . . .  7
   5 Security Considerations  . . . . . . . . . . . . . . . . . . . .  7
   6  IANA Considerations . . . . . . . . . . . . . . . . . . . . . .  8
   6  References  . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     5.1  Normative References  . . . . . . . . . . . . . . . . . . .  8
     5.2  Informative References  . . . . . . . . . . . . . . . . . .  8
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .  9


















Yizhou, et al                                                   [Page 2]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


1  Introduction

   The IETF TRILL (Transparent Interconnection of Lots of Links)
   [RFC6325] protocol provides loop free and per hop based multipath
   data forwarding with minimum configuration. TRILL uses IS-IS
   [RFC6165] [RFC6326bis] as its control plane routing protocol and
   defines a TRILL specific header for user data. In a TRILL campus,
   communications between TRILL switches can

   (1) use multiple parallel links and/or paths,

   (2) load spread over different links and/or paths at a fine grained
   flow level through equal cost multipathing of unicast traffic and
   multiple distribution trees for multi-destination traffic, and

   (3) rapidly re-configure to accommodate link or node failures or
   additions.

   Active-active connection is the extension, to the extent practical,
   of similar load spreading and robustness to the connections between
   end stations and the TRILL campus. Such end stations may have
   multiple ports and will be connected, directly or via bridges, to
   multiple edge TRILL switches. It must be possible, except in some
   failure conditions, to load spread end station traffic at the flow
   level across links to such multiple edge TRILL switches and rapidly
   re-configure to accommodate topology changes.


1.1  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   The acronyms and terminology in [RFC6325] is used herein with the
   following additions:

      CE - customer equipment. Could be a bridge or end station.

      TRILL switch - an alternative term for an RBridge.



2.  Target Scenario

   The TRILL appointed forwarder [RFC6325] [RFC6327bis] [RFC6439]
   mechanism provides per VLAN active-standby traffic spreading and loop
   avoidance at the same time. One and only one appointed RBridge can



Yizhou, et al                                                   [Page 3]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


   ingress/egress native frames into/from TRILL campus for a given VLAN
   among all edge RBridges connecting a legacy network to TRILL campus.
   This is true whether the legacy network is a simple point-to-point
   link or a complex bridged LAN or anything inbetween. By carefully
   selecting different RBridge as appointed forwarder for different set
   of VLANs, load spreading over different edge RBidges across different
   VLANs can be achieved.


   This section presents a typical scenario of active-active connections
   to TRILL campus via multiple edge RBridges where current TRILL
   appointed forwarder mechanism is not applicable.

   The appointed forwarder mechanism [RFC6439] requires each of the edge
   RBridges to exchange TRILL IS-IS Hello packets from their access
   ports. As  figure 1 shows, when multiple access links of multiple
   edge RBridges are bundled as an MC-LAG (Multi-Chassis Link
   Aggregation Group), Hello messages sent by RB1 via access port to CE1
   will not be forwarded to RB2 by CE1. RB2 (and other members of MC-
   LAG1) will not see that Hello from RB1. Every member RBridge of MC-
   LAG1 thinks of itself as appointed forwarder on MC-LAG1 link for all
   VLANs and will ingress/egress frames for all VLANs. Hence appointed
   forwarder mechanism is not applicable in such active-active scenario.

                ----------------------
               |                      |
               |   TRILL Campus       |
               |                      |
                ----------------------
                    |       |    |
               -----        |     --------
              |             |             |
          +------+      +------+      +------+
          |      |      |      |      |      |
          |(RB1) |      |(RB2) |      | (RBk)|
          +------+      +------+      +------+
            |..|          |..|          |..|
            |  +----+     |  |          |  |
            |   +---|-----|--|----------+  |
            | +-|---|-----+  +-----------+ |
   MC-      | | |   +------------------+ | |
   LAG1--->(| | |)                    (| | |) <---MC-LAG2
          +-------+    .  .  .       +-------+
          | CE1   |                  | CEn   |
          |       |                  |       |
          +-------+                  +-------+





Yizhou, et al                                                   [Page 4]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


   Active-Active connection is useful when we want to achieve the
   following requirements though MC-LAG implementation varies by vendor.

   - Flow rather than VLAN based load balancing is required.

   - Rapid failure recovery. Current appointed forwarder mechanism
   relies on the Hello timer expiration to detect the unreachability of
   another edge RBridge connecting to the same local Ethernet link. Then
   re-appoint the forwarder for specific VLANs may be required. Such
   procedures takes time in the scale of seconds. Active-Active
   connection should minimize the frame loss and recovery time in
   failure.


3. Problems in active-active connection at the edge

   This sections present the problems needed to be addressed in active-
   active connection scenario.


3.1 Frame duplications

   When an MC-LAG is formed to multiple RBridges, there may be a
   potential duplication of the frame to be received by the a CE. Two
   possible scenarios are presented as follows.

   1. Looping back: CE1 forwards a multi-destination frame from a user
   device. As shown in Figure 1, the frame enters the TRILL campus via a
   member of an MC-LAG (say RB1) and then is forwarded through the
   campus to another member (say RB2) of the same MC-LAG. Then CE1
   receives a duplicated copy from RB2.

   2. Duplication from remote: A remote RBridge sends a multi-
   destination frame of VLAN x. All members of MC-LAG1 will receive the
   frame. As each of them thinks it is the appointed forwarder for all
   VLANs, they would all forward the frame to CE1. The consequence is CE
   receives multiple copies.

   Frame duplication only happens in multi-destination frame forwarding.
   Unicast does not have this issue.

3.2 Address flip-flop

   Consider RB1 and RB2 using their own nickname as source nickname to
   ingress data frame into a TRILL campus. As shown by Figure 1, CE1 may
   send a data frame with the same source MAC address to any member RB
   of MC-LAG1. If the egress RBridge receives TRILL packet from
   different ingress RBridge RBridges but with same same source MAC



Yizhou, et al                                                   [Page 5]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


   address, it learns different address correspondence from the data
   frames. Address correspondence may keep flip-flopping among nicknames
   of the member RBridges of the MC-LAG for the same MAC address in the
   same VLAN. Some TRILL switches may behave badly under these
   circumstances and, for example, interpret this as a severe network
   problem. It may also cause the returning traffic to go through the
   different paths to reach the destination resulting in persistent re-
   ordering of the frames.



3.3 Packet drop due to RPF check

   In order to solve the problems above, a pseudonode nickname [TRILLPN]
   solution was proposed. The basic idea is to represent all member
   links of the MC-LAG as a virtual RBridge with single pseudonode
   nickname. Any member RBridge of the MC-LAG should use this pseudonode
   nickname rather than its own nickname as ingress nickname when inject
   TRILL data frames. It solves the abovementioned problems pretty well;
   however, it introduces another issue: packet drop due to RPF check.

   When forwarding multi-destination frame, different member RBridges of
   an MC-LAG may choose the same tree. A random RBridge RBn in TRILL
   campus may receive the frame on single tree from the pseudonode
   nickname on different incoming ports. RPF check fails in this case.
   Frames will be dropped.

3.4 Loops

   Active-Active connection does not introduce extra looping risk as MC-
   LAG is just like a single link. So a frame will not keep geting
   ingress and egressed to/from the TRILL campus via a single MC-LAG
   link in normal situation. However we do need to pay attention that
   any solutions for active-active connection scenario make sure the
   campus is loop-free.


3.5 Member RBridges info synchronization

   When multiple edge RBridges are bundled as an MC-LAG to make CE
   multi-homed to TRILL campus, it is necessary to make sure the
   RBridges are aware of the status of each link in MC-LAG.
   Synchronization of information is necessary.

   1. Member RBridges configuration synchronization: it is unavoidable
   to synchronize the configuration parameters among edge RBridges of an
   MC-LAG. Such configuration may include system ID, system priority,
   port key, port priority, partner information, etc. If abovementioned



Yizhou, et al                                                   [Page 6]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


   [TRILLPN] and/or [CMT] was employed, there are more configurations to
   be synchronized, for instance, pseudonode nickname of the virtual
   RBridge. Without synchronization mechanism, we have to manually
   provision each member RBridge to guarantee consistency. In addition,
   some of the configuration may dynamically change during failure, for
   instance, tree-id selected by member RBridges [CMT]. Manual
   inconsistency check is not applicable in this case.

   2. Member RBridges state synchronization: link failure or node
   failure on a member RBridge may introduce packet loss. Link failure
   includes both access port and trunk port link failure. When failure
   occurs, MC-LAG may need to invoke re-selection logic to spread the
   traffic across the rest links/nodes. Therefore fast detection and
   failure recovery is required upon state synchronization. Some
   mechanism could be employed, for example, TRILL BFD
   support[TRILLBFD]. Trunk port and node failure can be detected it.
   However access port/link failure needs some special care. An RBridge
   that has an access port/link failure should notify the other members
   RBs with port information to make them adjust the corresponding MC-
   LAG.

   3. Member RBridges learnt MAC address synchronization: it is required
   that member RBs share the MAC address and egress nickname
   correspondence they have learnt. By such synchronization, flooding
   due to unknown unicast can be reduced.

   If some inter-chassis protocol is employed among member RBridges for
   MC-LAG member discovery, info synchronization and failure handling,
   we need to make sure it can run smoothly over TRILL campus. The
   protocol may use IP address to identify the other members. We need to
   make sure such packets can be correctly TRILL encapsulated.

   If no such inter-chassis protocol is available, TRILL has to provide
   its own mechanisms to support the information synchronization.



4 Current Work

   There have been some solution drafts presented in TRILL WG.
   [TRILLPN], [CMT] and [TRILLBFD] address parts of the problems above.


5 Security Considerations

   This draft presents the problems in a particular scenario. It does
   not introduce any extra security risks. For general TRILL Security
   Considerations, see [RFC6325].



Yizhou, et al                                                   [Page 7]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


6  IANA Considerations

   No IANA action is required. RFC Editor: please delete this section
   before publication.


6  References

5.1  Normative References


   [RFC6165]  Banerjee, A. and D. Ward, "Extensions to IS-IS for Layer-2
              Systems", RFC 6165, April 2011.

   [RFC6325] Perlman, R., et.al. "RBridge: Base Protocol Specification",
              RFC 6325, July 2011.

   [RFC6326bis] Eastlake, D., Banerjee, A., Dutt, D., Perlman, R., and
              A. Ghanwani, "TRILL Use of IS-IS", draft-eastlake-isis-
              rfc6326bis, work in progress.

   [RFC6327bis] Eastlake 3rd, D., R. Perlman, A. Ghanwani, H. Yang, and
              V. Manral, "TRILL: Adjacency", draft-ietf-trill-
              rfc6327bis, work in progress.

   [RFC6439] Eastlake, D. et.al., "RBridge: Appointed Forwarder", RFC
              6439, November 2011.

5.2  Informative References


   [TRILLPN] Zhai,H., et.al., "RBridge: Pseudonode Nickname", draft-hu-
              trill-pseudonode-nickname, Work in progress, November
              2011.

   [CMT] Senevirathne, T., Pathangi, J., and J. Hudson, "Coordinated
              Multicast Trees (CMT)for TRILL", draft-ietf-trill-cmt-
              01.txt Work in Progress, November 2012

   [TRILLBFD] V. Manral., et al., "TRILL (Transparent Interconnetion of
              Lots of Links): Bidirectional Forwarding Detection (BFD)
              Support", draft-ietf-trill-rbridge-bfd-07.txt work in
              Progress, July 2012

   [8021AX] IEEE, "Link Aggregration", 802.1AX-2008, 2008.

   [8021Q] IEEE, "Media Access Control (MAC) Bridges and Virtual Bridged
              Local Area Networks", IEEE Std 802.1Q-2011, August, 2011



Yizhou, et al                                                   [Page 8]


INTERNET DRAFT    Problems of Active-Active connection         July 2013


Authors' Addresses


   Yizhou Li
   Huawei Technologies
   101 Software Avenue,
   Nanjing 210012
   China

   Phone: +86-25-56625375
   EMail: liyizhou@huawei.com

   Donald Eastlake
   Huawei R&D USA
   155 Beaver Street
   Milford, MA 01757 USA

   Phone: +1-508-333-2270
   Email: d3e3e3@gmail.com

   Weiguo Hao
   Huawei Technologies
   101 Software Avenue,
   Nanjing 210012
   China

   Phone: +86-25-56623144
   EMail: haoweiguo@huawei.com























Yizhou, et al                                                   [Page 9]