INTERNET-DRAFT                                              Mingui Zhang
Intended Status: Proposed Standard                                Huawei
Expires: August 2, 2012                                 January 30, 2012

      Reverse Path Forwarding Check under Multiple Topology TRILL
                draft-zhang-trill-multi-topo-rpfc-00.txt

Abstract

   Multi-homing (RBridge Aggregation) is a promising approach to
   increase the reliability and access bandwidth of TRILL edge. Active-
   active forwarding in multi-homing allows multiple RBridges forward
   data frames for VLAN-x on a LAN link, which creates the possibility
   that multicast frames from a specific ingress RBridge may arrive at
   multiple incoming ports of a remote RBridge. This violates the
   Reverse Path Forwarding Check and multicast frames arrives at
   unexpected incoming ports will be discarded by this RBridge. This
   document makes use of multiple topology TRILL to solve this problem.
   Multiple topology TRILL provides physical separation of traffic from
   different members of aggregation. Multicast frames from aggregation
   members comply with the Reverse Path Forwarding Check per topology.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html


Copyright and License Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors. All rights reserved.




Mingui Zhang             Expires August 2, 2012                 [Page 1]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1. Content . . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.3. Acronyms  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2. RPFC Issue in Active-Active Multi-homing  . . . . . . . . . . .  5
   3. Multi-Topology for Aggregation  . . . . . . . . . . . . . . . .  6
     3.1. Multicast Ingressing  . . . . . . . . . . . . . . . . . . .  7
     3.2. Multicast Egressing . . . . . . . . . . . . . . . . . . . .  7
     3.3. Address Flip-Flop Avoidance by Asymmetric Topologies  . . .  7
     3.4. Tunneling Approach  . . . . . . . . . . . . . . . . . . . .  8
   4. Incremental Deployment  . . . . . . . . . . . . . . . . . . . .  9
     4.1. Intra-Topology Communication  . . . . . . . . . . . . . . .  9
     4.2. Inter-Topology Communication  . . . . . . . . . . . . . . . 10
     4.3. A Hybrid Scenario . . . . . . . . . . . . . . . . . . . . . 10
   5. Security Considerations . . . . . . . . . . . . . . . . . . . . 11
   6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11
   7. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 11
   8. References  . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     8.1. Normative References  . . . . . . . . . . . . . . . . . . . 11
     8.2. Informative References  . . . . . . . . . . . . . . . . . . 12
   Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13

















Mingui Zhang             Expires August 2, 2012                 [Page 2]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


1. Introduction

   With the link state routing of IS-IS (Intermediate System to
   Intermediate System), TRILL provides a solution of least cost
   forwarding of data frames to replace the Spanning Tree Protocol (STP)
   running in traditional bridge networks.

   RBridge Aggregation provides active-active multi-homing at the edge
   of TRILL [RBAgg]. It increases the access bandwidth and reliability
   of TRILL edge but creates the possibility that multiple RBridges
   ingress/egress data frames for end-stations from VLAN-x on a LAN
   link. A typical use of RBridges Aggregation is to represent a LAN
   link with a single virtual RBridge. RBridges participating the
   aggregation ingress/egress data frames on behalf of this virtual
   RBridge using a pseudonode nickname.

   Reverse Path Forwarding Check (RPFC) is used by TRILL to suppress
   forwarding loops of multicast frames. Based on a Distribution Tree
   (DT), a multicast frame from a specific ingress RBridge arrives at a
   single expected link of an RBridge. RBridges MUST drop multicast
   frames that fail the RPFC [RFC6325]. When multiple RBridges ingress
   multicast frames for end-stations from VLAN-x on a LAN link
   simultaneously, it can not guarantee that these frames always arrive
   at the expected link of a remote RBridge.

   Multiple Topology (MT) TRILL provides a physical separation of
   traffic [RFC5120] [MTc] [MTd]. An MT aware RBridge can participate
   data forwarding in multiple topologies at the same time. This feature
   is utilized in this document to resolve the issue that active-active
   multi-homing may fail RPFC. Each RBridge of the aggregation uses an
   individual topology to ingress/egress data frames for the target LAN
   link. Since distribution trees are calculated per topology by MT
   aware RBridges [MTd], multicast frames will be forwarded along these
   distribution tress separately, which helps the arriving multicast
   frames pass RPFC. To be backward compatible, the solution provided in
   this draft does not require all RBridges in a campus to upgrade to
   support multiple topology TRILL. Legacy RBridges that do not support
   multiple topology TRILL can inter-operate with the MT aware RBridges
   participating the RBridge Aggregation.

   This document focus on solving the RPFC issues caused by active-
   active multi-homing. Other issues of multi-homing, such as failure
   recovery and load balance, are in the scope of RBridge Aggregation
   [RBagg]. One advantage of the adoption of multiple topology TRILL is
   that approaches for failure recovery developed for multiple topology
   routing ([RFC5714]) can be reused in RBridge Aggregation without the
   reinvention of the wheel.




Mingui Zhang             Expires August 2, 2012                 [Page 3]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


1.1. Content

   Section 2 explains why active-active multi-homing may cause trouble
   in Reverse Path Forwarding Check of TRILL.

   Section 3 describes the approach of configuration for the edge
   RBridges to achieve RBridge Aggregation through multi-topology
   TRILL.

   Backward compatibility is an essential requirement for the inter-
   operation between legacy RBridges and RBridges participating in
   aggregation. Section 4 describes solutions for three incremental
   deployment scenarios.

1.2. Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

1.3. Acronyms

   IS-IS - Intermediate System to Intermediate System

   TRILL - TRansparent Interconnection of Lots of Links

   STP - Spanning Tree Protocol

   MT - Multiple Topology

   DT - Distribution Tree

   LAG - Link Aggregation

   RPFC - Reverse Path Forwarding Check
















Mingui Zhang             Expires August 2, 2012                 [Page 4]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


2. RPFC Issue in Active-Active Multi-homing

                             +-----+
          RBi                | RBi |(Remote RBridge)
         /   \               +-----+
       RB1   RB2                |
       /                   /\/\/\/\/\/\
     RBv                  /   Transit  \
                         <    RBridges  >
   Distribution Tree(DT)  \   Campus   /
                           \/\/\/\/\/\/
                            |        |
                         +-----+  +-----+
                         | RB1 |--| RB2 |(Aggregation Members)
                         +-----+  +-----+
                               \   /
                              *******
                              * RBv * (Virtual RBridge)
                              *******
                                ||(LAG)
                              +----+
                          +---| CE |---+
                          |   +----+   |
                          |            |
                          |[H1][H2]... |
                          +------------+
                              VLAN-x

        Figure 2.1: An Example Topology of RBridge Aggregation

   RBridge Aggregation is first proposed in [RBagg]. RBridge Aggregation
   enables active-active multi-homing for LAN links [RBagg]. Several
   RBridges can ingress/egress data frames for end-stations of one VLAN
   on a LAN link, which increases the access bandwidth and reliability
   of TRILL edge.

   Figure 2.1 shows an example topology of RBridge Aggregation and the
   distribution tree is shown on the left (Suppose the transit RBridge
   campus is null.). Based on the distributions tree, multicast frames
   from RBv to RBi is expected to be received at the port attaching to
   RB1.

   Under RBridge Aggregation, RB2 can really ingress native data frames
   from the LAG links, therefore multicast frames from RBv to RBi may
   legally be received at the port attaching to RB2. These frames will
   be discarded according to the rule of Reverse Path Forwarding Check
   [RFC6325]. Active-active forwarding of multicast frames is the root
   cause of this issue. The rest of this document will make use of



Mingui Zhang             Expires August 2, 2012                 [Page 5]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


   multiple topology TRILL to solve this problem.

3. Multi-Topology for Aggregation

   Documents [MTc] and [MTd] define the protocol extensions, data plane
   encoding and procedures to make use of the multiple topology routing
   supported by ISIS. Multiple topology routing provides physical
   traffic segregation to TRILL, which is utilized to solve the RPFC
   issue caused by RBridge Aggregation. RPFC will be done based on
   distribution trees which are calculated per topology abbreviation by
   MT aware RBridges.

   Topology IDs are used to identify the aggregation members. If the
   number of available topologies is greater than the number of
   aggregation members, several topology IDs can be assigned to one
   aggregation member which can make use of these topologies to realize
   load-balancing. If available topologies are less than aggregation
   members, some of these members get no topology ID. These standby
   aggregation members can make use of the tunneling approach defined in
   Section 3.3 to redirect arriving data frames to other members for
   forwarding.

           Table 3.1: A Sample Configuration for Aggregation

                     +------------+---------+-------+
                     |Aggregation |  RBv's  |  LAG  |
                     |  Members   |Nicknames|Members|
                     +------------+---------+-------+
                     |   RB1      |...001...|RB1-RBv|
                     +------------+---------+-------+
                     |   RB2      |...010...|RB2-RBv|
                     +------------+---------+-------+
                     |   RB3      |...011...|RB3-RBv|
                     +------------+---------+-------+
                     |   RB4      |...100...|RB4-RBv|
                     +------------+---------+-------+

   Since multiple topology TRILL identifies a topology using the ingress
   nickname [MTd], the topology assignment among aggregation members is
   embodied through the nickname configuration of RBv. Figure 3.1 shows
   a typical configuration of RBridge Aggregation with 4 members. Each
   aggregation member ingress native frames using one nickname of RBv.
   These frames will be confined to the topology as these nicknames
   indicate. For example, when RB1 ingress the native frames from the
   local link, it will use RBv001 as the ingress nickname and these
   frames will be forwarded in topology 1.

   The rest of this section discusses multicast forwarding. For the



Mingui Zhang             Expires August 2, 2012                 [Page 6]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


   detail of unicast forwarding, one may refer to [RBagg].

3.1. Multicast Ingressing

                    RBi                   RBi
                   /   \                 /   \
                 RB1   RB2             RB1   RB2
                 /                             \
              RBv001                          RBv010

             DT for Topology_1       DT for Topology_2

       Figure 3.1: Sample Distribution Trees for Topology 1 and 2

   LAG may use any of its links as the active link to send frames to a
   member of RBridge Aggregation. The receiver SHOULD encapsulate the
   native frames on behalf of RBv. Take Figure 3.1 as an example, RB1
   and RB2 encapsulates native frame using RBv001 and RBv010 as their
   ingress nicknames respectively. If these frames are multicast frames,
   they will be forwarded according to the distribution trees calculated
   per topology. Since RBi calculates two different distributions trees
   for RBv001 and RBv010, multicast frames arriving at the ports
   attached to RB1 and RB2 can all pass the RPFC.

3.2. Multicast Egressing

   Since distribution trees are built per topology, a multicast frame
   will be received by only one aggregation member. This member should
   egress the multicast frame to the local link on behalf of RBv. But
   remote RBridges is not aware that RBv actually does not exist. All
   aggregation members act as penultimate hops to RBv in the campus.

3.3. Address Flip-Flop Avoidance by Asymmetric Topologies

             +-------+------+------+      +-------+------+------+
             |VLAN ID|MacDA |Egress|      |VLAN ID|MacDA |Egress|
             +-------+----- +------+ ---> +-------+----- +------+
             |VLAN-x |Mac_H1|RBv001|      |VLAN-x |Mac_H1|RBv010|
             +-------+------+------+      +-------+------+------+

            Figure 3.2: An Example of MAC Address Flip-Flop

   In the above ingressing procedure, native data frames from one end
   station may be ingressed to the campus by different aggregation
   members. Current RBridges do not have the topology abbreviation as a
   separate column in their MAC tables. Therefore, when a remote RBridge
   receives multicast frames with the same source MAC address from
   different aggregation members, these multicast frames will create



Mingui Zhang             Expires August 2, 2012                 [Page 7]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


   only one entry in the MAC table of this remote RBridge.

   As illustrated in Figure 3.2, when frames originated from H1 is sent
   to RBi from RB1, RBi will learn that the egress RBridge nickname for
   Mac_H1 is RBv001. Afterwards, if RB2 sends frames originated from H1
   to RBi, the egress RBridge nickname will change to RBv010. It seems
   that the use of multiple topology TRILL brings a MAC address flip-
   flop issue. If RBv001 and RBv010 are regarded as two different egress
   RBridges and RBi prepares paths to them separately, it is possible
   RBi gets different forwarding paths. In other words, RBi will use
   different forwarding paths in different topologies for the data
   frames destined to the same end-station, which may cause packet
   disorder.

   However, MT aware RBridges support asymmetric use of topologies
   [MTd]. In the above example, RBi can send data frames to Mac_H1
   according to topology 1 even if it learns Mac_H1 from the data flow
   in topology 2. That is to say RBi can send return data frames to
   RBv001 all the time. In practical use, remote RBridges SHOULD adhere
   to a specific topology to send return data frames destined to a
   specific MAC address.

3.4. Tunneling Approach

   If available topologies are less than the aggregation members, there
   will be standby members who get no topology ID. These members can
   still ingress native frames from the LAG directly. But they should
   redirect them to other members through the following tunneling
   approach.

   Suppose RB5 is a standby member of the aggregation. So it is not a
   parent of RBv on the distribution tree of any topology. Assume RB5
   tunnels native frames from the LAG to RB1 which is the parent of RBv
   in topology 1. RB5 should ingress the native frame, fill its egress
   nickname as RB1 and fill its ingress nickname as the nickname of
   RBv001 which is used by RB1. Then RB5 sends this frame as a unicast
   frame to RB1. When RB1 receives this unicast frame, it can judge from
   its ingress nickname that this frame should be actually ingressed by
   RB1. Therefore, RB1 decapsulates this frame and re-capsulate it as if
   it is received from the LAG link RBv-RB1.

   For the sake of load-balancing and resilience, it is recommended that
   standby RBridges tunnel their multicast frames evenly among those
   aggregation members who get topology IDs. The optimization of the
   tunneling configuration is out the scope of this document. Tunneling
   approach can also be used for any other purpose such as fail-over.
   However, in this document, tunneling is used only for redirecting
   ingress multicast frames to pass through the RPFC in TRILL.



Mingui Zhang             Expires August 2, 2012                 [Page 8]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


4. Incremental Deployment

   When RBridge Aggregation is put to use in a TRILL campus, it is
   probably that MT unaware RBridges have already been deployed in this
   campus. It is therefore necessary to enable the inter-operation of
   these two types of RBridges. On one hand, MT aware aggregation
   members MUST be backward compatible to those legacy MT unaware
   RBridges. On the other hand, legacy RBridges need not make any change
   in order to communicate with aggregation members.

   With multi-topology TRILL, RBridge Aggregation can be incrementally
   deploy in an RBridge campus. This rest of this section provides
   approaches for three incremental deployment scenarios: (1)
   aggregation members need not to talk with MT unaware RBridges; (2)
   aggregation members need to communicate with MT unaware RBridges; (3)
   a combination of the above two scenarios.

4.1. Intra-Topology Communication

                       +------------------------+
                       |Topology_0    RBx       |
                       |                        |
                       +------------------------+
                       |Topology_1    RBi       |
                       |             /   \      |
                       |           RB1   RB2    |
                       |           /            |
                       |        RBv001          |
                       +------------------------+
                       |Topology_2    RBi       |
                       |             /   \      |
                       |           RB1   RB2    |
                       |                   \    |
                       |                  RBv010|
                       +------------------------+

    Figure 4.1: Aggregation Members Talk with MT aware RBridges Only

   If MT aware aggregated RBridges do not talk with MT unaware RBridges,
   aggregation traffic can be confined to non-zero topologies. This kind
   of traffic segregation is achieved through multi-topology routing. As
   illustrated in Figure 4.1, when RB1 and RB2 forward multicast frames
   to RBi according to distribution trees for topology 1 and topology 2
   respectively, the MT unaware RBx will not receive these frames from
   RBv. When RB1 and RB2 advertise LSPs in the base topology, they will
   not include their adjacencies to RBv001 and RBv010, therefore RBx
   will not be aware of RBv001 and RBv010. In particular, nickname
   RBv000 SHOULD be reserved and not used in aggregation configuration



Mingui Zhang             Expires August 2, 2012                 [Page 9]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


   so that even RBx can reach RBv000, they will not talk with each
   other.

4.2. Inter-Topology Communication

                       +------------------------+
                       |Topology_0              |
                       |              RBi       |
                       |             /   \      |
                       |           RB1   RB2    |
                       |           /       \    |
                       |        RBv001    RBv010|
                       +------------------------+

  Figure 4.2: MT aware RBridges Need to Talk with MT unaware RBridges

   If MT aware aggregated RBridges need to talk with MT unaware
   RBridges, the traffic segregation method in Section 4.1 can not be
   used again. Since MT unaware RBridges only understand the base
   topology, all aggregated RBridges advertise their connections to RBv
   in the base topology. Figure 4.2 illustrate this approach. Assume RBi
   is the MT unaware RBridge. RB1 and RB2 advertise their adjacencies to
   RBv, i.e., "RB1-RBv001" and "RB2-RBv010", in their LSPs. When RBi
   calculate the distribution tree, it should calculate as what is shown
   in Figure 4.2. RBv001 and RBv010 are regarded as two different
   RBridges by RBi.

   Hashing function is widely used in LAG for the purpose of load
   balancing. In a corner case, native data packets from one end-station
   may be mapped to any aggregated member. Similar as the example shown
   in Figure 3.3, the egress of Mac_H1 at RBi may change between RBv001
   and RBv010 back and forth. When MAC address flip-flop happens, the MT
   unaware RBi is unable to use asymmetric topologies to send return
   TRILL data frames back to aggregated members. TRILL data frames
   destined to RBv001 and RBv010 may go through two different forwarding
   paths. Although this kind of MAC flip-flop is rare in real TRILL
   campus, it is recommended that the hashing function is configured to
   completely avoid it. The configuration of LAG should guarantee that
   native data frames from one end-station are mapped to only one
   aggregated member (one active link in the LAG). Destination IP/MAC
   address fields of a native data frame SHOULD not be used as the input
   of hashing function.

4.3. A Hybrid Scenario

   It is allowable that some aggregated members report their connections
   to RBv in the base topology while others do not. For aggregated
   members which do not report the connections to RBv in the based



Mingui Zhang             Expires August 2, 2012                [Page 10]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


   topology, they need tunnel multicast frames to those members who
   report their connections to RBv in the based topology in order to
   communicate with MT unaware RBridges. For example, RB1 and RB2
   advertise their connections  to RBv001 and RBv010 in topology 1 and
   2, while RB0 advertises the adjacency to RBv000 in topology 0. Assume
   RBi is an MT unaware RBridge. The distribution tree calculated by RBi
   will include RBv000 while does not include RBv001 or RBv010. RB0 can
   talk with RBi directly on behalf of RBv000. When RB1 and RB2
   communicates with MT aware RBridges, they can confine the traffic in
   topology 1 and 2. If RB1 and RB2 need to send TRILL data frames to MT
   unaware RBrdiges, such as RBi, they should redirect these frames to
   RB0 using the tunneling approach described in Section 3.3. RB0 will
   send these frames with RBv000 as their ingress nickname.

5. Security Considerations

   This document raises no new security issues for IS-IS.

6. IANA Considerations

   No new registry is requested to be assigned by IANA.

7. Acknowledgements

   Discussions with authors and contributors of [Pseudo] and [CMT]
   provide a great help to the write up of this draft. This document is
   by no means to replace such kind of solutions used for RPFC relaxing.
   These solutions are designed for TRILL base topology and can be used
   in parallel in the same RBridge campus with the solution presented in
   this document.

8. References

8.1. Normative References

   [RBAgg]   M. Zhang, D. Eastlake, et al, "RBridge Aggregation", draft-
             zhang-trill-aggregation-01.txt, working in progress.

   [RFC6325] R. Perlman, D. Eastlake, et al, "RBridges: Base Protocol
             Specification", RFC 6325, July 2011.

   [MTc]     Vishwas Manral, D. Eastlake, et al, "Multiple Topology
             Routing Extensions for Transparent Interconnection of Lots
             of Links (TRILL)", draft-manral-isis-trill-multi-topo-
             03.txt, working in progress.

   [MTd]     D. Eastlake, M. Zhang, et al, "Multiple Topology TRILL",
             draft-eastlake-trill-rbridge-multi-topo-02.txt, working in



Mingui Zhang             Expires August 2, 2012                [Page 11]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


             progress.

8.2. Informative References

   [RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi
             Topology (MT) Routing in Intermediate System to
             Intermediate Systems (IS-ISs)", RFC 5120, February 2008.

   [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC
             5714, January 2010.

   [Pseudo]  H. Zhai, F. Hu, et atl, "RBridge: Pseudonode Nickname",
             draft-hu-trill-pseudonode-nickname-01.txt, working in
             progress.

   [CMT]     T. Senevirathne, J. Pathangi, et al, "Coordinated Multicast
             Trees (CMT)for TRILL", draft-tissa-trill-cmt-00.txt,
             working in progress.

































Mingui Zhang             Expires August 2, 2012                [Page 12]


INTERNET-DRAFT      RFPC under Multi-Topology TRILL     January 30, 2012


Author's Addresses


   Mingui Zhang
   Huawei Technologies Co.,Ltd
   Huawei Building, No.156 Beiqing Rd.
   Z-park ,Shi-Chuang-Ke-Ji-Shi-Fan-Yuan,Hai-Dian District,
   Beijing 100095 P.R. China

   Email: zhangmingui@huawei.com









































Mingui Zhang             Expires August 2, 2012                [Page 13]