BESS                                                              W. Lin
Internet-Draft                                                  Z. Zhang
Intended status: Standards Track                                J. Drake
Expires: October 5, 2015                          Juniper Networks, Inc.
                                                           April 3, 2015


                 EVPN Inter-subnet Multicast Forwarding
                  draft-lin-bess-evpn-irb-mcast-00.txt

Abstract

   This document describes inter-subnet multicast forwarding procedures
   for Ethernet VPNs (EVPN).

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC2119.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on October 5, 2015.

Copyright Notice

   Copyright (c) 2015 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect



Lin, et al.              Expires October 5, 2015                [Page 1]


Internet-Draft               evpn-irb-mcast                   April 2015


   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Solution  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.1.  IGMP/MLD Snooping Consideration . . . . . . . . . . . . .   5
     2.2.  Receiver sites not connected to a source subnet . . . . .   5
     2.3.  Receiver sites without IRB  . . . . . . . . . . . . . . .   5
     2.4.  Multi-homing Support  . . . . . . . . . . . . . . . . . .   6
   3.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   4.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   7
   5.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     5.1.  Normative References  . . . . . . . . . . . . . . . . . .   7
     5.2.  Informative References  . . . . . . . . . . . . . . . . .   7
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   7

1.  Introduction

   EVPN provides an extensible and flexible multi-homing VPN solution
   for intra-subnet connectivity among hosts/VMs over an MPLS/IP
   network.  When forwarding among hosts/VMs across different IP subnets
   is required, Integrated Routing and Bridging (IRB) can be used [ietf-
   bess-evpn-inter-subnet-forwarding].

   An NVE device supporting IRB is called a L3 Gateway.  In a
   centralized approach, a centralized gateway provides all L3 routing
   functionality, and even two Tenant Systems on two subnets connected
   to the same NVE need to go through the central gateway, which is
   inefficient.  In a distributed approach, each NVE (or most NVEs) has
   IRB configured, and inter-subnet traffic will be locally routed
   without having to go through a central gateway.

   Inter-subnet multicast forwarding is more complicated and not covered
   in [ietf-bess-evpn-inter-subnet-forwarding].  This document describes
   the procedures for inter-subnet multicast forwarding.

   For multicast traffic sourced from a TS in subnet 1, EVPN BUM
   forwarding will deliver it to all sites, and NVEs with IRB interfaces
   for the subnet will associate the traffic with the corresponding IRB
   interfaces.  From L3 point of view, those NVEs are routers connected
   to the virtual LAN via the IRB interfaces and the source is locally
   attached.  Nothing is different from a traditional LAN and regular
   IGMP/MLD/PIM procedures kick in.




Lin, et al.              Expires October 5, 2015                [Page 2]


Internet-Draft               evpn-irb-mcast                   April 2015


   If a TS is a multicast receiver, it uses IGMP/MLD to signal its
   interest in some multicast flows.  One of the gateways is the IGMP/
   MLD querier and sends queries out of the IRB interfaces, which are
   forwarded throughout the subnet following EVPN BUM procedures.  TS's
   send IGMP/MLD joins via multicast, which are also forwarded
   throughout the subnet via EVPN BUM procedure.  The gateways receive
   the joins via their IRB interfaces.  From layer 3 point of view,
   again it is nothing different from on a traditional LAN.

   On a traditional LAN, only one router can send multicast to the LAN.
   That is either the PIM Designated Router (DR) or IGMP/MLD querier
   (when PIM is not needed - e.g., the LAN is a stub network).  On the
   source network, PIM is typically needed so that traffic can be
   delivered to other routers.  For example, in case of PIM-SM, the DR
   on the source network encapsulates the initial packets for a
   particular flow in PIM Register messages and send to the RP,
   triggering necessary states for that flow to be built throughout the
   network.

   That also works in the EVPN scenario, although not efficiently.
   Consider the following example, where a tenant has two subnets
   (corresponding to two VLANs realized by two EVPN EVIs) at three
   sites.  A multicast source is located at site 1 on VLAN/subnet 1 and
   three receivers are located at site 2 on VLAN/subnet 1, site 1 and 2
   on VLAN/subnet 2 respectively.  On subnet 1, NVE1 is the PIM DR while
   on subnet 2, NVE3 is the PIM DR.  The connection drawn among NVEs are
   L3 connections (typically via L3VPN).

   Multicast traffic from the source at site 1 on subnet 1 is forwarded
   to all three sites on VLAN 1 following EVNP procedure.  Rcvr1 gets
   the traffic when NVE2 sends it out of its local Attachment Circuit
   (AC).  The three gateways for EVI1 also receive the traffic on their
   IRB interfaces to potentially route to other subnets.  NVE3 is the DR
   on subnet 2 so it routes the local traffic (from L3 point of view) to
   subnet 2 while NVE1/2 is not the DR on subnet 2 so they don't.  Once
   traffic gets onto subnet 2, it is forwarded back to NVE1/2 and
   delivered to rcvr2/3 following EVPN procedures.

   Notice that both NVE1 and NVE2 receive the multicast traffic from
   subnet 1 on their IRB interfaces for subnet 1, but they do not route
   to subnet 2 where they are not the DRs.  Instead, they wait to
   receive traffic at L2 from NVE3.  For example, for receiver 3
   connected to NVE1 but on different IP subnet as the multicast source,
   the multicast traffic from source has to go from NVE1 to NVE3 and
   then back to NVE1 before it is being delivered to the receiver 3.
   This is similar to the hair-pinning issue with centralized approach
   (forwarding is centralized via the DR) for unicast, even though




Lin, et al.              Expires October 5, 2015                [Page 3]


Internet-Draft               evpn-irb-mcast                   April 2015


   distributed approach is being used for unicast (in that each NVE is
   supporting IRB and routing inter-subnet unicast traffic locally).


           site 1     .      site 2      .       site 3
                      .                  .
            src       .      rcvr1       .
             |        .        |         .
         --------------------------------------------  VLAN 1 (EVI1)
             |        .        |         .         |
         IRB1| DR     .    IRB1|         .     IRB1|
            NVE1------------NVE2-----------------NVE3---RP
         IRB2|        .    IRB2|         .     IRB2| DR
             |        .        |         .         |
         --------------------------------------------  VLAN 2 (EVI2)
             |        .        |         .
            rcvr3     .       rcvr2      .
                      .                  .
           site 1     .     site 2       .      site 3

2.  Solution

   This multicast hair-pinning can be avoided if the following
   procedures are followed:

   o  On the IRB interfaces, each gateway forward multicast traffic as
      long as there are receivers for the traffic, regardless if it is
      DR or not.

   o  On the IRB interfaces, each gateway will send PIM joins towards
      the RP or source if has IGMP/MLD group membership, regardless if
      it is DR/querier or not.

   o  Multicast data traffic sent out of the IRB interfaces is forwarded
      to local ACs only and not to other EVPN sites.

   Essentially, each router on an IRB interface behaves as a DR/querier
   for receivers (but only the true DR behaves as a DR for sources), and
   multicast data traffic from IRB interfaces is limited to local
   receivers.

   Note that link local multicast traffic (e.g. addressed to 224.0.0.x
   in case of IPv4)), typically use for protocols, is not subject to the
   above procedures and still forwarded to remote sites following EVPN
   procedures.

   In the above example, when NVE1 gets traffic on its IRB1 interface it
   will route the traffic out of its IRB2 and deliver to local rcvr3.



Lin, et al.              Expires October 5, 2015                [Page 4]


Internet-Draft               evpn-irb-mcast                   April 2015


   It also sends register messages to the RP, since it is the DR on the
   source network.  Both NVE2 and NVE3 will receive the traffic on IRB1
   but neither sends register messages to the RP, since they are not the
   DR on the source subnet.  NVE2 will route the traffic out of its IRB2
   and deliver to local rcvr2.  NVE3 will also route the traffic out of
   IRB2 even though there is no receiver at the local site, because the
   IGMP/MLD joins from rcvr2/3 are also received by NVE3.

2.1.  IGMP/MLD Snooping Consideration

   In the above example, NVE3 receives IGMP/MLD joins from rcvr2/3 and
   will route packets out of IRB2, even though there are no receivers at
   the local site.  IGMP/MLD snooping on NVE3 can prevent the traffic
   from actually being sent out of ACs but at L3 there will still be
   related states and processing/forwarding (e.g., IRB2 will be in the
   downstream interface list for PIM join states and forwarding routes).

   To prevent NVE3 from learning those remote receivers at all, IGMP/MLD
   snooping on NVE3 could optionally suppress the joins from remote
   sites being sent to its IRB interface.  With that, in the above
   example NVE3 will not learn of rcvr2/3 on IRB2 and will not try to
   route packets out of IRB2 at all.

2.2.  Receiver sites not connected to a source subnet

   In the above example, the source subnet is connected to all NVEs that
   has receiver sites, and there are no receivers outside the EVPN
   network.  As a result, PIM is not really needed and each NVE can just
   route multicast traffic locally.  In that case, IGMP/MLD querier will
   be responsible to send traffic to a subnet.

   If there is a receiver subnet connected to an NVE that is not
   connected to the source subnet, then PIM must be running on the
   source subnet and among the NVEs so that the DR on the source subnet
   will route traffic to the receiver NVE over the tenant's L3 network,
   following the normal PIM procedures.  In this revision, it is assumed
   that the subnets realized by EVPN are stub only and not transit.

2.3.  Receiver sites without IRB

   It is possible that a particular NVE may not have an IRB interface
   for a particular l2 domain.  In that case, for traffic from another
   l2 domain, receivers need to receive from another NVE following EVPN
   procedures.  The obvious choice is that it receives from the DR of
   that subnet.  Because an NVE does not deliver traffic out of IRBs to
   remote sites with IRB, the DR needs to use a separate provider tunnel
   to deliever traffic only to sites that do not have IRB interfaces.




Lin, et al.              Expires October 5, 2015                [Page 5]


Internet-Draft               evpn-irb-mcast                   April 2015


   The tunnel can be advertised via a separate Multicast Ethernet Tag
   Route, and only the sites without IRBs will join that tunnel.

   Details for that route and procedure will be provided in future
   revisions.

2.4.  Multi-homing Support

   The solution works equally well in multi-homing situations.

   As shown in the diagram below, both rcvr4 and rcvr5 are multi-homed
   to NVE2 and NVE3.  Receiver 4 is on subnet VLAN 1 and receiver 5 is
   on VLAN 2.  When IRBs on NVE1 and NVE2 forward multicast traffic to
   its local attached access interface(s) based on EVPN BUM procedure,
   only DF for the ES deliveries multicast traffic to its multi-homed
   receiver.  Hence no duplicated multicast traffic will be forwarded to
   receiver 4 or receiver 5.

   If NVE2 does not have an IRB interface and becomes the DF on the
   multi-homed segments, then rcvr5 will not be able to receive traffic
   from a different layer 2 domain (rcvr4 will, because it is in the
   same layer 2 domain so traffic does not have to go through IRB).  To
   make sure that an NVE does not become a DF unless it has IRB
   configured, this document proposes the addition of a new TLV, the IRB
   PIM Capable TLV, to the Ethernet Segment route from NVEs with IRB
   configured for the corresponding segment.  Those without IRB
   interface configured will not add the TVL to their routes for the
   segment.  The standard DF election procedure defined in section 8.5
   of [RFC7432] is augmented to prefer ES routes with the IRB PIM
   Capable TLV, so that the elected DF always has IRB configured when
   possible.  If the elected DF does not have IRB configured, then it
   must use the procedure defined in Section 2.3 to pull traffic from a
   remote DR and deliver to the local receivers.  In the multi-homing
   example given below, NVE2 does not have IRB interface so it does not
   include IRB PIM capable TLV in its ES routes.  NVE3 will be elected
   as DF for both ESs.















Lin, et al.              Expires October 5, 2015                [Page 6]


Internet-Draft               evpn-irb-mcast                   April 2015


                      .
            src       .        +-------- rcvr4-----+
             |        .        |         .         |
         --------------------------------------------  VLAN 1 (EVI1)
             |        .        |         .         |
         IRB1| DR     .        |         .     IRB1|
            NVE1------------NVE2-----------------NVE3---RP
         IRB2|        .        |         .     IRB2| DR
             |        .        |         .         |
         --------------------------------------------  VLAN 2 (EVI2)
             |        .        |         .         |
            rcvr3     .        +-------- rcvr5-----+
                      .



3.  Security Considerations

   This document does not introduce new security risks.

4.  Acknowledgements

5.  References

5.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC7432]  Sajassi, A., Aggarwal, R., Bitar, N., Isaac, A., Uttaro,
              J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet
              VPN", RFC 7432, February 2015.

5.2.  Informative References

   [I-D.ietf-bess-evpn-inter-subnet-forwarding]
              Sajassi, A., Salam, S., Thoria, S., Rekhter, Y., Drake,
              J., Yong, L., and L. Dunbar, "Integrated Routing and
              Bridging in EVPN", draft-ietf-bess-evpn-inter-subnet-
              forwarding-00 (work in progress), November 2014.

Authors' Addresses

   Wen Lin
   Juniper Networks, Inc.

   EMail: wlin@juniper.net




Lin, et al.              Expires October 5, 2015                [Page 7]


Internet-Draft               evpn-irb-mcast                   April 2015


   Zhaohui Zhang
   Juniper Networks, Inc.

   EMail: zzhang@juniper.net


   John Drake
   Juniper Networks, Inc.

   EMail: jdrake@juniper.net









































Lin, et al.              Expires October 5, 2015                [Page 8]