BESS WG                                                          Y. Wang
Internet-Draft                                           ZTE Corporation
Intended status: Standards Track                            May 17, 2020
Expires: November 18, 2020


                    Reduction of EVPN C-MAC Overload
            draft-wang-bess-evpn-cmac-overload-reduction-00

Abstract

   When there are too many customer-MACs (C-MACs), the RRs and/or ASBRs
   will be overloaded by the RT-2 routes for these MACs according to
   [I-D.dawra-bess-srv6-services].  This issue can be simply solved by
   making the remote C-MAC entries learnt via data-plane MAC learning
   (like what PBB VPLS have been done since [RFC7041]) rather than
   received from RT-2 routes.  This simplified solution will works as
   well as PBB VPLS.  But this simplified solution will lose many
   important features that based on the ESI concept.  Because the
   ingress-ESI can't be learnt via data-plane MAC learning at the egress
   PE.  So when the data packets is forwarded following these MAC
   entries, they can't benefit from the EAD/EVI routes as per RFC7432.
   So the All-Active Redundancy mode for ES can't be supported.  This
   make the simplified solution can't work as well as PBB EVPN
   ([RFC7623]).

   This document proposes a new SRv6 function type and an extension to
   [I-D.dawra-bess-srv6-services] to achieve all-active mode ES
   redundancy on TPEs and reduce the C-MAC loads for RRs and ASBRs.  The
   new solution will work even more better than PBB EVPN under the help
   of these extensions.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on November 18, 2020.



Wang                    Expires November 18, 2020               [Page 1]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


Copyright Notice

   Copyright (c) 2020 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Control Plane . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Dataplane . . . . . . . . . . . . . . . . . . . . . . . . . .   5
     3.1.  PE1 forward ARP Request to PE2/PE3  . . . . . . . . . . .   5
     3.2.  PE2/PE3's Dataplane MAC Learning  . . . . . . . . . . . .   5
     3.3.  PE2 Discard ARP Request to CE1  . . . . . . . . . . . . .   6
     3.4.  PE3 Forward ARP Replay to PE1/PE2 . . . . . . . . . . . .   6
     3.5.  PE1 Forward ARP Replay to CE1 . . . . . . . . . . . . . .   6
   4.  ESI Indicator Advertisement Optimization  . . . . . . . . . .   6
   5.  C-MAC Flush Notification Procedure  . . . . . . . . . . . . .   7
   6.  E-Tree Support Considerations . . . . . . . . . . . . . . . .   7
   7.  EVPN IRB Support Considerations . . . . . . . . . . . . . . .   7
   8.  Use End.ESI SID in MAC/IP Advertisement Routes  . . . . . . .   7
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   11. References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     11.1.  Normative References . . . . . . . . . . . . . . . . . .   8
     11.2.  Informative References References  . . . . . . . . . . .   8
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   In [I-D.dawra-bess-srv6-services], an extension to [RFC7432] is
   proposed for SRv6 EVPN control-plane.  In this control-plane the
   C-MACs is advertised via RT-2 route, but in order to solve the C-MAC
   overload problem for RRs and ASBRs, we have to return to a PBB-like
   dataplane C-MAC learning procedures.

   This document introduce an "ESI Indicator" concept to the EVPN data-
   plane.  We can recognize an ESI from its ESI-indicator.  But an ESI



Wang                    Expires November 18, 2020               [Page 2]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


   may have a few ESI-indicators, each for a TPE, espacially in the
   single-active mode of ES redundancy.

   Then we introduce a SRv6 function named End.ESI to carry the ESI-
   indicator in SRv6 dataplane.  A SID with the End.ESI function is
   called as an "ESI SID" in this document.  The ESI-indicator is the
   locator and fuction part of its ESI SID.  The argument part of the
   ESI SID is a Global Discreminating Value (GDV) for an EVI.  The GDV
   works like the function part of an End.DT2U/DT2M SID.  But the GDV
   has a global meaning like a global VNI or an PBB ISID but the
   function part for an End.DT2U/DT2M SID typically is only a local
   discreminator on the egress PE.  The argument part of the ESI SID is
   called as Arg.GED in this document, where the Global EVI
   discreminator is abbreviated as GED.

1.1.  Terminology

   Most of the terminology used in this documents comes from [RFC7432]
   and [I-D.dawra-bess-srv6-services] except for the following:

   C-MAC: Customer MAC, it is the same as the C-MAC of PBB EVPN, But
   there is no B-MAC in this document.

   ISID: a broadcast domain identifier in PBB I-Component.

   LDV: Local Discreminating Value.  It is similar to the Local
   Discreminating Value of type 3 ESI.

   GDV: Global Discreminating Value.  An identifier with global
   uniqueness.

   GED: Global EVI Discreminator, a GDV for an EVI instance.

   ESI Indicator: A Global ID for an ESI.  Note that different PE may
   assign different ESI-indicator for the same ESI, espacially when the
   ES redundancy mode is single-active.

   GEI: Global ESI Indicator.  It is the same as the "ESI Indicator"
   except for the emphasization to its global uniqueness.

   EAD/EVI: An Ethernet A-D route per EVI.

   GEI/EVI: An EAD/EVI route with an Gloabal ESI Indicator.

   Arg.GED: The argument part of a SID of the End.ESI function.

   RT-2: MAC/IP Advertise Route.




Wang                    Expires November 18, 2020               [Page 3]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


   ESI/IP: RT-2 Route whose IP field of the NLRI is a ESI-indicator.

   MAC Entry: An entry in the EVPN MAC table in data-plane.

   ESI IP: An End.ESI SID with its Argument part being set to zero.

2.  Control Plane

   We assign a GED to an EVI instance EVI_1, the GED is a number
   consists of N bits.  We assign an ESI-indicator I1 with ESI1 on PE1,
   and we assign an ESI-indicator I2 with ESI1 on PE2.  We call the
   relationship between ESI1 and its two ESI-indicators as ESI1_I1 and
   ESI1_I2 respectively.

                                    +----------+
                      PE1           |          |
                 +-------------+    |          |
                 | ESI1_I1     |    |          |         PE3
                /|             |----|          |   +-------------+
               / |             |    |   IP     |   |             |
          LAG /  +-------------+    | Backbone |   |     ESI2_I3 |---CE2
      CE1=====                      |   with   |   |             |
              \  +-------------+    |   EVPN   |---|             |
               \ |             |    |   RRs    |   +-------------+
                \|             |----|   and    |
                 | ESI1_I2     |    |   ASBRs  |
                 +-------------+    |          |
                      PE2           |          |
                                    +----------+

                   Figure 1: EVPN MAC Reduction Usecase

   We use IMET routes to build a broadcast-list.  The broadcast-list is
   used to forward BUM traffics.  The data-plane MAC learning for BUM
   traffics produces the first batch of C-MAC entries.  The subsequent
   C-MAC entries can be learnt from Unicast traffics and/or BUM
   traffics.  It is clear that we don't use MAC/IP routes as usual for
   fear that the RRs and/or ASBRs are overloaded by these C-MACs.

   The SRv6 SID in IMET route is an End.DT2M SID with a zero argument
   length.  The I1 and I2 are SRv6 SID of End.ESI function that is
   defined in the following figure.  We use IGP protocols to advertise
   I1 and I2 to PE3 respectively in SRv6 underlay.  So we don't use EAD/
   ES route or EAD/EVI route in SRv6 EVPN in this section.  If ESI1 is
   single-active mode, I1 is different from I2, but if ESI1 is all-
   active mode, I1 is the same as I2.





Wang                    Expires November 18, 2020               [Page 4]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


       |       ESI-Indicator(128-N bits)     |        N bits           |
       +------------+------------+-----------+-------------------------+
       |    Block   |   Node     | ESI.LDV   |        Arg.GED          |
       +------------+------------+-----------+-------------------------+


                       Figure 2: End.ESI SID Format

   Note that an ESI-indicator is composed of Locator and Function, an
   ESI IP is an 128 bits SID with a zero argument.  The function part is
   a Local Discreminating Value on that PE for the ESI.  The argument
   part is a Global EVI Discreminator (GED) for the data packet.  The
   argument part is also called Arg.GED in this document.

3.  Dataplane

3.1.  PE1 forward ARP Request to PE2/PE3

   When CE1 requests CE2's ARP, PE1 will receive the ARP Request from a
   AC of ESI1.  PE1 will forward the ARP Request following the
   broadcast-list for the AC's EVI instance.  The broadcast-list is
   constructed by IMET routes from PE2/PE3.

   PE1 will forward the ARP Request to PE2/PE3 with the following SRv6
   BE encapsulation: It's underlay Source IP is the End.ESI SID on PE1
   for ESI1; It's underlay Destination IP is the End.DT2M SID on PE2/
   PE3.  The locator and function part of the End.ESI SID is I1.  The
   Argument part of the End.ESI SID is 0.  The SMAC of the ARP request
   is M1 which is CE1's MAC address.

   Note that the underlay SIP will be the End.DT2U SID for the single-
   homed ingress ACs.  The multi-homed ingress ACs with single-active
   behavior may not be assigned with an ESI-indicator either.  In such
   situations, the underlay SIP will be the End.DT2U SID too.

3.2.  PE2/PE3's Dataplane MAC Learning

   When PE2/PE3 receives the ARP Request packet, they do dataplane MAC
   learning independently.  They will learn that M1 is behind I1, which
   is determined by underlay SIP of the ARP Request packet.

   Note that when PE2 learns that M1 is behind I1, it will assume that
   M1 is behind the local AC with an ESI-indicator I1 too.  The local AC
   may have more higher priority than the ESI-IP.

   After the dataplane MAC learning, the ARP request packet is
   broadcasted to the local ACs, behind one of which is CE2.




Wang                    Expires November 18, 2020               [Page 5]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


3.3.  PE2 Discard ARP Request to CE1

   When ESI1 is all-active mode and PE2 is about to forward the ARP
   request to CE1, PE2 will find that the ESI indicator for the outgoing
   AC is also I1, so PE2 discards it for ESI loop-free considerations.

   When ESI1 is single-active mode, the outgoing AC may be in blocking
   state, otherwise its corresponding sub-interface on CE1 will take
   charge of packet-drop behavior instead.  So alghough the ESI
   indicator for the outgoing AC is not the same as I1, no loop will
   arise in the Ethernet Segment.

3.4.  PE3 Forward ARP Replay to PE1/PE2

   When CE2 replies to CE1 for the ARP request, PE3 will forward the ARP
   reply according to the MAC entry M1 learned previously as above.

   PE3 will forward the ARP reply to PE1 with the following SRv6 BE
   encapsulation: It's underlay Source IP is the End.ESI SID on PE3 for
   ESI2; It's underlay Destination IP is the End.ESI SID on PE1 for ESI1
   according to the MAC entry M1.  The Arg.GED for the End.ESI SID in
   DIP is the Global EVI Discreminator (GED) configured on PE3.  Note
   that the GED for the same EVI is configured with the same value on
   PE1/PE2/PE3.

   When ESI1 is all-active mode, I1 will be the same as I2, so we call
   both of them I21 instead.  The traffics to M1 will be load-balanced
   between PE1 and PE2 by the underlay network on PE3.  Because I21 is
   advertised by both PE1 and PE2 in the underlay IGP protocol.

3.5.  PE1 Forward ARP Replay to CE1

   Whe PE1 received the SRv6 encapsulated ARP reply packet from PE3, PE1
   first match the packet to the End.ESI SID of ESI1 by DIP, then match
   the packet to the EVI instance EVI_1 by Arg.GED.  And PE1 will not
   discard it because the egress ESI indicator I1 is not the same as the
   ingress ESI indicator I3 in the SIP of the packet.

4.  ESI Indicator Advertisement Optimization

   Although we can advertise End.ESI SID in underlay IGP protocols, But
   it is better to use the SRv6 SID Structure Sub-Sub-TLV to indicate
   the length of the Arg.GED in the End.ESI SID.

   So we can use EAD/EVI route to advertise Global ESI Indicator (GEI),
   these EAD/EVI routes is called as GEI/EVI route in this document.
   But we also can use MAC/IP route to advertise GEI, like what have
   been done by PBB EVPN's B-MAC advertisement procedures as per



Wang                    Expires November 18, 2020               [Page 6]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


   [RFC7623].  When the MAC/IP route is used to advertise GEI, only the
   IP field in its NLRI is used to identify a GEI, so the MAC field in
   its NLRI can be set to zero.  Such MAC/IP route is called as ESI/IP
   route in this document.  When the GEI/EVI route is used to advertise
   GEI, the End.ESI SID is encapsulated its SRv6 L2 Service TLV, not in
   its nexthop.

   Either GEI/EVI routes or ESI/IP routes will be advertised/imported
   for Global Routing Table (GRT), so their Route-Targets (RT) will be
   configured with GRT.  Because there isn't a dedicated B-component
   like PBB VPLS and PBB EVPN.

   Although GEIs is imported to GRT, they are awared only on PE nodes,
   the transit nodes in underlay network won't be aware of GEIs in order
   to reduce the FIB consumption.  We can use the argument length in the
   SRv6 SID Structure Sub-Sub-TLV to check whether the GED is too big
   for the End.ESI SID, So we can avoid the destruction to the function
   part of the End.ESI and we can use flexible GED length.

5.  C-MAC Flush Notification Procedure

   The withdraw of ESI Indicator Advertisement can be used as C-MAC
   flush notification like what have been done by [RFC8317] and
   [I-D.snr-bess-pbb-evpn-isid-cmacflush].

6.  E-Tree Support Considerations

   E-tree Supprot extensions is similar to [RFC8317] section 5 except
   for the following notable differences: The B-MAC is replaced by GEIs,
   the PBB encapsulation is replaced by SRv6 encapsulation, the
   B-component is replaced by underlay GRT.  The B-MAC Advertisement
   Route is replaced by GEI/EVI route or ESI/IP Route.

7.  EVPN IRB Support Considerations

   The PBB-VPLS/PBB-EVPN is not friendly to IRB usecase because of its
   complicated Protocol Stack, so it is used only in pure L2VPN usecase
   up to now in the industry.  But the dataplane in this draft is no
   more complex with typical SRv6 EVPN.  So it will work as efficient as
   we should expect in SRv6 EVPN IRB usecase.

8.  Use End.ESI SID in MAC/IP Advertisement Routes

   In [I-D.dawra-bess-srv6-services] the downstream assigned ESI label
   is encapsulated in the Arg.FE2 part of End.DT2M SID, And the ESI
   label present as Arg.FE2 only when the egress PE is adjacent with the
   ingress ESI.  So it is difficult (if not impossible) to do data-plane
   C-MAC learning via End.DT2M SID and its unwarranted Arg.FE2 presence.



Wang                    Expires November 18, 2020               [Page 7]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


   Alghough upstream assigned ESI label may be used to learn ingress
   ESI-indicator on egress PE node, other issues will arise together.

   But the End.ESI SID can be used in MAC/IP advertisement route, only
   if C-MAC overload is not a real threat.  By doing this, the data-
   plane can be unified among these usecases.  The details for using
   End.ESI SID in MAC/IP Advertisement Route will be described in future
   versions.

9.  Security Considerations

   This document does not introduce any new security considerations
   other than already discussed in [RFC7432] and [RFC7623].

10.  IANA Considerations

   There is no IANA consideration.

11.  References

11.1.  Normative References

   [I-D.dawra-bess-srv6-services]
              Dawra, G., Filsfils, C., Brissette, P., Agrawal, S.,
              Leddy, J., daniel.voyer@bell.ca, d.,
              daniel.bernier@bell.ca, d., Steinberg, D., Raszuk, R.,
              Decraene, B., Matsushima, S., Zhuang, S., and J. Rabadan,
              "SRv6 BGP based Overlay services", draft-dawra-bess-
              srv6-services-02 (work in progress), July 2019.

   [RFC7432]  Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
              Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based
              Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February
              2015, <https://www.rfc-editor.org/info/rfc7432>.

   [RFC8317]  Sajassi, A., Ed., Salam, S., Drake, J., Uttaro, J.,
              Boutros, S., and J. Rabadan, "Ethernet-Tree (E-Tree)
              Support in Ethernet VPN (EVPN) and Provider Backbone
              Bridging EVPN (PBB-EVPN)", RFC 8317, DOI 10.17487/RFC8317,
              January 2018, <https://www.rfc-editor.org/info/rfc8317>.

11.2.  Informative References References

   [I-D.snr-bess-pbb-evpn-isid-cmacflush]
              Rabadan, J., Sathappan, S., Nagaraj, K., Miyake, M., and
              T. Matsuda, "PBB-EVPN ISID-based CMAC-Flush", draft-snr-
              bess-pbb-evpn-isid-cmacflush-06 (work in progress), July
              2019.



Wang                    Expires November 18, 2020               [Page 8]


Internet-Draft            EVPN C-MAC Reduction                  May 2020


   [RFC7041]  Balus, F., Ed., Sajassi, A., Ed., and N. Bitar, Ed.,
              "Extensions to the Virtual Private LAN Service (VPLS)
              Provider Edge (PE) Model for Provider Backbone Bridging",
              RFC 7041, DOI 10.17487/RFC7041, November 2013,
              <https://www.rfc-editor.org/info/rfc7041>.

   [RFC7623]  Sajassi, A., Ed., Salam, S., Bitar, N., Isaac, A., and W.
              Henderickx, "Provider Backbone Bridging Combined with
              Ethernet VPN (PBB-EVPN)", RFC 7623, DOI 10.17487/RFC7623,
              September 2015, <https://www.rfc-editor.org/info/rfc7623>.

Author's Address

   Yubao(Bob) Wang
   ZTE Corporation
   No. 50 Software Ave, Yuhuatai Distinct
   Nanjing
   China

   Email: yubao.wang2008@hotmail.com































Wang                    Expires November 18, 2020               [Page 9]