Skip to main content

EVPN Fast Reroute
draft-burdet-bess-evpn-fast-reroute-07

Document Type Active Internet-Draft (individual)
Authors Luc André Burdet , Patrice Brissette , Takuya Miyasaka , Jorge Rabadan
Last updated 2024-03-04
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-burdet-bess-evpn-fast-reroute-07
BESS Working Group                                       LA. Burdet, Ed.
Internet-Draft                                              P. Brissette
Intended status: Standards Track                                   Cisco
Expires: 5 September 2024                                    T. Miyasaka
                                                        KDDI Corporation
                                                              J. Rabadan
                                                                   Nokia
                                                            4 March 2024

                           EVPN Fast Reroute
                 draft-burdet-bess-evpn-fast-reroute-07

Abstract

   This document summarises EVPN convergence mechanisms and specifies
   procedures for EVPN networks to achieve fast and scale-independent
   convergence.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 5 September 2024.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

Burdet, et al.          Expires 5 September 2024                [Page 1]
Internet-Draft              EVPN Fast Reroute                 March 2024

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   5
   4.  Solution  . . . . . . . . . . . . . . . . . . . . . . . . . .   5
     4.1.  Pre-selection of Backup Path  . . . . . . . . . . . . . .   6
     4.2.  Failure Detection and Traffic Restoration . . . . . . . .   7
       4.2.1.  Simultaneous Failures in ES . . . . . . . . . . . . .   9
       4.2.2.  Successive and Cascading Failures in ES . . . . . . .   9
   5.  Redirect Labels: Forwarding Behaviors . . . . . . . . . . . .   9
     5.1.  Bypassing DF-Election Behavior  . . . . . . . . . . . . .  10
     5.2.  Terminal Disposition Behavior . . . . . . . . . . . . . .  11
   6.  Controlled Recovery Sequence  . . . . . . . . . . . . . . . .  12
   7.  Transport Underlay  . . . . . . . . . . . . . . . . . . . . .  12
     7.1.  NVO Tunnels . . . . . . . . . . . . . . . . . . . . . . .  12
       7.1.1.  Ignoring Local Bias Behavior  . . . . . . . . . . . .  13
     7.2.  Segment Routing v6  . . . . . . . . . . . . . . . . . . .  13
       7.2.1.  End.DT2U.Reroute : End.DT2U with Fast Reroute . . . .  13
       7.2.2.  End.DX2.Reroute : End.DX2 with Fast Reroute . . . . .  15
       7.2.3.  Conflicting Endpoint Behaviors  . . . . . . . . . . .  17
     7.3.  Inter-AS Option B . . . . . . . . . . . . . . . . . . . .  17
   8.  BGP Extensions  . . . . . . . . . . . . . . . . . . . . . . .  18
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  18
   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  19
   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  19
   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  19
     12.1.  Normative References . . . . . . . . . . . . . . . . . .  19
     12.2.  Informative References . . . . . . . . . . . . . . . . .  20
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  21

1.  Introduction

   EVPN convergence and failure recovery methods from different types of
   network failures is described in Section 17 of
   [I-D.ietf-bess-rfc7432bis].  Similarly for EVPN-VPWS, the end of
   Section 5 of [RFC8214] briefly evokes an egress link protection
   mechanism.

Burdet, et al.          Expires 5 September 2024                [Page 2]
Internet-Draft              EVPN Fast Reroute                 March 2024

   The fundamentals of EVPN convergence rely on a mass-withdraw
   technique of the Ethernet A-D per ES route to unresolve all the
   associated forwarding paths (Section 9.2.2 of
   [I-D.ietf-bess-rfc7432bis] 'Route Resolution').  The mass-withdraw
   grouping approach results in suitable EVPN convergence at lower
   scale, but is not sufficient to meet stricter convergence
   requirements, often sub-second.  Other control-plane enhancements
   such as route-prioritisation ([I-D.ietf-bess-rfc7432bis]) help
   further but still provide no guarantees.

   EVPN convergence using only control-plane approaches is constrained
   by BGP route propagation delays, routes processing times in software
   and hardware programming.  These are additionally often performed
   sequentially and linearly given the potential large scale of EVPN
   routes present in control plane.

   This document presents a mechanism for fast reroute to minimise
   packet loss in the case of a link failure using EVPN redirect labels
   (ERLs) with special forwarding behaviors.  Multiple-failures where
   loops may occur are addressed, as are cascading failures.  A
   mechanism for distributing redirect labels (ERLs) alongside EVPN
   service labels (ESLs) is shown.

   The main objective is to achieve fast convergence in EVPN networks
   without relying on control plane actions.  The procedures in this
   document apply to the following EVPN services: EVPN
   [I-D.ietf-bess-rfc7432bis], EVPN-VPWS [RFC8214], EVPN Inter-Subnet
   Forwarding [RFC9135] and EVPN IP-VRF-to-IP-VRF models as in
   Section 4.4 of [RFC9136].  All the EVPN Multi-Homing modes are
   included.

2.  Terminology

   Some of the terminology in this document is borrowed from [RFC8679]
   for consistency across fast reroute frameworks.
   The term 'label' when used in this document, especially when
   referring to ERL and ESL (below) indicates an MPLS label, a VNI
   (VXLAN Network Identifier) or a Segment Routing IPv6 SID, depending
   on the transport being used.

   CE:  Customer Edge device, e.g., a host, router, or switch.

   PE:  Provider Edge device.

   Ethernet Segment (ES):  A set of ethernet links connected to one or
      more PEs.

   Ethernet Segment Identifier (ESI):  A unique non-zero identifier that

Burdet, et al.          Expires 5 September 2024                [Page 3]
Internet-Draft              EVPN Fast Reroute                 March 2024

      identifies an Ethernet segment.

   Egress link:  Specific Ethernet link connecting a given PE-CE, which
      forms part of an Ethernet Segment.

   Single-Active Redundancy Mode:  When only a single PE, among all the
      PEs attached to an Ethernet segment, is allowed to forward traffic
      to/from that Ethernet segment for a given VLAN, then the Ethernet
      segment is defined to be operating in Single-Active redundancy
      mode.

   All-Active Redundancy Mode:  When all PEs attached to an Ethernet
      segment are allowed to forward known unicast traffic to/from that
      Ethernet segment for a given VLAN, then the Ethernet segment is
      defined to be operating in All-Active redundancy mode.

   Port-Active Redundancy Mode:  When only a single PE, among all the
      PEs attached to an Ethernet segment, is allowed to forward traffic
      to/from that Ethernet segment for the entire interface (all
      VLANs), then the Ethernet segment is defined to be operating in
      Port-Active redundancy mode.

   Single-Flow-Active Redundancy Mode:  When all PEs attached to an
      Ethernet segment are allowed to forward known unicast traffic to/
      from that Ethernet segment for a given VLAN, but only one does
      based on receiving a traffic flow from the access for that VLAN,
      then the Ethernet segment is defined to be operating in Single-
      Flow-Active redundancy mode.

   DF-Election:  Designated Forwarder election, as in
      [I-D.ietf-bess-rfc7432bis] and [RFC8584].

   DF:  Designated Forwarder.

   Backup-DF (BDF):  Backup-Designated Forwarder.

   Non-DF (NDF):  Non-Designated Forwarder.

   AC:  Attachment Circuit.

   ERL:  EVPN redirect label, as described in this document.

   ESL:  EVPN service label, as in [I-D.ietf-bess-rfc7432bis],
      [RFC8214], [RFC9135] and [RFC9136].

   FRR:  Fast Re-Route.

Burdet, et al.          Expires 5 September 2024                [Page 4]
Internet-Draft              EVPN Fast Reroute                 March 2024

3.  Requirements

   1.   EVPN multihoming is often described as 2 peering PEs.  The
        solution MUST be generic enough to apply multiple peering PE and
        no artificial limit imposed on the number of peering PEs.

   2.   The solution MUST apply to all EVPN load-balancing modes.

   3.   The solution MUST be robust enough to tolerate failures of the
        same ES at multiple PEs.  Simultaneous as well as cascading
        failures on the same ES must be addressed.

   4.   The solution MUST support EVPN [I-D.ietf-bess-rfc7432bis], EVPN-
        VPWS [RFC8214], EVPN Inter-Subnet Forwarding [RFC9135] and EVPN
        IP-VRF-to-IP-VRF models as in Section 4.4 of [RFC9136].

   5.   An implementation of this document SHOULD support one, or many,
        of the above-listed services.

   6.   The solution SHOULD meet stringent requirements for traffic loss
        of EVPN services.

   7.   The solution MUST allow redirected-traffic to bypass port
        blocking states resulting from DF-Election (BDF or NDF).

   8.   The solution MUST be scale-independent and agnostic of EVPN
        route types, scale or choice of underlay.

   9.   The solution MUST address egress link (PE-CE link) failures.

   10.  The solution MUST be loop-free, and once-redirected traffic MUST
        never be repeatedly redirected.

   11.  The solution MUST NOT rely on pushing an additional label onto
        the label stack, or on the definition of a special-purpose label
        (underlay-specific to MPLS)

4.  Solution

   Fast convergence in EVPN networks is achieved using a combined
   approach to minimising traffic loss:

   *  Local failure detection and restoration of traffic flows in
      minimal time using a pre-computed redirect path;

   *  Restoration of optimal traffic paths, and reconvergence of EVPN
      control plane with EVPN mass withdraw.

Burdet, et al.          Expires 5 September 2024                [Page 5]
Internet-Draft              EVPN Fast Reroute                 March 2024

   The solution presented in this document addresses the local failure
   detection and restoration, without impeding on or impacting existing
   EVPN control plane convergence mechanisms.

   Consider the following EVPN topology where PE1 and PE2 are
   multihoming PEs on a shared ES, ESI1.  EVPN (known unicast) or
   EVPN-VPWS traffic from CE1 to CE2 is sent to PE1 and PE2 using EVPN
   service labels ESL1 and/or ESL2 (depending on load-balancing mode of
   the ESI1 interfaces).

                                       +------+
                                       |  PE1 |
                                       |      |
                      +-------+        | ESL1---DF----
                      |       |--------|      |       \
                      |       |        | ERL1--------> \
           +-----+    |       |        +------+         \
           |     |    |IP/MPLS|                          \
    CE1 ---| PE3 |----|Core   |                     ESI1  === CE2
           |     |    |Network|                          /
           +-----+    |       |        +------+         /
                      |       |        | ERL2--------> /
                      |       |--------|      |       /
                      +-------+        | ESL2---BDF--X
                                       |      |
                                       |  PE2 |
                                       +------+

        Figure 1: EVPN Multihoming with service and redirect labels

   Alongside the service labels ESL1 and ESL2, two redirect labels ERL1
   and ERL2 are allocated with special forwarding behaviors, as detailed
   in Section 5.  Fast-reroute and use of the ERLs is shown in
   Section 4.2

4.1.  Pre-selection of Backup Path

   EVPN DF-Election lends itself well to the selection of a pre-computed
   path amongst any given number of peering PEs by providing a
   DF-Elected and BDF-Elected node at the <EVI, ESI> granularity
   ([RFC8584] and [I-D.ietf-bess-rfc7432bis]).

   In All-active mode, all PEs in the Ethernet Segment are actively
   forwarding known unicast traffic to the CE.  For All-active services
   where DF-Election is not strictly required (EVPN-VPWS) the DF-
   Election algorithm is run to determine BDF-Elected PE for ERL
   selection purposes only, without impacting the service itself.

Burdet, et al.          Expires 5 September 2024                [Page 6]
Internet-Draft              EVPN Fast Reroute                 March 2024

   In Single-active and Port-Active modes, only a single PE in the
   Ethernet Segment is actively forwarding known unicast traffic to the
   CE: the DF-Elected PE.  The BDF-Elected PE is next to be elected in
   the redundancy group and is already known.  In Single-flow-active
   mode ([I-D.ietf-bess-evpn-l2gw-proto]), only a single PE in the
   Ethernet Segment is actively forwarding known unicast to the CE for a
   given flow: the PE which initially received that flow from the
   Ethernet-Segment.  The backup PE is the multihoming peer in the
   redundancy group, referred to as "BDF" for consistency with other
   redudancy modes.

   For consistency across PEs and load-balancing modes, the backup path
   selected should be in order of {DF, BDF, NDF1, NDF2, ...}. The DF-
   Elected PE selects the next-best BDF-Elected as backup and all BDF-
   and NDF-Elected nodes select the best DF-Elected for the protection
   of their egress links.

   *  PE1 (DF) selects PE2 as BDF,

   *  PE1   (DF) uses the ERL2 label signaled by PE2 to redirect the
      traffic of its failed local AC connected to CE2,

   *  PE2   (BDF) uses the ERL1 label signaled by PE1 to redirect the
      traffic of its failed local AC connected to CE2,

   *  PE..n (NDF) use the ERL1 label signaled by PE1 to redirect the
      traffic of their failed local AC connected to CE2.

   The use of PE2's ERL2 as redirect label applies to local failures in
   all load-balancing modes at PE1.

   The number of peering PEs is not limited by existing DF-Election
   algorithms.  A solution based on DF-Election supports subsequent
   redirection upon multiple cascading failures, once a new DF-Election
   has occurred.  Pre-selection of a backup path is supported by all
   current DF-Election algorithms, and more generally by all algorithms
   supporting BDF-Election, as recommended in
   ([I-D.ietf-bess-rfc7432bis]).

4.2.  Failure Detection and Traffic Restoration

Burdet, et al.          Expires 5 September 2024                [Page 7]
Internet-Draft              EVPN Fast Reroute                 March 2024

                                           +------+
                                           |  PE1 |
                                           |      |
                          +-------+        | ESL1-----XX..
                          |       |--------|      |   *   .
                          |       |        | ERL1 |  *     .
               +-----+    |       |        +------+ *       .
               |     |    |IP/MPLS|                *         .
        CE1 ---| PE3 |----|Core   |               *     ESI1  *** CE2
               |     |    |Network|              *           *
               +-----+    |       |        +----*-+         *
                          |       |        | ERL2* * * * * *
                          |       |--------|      |       /
                          +-------+        | ESL2---BDF--X
                                           |      |
                                           |  PE2 |
                                           +------+

                Figure 2: EVPN Multihoming failure scenario

   The procedures for forwarding known unicast packets received from a
   remote PE on the local redirect label follow Section 13.2.2 of
   [I-D.ietf-bess-rfc7432bis] for known unicast traffic.  Since the CE
   next-hop forwarding information reflects the current BDF state of the
   AC, additional steps to bypass blocking state and preventing another
   re-direction are applied, as described further in this document.

   Consider the EVPN multihoming topology in Figure 1, and a traffic
   flow from CE1 to CE2 which is currently using EVPN service label ESL1
   and forwarded through the core arriving at PE1.  When the local AC
   representing the <EVI,ESI> pair is protected using the fast-reroute
   solution, the pre-computed backup path's redirect label (i.e.  ERL2
   from BDF-Elected PE2) is installed against the AC.

   Under normal conditions, PE1 disposition using ESL1 will result in
   forwarding the packet to the CE by selecting the local AC associated
   with the EVPN service label ([RFC8214], [I-D.ietf-bess-rfc7432bis]).
   When this local AC is in failed state, the fast-reroute solution at
   PE1 will begin rerouting packets using the BDF-Elected peer's nexthop
   and ERL2.  ERL2 is chosen for redirected traffic and not ESL2 to
   prevent loops and overcome DF-Election timing as described in
   Sections 5.2 and 5.1 respectively.

Burdet, et al.          Expires 5 September 2024                [Page 8]
Internet-Draft              EVPN Fast Reroute                 March 2024

4.2.1.  Simultaneous Failures in ES

   In EVPN multihoming where the CE connects to peering PEs through link
   aggregation (LAG), a single LAG failure at the CE may manifest as
   multiple ES failures at all peering PEs simultaneously.

   As all peering PEs would enable simultaneously the fast-reroute
   mechanism, redirection would be permanent causing a traffic storm or
   until TTL expires.

   Once-redirected traffic may not be redirected again, according to the
   terminal nature of ERLs described in Section 5.2

4.2.2.  Successive and Cascading Failures in ES

   Trying to support cascading failures by redirecting once-redirected
   traffic is substantially equivalent to simultaneous failures above.

   Once-redirected traffic may not be redirected again, according to the
   terminal nature of ERLs described in Section 5.2 and loss is to be
   expected until EVPN control plane reconverges for double-failure
   scenarios.

   In a scenario with 3 peering PEs (PE1-DF, PE2-BDF, PE3-NDF) where PE1
   fails, followed by a PE2 failure before control-plane reconvergence,
   there is no reroute of traffic towards PE3 because the reroute-label
   is terminal.

   In such rapid-succession failures, it is expected that control plane
   must first correct for the initial failure and DF-Elect PE2 as new-DF
   and PE3 as the new-BDF.  PE2 to PE3 redirection would then begin,
   unless control-plane is rapid enough to correct directly, and elect
   PE3 new-DF.

5.  Redirect Labels: Forwarding Behaviors

   The EVPN redirect labels MUST be downstream assigned, and it is
   directly associated with the <EVI,ESI> AC being egress protected.
   The special forwarding characteristics and use of an EVPN redirect
   label (ERL) described below, are a matter of local significance only
   to the advertising PE (which is also the disposition PE).

   Special behaviors to the ERLs do not affect any other PEs or transit
   P nodes.  There are no extra labels appended to the label stack in
   the IP/MPLS network and the ERL appears to label-switching transit
   nodes as would any other EVPN service label.  Since they appear as
   EVPN service labels, ERL labels do not have any impact on Flow-Label
   or Control-Word procedures in [I-D.ietf-bess-rfc7432bis].

Burdet, et al.          Expires 5 September 2024                [Page 9]
Internet-Draft              EVPN Fast Reroute                 March 2024

   *  Traffic redirection and use of reroute labels may create routing
      loops upon multiple failures.  Such loops are detrimental to the
      network and may cause congestion between protected PEs.

   *  Local restoration and redirection is meant to occur much faster
      than control-plane operations, meaning redirected packets may
      arrive at the BDF PE long before a DF-Election operation unblocks
      the egress link.

   Two special forwarding characteristics and behaviors of EVPN redirect
   labels are described below to mitigate these issues.

5.1.  Bypassing DF-Election Behavior

   Local detection and restoration at DF-Elected PE1 will begin rapidly
   redirecting traffic onto the backup path selected (PE2).
   Redirected packets will arrive at the Backup-DF port much faster than
   control plane DF-Election at the Backup-DF peer is capable of
   unblocking its local egress link for the shared ES (ESI1).  All
   redirected traffic would drop at Backup-DF and no net reduction in
   traffic loss is achieved.

   Traffic restoration remains dependant upon ES route or Ethernet A-D
   per ES/EVI routes withdrawal for a DF-Election operation and for PE1
   to assume the traffic forwarding role.  This is especially important
   in single-active load-balancing mode where known unicast traffic is
   blocked.

   To mitigate this, the redirect labels allocated must carry a special
   attribute in the local forwarding and decapsulation chain: for
   traffic received on the ERL when the AC is up, an override to the
   DF-Election is applied and traffic from the ERL will bypass the local
   Backup-DF blocking state.  Once EVPN control plane reconverges,
   traffic from the ERL will cease and the optimal forwarding path based
   on ESLs will resume.

   The EVPN redirect label MUST carry a context locally, such that from
   disposition to egress redirected packets are allowed to bypass the
   Backup-DF blocking state that would otherwise drop.  Similarly, this
   may open the gate to the traffic in the reverse direction.
   In Port-Active mode, the Backup-DF interface may signal Out-of-
   Service but remain in Up/Backup state: to support EVPN Fast Reroute,
   the CE must be able to receive traffic from an OOS LAG link.

Burdet, et al.          Expires 5 September 2024               [Page 10]
Internet-Draft              EVPN Fast Reroute                 March 2024

5.2.  Terminal Disposition Behavior

   The reroute scheme is susceptible to loops and persistant redirects
   between peering PEs which have setup FRR redirection.  Consider the
   scenario where both CE-facing interfaces fail simultaneously, fast
   reroute will be activated at both PE1 and PE2 effectively bouncing a
   redirected packet between the two PEs indefinitely (or until the TTL
   expires) causing a traffic storm.

   To prevent this, a distinction is made between 'regular' EVPN service
   labels for disposition (i.e. known unicast EVI label or EVPN-VPWS
   label) and reroute labels with terminal disposition.

   At the redirecting PE2, we consider the case of ESL2 vs. ERL2 , where
   both are locally allocated and provided in EVPN routes (downstream
   allocation) to BGP peers:

   1.  EVPN Service label, ESL2:

       *  Regular MAC-lookup or traffic forwarding occurs towards the
          access AC.

       *  If the AC is up, traffic will exit the interface, subject to
          local blocking state on the AC from DF-Election.

       *  If the AC is down and fast-reroute procedures are enabled,
          traffic may be re-encapsulated using BDF peer's redirect label
          ERL1 (if received).

       *  In most implementations, MACs are flushed on PE2 upon AC
          failure.  When fast-reroute procedures are enabled at PE2, it
          must maintain all MAC-CE2 programmed against the failed access
          AC for some time in order for the MAC-lookup to provide
          traffic continuity to the failed AC and the redirection above.

   2.  EVPN Reroute label, ERL2:

       *  Regular MAC-lookup or traffic forwarding occurs towards the
          access AC.

       *  If the AC is up, traffic will apply an override to DF-Election
          and bypass the local blocking state on the AC.

       *  If the AC is down, traffic is dropped.  No reroute must occur
          of once-rerouted traffic.  Redirecting towards peer's redirect
          label ERL1 is explicitly prevented.

Burdet, et al.          Expires 5 September 2024               [Page 11]
Internet-Draft              EVPN Fast Reroute                 March 2024

   The ERL acts like a local cross-connect by providing a direct channel
   from disposition to the AC.  ERLs are terminal-disposition and
   prevents once-redirected packets from being redirected again.  With
   this forwarding attribute on ERLs, known only locally to the
   downstream-allocating PE, redirection is achieved without growing the
   label stack with another special purpose label.

6.  Controlled Recovery Sequence

   Fast reroute mechanisms such as the one described in this document
   generally provide a way to preserve traffic flows at failure time.
   Use of fast reroute in EVPN, however, permits setting up a controlled
   recovery sequence to shorten the period of loss between an interface
   coming up and the EVPN DF-Election procedures and default timers for
   peer discovery.

   The benefit of a controlled recovery sequence is amplified when used
   in conjunction with [I-D.ietf-bess-evpn-fast-df-recovery]
   (synchronised DF-Election)>

7.  Transport Underlay

   The solution is agnostic to transport underlays, for instance similar
   behavior is carried forward for NVO tunnels (VXLAN) and SRv6.

7.1.  NVO Tunnels

   The rerouting procedures and behaviors in this document apply as well
   for [RFC8365] NVO tunnels.

   For MPLS-based NVO tunnels, i.e. MPLSoGRE, MPLSoUDP, etc., no
   additional behaviors are required.

   For non-MPLS NVO tunnels, the labels are 24-bit VNIs, not downstream
   assigned and usually global, i.e. same value for all the PEs attached
   to the BD.  In this case, the rerouting mechanisms described in this
   document would not work without some additional behaviors: the
   rerouting mechanism needs to avoid local-bias split-horizon filtering
   upon reception of the redirected packets.  For non-MPLS NVO tunnels,
   an additional identifier is advertised in Ethernet A-D per EVI routes
   to enable EVPN Fast Reroute.

Burdet, et al.          Expires 5 September 2024               [Page 12]
Internet-Draft              EVPN Fast Reroute                 March 2024

7.1.1.  Ignoring Local Bias Behavior

   Non-MPLS NVO tunnel encapsulations may use local-bias procedures
   instead of ES label-based split-horizon (for EVPN multihoming).
   This means that, e.g. when PE1 sends redirected traffic to
   multihoming peer PE2 with the ERL VNI, PE2 will drop the packets due
   to the filtering based on the tunnel source IP.  To support non-MPLS
   NVO tunnels such as VXLAN, PE2 in the example above needs to bypass
   the source IP based filtering if the VNI identifies a local
   redirection instance.  The split-horizon filtering would be based on
   source-IP + FRR-VNI, as opposed to source-IP only.
   Since the VNI is global and not e.g. downstream-assigned, a VNI must
   be allocated per ES,EVI for the rerouting mechanisms described in
   this document to apply.

7.2.  Segment Routing v6

   Ethernet A-D per EVI routes are advertised along with the Service SID
   used for End.DX2 or End.DT2U behaviors Section 6.1.2 of [RFC9252].
   These advertisements correspond to the ESL behavior in this document
   (EVPN Service SID).  An additional EVPN Redirect SID is advertised in
   Ethernet A-D per EVI routes to enable EVPN Fast Reroute, with one of
   2 new SRv6 Endpoint Behaviors.  At the redirecting PE1, the
   EVPN Redirect SID is used to implement ERL behaviors described in
   Section 4.2.

7.2.1.  End.DT2U.Reroute : End.DT2U with Fast Reroute

   The "End.DT2U with Fast Reroute" behavior ("End.DT2U.Reroute" for
   short) is a variant of the End.DT2U behavior.

   The End.DT2U.Reroute behavior is defined for the fast-reroute
   application between two EVPN multi-homing peers, and extends the base
   End.DT2U behavior.  This behavior takes an optional Fast Reroute
   argument: "Arg.FR2".  This argument provides a local mapping to
   Attachment Circuit (EVI/ESI) for the received traffic, which also
   implements the forwarding behaviors in Section 5.

   Any SID instance of this behavior may be used in two ways:

   1.  by ingress PEs not performing any reroute (such as PE3 in
       Figure 1) by setting the Arg.FR2 argument as zero for handling at
       an egress PE that is the same as End.DT2U

   2.  by peering PEs performing redirection (such as PE1 in Figure 2),
       by setting the argument Arg.FR2 with a non-zero value for the
       reroute handling in addition to the End.DT2U functionality

Burdet, et al.          Expires 5 September 2024               [Page 13]
Internet-Draft              EVPN Fast Reroute                 March 2024

   Thus, the SID entry for this behavior when instantiated in the FIB
   performs the disposition of both base L2 Table traffic (i.e., the
   base End.DT2U behavior) traffic as well as rerouted traffic (i.e.,
   the End.DT2U+Arg.FR2 handling).  End.DT2U processing is as in
   Section 4.11 of [RFC8986].

   When processing the Upper-Layer header of a packet matching a FIB
   entry locally instantiated as an End.DT2U.Reroute SID, N does the
   following:

   S01. If (Upper-Layer header type == 143(Ethernet) ) {
   S02.    Remove the outer IPv6 header with all its extension headers
   S03.    If (Arg.FR2 is 0) {
   S04.       Process as per Section 4.11 of [RFC8986]  (End.DT2U)
   S05.    } Else {
   S06.       Lookup the egress interface L2 OIF I for Arg.FR2
   S07.    If (L2 OIF interface I is down) {
   S08.       Drop the Ethernet frame
   S09.    } Else {
   S10.       Forward the Ethernet frame to the OIF I
                 bypassing any EVPN DF-Election blocking state
   S11.    }
   S12. } Else {
   S13.    Process as per Section 4.1.1 of [RFC8986]
   S14. }

   To maintain backwards-compatibility, both End.DT2U.Reroute and
   End.DT2U Behavior SIDs MAY be advertised together whereby legacy
   receivers ignore the SRv6 SID of unknown behavior End.DT2U.Reroute.

   The SRv6 L2 Service TLV in this case will carry two SRv6 SID
   Information sub-TLVs:

   *  the first one with the base End.DT2U behavior and

   *  the second one with the End.DT2U.Reroute behavior variant.
      The second one will have a non-zero Arg length (AL) and convey
      Arg.FR2 embedded in the advertised SID

   When advertised alongside an End.DT2U EVPN Service SID, the
   End.DT2U.Reroute EVPN Reroute SID MUST be identical to the End.DT2U
   except for the inclusion of an Argument Arg.FR2.  Both SRv6 SIDs can
   use transposition since the function MUST be identical between the 2
   SIDs.  A receiver unable to validate the applicability of arguments
   for SRv6 Endpoint Behaviors that are unknown to it MUST ignore the
   End.DT2U.Reroute SID (Section 3.2.1 of [RFC9252]).

Burdet, et al.          Expires 5 September 2024               [Page 14]
Internet-Draft              EVPN Fast Reroute                 March 2024

   Following is an example representation of the BGP Prefix-SID
   Attribute encoding in this case for a 16-bit argument Arg.FR2
   (0xaaaa):

   BGP Prefix SID Attr:
      SRv6 L2 Service TLV:
         SRv6 SID Information sub-TLV:
            SID: 2001:db8:b:1:fbd1::
               Behavior: End.DT2U
               SRv6 SID Structure sub-sub-TLV:
                  LBL: 48, LNL: 16, FL: 16, AL: 0, TPOS-L: 0, TPOS-O: 0
         SRv6 SID Information sub-TLV:
            SID: 2001:db8:b:1:fbd1:aaaa::
               Behavior: End.DT2U.Reroute
               SRv6 SID Structure sub-sub-TLV:
                  LBL: 48, LNL: 16, FL: 16, AL: 16, TPOS-L: 0, TPOS-O: 0

            Figure 3: EVPN Route Type 1 with dual End.DT2U SIDs

   When both End.DT2U.Reroute and End.DT2U are advertised, the ingress
   PE not performing reroute MUST use the End.DT2U as the EVPN Service
   SID.

7.2.2.  End.DX2.Reroute : End.DX2 with Fast Reroute

   The "End.DX2 with Fast Reroute" behavior ("End.DX2.Reroute" for
   short) is a variant of the End.DX2 behavior.

   The text in this section mirrors that of Section 7.2.1
   (End.DT2U.Reroute) and is included for completeness' sake.

   The End.DX2.Reroute behavior is defined for the fast-reroute
   application between two EVPN multi-homing peers, and extends the base
   End.DX2 behavior.  This behavior takes an optional Fast Reroute
   argument: "Arg.FR2".  This argument provides a local mapping to
   Attachment Circuit (EVI/ESI) for the received traffic, which also
   implements the forwarding behaviors in Section 5.

   Any SID instance of this behavior may be used in two ways:

   1.  by ingress PEs not performing any reroute (such as PE3 in
       Figure 1) by setting the Arg.FR2 argument as zero for handling at
       an egress PE that is the same as End.DX2

   2.  by peering PEs performing redirection (such as PE1 in Figure 2),
       by setting the argument Arg.FR2 with a non-zero value for the
       reroute handling in addition to the End.DX2 functionality

Burdet, et al.          Expires 5 September 2024               [Page 15]
Internet-Draft              EVPN Fast Reroute                 March 2024

   Thus, the SID entry for this behavior when instantiated in the FIB
   performs the disposition of both base L2 Table traffic (i.e., the
   base End.DX2 behavior) traffic as well as rerouted traffic (i.e., the
   End.DX2+Arg.FR2 handling).  End.DX2 processing is as in Section 4.9
   of [RFC8986].

   When processing the Upper-Layer header of a packet matching a FIB
   entry locally instantiated as an End.DX2.Reroute SID, N does the
   following:

   S01. If (Upper-Layer header type == 143(Ethernet) ) {
   S02.    Remove the outer IPv6 header with all its extension headers
   S03.    If (Arg.FR2 is 0) {
   S04.       Process as per Section 4.9 of [RFC8986]  (End.DX2)
   S05.    } Else {
   S06.       Lookup the egress interface L2 OIF I for Arg.FR2
   S07.    If (L2 OIF interface I is down) {
   S08.       Drop the Ethernet frame
   S09.    } Else {
   S10.       Forward the Ethernet frame to the OIF I
                 bypassing any EVPN DF-Election blocking state
   S11.    }
   S12. } Else {
   S13.    Process as per Section 4.1.1 of [RFC8986]
   S14. }

   To maintain backwards-compatibility, both End.DX2.Reroute and End.DX2
   Behavior SIDs MAY be advertised together.  Receiving PEs SHOULD use
   the SRv6 SID from the first instance of the Sub-TLV only (Section 3.1
   of [RFC9252]), and ignore the SRv6 SID of unknown behavior
   End.DX2.Reroute (Section 3.2.1 of [RFC9252]).

   The SRv6 L2 Service TLV in this case will carry two SRv6 SID
   Information sub-TLVs:

   *  the first one with the base End.DX2 behavior and

   *  the second one with the End.DX2.Reroute behavior variant.
      The second one will have a non-zero Arg length (AL) and convey
      Arg.FR2 embedded in the advertised SID

   When advertised alongside an End.DX2 EVPN Service SID, the
   End.DX2.Reroute EVPN Reroute SID MUST be identical to the End.DX2
   except for the inclusion of an Argument Arg.FR2.  Both SRv6 SIDs can
   use transposition since the function MUST be identical between the 2
   SIDs.  A receiver unable to validate the applicability of arguments
   for SRv6 Endpoint Behaviors that are unknown to it MUST ignore the
   End.DX2.Reroute SID (Section 3.2.1 of [RFC9252]).

Burdet, et al.          Expires 5 September 2024               [Page 16]
Internet-Draft              EVPN Fast Reroute                 March 2024

   Following is an example representation of the BGP Prefix-SID
   Attribute encoding in this case for a 16-bit argument Arg.FR2
   (0xaaaa):

   BGP Prefix SID Attr:
      SRv6 L2 Service TLV:
         SRv6 SID Information sub-TLV:
            SID: 2001:db8:b:1:fbd1::
               Behavior: End.DX2
               SRv6 SID Structure sub-sub-TLV:
                  LBL: 48, LNL: 16, FL: 16, AL: 0, TPOS-L: 0, TPOS-O: 0
         SRv6 SID Information sub-TLV:
            SID: 2001:db8:b:1:fbd1:aaaa::
               Behavior: End.DX2.Reroute
               SRv6 SID Structure sub-sub-TLV:
                  LBL: 48, LNL: 16, FL: 16, AL: 16, TPOS-L: 0, TPOS-O: 0

             Figure 4: EVPN Route Type 1 with dual End.DX2 SIDs

   When both End.DX2.Reroute and End.DX2 are advertised, the ingress PE
   not performing reroute MUST use the End.DX2 as the EVPN Service SID.

7.2.3.  Conflicting Endpoint Behaviors

   End.DT2U.Reroute ad End.DX2.Reroute are variants of their respective
   base behaviours and when two SIDs are advertised together in an
   Ethernet A-D per EVI routre, the variant advertised MUST be the same
   as base behaviour.
   In other words, advertisement of an End.DT2U.Reroute variant
   alongside an End.DX2 base is unusable and SHALL be discarded by
   receivers, and similarly an End.DX2.Reroute variant advertised
   alongside an End.DT2U base SHALL be discarded by receivers.

7.3.  Inter-AS Option B

   EVPN multi-homing peers in different AS are rather an exception.  In
   Inter-AS Option B or inter-domain scenarios, the ASBR/ABR and BGP
   route-reflectors with nexthop-self procedures are extended:

   *  Prior to this spec the ABR/ASBR receives the Ethernet A-D per EVI
      route, programs a label swap operation and redistributes the route
      with a new allocated label in the NLRI's label field.

Burdet, et al.          Expires 5 September 2024               [Page 17]
Internet-Draft              EVPN Fast Reroute                 March 2024

   *  To implement the procedures in this document, the ABR/ASBR needs
      to allocate two downstream labels for each Ethernet-A-D per EVI
      route: one for the NLRI's label (ERL) and another one for the ESI
      Label Extended Community label (ESL).  A label swap operation is
      programmed for both ERL and ESL labels.

8.  BGP Extensions

   While this document describes a new behavior, there are no new BGP
   extensions required to advertise the redirect label(s) used for EVPN
   egress link protection.  The ESI Label Extended Community defined in
   Section 7.5 of [I-D.ietf-bess-rfc7432bis] may be advertised along
   with Ethernet A-D routes:

   *  When advertised with an Ethernet A-D per ES route, it enables
      split-horizon procedures for multihomed sites as described in
      Section 8.3 of [I-D.ietf-bess-rfc7432bis];

   *  When advertised with an Ethernet A-D per EVI route, it enables
      link protection and fast-reroute procedures for multihomed sites
      as described in this document.  The label value represents the
      per-<EVI,ESI> EVPN redirect label (ERL).  The Flags field SHOULD
      NOT be set and MUST be ignored.

   Prior to this document, advertising the ESI Label Extended Community
   along with an Ethernet A-D per EVI route (Ethertag different than
   MAX-ET) was undefined, and presumably ignored.

   Remote PEs SHOULD NOT use the ERLs as a substitution for ESLs in
   route resolution, and is especially not to be confused with the
   aliasing and backup path ESL as described and used in Section 8.4 of
   [I-D.ietf-bess-rfc7432bis].

9.  Security Considerations

   The mechanisms in this document use the EVPN control plane as defined
   in [I-D.ietf-bess-rfc7432bis] and [RFC8214], and the security
   considerations described therein are equally applicable.  Reroute
   labels redistributed in EVPN control plane are meant for consumption
   by the peering PE in a same ES.  It is, however, visible in the EVPN
   control plane to remote peers.  Care shall be taken when installing
   reroute labels, since their use may result in bypassing DF-Election
   procedures and lead to duplicate traffic at CEs if incorrectly
   installed.

Burdet, et al.          Expires 5 September 2024               [Page 18]
Internet-Draft              EVPN Fast Reroute                 March 2024

10.  Acknowledgements

   Authors would like to thank Ketan Talaulikar for his review of SRv6
   procedures in this document.

11.  IANA Considerations

   This document introduces two new Endpoint behaviors.  This document
   requests IANA assign a two new values and update the "SRv6 Endpoint
   Behaviors" subregistry under the top-level "Segment Routing" registry
   as follows:

            +-------+-----+-------------------+---------------+
            | Value | Hex | Endpoint Behavior | Reference     |
            +-------+-----+-------------------+---------------+
            | TBD   | TBD | End.DT2U.Reroute  | This document |
            +-------+-----+-------------------+---------------+
            | TBD   | TBD | End.DX2.Reroute   | This document |
            +-------+-----+-------------------+---------------+

                Table 1: SRv6 Endpoint Behaviors Subregistry

12.  References

12.1.  Normative References

   [I-D.ietf-bess-rfc7432bis]
              Sajassi, A., Burdet, L. A., Drake, J., and J. Rabadan,
              "BGP MPLS-Based Ethernet VPN", Work in Progress, Internet-
              Draft, draft-ietf-bess-rfc7432bis-06, 5 January 2023,
              <https://datatracker.ietf.org/doc/html/draft-ietf-bess-
              rfc7432bis-06>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [RFC8214]  Boutros, S., Sajassi, A., Salam, S., Drake, J., and J.
              Rabadan, "Virtual Private Wire Service Support in Ethernet
              VPN", RFC 8214, DOI 10.17487/RFC8214, August 2017,
              <https://www.rfc-editor.org/info/rfc8214>.

Burdet, et al.          Expires 5 September 2024               [Page 19]
Internet-Draft              EVPN Fast Reroute                 March 2024

   [RFC8365]  Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R.,
              Uttaro, J., and W. Henderickx, "A Network Virtualization
              Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365,
              DOI 10.17487/RFC8365, March 2018,
              <https://www.rfc-editor.org/info/rfc8365>.

   [RFC8584]  Rabadan, J., Ed., Mohanty, S., Ed., Sajassi, A., Drake,
              J., Nagaraj, K., and S. Sathappan, "Framework for Ethernet
              VPN Designated Forwarder Election Extensibility",
              RFC 8584, DOI 10.17487/RFC8584, April 2019,
              <https://www.rfc-editor.org/info/rfc8584>.

   [RFC8986]  Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer,
              D., Matsushima, S., and Z. Li, "Segment Routing over IPv6
              (SRv6) Network Programming", RFC 8986,
              DOI 10.17487/RFC8986, February 2021,
              <https://www.rfc-editor.org/info/rfc8986>.

12.2.  Informative References

   [I-D.ietf-bess-evpn-fast-df-recovery]
              Brissette, P., Sajassi, A., Burdet, L. A., Drake, J., and
              J. Rabadan, "Fast Recovery for EVPN Designated Forwarder
              Election", Work in Progress, Internet-Draft, draft-ietf-
              bess-evpn-fast-df-recovery-06, 24 August 2022,
              <https://datatracker.ietf.org/doc/html/draft-ietf-bess-
              evpn-fast-df-recovery-06>.

   [I-D.ietf-bess-evpn-l2gw-proto]
              Brissette, P., Sajassi, A., Burdet, L. A., and D. Voyer,
              "EVPN Multi-Homing Mechanism for Layer-2 Gateway
              Protocols", Work in Progress, Internet-Draft, draft-ietf-
              bess-evpn-l2gw-proto-02, 24 October 2022,
              <https://datatracker.ietf.org/doc/html/draft-ietf-bess-
              evpn-l2gw-proto-02>.

   [RFC8679]  Shen, Y., Jeganathan, M., Decraene, B., Gredler, H.,
              Michel, C., and H. Chen, "MPLS Egress Protection
              Framework", RFC 8679, DOI 10.17487/RFC8679, December 2019,
              <https://www.rfc-editor.org/info/rfc8679>.

   [RFC9135]  Sajassi, A., Salam, S., Thoria, S., Drake, J., and J.
              Rabadan, "Integrated Routing and Bridging in Ethernet VPN
              (EVPN)", RFC 9135, DOI 10.17487/RFC9135, October 2021,
              <https://www.rfc-editor.org/info/rfc9135>.

Burdet, et al.          Expires 5 September 2024               [Page 20]
Internet-Draft              EVPN Fast Reroute                 March 2024

   [RFC9136]  Rabadan, J., Ed., Henderickx, W., Drake, J., Lin, W., and
              A. Sajassi, "IP Prefix Advertisement in Ethernet VPN
              (EVPN)", RFC 9136, DOI 10.17487/RFC9136, October 2021,
              <https://www.rfc-editor.org/info/rfc9136>.

   [RFC9252]  Dawra, G., Ed., Talaulikar, K., Ed., Raszuk, R., Decraene,
              B., Zhuang, S., and J. Rabadan, "BGP Overlay Services
              Based on Segment Routing over IPv6 (SRv6)", RFC 9252,
              DOI 10.17487/RFC9252, July 2022,
              <https://www.rfc-editor.org/info/rfc9252>.

Authors' Addresses

   Luc Andre Burdet (editor)
   Cisco
   Email: lburdet@cisco.com

   Patrice Brissette
   Cisco
   Email: pbrisset@cisco.com

   Takuya Miyasaka
   KDDI Corporation
   Email: ta-miyasaka@kddi.com

   Jorge Rabadan
   Nokia
   Email: jorge.rabadan@nokia.com

Burdet, et al.          Expires 5 September 2024               [Page 21]