Internet-Draft TPE-aided SPE-Protection January 2021
Wang Expires 29 July 2021 [Page]
Workgroup:
BESS WG
Published:
Intended Status:
Standards Track
Expires:
Author:
Y. Wang
ZTE Corporation

TPE-aided SPE-Protection

Abstract

MPLS EVPN SPEs cannot make use of anycast MPLS tunnel (whose egress LSRs are two of these SPEs) because that the two SPEs will re-assign different EVPN labels for the same EVPN prefix. It will be complicated to static-configure EVPN label for each EVPN prefix. At the same time, the TPEs should advertise specified signalling to do egress node (TPE) protection. This document specifies a egress node protection signalling from/among TPE nodes, and TPE (whether it is egress-protected or not) can help the SPEs to do egress protection on the basis of that signalling.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 29 July 2021.

1. Introduction

In section 2.5 and section 4.4 of [I-D.wang-bess-evpn-egress-protection], a MPLS egress protection signalling is defined. The section 5.4 of [I-D.wang-bess-evpn-context-label] uses the same signalling to do egress protection for SPEs. This draft put the two scenarios together, and describe all the unified signallings for the MPLS SPEs and TPEs.

Note that the "egress" in "egress protection" means the egress LSR of the underlay LSP, not the egress LSR of the overlay LSP. The SPEs are not the egress LSR of the overlay LSP, but they are the egress LSR of the underlay LSP. So the anycast tunnel for SPEs is also egress protection tunnel for SPEs.

1.1. Terminology and Acronyms

This document uses the following acronyms and terms:

  • All-Active Redundancy Mode - When a device is multihomed to a group of two or more PEs and when all PEs in such redundancy group can forward traffic to/from the multihomed device or network for a given VLAN.
  • Backup egress router - Given an egress-protected tunnel and its egress router, this is another router that has connectivity with all or a subset of the destinations of the egress-protected services carried by the egress-protected tunnel.
  • SPE - Stitching PE, the PEs to do label swapping operation for the EVPN labels. It is similar to the SPE of MS-PWs.
  • TPE - Target PE, the PEs to do EVPN forwarding for the overlay network.
  • BUM - Broadcast, Unknown unicast, and Multicast.
  • CE - Customer Edge equipment.
  • EELP bypass tunnel - Egress ESI Link Protection bypass tunnel - A tunnel used to reroute service packets upon an egress ESI link failure.
  • Egress failure - An egress node failure or an egress link failure.
  • Egress link failure - A failure of the egress link (e.g., PE-CE link, attachment circuit) of a service.
  • Egress loopback - the loopback interface on the Egress router, whose IP address is the destination of the Egress-protected tunnel.
  • Egress node failure - A failure of an egress router.
  • Egress router - A router at the egress endpoint of a tunnel. It hosts service instances for all the services carried by the tunnel and has connectivity with the destinations of the services.
  • ESI - Ethernet Segment Identifier - A unique non-reserved identifier that identifies an Ethernet segment.
  • OPE - Originating PE - the original Router of an EVPN route.
  • PE - Provider Edge equipment. Note that VTEP/NVE are also called as PE in this draft.
  • PLR - A router at the point of local repair. In egress node protection, it is the penultimate hop router on an egress-protected tunnel. In egress link protection, it is the egress router of the egress- protected tunnel.
  • Protector - A role acted by a router as an alternate of a protected egress router, to handle service packets in the event of an egress failure. A protector is physically independent of the egress router.
  • Protector loopback - the loopback interface on the Protector, whose IP address is the destination of the Egress-protected tunnel.
  • Single-Active Redundancy Mode - When a device or a network is multihomed to a group of two or more PEs and when only a single PE in such a redundancy group can forward traffic to/from the multihomed device or network for a given VLAN.
  • DF - Designated Forwarder.
  • NDF - non-DF, non Designated-Forwarder.
  • NDF-Bias - An exception for filtering bypassed BUM packets. It says that when an outgoing AC is a NDF on its ES, the bypass-BUM filter rules will not be applied for that AC.

2. Detailed Problem and Solution Requirement

2.1. Scenarios and Basic Settings

                    (PE3)           (PE1)
                ____SPE1____    ____TPE2____
               /            \  /            \
CE1---TPE1---PLR2           PLR1           CE2
               \____SPE2____/  \____TPE3____/
                                    (PE2)
Figure 1: TPE-aided SPE-Protection Scenario

The above figure is a combination of [I-D.wang-bess-evpn-egress-protection]'s Figure 1 and [I-D.wang-bess-evpn-context-label]'s Figure 6. The TPE1/SPE1/SPE2/TPE2 above is the TPE1/SPE1/SPE2/TPE2 of [I-D.wang-bess-evpn-context-label]'s Figure 6, But TPE2 is also the PE1 of [I-D.wang-bess-evpn-egress-protection]'s Figure 1, and TPE3 is the PE2, SPE1 is the PE3.

When TPE2 advertises an EVPN route (say R9), the same R9 will be advertised to both the two SPEs and TPE3. When TPE3 receives R9, they will do EVPN egress protection. When SPE1 or SPE2 receives the same R9, SPE1/SPE2 will advertise R9 to TPE1 with the same nexthop (the anycast tunnel address of SPE1 and SPE2) following Section 3.3.

Then the requirement here is clear that we want TPE2 use the same route attributes to advertise R9 to both the SPEs and the TPEs.

In addition, Note that when the BUM tunnel (T1) from PE1 (TPE2) to PE2 (TPE3) travels through the PLR1, and the PLR1 reroutes these packets (destined to PE2) back to PE1 when PE2 fails, at that moment, PE1 should drop these packets because their EVI label are mirrored EVI labels (in context-specific label space) but their ESI labels are not absent.

Note that the Leaf labels (along with mirrored EVI labels) should be distinguished from the ESI labels (along with mirrored EVI labels), because that the former should not be dropped but the latter can be dropped. They can be distinguished by installing mirrored Leaf labels, but the mirrored ESI labels need not be installed.

2.1.1. Exception Case for Next Hop Validating

When TPE3 receives an EVPN route R0 whose nexthop matches the prefix LOC1, TPE3 may discard the route R0 because its nexthop is considered to be TPE3's own address. Even though TPE3 don't disccard R0, TPE3 cannot use its nexthop to send an EVPN data packet to TPE2.

Because that a destination IP within prefix LOC1 (in forms of LOC1_P) will be considered to be sent to TPE3 itself. So we should use IP_N1 and IP_N2 to establish the bypass path between TPE2 and TPE3 instead of LOC1 and LOC2.

3. Control Plane Processing

3.1. Downstream-CLS ID Extended Community

The downstream-CLS ID Extended Community is a new Transitive Opaque EC with the following structure (Sub-Type value to be assigned by IANA):

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0x03 or 0x43  |   Sub-Type    |M|    ID-Type                  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         ID-Value                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: Downstream-CLS ID Extended Community
  • ID-Type: A 2-octet field that specifies the type of Label Space ID. In this section, the ID-Type is 0. The ID-Type 0 indicating that the ID-Value field is a MPLS label in DCB, and it has global uniqueness across the EVPN domain.
  • ID-Value: A 4-octet field that specifies the value of Label Space ID. When it is a label (with ID-Type 0), the most significant 20-bit is set to the label value.
  • M bit: Multi-homing Flag. If the EVPN route is advertised by a TPE of a redundancy group, and the nexthop of that route is the TPE's anycast address, the multi-homing flag should be set to 1.

    If the EVPN route is advertised by a SPE of no redundancy group, and the nexthop of that route is not an anycast address, the multi-homing flag should be kept unchanged.

    If the EVPN route is advertised by a SPE of a redundancy group, and the nexthop of that route is the redundancy group's anycast address, the multi-homing flag should be rewritten to 1.

Note that although the downstream-CLS ID EC is highly similar to the Context Label Space ID Extended Community (see section 3.1 of [I-D.ietf-bess-mvpn-evpn-aggregation-label]) in their encodings, they have absolutely different behaviors in data-plane. The CLS-ID EC should be treated as an incomming label in data-plane, but the downstream-CLS ID EC should be treated as an outgoing label in data-plane. So they couldn't share the same code-point in the signalling procedures.

3.2. TPE Procedures

First of all, We reserve a portion of the label space for assignment by a central authority. We refer to this reserved portion as the "Domain-wide Common Block" (DCB) of labels. This is analogous to the DCB that is described in Section 3.1. The DCB is taken from the same label space that is used for downstream-assigned labels, but each PE would know not to allocate local labels from that space. A PE would know by provisioning which label from the DCB corresponds to itself, and each of other labels from the DCB corresponds to each PE of the domain.

Note that the PEs don't have to know exactly which label corresponds to a specified PE, They just need know which label is for itself, and other labels is not for itself.

The MPLS-specific procedures are defined in the following list:

[M1]
In [C3], when TPE2(PE1) advertise R4/R5/R6, a Downstream-CLS ID EC will be advertised along with R4/R5/R6. And this EC carries the label (in DCB) that identifying TPE2(PE1) itself.
[M2]
In [C4], when TPE3(PE2) receives R4/R5/R6, It install the mirrored ILM entry in a context-specific labels space (say CLS23). The CLS23 is identified by the Downstream-CLS ID EC (say CIL2) of R4/R5/R6. The mirrored ILM entry is called as a CLS-specific ILM entry (CLS-ILM).
[M3]
In [C5], when SPE1(PE3) receives R4/R5/R6, It should impose the context-identifying label (CIL) carried in R4/R5/R6's Downstream-CLS ID EC onto the label stack following Section 4.2. That CIL is the outer label of the EVPN label of R4/R5/R6. In addition, SPE1(PE3) will aplly the procedures of Section 3.3 too. Although these procedures is not of EVPN egress TPE protection schema, they share the same signalling with EVPN protection. This simplifies the signalling procedures, because there no longer will be a requirement to advertise different route attributes to different PEs.

3.3. SPE Procedures

Now take above use case for example, the two SPEs are the egress nodes of an anycast SR-MPLS tunnel. The anycast SR-MPLS tunnel is used to transport flows from TPE1 to either SPE1 or SPE2 according to load balancing procedures. So SPE1 and SPE2 have to advertise the same EVPN label independently for a given EVPN route.

When TPE2 send a MAC/IP advertisement route (say R8) to SPE1 and SPE2, a "Downstream Context-specific Label Space (CLS) ID Extended Community" can be included in R8 along with an EVPN label (say EVL4).

3.3.1. Context-specific Label Swapping

When SPE1 and SPE2 receive R8 from TPE2, they should advertise R8 to TPE1 independently, and the next-hop of R8 should be changed to the common anycast node address (say IP_12) of SPE1 and SPE2 before the advertisement. But SPE1 and SPE2 can simply keep R8's EVPN label (the EVL4 from TPE2) unchanged.

The contex-VC label (say VCL4) in the "downstream-CLS ID EC" is also kept unchanged.

Note that although the EVL4 and VCL4 is unchanged, a CLS-specific ILM whose label operation is "label swapping" should also be installed, because that the outgoing PSN tunnel information should be resolved.

Note that the two outgoing-labels of the label-swapping have the same value (EVL4 and VCL4) as the two incomming-labels.

Note that if there is no TPE3, thus TPE2 is in no redundancy group. The SPEs will receive R8 with M bit = 0, In such case, the SPEs will not push the VCL4 onto the label stack for TPE2.

3.3.2. The Generating of Downstream-CLS ID EC on SPE

When TPE2 don't advertise the Downstream-CLS ID EC to SPE1 and SPE2, They have to generate that EC by themselves.

In such case, TPE2 should advertise the OPE TLV for R8. And a context-VC infrastructure should be established previously. The context-VC infrastructure should assure that the context-VCs from TPE2 to any other TPEs/SPEs have the same VCL value.

Then the SPE1 can set the ID-Value of the Downstream-CLS ID EC to the VCL of the contex VC from TPE2 to itself. The ID-Type of the Downstream-CLS ID EC is set to 0. So the same Downstream-CLS ID EC can be generated by the SPEs independently.

It is feasible for such context-VC infrastructure to be implemented on the basis of Kompella VPLS signalling or BGP SR signaling. But it will be better for the admin-EVI (as the context-VC infrastructure) and EVPN VPLS to use the same signalling framework.

4. Protection Procedures

4.2. SPE Protection Procedures

The label stack on the anycast SR-MPLS tunnel is constructed by TPE1 as the following:

        +---------------------------------+
        |  underlay ethernet header       |
        +---------------------------------+
        |  Anycast SR-TL = SR_LSP_to_SPEs |
        +---------------------------------+
        |  Context-VC Label = VCL4        |
        +---------------------------------+
        |  EVPN label = EVL4              |
        +---------------------------------+
        |  overlay ethernet or IP header  |
        +---------------------------------+

Figure 3: Anycast SPE dataplane

Note that the SR Tunnel Label (TL) in the label stack is the anycast SR-LSP label from TPE1 to the SPE1 or SPE2. And the VCL4 in the label stack is mandatory (from the viewpoint of TPE1).

Note that the context-VC is constructed (on SPE1 and SPE2) in per-platform label space, and VC labels from TPE2 to SPE1 and SPE2 will be the same value (VCL4). so the label stacks (from the viewpoint of TPE1) are the same for SPE1 and SPE2. That's why the anycast tunnel from TPE1 to SPE1 and SPE2 can be used for R8 by TPE1.

When SPE1/SPE2 receives that data packet, then SPE1/SPE2 will perform CLS-specific ILM lookup for the EVPN label in the "TPE2-specific label space" which is identified by the context-VC label VCL4. The label operation will be "swapping", and the new outgoing EVPN label will be the same value (as EVL4).

5. IANA Considerations

This document introduces a new Transitive Opaque Extended Community "Downstream CLS ID Extended Community". An IANA request will be submitted later for the code-point in the BGP Transitive Opaque Extended Community Sub-Types registry.

6. Security Considerations

This section will be added in future versions.

8. References

8.1. Normative References

[I-D.heitz-bess-evpn-option-b]
Heitz, J., Sajassi, A., Drake, J., and J. Rabadan, "Multi-homing and E-Tree in EVPN with Inter-AS Option B", Work in Progress, Internet-Draft, draft-heitz-bess-evpn-option-b-01, , <https://tools.ietf.org/html/draft-heitz-bess-evpn-option-b-01>.
[I-D.wang-bess-evpn-context-label]
Wang, Y. and B. Song, "Context Label for MPLS EVPN", Work in Progress, Internet-Draft, draft-wang-bess-evpn-context-label-04, , <https://tools.ietf.org/html/draft-wang-bess-evpn-context-label-04>.
[I-D.wang-bess-evpn-egress-protection]
Wang, Y. and R. Chen, "EVPN Egress Protection", Work in Progress, Internet-Draft, draft-wang-bess-evpn-egress-protection-04, , <https://tools.ietf.org/html/draft-wang-bess-evpn-egress-protection-04>.
[I-D.ietf-bess-mvpn-evpn-aggregation-label]
Zhang, Z., Rosen, E., Lin, W., Li, Z., and I. Wijnands, "MVPN/EVPN Tunnel Aggregation with Common Labels", Work in Progress, Internet-Draft, draft-ietf-bess-mvpn-evpn-aggregation-label-05, , <https://tools.ietf.org/html/draft-ietf-bess-mvpn-evpn-aggregation-label-05>.
[RFC8679]
Shen, Y., Jeganathan, M., Decraene, B., Gredler, H., Michel, C., and H. Chen, "MPLS Egress Protection Framework", RFC 8679, DOI 10.17487/RFC8679, , <https://www.rfc-editor.org/info/rfc8679>.
[RFC7432]
Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, , <https://www.rfc-editor.org/info/rfc7432>.
[C3]
"[C3]", , <https://tools.ietf.org/id/draft-wang-bess-evpn-egress-protection-04.html#list-1.3>.
[C4]
"[C4]", , <https://tools.ietf.org/id/draft-wang-bess-evpn-egress-protection-04.html#list-1.4>.
[C5]
"[C5]", , <https://tools.ietf.org/id/draft-wang-bess-evpn-egress-protection-04.html#list-1.5>.

8.2. Informative References

[I-D.ietf-bess-evpn-prefix-advertisement]
Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A. Sajassi, "IP Prefix Advertisement in EVPN", Work in Progress, Internet-Draft, draft-ietf-bess-evpn-prefix-advertisement-11, , <https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement-11>.
[I-D.ietf-bess-evpn-inter-subnet-forwarding]
Sajassi, A., Salam, S., Thoria, S., Drake, J., and J. Rabadan, "Integrated Routing and Bridging in EVPN", Work in Progress, Internet-Draft, draft-ietf-bess-evpn-inter-subnet-forwarding-08, , <https://tools.ietf.org/html/draft-ietf-bess-evpn-inter-subnet-forwarding-08>.

Author's Address

Yubao Wang
ZTE Corporation
No.68 of Zijinghua Road, Yuhuatai Distinct
Nanjing
China