Internet-Draft | SR Replication Segment | August 2023 |
Voyer, Ed., et al. | Expires 29 February 2024 | [Page] |
- Workgroup:
- Network Working Group
- Internet-Draft:
- draft-ietf-spring-sr-replication-segment-19
- Published:
- Intended Status:
- Standards Track
- Expires:
SR Replication segment for Multi-point Service Delivery
Abstract
This document describes the Segment Routing Replication segment for Multi-point service delivery. A Replication segment allows a packet to be replicated from a Replication node to Downstream nodes.¶
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Status of This Memo
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 29 February 2024.¶
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
1. Introduction
Replication segment is a new type of segment for Segment Routing (SR) [RFC8402], which allows a node (henceforth called a Replication node) to replicate packets to a set of other nodes (called Downstream nodes) in a Segment Routing Domain. A Replication segment can replicate packets to directly connected nodes or to downstream nodes (without need for state on the transit routers). This document focuses on specifying behavior of a Replication segment for both Segment Routing with Multiprotocol Label Switching (SR-MPLS) [RFC8660] and Segment Routing with IPv6 (SRv6) [RFC8986]. The examples in the Appendix illustrate the behavior of a Replication Segment in SR domain. The use of two or more Replication segments stitched together to form a tree using a control plane is left to be specified in other documents. The management of IP multicast groups, building IP multicast trees, and performing multicast congestion control are out of scope of this document.¶
1.1. Terminology
This section defines terms introduced and used frequently in this document. Refer to Terminology sections of [RFC8402], [RFC8754] and [RFC8986] for other terms used in Segment Routing.¶
- Replication segment: A segment in SR domain that replicates packets. See Section 2 for details.¶
- Replication node: A node in SR domain which replicates packets based on Replication segment.¶
- Downstream nodes: A Replication segment replicates packets to a set of nodes. These nodes are Downstream nodes.¶
- Replication state: State held for a Replication segment at a Replication node. It is conceptually a list of replication branches to Downstream nodes. The list can be empty.¶
- Replication SID: Data plane identifier of a Replication segment. This is a SR-MPLS label or SRv6 Segment Identifier (SID).¶
- SRH: IPv6 Segment Routing Header [RFC8754].¶
- Point-to-Multipoint Service: A service that has one ingress node and one or more egress nodes. A packet is delivered to all the egress nodes¶
- Root node: An ingress node of a P2MP service,¶
- Leaf node: An egress node of a P2MP service.¶
- Bud node: A node that is both a Replication node and a Leaf node.¶
1.2. Use Cases
In the simplest use case, a single Replication segment includes the ingress node of a multi-point service and the egress nodes of the service as all the Downstream nodes. This achieves Ingress Replication [RFC7988] that has been widely used for Multicast VPN (MVPN) [RFC6513] and Ethernet VPN (EVPN)[RFC7432] bridging of Broadcast, Unknown Unicast, and Multicast (BUM) traffic. This Replication segment can be either provisioned locally on ingress and egress nodes, or using dynamic auto-discovery procedures for MVPN and EVPN. Note SRv6 [RFC8986] has End.DT2M replication behavior for EVPN BUM traffic.¶
Replication segments can also be used to form trees by stitching Replication segments on a Root node, intermediate Replication nodes and Leaf nodes for efficient delivery of MVPN and EVPN BUM traffic.¶
2. Replication Segment
In a Segment Routing Domain, a Replication segment is a logical construct which connects a Replication node to a set of Downstream nodes. A Replication segment is a local segment instantiated at a Replication node. It can be either provisioned locally on a node or programmed by a control plane.¶
Replication segments can be stitched together to form a tree by either local provisioning on nodes or using a control plane. The procedures for doing this are out of scope of this document. One such control plane using a PCE with SR P2MP policy is specified in [I-D.ietf-pim-sr-p2mp-policy]. However, if local provisioning is used to stitch Replication segments, then a chain of Replication segments SHOULD NOT form a loop. If a control plane is used to stitch Replication segments, the control plane specification MUST prevent loops, or to detect and mitigate loops in steady state.¶
A Replication segment is identified by the tuple <Replication-ID, Node-ID>, where:¶
- Replication-ID: An identifier for a Replication segment that is unique in context of the Replication node.¶
- Node-ID: The address of the Replication node that the Replication segment is for. Note that the Root of a multi-point service is also a Replication node.¶
Replication-ID is a variable length field. In simplest case, it can be a 32-bit number, but it can be extended or modified as required based on specific use of a Replication segment. This is out of scope for this document. The length of Replication-ID is specified in the signaling mechanism used for Replication segment. Examples of such signaling and extensions are described in [I-D.ietf-pim-sr-p2mp-policy]. When the PCE signals a Replication segment to its node, the <Replication-ID, Node-ID> tuple identifies the segment.¶
A Replication segment includes the following elements:¶
- Replication SID: The Segment Identifier of a Replication segment. This is a SR-MPLS label or a SRv6 SID [RFC8402].¶
- Downstream nodes: Set of nodes in Segment Routing domain to which a packet is replicated by the Replication segment.¶
- Replication state: See below.¶
The Downstream nodes and Replication state of a Replication segment can change over time, depending on the network state and Leaf nodes of a multi-point service that the segment is part of.¶
Replication SID identifies the Replication segment in the forwarding plane. At a Replication node, the Replication SID operates on Replication state of the Replication segment.¶
Replication state is a list of replication branches to the Downstream nodes. In this document, each branch is abstracted to a <Downstream node, Downstream Replication SID> tuple. <Downstream node> represents the reachability from the Replication node to the Downstream node. In its simplest form, this MAY be specified as an interface or next-hop if downstream node is adjacent to the Replication node. The reachability may be specified in terms of Flexible Algorithm path (including the default algorithm) [RFC9350], or specified by an SR explicit path represented either by a SID-list (of one or more SIDs) or by a Segment Routing Policy [RFC9256]. Downstream Replication SID is the Replication SID of the Replication segment at the Downstream node.¶
A packet is steered into a Replication segment at a Replication node in two ways:¶
- When the Active Segment [RFC8402] is a locally instantiated Replication SID¶
- By the Root of a multi-point service based on local configuration outside the scope of this document.¶
In either case, the packet is replicated to each Downstream node in the associated Replication state.¶
If a Downstream node is an egress (Leaf) of the multi-point service, no further replication is needed. The Leaf node's Replication segment has an indicator for Leaf role and it does not have any Replication state i.e. the list of Replication branches is empty. The Replication SID at a Leaf node MAY be used to identify the multi-point service. Notice that the segment on the Leaf node is still referred to as a Replication segment for the purpose of generalization.¶
A node can be a Bud node, i.e. it is a Replication node and a Leaf node of a multi-point service [I-D.ietf-pim-sr-p2mp-policy]. Replication segment of a Bud node has a list of Replication Branches as well as Leaf role indicator.¶
In principle it is possible for different Replication segments to replicate packets to the same Replication segment on a Downstream node. However, such usage is intentionally left out of scope of this document.¶
2.1. SR-MPLS data plane
When the Active Segment is a Replication SID, the processing results in a POP [RFC8402] operation and lookup of the associated Replication state. For each replication in the Replication state, the operation is a PUSH [RFC8402] of the downstream Replication SID and an optional segment list on to the packet to steer the packet to the Downstream node.¶
The operation performed on incoming Replication SID is NEXT [RFC8402] at Leaf/Bud nodes where delivery of payload off tree is per local configuration. For some usages, this may involve looking at the next SID for example to get the necessary context.¶
When the Root of a multi-point service steers a packet to a Replication segment, it results in a replication to each Downstream node in the associated replication state. The operation is a PUSH of the replication SID and an optional segment list on to the packet which is forwarded to the downstream node.¶
The following applies to Replication SID in MPLS encapsulation:¶
- SIDs MAY be inserted before the downstream SR-MPLS Replication SID in order to guide a packet from a non-adjacent SR node to a Replication node.¶
- A Replication node MAY replicate a packet to a non-adjacent Downstream node using SIDs it inserts in the copy preceding the downstream Replication SID. The Downstream node may be a Leaf node of the Replication segment, or another Replication node, or both in case of Bud node.¶
- A Replication node MAY use an Anycast SID or Border Gateway Protocol (BGP) PeerSet SID in segment list to send a replicated packet to one downstream Replication node in an Anycast set if and only if all nodes in the set have an identical Replication SID and reach the same set of receivers.¶
- For some use cases, there MAY be SIDs after the Replication SID in the segment list of a packet. These SIDs are used only by the Leaf/Bud nodes to forward a packet off the tree independent of the Replication SID. Coordination regarding the absence or presence and value of context information for Leaf/Bud nodes is outside the scope of this document.¶
2.2. SRv6 data plane
For SRv6 [RFC8986], this document specifies “Endpoint with replication” behavior (End.Replicate for short) to replicate a packet and forward the replicas according to a Replication state.¶
When processing a packet destined to a local Replication SID, the packet is replicated according to the associated Replication state to Downstream nodes and/or locally delivered off tree when this is a Leaf/Bud node.For replication, the outer header is re-used, and the Downstream Replication SID, from Replication state, is written into the outer IPv6 header destination address. If required, an optional segment list may be used on some branches using H.Encaps.Red [RFC8986] (while some other branches may not need that). Note that this H.Encaps.Red is independent of the replication segment – it is just used to steer the replicated packet on a traffic engineered path to a Downstream node. The penultimate segment in encapsulating IPv6 header will execute Ultimate Segment Decapsulation (USD) flavor [RFC8986] of End/End.X behavior and forward the inner (replicated) packet to the Downstream node. If H.Encaps.Red is used to steer a replicated packet to a Downstream node, the operator must ensure the MTU on path to the Downstream node is sufficient to account for additional SRv6 encapsulation. This also applies when the Replication segment is for the Root node, whose upstream node has placed the Replication-SID in the header.¶
A local application on Root, for e.g. MVPN [RFC6513] or EVPN [RFC7432], may also apply H.Encaps.Red and then steer the resulting traffic into the Replication segment. Again, note that the H.Encaps.Red is independent of the Replication segment – it is the action of the application (e.g. MVPN/EVPN service). If the service is on a Root node, the two H.Encaps mentioned, one for the service and other in the previous paragraph for replication to Downstream node SHOULD be combined for optimization (to avoid extra IPv6 encapsulation).¶
When processing a packet destined to a local Replication SID, IPv6 Hop Limit MUST be decremented and MUST be non-zero to replicate the packet. A Root node that encapsulates a payload can set the IPv6 Hop Limit based on a local policy. This local policy SHOULD set the IPv6 Hop Limit so that a replicated packet can reach the furthest Leaf node. A Root node can also have a local policy to set the IPv6 Hop Limit from the payload. In this case, IPv6 Hop Limit may not be sufficient to get the replicated packet to all the Leaf nodes; non-replication nodes i.e. nodes which forward replicated packets based on IPv6 locator unicast prefix can decrement IPv6 Hop Limit to zero and originate ICMPv6 Error packets to the Root node. This can result in a storm of ICMPv6 packets (see Section 2.2.3) to the Root node. To avoid this, a Replication Segment has an optional IPv6 Hop Limit threshold. If this threshold is set, a Replication node MUST discard an incoming packet with local Replication SID if the IPv6 Hop Limit in the packet is less than the threshold and log this in a rate limited manner. The IPv6 Hop Limit Threshold SHOULD be set so that incoming packet can be replicated to furthest Leaf node.¶
For Leaf/Bud nodes local delivery off the tree is per Replication SID or next SID (if present in SRH). For some usages, this may involve getting the necessary context either from the next SID (e.g., MVPN with shared tree) or from the replication SID itself (e.g., MVPN with non-shared tree). In both cases, the context association is achieved with signaling and is out of scope of this document.¶
The following applies to Replication SID in SRv6 encapsulation:¶
- There MAY be SIDs preceding the SRv6 Replication SID in order to guide a packet from a non-adjacent SR node to a Replication node via an explicit path.¶
- A Replication node MAY steer a replicated packet on an explicit path to a non-adjacent Downstream node using SIDs it inserts in the copy preceding the downstream Replication SID. The Downstream node may be a Leaf node of the Replication segment, or another Replication node, or both in case of Bud node.¶
- For SRv6, as described in above paragraphs, the insertion of SIDs prior to Replication SID entails a new IPv6 encapsulation with SRH, but this can be optimized on Root node or for compressed SRv6 SIDs.¶
- The locator of Replication SID is sufficient to guide a packet on shortest path, for default or Flexible algorithm, between non-adjacent nodes.¶
- A Replication node MAY use an Anycast SID or BGP PeerSet SID in segment list to send a replicated packet to one downstream Replication node in an Anycast set if and only if all nodes in the set have an identical Replication SID and reach the same set of receivers.¶
- There MAY be SIDs after the Replication SID in the SRH of a packet. These SIDs are used to provide additional context for processing a packet locally at the node where the Replication SID is the Active Segment. Coordination regarding the absence or presence and value of context information for Leaf/Bud nodes is outside the scope of this document.¶
2.2.1. End.Replicate: Replicate and/or Decapsulate
The "Endpoint with replication and/or decapsulate behavior (End.Replicate for short) is variant of End behavior. The pseudo-code in this section follows the convention introduced in RFC 8986 [RFC8986].¶
A Replication state conceptually contains the following elements:¶
Replication state: { Node-Role: {Head, Transit, Leaf, Bud}; IPv6 Hop Limit Threshold; # default is zero # On Leaf, replication list is zero length Replication-List: { Downstream node: <Node-Identifier>; Downstream Replication SID: R-SID; # Segment-List may be empty Segment-List: [SID-1, .... SID-N]; } }¶
Below is the Replicate function on a packet for Replication state (RS).¶
S01. Replicate(RS, packet) S02. { S03. For each Replication R in RS.Replication-List { S04. Make a copy of the packet S05. Set IPv6 DA = RS.R-SID S06. If RS.Segment-List is not empty { S07. # Head node may optimize below encapsulation and S08. # the encapsulation of packet in a single encapsulation S09. Execute H.Encaps or H.Encaps.Red with RS.Segment-List on packet copy #RFC 8986 Section 5.1, 5.2 S10. } S11. Submit the packet to the egress IPv6 FIB lookup and transmission to the new destination S12. } S13. }¶
Notes:¶
- The IPv6 destination address in the copy of a packet is set from local state and not from SRH¶
When N receives a packet whose IPv6 DA is S and S is a local End.Replicate SID, N does:¶
S01. Lookup FUNCT portion of S to get Replication state RS S02. If (IPv6 Hop Limit <= 1) { S03. Discard the packet S04. # ICMPv6 Time Exceeded is not permitted (ICMPv6 section below) S05. } S06. If RS is not found { S07. Discard the packet S08. } S09. If (IPv6 Hop Limit < RS.IPv6 Hop Limit Threshold) { S10. Discard the packet S11. # Rate-limited logging S12. } S13. Decrement IPv6 Hop Limit by 1 S14. If (IPv6 NH == SRH and SRH TLVs present) { S15. Process SRH TLVs if allowed by local configuration S16. } S17. Call Replicate(RS, packet) S18. If (RS.Node-Role == Leaf OR RS.Node-Role == Bud) { S19. If (IPv6 NH == SRH and Segments Left > 0) { S20. Derive packet processing context(PPC) from Segment List S21. If (Segments Left != 0) { S22. Discard the packet S23. # ICMPv6 Parameter Problem with Code 0 S24. # (Erroneous header field encountered) S25. # is not permitted (ICMPv6 section below) S26. } S27. } Else { S28. Derive packet processing context(PPC) from FUNCT of Replication SID S29. } S30. Process the next header S31. }¶
The processing of Upper-Layer header of a packet matching End.Replicate SID at Leaf/Bud node is as follows:¶
S01. If (Upper-Layer header type == 4(IPv4) OR Upper-Layer header type == 41(IPv6) ) { S02. Remove the outer IPv6 header with all its extension headers S03. Process the packet in context of PPC S04. } Else If (Upper-Layer header type == 143(Ethernet) ) { S05. Remove the outer IPv6 header with all its extension headers S06. Process the Ethernet Frame in context of PPC S07. } Else If (Upper-Layer header type is allowed by local configuration) { S08. Proceed to process the Upper-Layer header S09. } Else { S10. Discard the packet S11. # ICMPv6 Parameter Problem with Code 4 S12. # (SR Upper-layer Header Error) S13. # is not permitted (ICMPv6 section below) S14. }¶
Notes:¶
- The behavior above MAY result in a packet with partially processed segment list in SRH under some circumstances. Fox example a head node may encode a context SID in an SRH. As per pseudo-code above, a Replication node that receives a packet with local Replication SID will not process the SRH segment list and just forward a copy with unmodified SRH to Downstream nodes.¶
- The packet processing context usually is a FIB table T¶
Processing the Replication SID may modify, if configured to process TLVs, the "variable-length data" of TLV types that change en route. Therefore, TLVs that change en route are mutable. The remainder of the SRH (Segments Left, Flags, Tag, Segment List, and TLVs that do not change en route) are immutable while processing this SID.¶
2.2.1.1. Hashed Message Authentication Code (HMAC) SRH TLV
If a Root node encodes a context SID in SRH with an optional HMAC SRH TLV [RFC8754], it MUST set the 'D' bit as defined in Section 2.1.2 because the Replication SID is not part of the segment list in SRH.¶
HMAC generation and verification is as specified in RFC 8754. Verification of HMAC TLV is determined by local configuration. If verification fails, an implementation of Replication SID MUST NOT originate an ICMPv6 error message (parameter problem, code 0). The failure SHOULD be logged (rate limited) and the packet SHOULD be discarded.¶
2.2.2. OAM Operations
RFC 9259 [RFC9259] specifies procedures for OAM operations like ping and traceroute on SRv6 SIDs.¶
It is possible to ping a Replication SID of a Leaf/Bud node, assuming the source node knows the Replication SID a priori, directly by putting it in the IPv6 destination address without a SRH or in a SRH as the last segment. While it is not possible to ping a Replication SID of a transit node because transit nodes do not process upper layer headers, it is still possible to ping a Replication SID of Leaf/Bud node of a tree via the Replication SID of intermediate transit nodes. The source of ping MUST compute the ICMPv6 Echo Request checksum using the Replication SID of Leaf/Bud as destination address. The source can then send the Echo Request packet to a transit node's Replication SID. The transit nodes replicate the packet by replacing the IPv6 destination address till the packet reaches the Leaf/Bud node which responds with an ICMPv6 Echo Reply. Note that a transit Replication node may replicate Echo Request packets to other Leaf/Bud nodes. These nodes will drop the Echo Request due to incorrect checksum. Procedures to prevent the mis-delivery of Echo Request may be addressed in a future document. Appendix A.2.1 illustrates examples of ping to a Replication SID.¶
Traceroute to a Leaf/Bud node Replication SID is not possible due to restriction prohibiting origination of ICMPv6 Time Exceeded error message for a Replication SID as described in the section below.¶
2.2.3. ICMPv6 Error Messages
ICMPv6 RFC [RFC4443] Section 2.4 states an ICMPv6 error message MUST NOT be originated as a result of receiving a packet destined to an IPv6 multicast address. This is to prevent a storm of ICMPv6 error messages resulting from replicated IPv6 packets from overwhelming a source node. There are two exceptions (1) the Packet Too Big message for Path MTU discovery, and (2) Parameter Problem Message, Code 2 reporting an unrecognized IPv6 option. An implementation of Replication segment for SRv6 MUST enforce these same restrictions and exceptions.¶
3. Implementation Status
Note to the RFC Editor: Please remove this section and reference to RFC 7942 before publication.¶
This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in RFC 7942 [RFC7942]. The description of implementations in this section is intended to assist the IETF in its decision processes in progressing drafts to RFCs. Please note that the listing of any individual implementation here does not imply endorsement by the IETF. Furthermore, no effort has been spent to verify the information presented here that was supplied by IETF contributors. This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist. According to RFC 7942 [RFC7942], "this will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature. It is up to the individual working groups to use this information as they see fit".¶
There are two known implementations of this draft by Cisco and Nokia. Interoperability reports for the implementations are not applicable since this draft does not specify inter-operable elements of Replication segments.¶
3.1. Cisco implementation
Cisco Implementation uses Replication segments defined in this draft as a basis for PCE to compute and establish P2MP trees in SR domain to provide multi-point services. The implementation, based on latest version of this draft, is in production and supports all MUST and SHOULD clauses for SR-MPLS Replication segments. The documentation is available at Cisco documentation and the point of contact is Rishabh Parekh (riparekh@cisco.com).¶
3.2. Nokia implementation
Nokia has implemented replication SID as defined in this draft to establish P2MP tree in segment routing domain. The implementation supports SR-MPLS encapsulation and has all the MUST and SHOULD clause in this draft. The implementation is at general availability maturity and is compliant with the latest version of the draft. The documentation for implementation can be found at Nokia help and the point of contact is hooman.bidgoli@nokia.com.¶
4. IANA Considerations
IANA has assigned the following codepoint for End.Replicate behavior in the "SRv6 Endpoint Behaviors" registry in the "Segment Routing" registry group.¶
Value | Hex | Endpoint behavior | Reference |
---|---|---|---|
75 | 0x004B | End.Replicate | [This.ID] |
5. Security Considerations
The SID behaviors defined in this document are deployed within an SR domain [RFC8402]. An SR domain needs protection from outside attackers as described in [RFC8754] and following is a brief reminder of the same:¶
-
For SR-MPLS deployments:¶
- By disabling MPLS on external interfaces of each edge node or any other technique to filter labeled traffic ingress on these interfaces.¶
-
For SRv6 deployments:¶
- Allocate all the SIDs from an IPv6 prefix block S/s and configure each external interface of each edge node of the domain with an inbound infrastructure access list (IACL) that drops any incoming packet with a destination address in S/s.¶
-
Additionally, an iACL may be applied to all nodes (k) provisioning SIDs as defined in this specification:¶
- Assign all interface addresses from within IPv6 prefix A/a. At node k, all SIDs local to k are assigned from prefix Sk/sk. Configure each internal interface of each SR node k in the SR domain with an inbound IACL that drops any incoming packet with a destination address in Sk/sk if the source address is not in A/a.¶
- Denying traffic with spoofed source addresses by implementing recommendations in BCP 84 [RFC3704].¶
- Additionally the block S/s from which SIDs are allocated may be a non-globally-routable address such as ULA or the prefix defined in [I-D.ietf-6man-sids].¶
Failure to protect the SR MPLS domain by correctly provisioning MPLS support per interface permits attackers from outside the domain to send packets that use the replication services provisioned within the domain.¶
Failure to protect the SRv6 domain with IACLs on external interfaces, combined with failure to implement BCP 38 [RFC2827]or apply IACLs on nodes provisioning SIDs, permits attackers from outside the SR domain to send packets that use the replication services provisioned within the domain.¶
Given the definition of the Replication segment in this document, an attacker subverting ingress filter above cannot take advantage of a stack of replication segments to perform amplification attacks nor link exhaustion attacks. Replication segment trees always terminate at a Leaf or Bud node resulting in a decapsulation. This however does allow an attacker to inject traffic to the receivers within a P2MP service.¶
This document introduces a SR segment endpoint behavior that replicates and decapsulates an inner payload for both the MPLS and IPv6 data planes. Similar to any MPLS end of stack label, or SRv6 END.D* behavior, if the protections described above are not implemented an attacker can perform an attack via the decapsulating segment (including the one described in this document).¶
Incorrect provisioning of Replication segments can result in a chain of Replication segments forming a loop. This can happen if Replication segments are provisioned on SR nodes without using a control plane. In this case, replicated packets can create a storm till MPLS TTL (for SR-MPLS) or IPv6 Hop Limit (for SRv6) decrements to zero. A control plane, for example PCE, can be used to prevent loops. The control plane protocols (like PCEP, BGP, etc.) used to instantiate Replication segments can leverage their own security mechanisms such as encryption, authentication filtering etc.¶
For SRv6, Section 2.2.3 describes an exception for Parameter Problem Message, code 2 ICMPv6 Error messages. If an attacker sends a packet destined to Replication SID with source address of a node and with an extension header using unknown option type marked as mandatory, then a large number of ICMPv6 Parameter Problem messages can cause a denial-of-service attack on the source node. Although this specification does not specify any extension headers, any future extension of this document doing so is susceptible to this security concern.¶
If an attacker can forge an IPv6 packet with source address of a node, Replication SID as destination address and an IPv6 Hop Limit such that nodes which forward replicated packets on IPv6 locator unicast prefix, decrement the Hop Limit to zero, then these nodes can cause a storm of ICMPv6 Error packets to overwhelm the source node under attack. The IPv6 Hop Limit Threshold check described in Section 2.2 can help mitigate such attacks.¶
6. Acknowledgements
The authors would like to acknowledge Siva Sivabalan, Mike Koldychev, Vishnu Pavan Beeram, Alexander Vainshtein, Bruno Decraene, Thierry Couture, Joel Halpern, Ketan Talaulikar, Darren Dukes and Jingrong Xie for their valuable inputs.¶
7. Contributors
Clayton Hassen Bell Canada Vancouver Canada¶
Email: clayton.hassen@bell.ca¶
Kurtis Gillis Bell Canada Halifax Canada¶
Email: kurtis.gillis@bell.ca¶
Arvind Venkateswaran Cisco Systems, Inc. San Jose US¶
Email: arvvenka@cisco.com¶
Zafar Ali Cisco Systems, Inc. US¶
Email: zali@cisco.com¶
Swadesh Agrawal Cisco Systems, Inc. San Jose US¶
Email: swaagraw@cisco.com¶
Jayant Kotalwar Nokia Mountain View US¶
Email: jayant.kotalwar@nokia.com¶
Tanmoy Kundu Nokia Mountain View US¶
Email: tanmoy.kundu@nokia.com¶
Andrew Stone Nokia Ottawa Canada¶
Email: andrew.stone@nokia.com¶
Tarek Saad Cisco Systems Inc. Canada¶
Email:tsaad@cisco.com¶
Kamran Raza Cisco Systems, Inc. Canada¶
Email:skraza@cisco.com¶
Jingrong Xie Huawei Technologies Beijing China¶
Email:xiejingrong@huawei.com¶
8. References
8.1. Normative References
- [RFC2119]
- Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
- [RFC4443]
- Conta, A., Deering, S., and M. Gupta, Ed., "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", STD 89, RFC 4443, DOI 10.17487/RFC4443, , <https://www.rfc-editor.org/info/rfc4443>.
- [RFC8174]
- Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
- [RFC8402]
- Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, , <https://www.rfc-editor.org/info/rfc8402>.
- [RFC8754]
- Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header (SRH)", RFC 8754, DOI 10.17487/RFC8754, , <https://www.rfc-editor.org/info/rfc8754>.
- [RFC8986]
- Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 (SRv6) Network Programming", RFC 8986, DOI 10.17487/RFC8986, , <https://www.rfc-editor.org/info/rfc8986>.
- [RFC9259]
- Ali, Z., Filsfils, C., Matsushima, S., Voyer, D., and M. Chen, "Operations, Administration, and Maintenance (OAM) in Segment Routing over IPv6 (SRv6)", RFC 9259, DOI 10.17487/RFC9259, , <https://www.rfc-editor.org/info/rfc9259>.
8.2. Informative References
- [I-D.filsfils-spring-srv6-net-pgm-illustration]
- Filsfils, C., Camarillo, P., Li, Z., Matsushima, S., Decraene, B., Steinberg, D., Lebrun, D., Raszuk, R., and J. Leddy, "Illustrations for SRv6 Network Programming", Work in Progress, Internet-Draft, draft-filsfils-spring-srv6-net-pgm-illustration-04, , <https://datatracker.ietf.org/doc/html/draft-filsfils-spring-srv6-net-pgm-illustration-04>.
- [I-D.ietf-6man-sids]
- Krishnan, S., "Segment Identifiers in SRv6", Work in Progress, Internet-Draft, draft-ietf-6man-sids-03, , <https://datatracker.ietf.org/doc/html/draft-ietf-6man-sids-03>.
- [I-D.ietf-pim-sr-p2mp-policy]
- Voyer, D., Filsfils, C., Parekh, R., Bidgoli, H., and Z. J. Zhang, "Segment Routing Point-to-Multipoint Policy", Work in Progress, Internet-Draft, draft-ietf-pim-sr-p2mp-policy-06, , <https://datatracker.ietf.org/doc/html/draft-ietf-pim-sr-p2mp-policy-06>.
- [RFC2827]
- Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, , <https://www.rfc-editor.org/info/rfc2827>.
- [RFC3704]
- Baker, F. and P. Savola, "Ingress Filtering for Multihomed Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, , <https://www.rfc-editor.org/info/rfc3704>.
- [RFC6513]
- Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, , <https://www.rfc-editor.org/info/rfc6513>.
- [RFC7432]
- Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, , <https://www.rfc-editor.org/info/rfc7432>.
- [RFC7942]
- Sheffer, Y. and A. Farrel, "Improving Awareness of Running Code: The Implementation Status Section", BCP 205, RFC 7942, DOI 10.17487/RFC7942, , <https://www.rfc-editor.org/info/rfc7942>.
- [RFC7988]
- Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress Replication Tunnels in Multicast VPN", RFC 7988, DOI 10.17487/RFC7988, , <https://www.rfc-editor.org/info/rfc7988>.
- [RFC8660]
- Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing with the MPLS Data Plane", RFC 8660, DOI 10.17487/RFC8660, , <https://www.rfc-editor.org/info/rfc8660>.
- [RFC9256]
- Filsfils, C., Talaulikar, K., Ed., Voyer, D., Bogdanov, A., and P. Mattes, "Segment Routing Policy Architecture", RFC 9256, DOI 10.17487/RFC9256, , <https://www.rfc-editor.org/info/rfc9256>.
- [RFC9350]
- Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., and A. Gulko, "IGP Flexible Algorithm", RFC 9350, DOI 10.17487/RFC9350, , <https://www.rfc-editor.org/info/rfc9350>.
Appendix A. Illustration of a Replication Segment
This section illustrates an example of a single Replication segment. Examples showing Replication segment stitched together to form P2MP tree (based on SR P2MP policy) are in [I-D.ietf-pim-sr-p2mp-policy].¶
Consider the following topology:¶
A.1. SR-MPLS
In this example, the Node-SID of a node Rn is N-SIDn and Adjacency-SID from node Rm to node Rn is A-SIDmn. Interface between Rm and Rn is Lmn. The state representation uses "R-SID->Lmn" to represent a packet replication with outgoing replication SID R-SID sent on interface Lmn.¶
Assume a Replication segment identified with R-ID at Replication node R1 and downstream nodes R2, R6 and R7. The Replication SID at node n is R-SIDn. A packet replicated from R1 to R7 has to traverse R4.¶
The Replication segment state at nodes R1, R2, R6 and R7 is shown below. Note nodes R3, R4 and R5 do not have state for the Replication segment.¶
Replication segment at R1:¶
Replication segment <R-ID,R1>: Replication SID: R-SID1 Replication state: R2: <R-SID2->L12> R6: <N-SID6, R-SID6> R7: <N-SID4, A-SID47, R-SID7>¶
Replication to R2 steers the packet directly to R2 on interface L12. Replication to R6, using N-SID6, steers the packet via shortest path to that node. Replication to R7 is steered via R4, using N-SID4 and then adjacency SID A-SID47 to R7.¶
Replication segment at R2:¶
Replication segment <R-ID,R2>: Replication SID: R-SID2 Replication state: R2: <Leaf>¶
Replication segment at R6:¶
Replication segment <R-ID,R6>: Replication SID: R-SID6 Replication state: R6: <Leaf>¶
Replication segment at R7:¶
Replication segment <R-ID,R7>: Replication SID: R-SID7 Replication state: R7: <Leaf>¶
When a packet is steered into the Replication segment at R1:¶
- Since R1 is directly connected to R2, R1 performs PUSH operation with just <R-SID2> label for the replicated copy and sends it to R2 on interface L12. R2, as Leaf, performs NEXT operation, pops R-SID2 label and delivers the payload.¶
- R1 performs PUSH operation with <N-SID6, R-SID6> label stack for the replicated copy to R6 and sends it to R2, the nexthop on shortest path to R6. R2 performs CONTINUE operation on N-SID6 and forwards it to R3. R3 is the penultimate hop for N-SID6; it performs penultimate hop popping, which corresponds to the NEXT operation and the packet is then sent to R6 with <R-SID6> in the label stack. R6, as Leaf, performs NEXT operation, pops R-SID6 label and delivers the payload.¶
- R1 performs PUSH operation with <N-SID4, A-SID47, R-SID7> label stack for the replicated copy to R7 and sends it to R2, the nexthop on shortest path to R4. R2 is the penultimate hop for N-SID4; it performs penultimate hop popping, which corresponds to the NEXT operation and the packet is then sent to R4 with <A-SID47, R-SID1> in the label stack. R4 performs NEXT operation, pops A-SID47, and delivers packet to R7 with <R-SID7> in the label stack. R7, as Leaf, performs NEXT operation, pops R-SID7 label and delivers the payload.¶
A.2. SRv6
For SRv6 , we use SID allocation scheme, reproduced below, from Illustrations for SRv6 Network Programming [I-D.filsfils-spring-srv6-net-pgm-illustration]¶
- 2001:db8::/32 is an IPv6 block allocated by a Regional Internet Registry (RIR) to the operator¶
- 2001:db8:0::/48 is dedicated to the internal address space¶
- 2001:db8:cccc::/48 is dedicated to the internal SRv6 SID space¶
- We assume a location expressed in 64 bits and a function expressed in 16 bits¶
- Node k has a classic IPv6 loopback address 2001:db8::k/128 which is advertised in the Interior Gateway Protocol (IGP)¶
- Node k has 2001:db8:cccc:k::/64 for its local SID space. Its SIDs will be explicitly assigned from that block¶
- Node k advertises 2001:db8:cccc:k::/64 in its IGP¶
- Function :1:: (function 1, for short) represents the End function with Penultimate Segment Pop of SRH (PSP) [RFC8986] and USD support¶
- Function :Cn:: (function Cn, for short) represents the End.X function from to Node n with PSP and USD support¶
Each node k has:¶
- An explicit SID instantiation 2001:db8:cccc:k:1::/128 bound to an End function with additional support for PSP and USD¶
- An explicit SID instantiation 2001:db8:cccc:k:Cj::/128 bound to an End.X function to neighbor J with additional support for PSP and USD¶
- An explicit SID instantiation 2001:db8:cccc:k:Fk::/128 bound to an End.Replicate function¶
Assume a Replication segment identified with R-ID at Replication node R1 and downstream nodes R2, R6 and R7. The Replication SID at node k, bound to an End.Replicate function, is 2001:db8:cccc:k:Fk::/128. A packet replicated from R1 to R7 has to traverse R4.¶
The Replication segment state at nodes R1, R2, R6 and R7 is shown below. Note nodes R3, R4 and R5 do not have state for the Replication segment. The state representation uses "R-SID->Lmn" to represent a packet replication with outgoing replication SID R-SID sent on interface Lmn. "SL" represents and optional segment list used to steer a replicated packet on a specific path to a Downstream node.¶
Replication segment at R1:¶
Replication segment <R-ID,R1>: Replication SID: 2001:db8:cccc:1:F1::0 Replication state: R2: <2001:db8:cccc:2:F2::0->L12> R6: <2001:db8:cccc:6:F6::0> R7: <2001:db8:cccc:4:C7::0>, SL: <2001:db8:cccc:7:F7::0>¶
Replication to R2 steers the packet directly to R2 on interface L12. Replication to R6, using 2001:db8:cccc:6:F6::0, steers the packet via shortest path to that node. Replication to R7 is steered via R4, using H.Encaps.Red with End.X SID 2001:db8:cccc:4:C7::0 at R4 to R7.¶
Replication segment at R2:¶
Replication segment <R-ID,R2>: Replication SID: 2001:db8:cccc:2:F2::0 Replication state: R2: <Leaf>¶
Replication segment at R6:¶
Replication segment <R-ID,R6>: Replication SID: 2001:db8:cccc:6:F6::0 Replication state: R6: <Leaf>¶
Replication segment at R7:¶
Replication segment <R-ID,R7>: Replication SID: 2001:db8:cccc:7:F7::0 Replication state: R7: <Leaf>¶
When a packet, (A,B2), is steered into the Replication segment at R1:¶
- Since R1 is directly connected to R2, R1 creates encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:2:F2::0) (A, B2), and sends it to R2 on interface L12. R2, as Leaf, removes outer IPv6 header and delivers the payload.¶
- R1 creates encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:6:F6::0) (A, B2) then forwards the resulting packet on the shortest path to 2001:db8:cccc:6::/64. R2 and R3 forward the packet using 2001:db8:cccc:6::/64. R6, as Leaf, removes outer IPv6 header and delivers the payload.¶
-
R1 has to steer packet to Downstream node R7 via node R4. It can do this in one of two ways:¶
- R1 creates encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) and then performs H.Encaps.Red using the SL to create (2001:db8::1, 2001:db8:cccc:4:C7::0) (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) packet. It sends this packet to R2, the nexthop on shortest path to 2001:db8:cccc:4::/64. R2 forwards packet to R4 using 2001:db8:cccc:4::/64. R4 executes End.X function on 2001:db8:cccc:4:C7::0, performs USD action, removes outer IPv6 encapsulation and sends resulting packet (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, as Leaf, removes outer IPv6 header and delivers the payload.¶
- R1 is Root of replication segment. Therefore, it can combine above encapsulations to create encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:4:C7::0) (2001:db8:cccc:7:F7::0; SL=1) (A, B2) and sends it to R2, the nexthop on shortest path to 2001:db8:cccc:4::/64. R2 forwards packet to R4 using 2001:db8:cccc:4::/64. R4 executes End.X function on 2001:db8:cccc:4:C7::0, performs PSP action, removes SRH and sends resulting packet (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, as Leaf, removes outer IPv6 header and delivers the payload.¶
A.2.1. Pinging Replication SID
This section illustrates ping of a Replication SID.¶
Node R1 pings replication SID of node R6 directly by sending the following packet:¶
- R1 to R6: (2001:db8::1, 2001:db8:cccc:6:F6::0; NH=ICMPv6) (ICMPv6 Echo Request)¶
- Node R6 as a Leaf processes upper layer ICMPv6 Echo Request and responds with ICMPv6 Echo Reply¶
Node R1 pings Replication SID of R7 via R4 by sending the following packet with SRH:¶
- R1 to R4: (2001:db8::1, 2001:db8:cccc:4:C7::0) (2001:db8:cccc:7:F7::0; SL=1; NH=ICMPV6) (ICMPv6 Echo Request)¶
- R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 Echo Request)¶
- Node R7 as a Leaf processes upper layer ICMPv6 Echo Request and responds with ICMPv6 Echo Reply¶
Assume node R4 is a transit Replication node with Replication SID 2001:db8:cccc:4:F4::0 replicating to R7. Node R1 pings Replication SID of R7 via Replication SID of R4 as follows:¶
- R1 to R4: (2001:db8::1, 2001:db8:cccc:4:F4::0; NH=ICMPv6) (ICMPv6 Echo Request)¶
- R4 replicates to R7 by replacing IPv6 destination address with Replication SID of R7 from its Replication state¶
- R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 Echo Request)¶
- Node R7 as a Leaf processes upper layer ICMPv6 Echo Request and responds with ICMPv6 Echo Reply¶