Performance Measurement for Segment Routing Networks with MPLS Data Plane
draft-ietf-mpls-rfc6374-sr-17
Yes
Jim Guichard
No Objection
Deb Cooley
Erik Kline
Mahesh Jethanandani
Orie Steele
Note: This ballot was opened for revision 11 and is now closed.
Jim Guichard
Yes
Deb Cooley
No Objection
Erik Kline
No Objection
Gunter Van de Velde
No Objection
Comment
(2024-10-10 for -12)
Sent
# Gunter Van de Velde, RTG AD, comments for draft-ietf-mpls-rfc6374-sr-12 # line numbers are derived with the idnits tool https://author-tools.ietf.org/api/idnits?url=https://www.ietf.org/archive/id/draft-ietf-mpls-rfc6374-sr-12.txt # Thank you for this document, It reads well and explains the objectives well. It is a useful solution. # In this review you will find observations i had when reviewing the document. I suspect majority will be easy to resolve or address and will be clarifications #DETAILED COMMENTS #================= 19 Segment Routing (SR) leverages the source routing paradigm. SR 20 applies to the Multiprotocol Label Switching data plane (SR-MPLS) as 21 specified in RFC 8402. RFC 6374 and RFC 7876 specify protocol 22 mechanisms to enable efficient and accurate measurement of packet 23 loss, one-way and two-way delay, as well as related metrics such as 24 delay variation in MPLS networks. RFC 9341 defines the Alternate- 25 Marking Method using Block Number as a data correlation mechanism for 26 packet loss measurement. This document utilizes mechanisms from RFC 27 6374, RFC 7876, and RFC 9341 for performance delay and loss 28 measurements in SR-MPLS networks, covering both links and end-to-end 29 SR-MPLS paths, including SR Policies. GV> What about the following proposal for a more compact abstract explaining the objective of the document: " This document specifies the application of the MPLS loss and delay measurement techniques, originally defined in RFC 6374, RFC 7876, and RFC 9341 within Segment Routing (SR) networks that utilize the MPLS data plane. Segment Routing enables the forwarding of packets through an ordered list of instructions, known as segments, which are imposed at the ingress node. By applying the mechanisms from RFC 6374, RFC 7876, and RFC 9341 to SR-MPLS networks, this document facilitates accurate measurement of packet loss and delay for Segment Routing paths. It defines the procedures and extensions necessary to perform performance monitoring and fault management in SR-MPLS environments, ensuring that network operators can effectively measure and maintain the quality of service across their SR-based MPLS networks. This includes coverage of links and end-to-end SR-MPLS paths, as well as SR Policies. " 104 Segment Routing (SR) leverages the source routing paradigm. SR 105 applies to both Multiprotocol Label Switching (SR-MPLS) and IPv6 106 (SRv6) data planes as specified in [RFC8402]. SR takes advantage of 107 the Equal-Cost Multipaths (ECMPs) between source and transit nodes, 108 between transit nodes and between transit and destination nodes. SR 109 Policies as defined in [RFC9256] are used to steer traffic through 110 specific, user-defined paths using a list of Segments. A 111 comprehensive SR Performance Measurement toolset is one of the 112 essential requirements for measuring network performance to provide 113 Service Level Agreements (SLAs). GV> This section reads strange. What about the following readability rewrite. I split the requirement for a comprehensive toolset off from the paragraph to add to its importance. " Segment Routing (SR), as specified in [RFC8402], leverages the source routing paradigm and applies to both the Multiprotocol Label Switching (SR-MPLS) and IPv6 (SRv6) data planes. SR takes advantage of Equal-Cost Multipaths (ECMPs) between source and transit nodes, between transit nodes, and between transit and destination nodes. SR Policies, defined in [RFC9256], are used to steer traffic through specific, user-defined paths using a list of segments. A comprehensive SR Performance Measurement toolset is one of the essential requirements for measuring network performance to provide Service Level Agreements (SLAs). " 121 defined in [RFC6374]. These mechanisms are also well-suited to SR- 122 MPLS networks. GV> s/are also well-suited to SR-MPLS networks./can be applied to SR-MPLS networks./ 132 This document defines Return Path and Block Number TLV extensions for 133 [RFC6374] for performance delay and loss measurement in SR-MPLS GV> the document talks about "performance delay". Is there a definition what that exactly means or describes? 132 This document defines Return Path and Block Number TLV extensions for 133 [RFC6374] for performance delay and loss measurement in SR-MPLS 134 networks. These TLV extensions also apply to the MPLS Label Switched 135 Paths (LSPs) [RFC3031]. However, the procedure for performance delay 136 and loss measurement of MPLS LSPs is outside the scope of this 137 document. GV> should here be explicit mentioned that this document proposes a new a registry for "Return Path Sub-TLV Type" and allocate values for Mandatory TLV Types for [RFC6374] from the "MPLS Loss/Delay Measurement TLV Object" to be more accurate? 197 SR is enabled with MPLS data plane on nodes Q1 and R1. The nodes Q1 198 and R1 may be directly connected via a link enabled with MPLS 199 (Section 2.9.1 of [RFC6374]) or a Point-to-Point (P2P) SR-MPLS path 200 [RFC8402]. The link may be a physical interface, a virtual link, or 201 a Link Aggregation Group (LAG) [IEEE802.1AX], or LAG member link. 202 The SR-MPLS path may be an SR-MPLS Policy [RFC9256] on node Q1 203 (called head-end) with destination to node R1 (called tail-end). GV> This paragraph reads a bit odd. I think it intends to describe all different manners to connect Q1 with R1. Why is this section ot using the terminology from section 2.9.1 that talks about "Types of Channel" A channel may be a link or some other constructs. 220 For delay and loss measurement in SR-MPLS networks, the procedures GV> earlier the terminology "performance delay" was used. Is that different from "delay" mentioned here? I suspect it is the same. Maybe define once and inform that performance delay = delay as used in this document 218 3. Overview 220 For delay and loss measurement in SR-MPLS networks, the procedures 221 defined in [RFC6374], [RFC7876], and [RFC9341] are used in this 222 document. Note that the one-way, two-way, and round-trip delay 223 measurements are defined in Section 2.4 of [RFC6374] and are further 224 described in this document for SR-MPLS networks. Similarly, the 225 packet loss measurement is defined in Section 2.2 of [RFC6374] and is 226 further described in this document for SR-MPLS networks. 228 The packet loss measurement using Alternate-Marking Method defined in 229 [RFC9341] may use Block Number for data correlation. This is 230 achieved by using the Block Number TLV extension defined in this 231 document. 233 In SR-MPLS networks, the query and response messages defined in 234 [RFC6374] are sent as follows: 236 * For delay measurement, the query messages MUST be sent on the same 237 path as data traffic for links and end-to-end SR-MPLS paths to 238 collect both transmit and receive timestamps. 240 * For loss measurement, the query messages MUST be sent on the same 241 path as data traffic for links and end-to-end SR-MPLS paths to 242 collect both transmit and receive traffic counters. 244 If it is desired in SR-MPLS networks that the same path (same set of 245 links and nodes) between the querier and responder be used in both 246 directions of the measurement, it is achieved by using the Return 247 Path TLV extension defined in this document. 249 The performance measurement procedure for links can be used to 250 compute extended Traffic Engineering (TE) metrics for delay and loss 251 as described in this document. The metrics are advertised in the 252 network using the routing protocol extensions defined in [RFC7471], 253 [RFC8570], and [RFC8571]. GV> This section could use some edits to make the text easier to digest when reading. What about the following: " In this document, the procedures defined in [RFC6374], [RFC7876], and [RFC9341] are utilized for delay and loss measurement in SR-MPLS networks. Specifically, the one-way, two-way, and round-trip delay measurements described in Section 2.4 of [RFC6374] are further elaborated for application within SR-MPLS networks. Similarly, the packet loss measurement procedures outlined in Section 2.2 of [RFC6374] are extended for use in SR-MPLS networks. Packet loss measurement using the Alternate-Marking Method defined in [RFC9341] may employ the Block Number for data correlation. This is achieved by utilizing the Block Number TLV extension defined in this document. In SR-MPLS networks, the query and response messages defined in [RFC6374] are transmitted as follows: * For delay measurement, the query messages MUST be sent along the same path as the data traffic for links and end-to-end SR-MPLS paths to collect both transmit and receive timestamps. * For loss measurement, the query messages MUST be sent along the same path as the data traffic for links and end-to-end SR-MPLS paths to collect both transmit and receive traffic counters. If it is desired in SR-MPLS networks that the same path (i.e., the same set of links and nodes) between the querier and responder be used in both directions of the measurement, this can be achieved by using the Return Path TLV extension defined in this document. The performance measurement procedures for links can be used to compute extended Traffic Engineering (TE) metrics for delay and loss, as described herein. These metrics are advertised in the network using the routing protocol extensions defined in [RFC7471], [RFC8570], and [RFC8571]. " 261 The query message as defined in [RFC6374] is sent over the links for 262 both delay and loss measurement. In each Label Stack Entry (LSE) 263 [RFC3032] in the MPLS label stack, the TTL value MUST be set to 255. GV> What is the motivation to set this to 255? would that in-case the routing is bad not potentially cause looping packets? would a TTL = 1 not be a protection mechanism against this attack potential vector? 267 An SR-MPLS Policy Candidate-Path may contain a number of Segment 268 Lists (SLs) (i.e., stack of MPLS labels) [RFC9256]. For delay and/or 269 loss measurement for an end-to-end SR-MPLS Policy, the query messages 270 MUST be transmitted for every SL of the SR-MPLS Policy Candidate- 271 Path. GV> From a clarification perspective: If a single sr-policy as 3 Segment lists, then it MUST transmit 3 individual queries. How is tracking done to correlate which response aligns with which query? or how all queries are responded towards? 345 The loopback measurement mode defined in Section 2.8 of [RFC6374] is 346 used to measure round-trip delay for a bidirectional circular SR-MPLS 347 path. In this mode for SR-MPLS, the received query messages are not GV> Not sure what 'circular' accurately describes? Is this when the upstream and downstream path are exactly the same? is this when the segment lists are identical but in reverse order? is this something else? 354 The loopback mode is done by generating "queries" with the Response 355 flag set to 1 and adding the Loopback Request object (Type 3) 356 [RFC6374]. The label stack, as shown in Figure 2, in query messages 357 in this case carries both the forward and reverse paths in the MPLS 358 header. The GAL is still carried at the bottom of the label stack 359 (with S=1) (example as shown in Figure 2). GV> Maybe a clarification. Is it assumed here that the segments in the stack are node segments? I suspect that adj segments MUST not be used? Maybe this is detailed in other specifications though for this type of measurement? 363 5.1. Delay Measurement Message Format GV> earlier 'performance' measurements terminology was used. i suspect it is all identical, but maybe add text to explicitly point that out 532 The Return Path TLV is defined in the Mandatory TLV Type registry 533 space [RFC6374]. The querier MUST only insert one Return Path TLV in 534 the query message. The responder that supports this TLV, MUST only 535 process the first Return Path TLV and ignore the other Return Path 536 TLVs if present. The responder that supports this TLV, also MUST 537 send response message back on the return path specified in the Return 538 Path TLV. The responder also MUST NOT add Return Path TLV in the 539 response message. The Reserved field MUST be set to 0 and MUST be 540 ignored on the receive side. GV> Is this impacted in any way when the query is sent on all SL's within the sr-policy? Will all these queries use the same return path? Would this not be a restriction of using return path TLV that all return responses are using the same return path? 566 The MPLS Label Stack contains a list of 32-bit LSE that includes a 567 20-bit label value, 8-bit TTL value, 3-bit TC value, and 1-bit EOS 568 (S) field. An MPLS Label Stack Sub-TLV may carry a stack of labels 569 or a Binding SID label [RFC8402] of the Return SR-MPLS Policy. GV> any dependency on labels corresponding to node or adjacency SIDs 616 8. ECMP for SR-MPLS Policies GV> Is the ECMP when for example there are multiple paths between two hops in the segment list (for example multiple parallel links between two nodes)? or is this ECMP between multiple SL (Segment Lists) that exist within a sr-policy? I suspect this is about the first, but am not sure. 639 The extended TE metrics for link delay and loss can be computed using 640 the performance measurement procedures described in this document and 641 advertised in the routing domain as follows: GV> Is this document defining how to compute the measurement metrics? Would it not be the process for measuring performance metrics.
John Scudder
No Objection
Comment
(2024-10-17 for -16)
Sent
Thanks for this well-written document. I have a few small comments. ### Section 2.3 The channel may be a directly connected link enabled with MPLS (Section 2.9.1 of [RFC6374]) or a Point-to-Point (P2P) SR-MPLS path [RFC8402]. The link may be a physical interface, a virtual link, or I don't see either "point-to-point" or "p2p" in RFC 8402. Would the quoted section be just as valid if you removed these adjectives? ### Section 4.1.2 by the intended responder, the Destination Address TLV (Type 129) [RFC6374] containing the address of the responder can be sent in the Shouldn't that be "an address of the responder" since in general it will have more than one? ### Section 9 responder. If the responder does not support the new Mandatory TLV Types defined in this document; it MUST return Error 0x17: Unsupported Mandatory TLV Object as per [RFC6374]. That seems like a misuse of the RFC 2119 keyword; by definition, this spec can't dictate any requirements to a responder that doesn't implement this spec. Probably you mean something like, NEW: responder. If the responder does not support the new Mandatory TLV Types defined in this document it will return Error 0x17: Unsupported Mandatory TLV Object as per [RFC6374]. (I also removed a spurious semicolon.)
Mahesh Jethanandani
No Objection
Murray Kucherawy
No Objection
Comment
(2024-10-16 for -16)
Sent
There are two SHOULDs in this document, and I wonder for each of them what the impact is if an implementer were to deviate from that advice, or under what circumstances one might legitimately do so. There might be more to say in each case.
Orie Steele
No Objection
Paul Wouters
No Objection
Comment
(2024-10-15 for -14)
Sent
The Length is a one-byte field and is equal to the length of the Return Path Sub-TLV and the Reserved field in bytes. Length MUST NOT be 0. Since the Reserved field is two octets, doesn't this mean the Length field MUST NOT be 0 or 1 ? (and likely more since the Sub TLV also has a minimal structure length. This occurs twice in the document. The Security Considerations contains both "may" and "MAY" in a similar way (as in I think these should be done similarly, but I don't think it matters which is picked)
Roman Danyliw
(was Discuss)
No Objection
Comment
(2024-10-16 for -15)
Sent
Thank you to Roni Even for the GENART review. Thank you for addressing my previous DISCUSS feedback. ** Section 13. Is the WG confident that such a small range of only 176-239 is best allocated through a FCFS registration policy?
Warren Kumari
No Objection
Comment
(2024-10-09 for -12)
Sent
Thank you for writing this document, I found it interesting and useful. Also, thank you *very* much to Dhruv Dhody for the excellent OpsDir review (https://datatracker.ietf.org/doc/review-ietf-mpls-rfc6374-sr-11-opsdir-lc-dhody-2024-09-04/), and for addressing their comments.
Zaheduzzaman Sarker
No Objection
Comment
(2024-10-17 for -16)
Sent
Éric Vyncke
No Objection
Comment
(2024-10-16 for -15)
Sent
# Éric Vyncke, INT AD, comments for draft-ietf-mpls-rfc6374-sr-14 Thank you for the work put into this document, it is always important to understand the performance of a system. Please find below some non-blocking COMMENT points (but replies would be appreciated even if only for my own education), and some nits. Special thanks to Tony Li for the shepherd's write-up including the WG consensus *and* the justification of the intended status. Other thanks to Brian Haberman, the Internet directorate reviewer (at my request), he found no issue in his int-dir review: https://datatracker.ietf.org/doc/review-ietf-mpls-rfc6374-sr-13-intdir-telechat-haberman-2024-10-10/ I hope that this review helps to improve the document, Regards, -éric # COMMENTS (non-blocking) ## IPPM Was this document also reviewed by the IPPM WG where there is a lot of knowledge about measurement? ## Section 3 Make the reader's task easy by providing a section reference for `Return Path TLV extension`. ## Section 4.2.3 Where is "circular SR-MPLS path" defined ? Suggest adding a figure showing the forward & return paths in the label stack. ## Sections 5.1 and 6.1 Suggest not using sub-sections but simply having the text in sections 5 and 6 ## Section 6.2 This seems a weird location in the flow. Suggest merging 5 & 6 in a single section "Measurements" ## Section 7.1 I was about to DISCUSS this point but shouldn't s/Length MUST NOT be 0/Length MUST NOT be less than 2/ to take into account the length of the reserved field Suggest having the last sentence (about reserved field) on its own paragraph. (Same for section 7.1.1) # NITS (non-blocking / cosmetic) ## Abstract The abstract could be shorter, e.g., by not explaining for SR is or not listing twice a set of RFCs. ## Section 2.3 Consider using aasvg for the graphical elements (nicer on HTML rendering). ## Section 3 Consider avoiding text repetition in the delay / loss bullets. ## Section 9 (and possibly others) s/ISIS/IS-IS/