Skip to main content

Performance Measurement for Segment Routing Networks with MPLS Data Plane
draft-ietf-mpls-rfc6374-sr-17

Yes

Jim Guichard

No Objection

Deb Cooley
Erik Kline
Mahesh Jethanandani
Orie Steele

Note: This ballot was opened for revision 11 and is now closed.

Jim Guichard
Yes
Deb Cooley
No Objection
Erik Kline
No Objection
Gunter Van de Velde
No Objection
Comment (2024-10-10 for -12) Sent
# Gunter Van de Velde, RTG AD, comments for draft-ietf-mpls-rfc6374-sr-12

# line numbers are derived with the idnits tool https://author-tools.ietf.org/api/idnits?url=https://www.ietf.org/archive/id/draft-ietf-mpls-rfc6374-sr-12.txt
# Thank you for this document, It reads well and explains the objectives well. It is a useful solution.
# In this review you will find observations i had when reviewing the document. I suspect majority will be easy to resolve or address and will be clarifications

#DETAILED COMMENTS
#=================

19	   Segment Routing (SR) leverages the source routing paradigm.  SR
20	   applies to the Multiprotocol Label Switching data plane (SR-MPLS) as
21	   specified in RFC 8402.  RFC 6374 and RFC 7876 specify protocol
22	   mechanisms to enable efficient and accurate measurement of packet
23	   loss, one-way and two-way delay, as well as related metrics such as
24	   delay variation in MPLS networks.  RFC 9341 defines the Alternate-
25	   Marking Method using Block Number as a data correlation mechanism for
26	   packet loss measurement.  This document utilizes mechanisms from RFC
27	   6374, RFC 7876, and RFC 9341 for performance delay and loss
28	   measurements in SR-MPLS networks, covering both links and end-to-end
29	   SR-MPLS paths, including SR Policies.

GV> What about the following proposal for a more compact abstract explaining the objective of the document:

"
This document specifies the application of the MPLS loss and delay measurement techniques, originally defined in RFC 6374, RFC 7876, and RFC 9341 within Segment Routing (SR) networks that utilize the MPLS data plane. Segment Routing enables the forwarding of packets through an ordered list of instructions, known as segments, which are imposed at the ingress node. By applying the mechanisms from RFC 6374, RFC 7876, and RFC 9341 to SR-MPLS networks, this document facilitates accurate measurement of packet loss and delay for Segment Routing paths. It defines the procedures and extensions necessary to perform performance monitoring and fault management in SR-MPLS environments, ensuring that network operators can effectively measure and maintain the quality of service across their SR-based MPLS networks. This includes coverage of links and end-to-end SR-MPLS paths, as well as SR Policies.
"

104	   Segment Routing (SR) leverages the source routing paradigm.  SR
105	   applies to both Multiprotocol Label Switching (SR-MPLS) and IPv6
106	   (SRv6) data planes as specified in [RFC8402].  SR takes advantage of
107	   the Equal-Cost Multipaths (ECMPs) between source and transit nodes,
108	   between transit nodes and between transit and destination nodes.  SR
109	   Policies as defined in [RFC9256] are used to steer traffic through
110	   specific, user-defined paths using a list of Segments.  A
111	   comprehensive SR Performance Measurement toolset is one of the
112	   essential requirements for measuring network performance to provide
113	   Service Level Agreements (SLAs).

GV> This section reads strange. What about the following readability rewrite. I split the requirement for a comprehensive toolset off from the paragraph to add to its importance.

"
Segment Routing (SR), as specified in [RFC8402], leverages the source routing paradigm and applies to both the Multiprotocol Label Switching (SR-MPLS) and IPv6 (SRv6) data planes. SR takes advantage of Equal-Cost Multipaths (ECMPs) between source and transit nodes, between transit nodes, and between transit and destination nodes. SR Policies, defined in [RFC9256], are used to steer traffic through specific, user-defined paths using a list of segments. 

A comprehensive SR Performance Measurement toolset is one of the essential requirements for measuring network performance to provide Service Level Agreements (SLAs).
"

121	   defined in [RFC6374].  These mechanisms are also well-suited to SR-
122	   MPLS networks.

GV> s/are also well-suited to SR-MPLS networks./can be applied to SR-MPLS networks./

132	   This document defines Return Path and Block Number TLV extensions for
133	   [RFC6374] for performance delay and loss measurement in SR-MPLS

GV> the document talks about "performance delay". Is there a definition what that exactly means or describes? 

132	   This document defines Return Path and Block Number TLV extensions for
133	   [RFC6374] for performance delay and loss measurement in SR-MPLS
134	   networks.  These TLV extensions also apply to the MPLS Label Switched
135	   Paths (LSPs) [RFC3031].  However, the procedure for performance delay
136	   and loss measurement of MPLS LSPs is outside the scope of this
137	   document.

GV> should here be explicit mentioned that this document proposes a new a registry for "Return Path Sub-TLV Type" and allocate values for Mandatory TLV Types for [RFC6374] from the "MPLS Loss/Delay Measurement TLV Object" to be more accurate?

197	   SR is enabled with MPLS data plane on nodes Q1 and R1.  The nodes Q1
198	   and R1 may be directly connected via a link enabled with MPLS
199	   (Section 2.9.1 of [RFC6374]) or a Point-to-Point (P2P) SR-MPLS path
200	   [RFC8402].  The link may be a physical interface, a virtual link, or
201	   a Link Aggregation Group (LAG) [IEEE802.1AX], or LAG member link.
202	   The SR-MPLS path may be an SR-MPLS Policy [RFC9256] on node Q1
203	   (called head-end) with destination to node R1 (called tail-end).

GV> This paragraph reads a bit odd. I think it intends to describe all different manners to connect Q1 with R1. Why is this section ot using the terminology from section 2.9.1 that talks about "Types of Channel" A channel may be a link or some other constructs.

220	   For delay and loss measurement in SR-MPLS networks, the procedures

GV> earlier the terminology "performance delay" was used. Is that different from "delay" mentioned here? I suspect it is the same. Maybe define once and inform that performance delay = delay as used in this document

218	3.  Overview

220	   For delay and loss measurement in SR-MPLS networks, the procedures
221	   defined in [RFC6374], [RFC7876], and [RFC9341] are used in this
222	   document.  Note that the one-way, two-way, and round-trip delay
223	   measurements are defined in Section 2.4 of [RFC6374] and are further
224	   described in this document for SR-MPLS networks.  Similarly, the
225	   packet loss measurement is defined in Section 2.2 of [RFC6374] and is
226	   further described in this document for SR-MPLS networks.

228	   The packet loss measurement using Alternate-Marking Method defined in
229	   [RFC9341] may use Block Number for data correlation.  This is
230	   achieved by using the Block Number TLV extension defined in this
231	   document.

233	   In SR-MPLS networks, the query and response messages defined in
234	   [RFC6374] are sent as follows:

236	   *  For delay measurement, the query messages MUST be sent on the same
237	      path as data traffic for links and end-to-end SR-MPLS paths to
238	      collect both transmit and receive timestamps.

240	   *  For loss measurement, the query messages MUST be sent on the same
241	      path as data traffic for links and end-to-end SR-MPLS paths to
242	      collect both transmit and receive traffic counters.

244	   If it is desired in SR-MPLS networks that the same path (same set of
245	   links and nodes) between the querier and responder be used in both
246	   directions of the measurement, it is achieved by using the Return
247	   Path TLV extension defined in this document.

249	   The performance measurement procedure for links can be used to
250	   compute extended Traffic Engineering (TE) metrics for delay and loss
251	   as described in this document.  The metrics are advertised in the
252	   network using the routing protocol extensions defined in [RFC7471],
253	   [RFC8570], and [RFC8571].

GV> This section could use some edits to make the text easier to digest when reading. What about the following:

"
In this document, the procedures defined in [RFC6374], [RFC7876], and [RFC9341] are utilized for delay and loss measurement in SR-MPLS networks. Specifically, the one-way, two-way, and round-trip delay measurements described in Section 2.4 of [RFC6374] are further elaborated for application within SR-MPLS networks. Similarly, the packet loss measurement procedures outlined in Section 2.2 of [RFC6374] are extended for use in SR-MPLS networks.

Packet loss measurement using the Alternate-Marking Method defined in [RFC9341] may employ the Block Number for data correlation. This is achieved by utilizing the Block Number TLV extension defined in this document.

In SR-MPLS networks, the query and response messages defined in [RFC6374] are transmitted as follows:

* For delay measurement, the query messages MUST be sent along the same path as the data traffic for links and end-to-end SR-MPLS paths to collect both transmit and receive timestamps.

* For loss measurement, the query messages MUST be sent along the same path as the data traffic for links and end-to-end SR-MPLS paths to collect both transmit and receive traffic counters.

If it is desired in SR-MPLS networks that the same path (i.e., the same set of links and nodes) between the querier and responder be used in both directions of the measurement, this can be achieved by using the Return Path TLV extension defined in this document.

The performance measurement procedures for links can be used to compute extended Traffic Engineering (TE) metrics for delay and loss, as described herein. These metrics are advertised in the network using the routing protocol extensions defined in [RFC7471], [RFC8570], and [RFC8571].
"

261	   The query message as defined in [RFC6374] is sent over the links for
262	   both delay and loss measurement.  In each Label Stack Entry (LSE)
263	   [RFC3032] in the MPLS label stack, the TTL value MUST be set to 255.

GV> What is the motivation to set this to 255? would that in-case the routing is bad not potentially cause looping packets? would a TTL = 1 not be a protection mechanism against this attack potential vector? 

267	   An SR-MPLS Policy Candidate-Path may contain a number of Segment
268	   Lists (SLs) (i.e., stack of MPLS labels) [RFC9256].  For delay and/or
269	   loss measurement for an end-to-end SR-MPLS Policy, the query messages
270	   MUST be transmitted for every SL of the SR-MPLS Policy Candidate-
271	   Path. 

GV> From a clarification perspective: If a single sr-policy as 3 Segment lists, then it MUST transmit 3 individual queries. How is tracking done to correlate which response aligns with which query? or how all queries are responded towards?

345	   The loopback measurement mode defined in Section 2.8 of [RFC6374] is
346	   used to measure round-trip delay for a bidirectional circular SR-MPLS
347	   path.  In this mode for SR-MPLS, the received query messages are not

GV> Not sure what 'circular' accurately describes? Is this when the upstream and downstream path are exactly the same? is this when the segment lists are identical but in reverse order? is this something else?

354	   The loopback mode is done by generating "queries" with the Response
355	   flag set to 1 and adding the Loopback Request object (Type 3)
356	   [RFC6374].  The label stack, as shown in Figure 2, in query messages
357	   in this case carries both the forward and reverse paths in the MPLS
358	   header.  The GAL is still carried at the bottom of the label stack
359	   (with S=1) (example as shown in Figure 2).

GV> Maybe a clarification. Is it assumed here that the segments in the stack are node segments? I suspect that adj segments MUST not be used? Maybe this is detailed in other specifications though for this type of measurement?

363	5.1.  Delay Measurement Message Format

GV> earlier 'performance' measurements terminology was used. i suspect it is all identical, but maybe add text to explicitly point that out

532	   The Return Path TLV is defined in the Mandatory TLV Type registry
533	   space [RFC6374].  The querier MUST only insert one Return Path TLV in
534	   the query message.  The responder that supports this TLV, MUST only
535	   process the first Return Path TLV and ignore the other Return Path
536	   TLVs if present.  The responder that supports this TLV, also MUST
537	   send response message back on the return path specified in the Return
538	   Path TLV.  The responder also MUST NOT add Return Path TLV in the
539	   response message.  The Reserved field MUST be set to 0 and MUST be
540	   ignored on the receive side.

GV> Is this impacted in any way when the query is sent on all SL's within the sr-policy? 
Will all these queries use the same return path? Would this not be a restriction of using return path TLV that all return responses are using the same return path?

566	   The MPLS Label Stack contains a list of 32-bit LSE that includes a
567	   20-bit label value, 8-bit TTL value, 3-bit TC value, and 1-bit EOS
568	   (S) field.  An MPLS Label Stack Sub-TLV may carry a stack of labels
569	   or a Binding SID label [RFC8402] of the Return SR-MPLS Policy.

GV> any dependency on labels corresponding to node or adjacency SIDs

616	8.  ECMP for SR-MPLS Policies

GV> Is the ECMP when for example there are multiple paths between two hops in the segment list (for example multiple parallel links between two nodes)? or is this ECMP between multiple SL (Segment Lists) that exist within a sr-policy? I suspect this is about the first, but am not sure.

639	   The extended TE metrics for link delay and loss can be computed using
640	   the performance measurement procedures described in this document and
641	   advertised in the routing domain as follows:

GV> Is this document defining how to compute the measurement metrics? Would it not be the process for measuring performance metrics.
John Scudder
No Objection
Comment (2024-10-17 for -16) Sent
Thanks for this well-written document. I have a few small comments.

### Section 2.3

   The channel may be a directly connected link enabled with MPLS
   (Section 2.9.1 of [RFC6374]) or a Point-to-Point (P2P) SR-MPLS path
   [RFC8402].  The link may be a physical interface, a virtual link, or

I don't see either "point-to-point" or "p2p" in RFC 8402. Would the quoted section be just as valid if you removed these adjectives? 

### Section 4.1.2

   by the intended responder, the Destination Address TLV (Type 129)
   [RFC6374] containing the address of the responder can be sent in the

Shouldn't that be "an address of the responder" since in general it will have more than one?

### Section 9

   responder.  If the responder does not support the new Mandatory TLV
   Types defined in this document; it MUST return Error 0x17:
   Unsupported Mandatory TLV Object as per [RFC6374].

That seems like a misuse of the RFC 2119 keyword; by definition, this spec can't dictate any requirements to a responder that doesn't implement this spec. Probably you mean something like,

NEW:
   responder.  If the responder does not support the new Mandatory TLV
   Types defined in this document it will return Error 0x17:
   Unsupported Mandatory TLV Object as per [RFC6374].

(I also removed a spurious semicolon.)
Mahesh Jethanandani
No Objection
Murray Kucherawy
No Objection
Comment (2024-10-16 for -16) Sent
There are two SHOULDs in this document, and I wonder for each of them what the impact is if an implementer were to deviate from that advice, or under what circumstances one might legitimately do so.  There might be more to say in each case.
Orie Steele
No Objection
Paul Wouters
No Objection
Comment (2024-10-15 for -14) Sent
        The Length is a one-byte field and is equal to the length of the
        Return Path Sub-TLV and the Reserved field in bytes. Length MUST
        NOT be 0.

Since the Reserved field is two octets, doesn't this mean the Length
field MUST NOT be 0 or 1 ? (and likely more since the Sub TLV also has
a minimal structure length. This occurs twice in the document.

The Security Considerations contains both "may" and "MAY" in a similar
way (as in I think these should be done similarly, but I don't think
it matters which is picked)
Roman Danyliw
(was Discuss) No Objection
Comment (2024-10-16 for -15) Sent
Thank you to Roni Even for the GENART review.

Thank you for addressing my previous DISCUSS feedback.

** Section 13.  Is the WG confident that such a small range of only 176-239 is best allocated through a FCFS registration policy?
Warren Kumari
No Objection
Comment (2024-10-09 for -12) Sent
Thank you for writing this document, I found it interesting and useful.

Also, thank you *very* much to Dhruv Dhody for the excellent OpsDir review (https://datatracker.ietf.org/doc/review-ietf-mpls-rfc6374-sr-11-opsdir-lc-dhody-2024-09-04/), and for addressing their comments.
Zaheduzzaman Sarker
No Objection
Comment (2024-10-17 for -16) Sent
Thanks for working on this specification. Thanks to Markus Ihlar for his TSVART review.

It seems the URO defined in RFC7876 is operating according to RFC 8085 ( obsolates RFC 5405 ) - UDP usage guideline. No objection from transport protocol point of views.
Éric Vyncke
No Objection
Comment (2024-10-16 for -15) Sent
# Éric Vyncke, INT AD, comments for draft-ietf-mpls-rfc6374-sr-14

Thank you for the work put into this document, it is always important to understand the performance of a system.

Please find below some non-blocking COMMENT points (but replies would be appreciated even if only for my own education), and some nits.

Special thanks to Tony Li for the shepherd's write-up including the WG consensus *and* the justification of the intended status.

Other thanks to Brian Haberman, the Internet directorate reviewer (at my request), he found no issue in his int-dir review:
https://datatracker.ietf.org/doc/review-ietf-mpls-rfc6374-sr-13-intdir-telechat-haberman-2024-10-10/

I hope that this review helps to improve the document,

Regards,

-éric

# COMMENTS (non-blocking)

## IPPM

Was this document also reviewed by the IPPM WG where there is a lot of knowledge about measurement?

## Section 3

Make the reader's task easy by providing a section reference for `Return Path TLV extension`.

## Section 4.2.3

Where is "circular SR-MPLS path" defined ?

Suggest adding a figure showing the forward & return paths in the label stack.

## Sections 5.1 and 6.1

Suggest not using sub-sections but simply having the text in sections 5 and 6

## Section 6.2

This seems a weird location in the flow. Suggest merging 5 & 6 in a single section "Measurements"

## Section 7.1

I was about to DISCUSS this point but shouldn't s/Length MUST NOT be 0/Length MUST NOT be less than 2/ to take into account the length of the reserved field

Suggest having the last sentence (about reserved field) on its own paragraph. (Same for section 7.1.1)


# NITS (non-blocking / cosmetic)

## Abstract

The abstract could be shorter, e.g., by not explaining for SR is or not listing twice a set of RFCs.

## Section 2.3

Consider using aasvg for the graphical elements (nicer on HTML rendering).

## Section 3

Consider avoiding text repetition in the delay / loss bullets.

## Section 9 (and possibly others)

s/ISIS/IS-IS/