Multicast On-path Telemetry using IOAM
draft-ietf-mboned-multicast-telemetry-12
Yes
(Warren Kumari)
No Objection
Jim Guichard
(Erik Kline)
(Francesca Palombini)
(Orie Steele)
(Paul Wouters)
Note: This ballot was opened for revision 09 and is now closed.
Éric Vyncke
(was Discuss)
No Objection
Comment
(2024-06-17 for -10)
Sent
Thanks for addressing my previous DISCUSS points ( https://mailarchive.ietf.org/arch/msg/mboned/_fcjFzCxSMIXlkRkUwVSvSLSEgg/ ). Please note that there is at least one non-blocking COMMENT that is not addressed, the I-D would probably benefit by addressing them (e.g., s/bier header/BIER header/ in section 2).
Gunter Van de Velde
(was Discuss)
No Objection
Comment
(2024-06-25 for -11)
Sent
# Gunter Van de Velde, RTG AD, comments for draft-ietf-mboned-multicast-telemetry-09.txt Please find https://www.ietf.org/blog/handling-iesg-ballot-positions/ documenting the handling of ballots. Thanks for writing up this draft and to start introducing Telemetry to multicast. One of the items that confused me when reading trough the document was that the I and N bit were new. I have troubles understanding why a single bit is not sufficient? There are not that many flag field available, hence being conservatives is not a bad habit. Also, what is recipient node to do if t received the node-id/interface-id when it should not or if if it should receive it, but it wasn't added? The section about Applicability introduced me some confusion on what it was trying to achieve. If the intent is to say that the introduced technology procedures can be used on these multicast control plane technologies, then why not have a short list without all the details? It makes it hard read, especially the X-PMSI section (so many acronyms in that section, not all have a reference i think) Below you find 6 different DISCUSS items to be looked at and to see how to resolve. I think some will be easy to resolve, others may be less trivial. And finally, in the COMMENTS section i have added a series of comments with additional context and classified them into [minor] and [major]. I hope this review and the various observations provide a way to help improve the document. G/ #DISCUSS items resolved in draft-ietf-mboned-multicast-telemetry-10.txt #====================================================================== ##[resolved] DISCUSS1 Some multicast tree bilding technologies have been mentioned, while another set was silently ignored (maybe due to historical or lesser used?) Can these be taken into the story flow and mentioned if considered, not considered or deemend irrelivant for Telemetry extensions? i.e. PIM-SM (Protocol Independent Multicast - Sparse Mode), PIM-DM (Protocol Independent Multicast - Dense Mode), CBT (Core-Based Tree), DVMRP (Distance Vector Multicast Routing Protocol), MOSPF (Multicast Extensions to OSPF), Bidir-PIM (Bidirectional PIM), SR Replication Segments (SR-MPLS and SRv6 (work in progress) ##[resolved] DISCUSS2 It is unclear if the 2nd method documented in the section "Modifications to Existing Solutions" needs modification. Maybe the exact nature of the modification can be more explicit documented? ##[resolved] DISCUSS3 When "Per-hop postcard using IOAM DEX" is used and per hop it seems operationally desireable to achieve such based upon sampled packets within a multicast flow. The sampling requirements for multicast may be different from unicast traffic. This is not discussed and considered. Is there a reason it is not discussed? ##[resolved] DISCUSS4 How is postcard based telemetry achievable for high volume mcast flows when retaining each single packet, then process the telemetry and finally forwarding once all telemetry is processed. Maybe this solution is intended for low volme mcast? (assuming there is some identification what low volume means for the branch node). ##[resolved] DISCUSS55 Eric Vyncke pointed out that IPv6 needs to be considered, or at least not excluded. (i support his DISCUSS) ##[resolved] DISCUSS6 The formal procedures when using BIER are a little light. The applicability section talks about "would be possible" or has handwaving on the different encapsulation types of BIER. The various types should maybe be explicit mentioned and associated formal procedures discussed? ##Update after draft-ietf-mboned-multicast-telemetry-11: BIER section removed from the draft to resolve the blocking discuss #DETAILED COMMENTS #================= ##classified as [minor] and [major] 91 Multicast has many use cases. For example, it can be used by 92 residential broadband customers across operator networks, private 93 MPLS customers, and internal customers within corporate intranet. 94 Multicast provides real time interactive online meetings or podcasts, 95 IPTV, and financial markets real-time data, which all have a reliance 96 on UDP's unreliable transport. End-to-end QOS, therefore, should be 97 a critical component of multicast deployment in order to provide a 98 good end user experience. In multicast real-time media streaming, 99 loss of a single packet containing a reference frame can result in 100 the inability of thousands of receivers to decode a whole sequence of 101 packets called Group-of-Picture, introducing black picture for 102 periods of a few seconds. Unexpected long delay in propagation of a 103 packet in such real-time media streaming may equally result in the 104 packet not being received and create the same results. Multicast 105 packet drops and delay can therefore severely affect the application 106 performance and user experience. [minor] This section seems to flow not so well when reading and observations are made with seemingly handwaiving to what is believed well known artifacts. In general i think this paragraph tries to describe that mcast uses UDP and that it is inherently unreliable, and that a single packet loss may result in amplified impacts across many receivers. Not only streaming servies should maybe be flaged, but loss of single packet in financial envirenment (it was mentioned in the text, but not mentioned in the negative imacts) may cause a wrong tick and inequality between brokers using such data. For these, some informative references could be appreciated. What about following initial rewrite, assuming references are added during later moment: "Multicast has numerous use case environments, including residential broadband services across operator networks, private MPLS customer networks, and internal corporate intranets. It enables applications such as real-time interactive online meetings, podcasts, IPTV, and financial market real-time data feeds, all of which rely on the unreliable transport of UDP. To ensure a positive end-user experience, superior end-to-end Quality of Service (QoS) is essential in multicast deployments. In multicast real-time media streaming, the loss of a single packet containing a reference frame can prevent thousands of receivers from decoding an entire sequence of packets, known as a Group-of-Pictures (GoP), resulting in a black screen for several seconds. Similarly, unexpected delays in packet propagation can cause packets to be received late or not at all, leading to the same issues. Therefore, packet drops and delays in multicast streaming can significantly degrade application performance and user experience. " 108 It is important to monitor the performance of the multicast traffic. 109 New on-path telemetry techniques such as In-situ OAM (IOAM) 110 [RFC9197], IOAM Direct Export (DEX) [RFC9326] IOAM Marking-based 111 Postcard (PBT-M) [I-D.song-ippm-postcard-based-telemetry], and Hybrid 112 Two-Step (HTS) [I-D.ietf-ippm-hybrid-two-step] are useful and 113 complementary to the existing active OAM performance monitoring 114 methods (e.g., ICMP ping [RFC0792]), provide promising means to 115 directly monitor the network experience of multicast traffic. 116 However, multicast traffic has some unique characteristics which pose 117 some challenges on applying such techniques in an efficient way. [minor] Fixed some typos and readability in the textblob with following proposal: " It is essential to monitor the performance of multicast traffic. New on-path telemetry techniques, such as In-situ OAM (IOAM) [RFC9197], IOAM Direct Export (DEX) [RFC9326], IOAM Marking-based Postcard (PBT-M) [I-D.song-ippm-postcard-based-telemetry], and Hybrid Two-Step (HTS) [I-D.ietf-ippm-hybrid-two-step], complement existing active OAM performance monitoring methods like ICMP ping [RFC0792]. These techniques offer promising means to directly monitor multicast traffic. However, multicast traffic's unique characteristics present challenges in applying these techniques efficiently. " 119 The IP multicast packet data for a particular (S, G) state is 120 identical from one branch to another on its way to multiple 121 receivers. When adding IOAM trace data to multicast packets, each 122 replicated packet would keep the telemetry data for its entire 123 forwarding path. Since the replicated packets all share common path 124 segments, redundant data will be collected for the same original 125 multicast packet. Such redundancy consumes extra network bandwidth 126 unnecessarily. For a large multicast tree, such redundancy is 127 considerable. Alternatively, it could be more efficient to collect 128 the telemetry data using solutions such as IOAM DEX to eliminate the 129 data redundancy. However, IOAM DEX lacks a branch identifier, making 130 telemetry data correlation and multicast-tree reconstruction 131 difficult. [minor] Fixing some typos and making the text flow easier to read. THis could use an example of how such IOAM trace data is redundant. for common segments. "The IP multicast packet data for a particular (S, G) state remains identical across different branches to multiple receivers. When IOAM trace data is added to multicast packets, each replicated packet retains telemetry data for its entire forwarding path. This results in redundant data collection for common path segments, unnecessarily consuming extra network bandwidth. For large multicast trees, this redundancy is substantial. Using solutions like IOAM DEX could be more efficient by eliminating data redundancy, but IOAM DEX lacks a branch identifier, complicating telemetry data correlation and multicast tree reconstruction. " 140 2. Requirements for Multicast Traffic Telemetry 142 Multicast traffic is forwarded through a multicast tree. With PIM 143 and P2MP, the forwarding tree is established and maintained by the 144 multicast routing protocol. With BIER, no state is created in the 145 network to establish a forwarding tree; instead, a bier header 146 provides the necessary information for each packet to know the egress 147 points. Multicast packets are only replicated at each tree branch 148 fork node for efficiency. [major] This sections discusses various technologies to build mcast trees, however not all of them are mentioned. Maybe the following can be added in addition to BIER to make the overview more complete. #PIM-SM (Protocol Independent Multicast - Sparse Mode): * Builds shared trees rooted at a Rendezvous Point (RP) and can switch to source-based trees for more efficient delivery. #PIM-DM (Protocol Independent Multicast - Dense Mode): * Initially floods multicast traffic to all nodes and then prunes back the unwanted branches. #CBT (Core-Based Tree): * Constructs a shared tree rooted at a core router, minimizing state information in the network. * RFC 2189: "Core Based Trees (CBT) Multicast Routing Architecture" * RFC 2201: "Core Based Trees (CBT) Multicast Routing Protocol Specification" #DVMRP (Distance Vector Multicast Routing Protocol): * Uses distance vector algorithms to build source-based trees, suitable for small to medium-sized networks. #MOSPF (Multicast Extensions to OSPF): * Extends OSPF to support multicast by building source-based trees. #Bidir-PIM (Bidirectional PIM): * Builds bidirectional shared trees to support efficient many-to-many communication. #SR Replication Segments (SR-MPLS and SRv6 (work in progress) 150 There are several requirements for multicast traffic telemetry, a few 151 of which are: [minor] s/a few of which are/a non exclusive list is/ 150 There are several requirements for multicast traffic telemetry, a few 151 of which are: 153 * Reconstruct and visualize the multicast tree through data plane 154 monitoring. 156 * Gather the multicast packet delay and jitter performance on each 157 path. 159 * Find the multicast packet drop location and reason. 161 * Gather the VPN state and tunnel information in case of P2MP 162 multicast. [major] This list was created with the solution being proposed already in mind and what it intends to fullfill for multicast. It is as result not fully objective list. I believe important for multicast telemetry is also: Scalability: * Handle large-scale networks with numerous multicast groups and receivers. Minimal Overhead: * Ensure telemetry collection does not significantly impact network performance or consume excessive bandwidth. Real-Time Data Collection: * Provide timely insights for monitoring and troubleshooting. Accuracy and Precision: * Capture detailed and accurate network performance metrics. Compatibility: * Integrate with existing network protocols and telemetry systems. Security: * Protect telemetry data from unauthorized access and tampering. Support for Various Telemetry Techniques: 164 In order to meet these requirements, we need the ability to directly 165 monitor the multicast traffic and derive data from the multicast 166 packets. The conventional OAM mechanisms, such as multicast ping 167 [RFC6450] and trace [RFC8487], are not sufficient to meet these 168 requirements. [minor] I believe there is more to the eye then what is listed here. Maybe this can be taken as an opportunity to isolate from the many requirements those requirements that are addressed by the proposed solution? THis will lead to a more objective document by showing requiremets that were maybe not met. 184 If the IOAM trace option is used for on-path data collection, the 185 partial trace data will also be replicated into the packet copy for 186 each branch. The end result is that, at the multicast tree leaves, 187 each copy of the multicast packet has a complete trace. Most of the 188 data (except data from the last leaf branch) appear in multiple 189 copies while only one copy is sufficient. Data redundancy introduces 190 unnecessary header overhead, wastes network bandwidth, and 191 complicates the data processing. The larger the multicast tree, or 192 the longer the multicast path, the more severe the redundancy problem 193 becomes. [minor] The following rewrite provides a flow that is easier to read "When the IOAM trace option is utilized for on-path data collection, partial trace data is replicated into the packet copy for each branch of the multicast tree. Consequently, at the leaves of the multicast tree, each copy of the multicast packet contains a complete trace. This results in data redundancy, as most of the data (except from the final leaf branch) appears in multiple copies, where only one is sufficient. This redundancy introduces unnecessary header overhead, wastes network bandwidth, and complicates data processing. The larger the multicast tree or the longer the multicast path, the more severe the redundancy problem becomes. " 195 The postcard-based solutions (e.g., IOAM DEX), can be used to 196 eliminate such data redundancy, because each node on the tree only 197 sends a postcard covering local data. However, they cannot track and 198 correlate the tree branches properly due to the lack of branching 199 information, so they can bring confusion about the multicast tree 200 topology. For example, in a multicast tree, Node A has two branches, 201 one to Node B and the other to node C; further, Node B leads to Node 202 D and Node C leads to Node E. When applying postcard-based methods, 203 one cannot tell whether or not Node D(E) is the next hop of Node B(C) 204 from the received postcards alone, unless one correlates the 205 exporting nodes with knowledge about the tree collected by other 206 means (e.g., mtrace). Such correlation is undesirable because it 207 introduces extra work and complexity. [major] It is unclear what the D(E) and/or the B(C) is representing. I can guess what it means, but for a standards track document guessing is discouraged Would the following description be correct analysis? "The postcard-based solutions, such as IOAM Direct Export (DEX), can eliminate data redundancy because each node on the multicast tree sends a postcard with only local data. However, these methods cannot accurately track and correlate tree branches due to the absence of branching information. For instance, in a multicast tree where Node A branches to Node B and Node C, and further, Node B leads to Node D and Node C leads to Node E, it is impossible to determine from postcards alone whether Node D is a continuation of Node B or Node C. This ambiguity necessitates additional correlation using external knowledge about the tree, such as through mtrace, which introduces extra complexity and effort. " 213 4. Modifications to Existing Solutions 215 We provide two solutions to address the above issues. One is based 216 on IOAM DEX and requires an extension to the instruction header of 217 the IOAM DEX Option. The second solution combines the IOAM trace 218 option and postcards for redundancy removal. [major] Two solutions for the same problem in a single standards track document seems to make it not trivial to fully implement the proposed standard. Would it make sense to flag one proposal as the preferred one and the other as the less preferred one? or maybe ballot conditions for when the first proposal is preferred above the second proposal and visa versa? What are the pro's and cons of each? 220 4.1. Per-hop postcard using IOAM DEX 222 One way to mitigate the postcard-based telemetry's tree tracking 223 weakness is to augment it with a branch identifier field. Note that [major] Not being overly familiar with IOAM, is this intended for each single packet of the mcast flow? or will this logic happen for a subset of identified packets? processing each packet seems not trivial in high volume mcast flows? This could be a major operational usage issue 265 Conforming to the node ID specification in IOAM [RFC9197], the node 266 ID is a 3-octet unsigned integer. The interface index is a two-octet 267 unsigned integer. As shown in Figure 2, the branch ID consumes 8 268 octets in total. The three unused octets MUST be set to 0. [major] What to do if the recipient gets these and they are not set to 0? drop, process, alert, etc? What if there are so many interfaces resulting in Interface index overflow? 280 Figure 3 shows that the branch ID is carried as an optional field 281 after the flow ID and sequence number optional fields in the IOAM DEX 282 option header. Two bits "N" and "I" (i.e., the third and fourth bits 283 in the Extension-Flags field) are reserved to indicate the presence 284 of the optional branch ID field. "N" stands for the Node ID and "I" 285 stands for the interface index. If "N" and "I" are both set to 1, 286 the optional multicast branch ID field is present; otherwise it is 287 absent. [major] It was not entirely clear why exactly these bits were selected? And why there are two bits? Would a single bit not be good enough? with 2 bits there are 4 states possible, and only one causes that the information is present. What in the other three states? what if the info is there but shouldn't or what if it should be there, but the branch info is not? What happens in those situations 311 4.2. Per-section postcard for IOAM Trace [minor] Maybe the intend of what postcard based IOAM trace can be helpful for a reader of the specification. What about adding something as the following proposed section " The postcard-based method for IOAM trace works by each node in the network independently sending "postcards," which are packets containing telemetry data about the packet processing at that specific node. These postcards are sent directly to a collection system and not carried within the data packet itself. This method eliminates redundancy because each node only reports its own data, but it also introduces challenges in reconstructing the full path and topology of the multicast tree due to the lack of inherent branching information in the individual postcards. This reconstruction often requires additional correlation using external tools or data, adding complexity. " 313 The second solution is a combination of the IOAM trace option and the 314 postcard-based telemetry. To avoid data redundancy, at each branch 315 fork node, the trace data accumulated up to this node is exported by 316 a postcard before the packet is replicated. In this solution, each [major] How is this achievable for high volume mcast flows to retain each single packet, processing the telemetry and then forwarding once all telemetry is processed. Maybe this solution is intended for low volume mcast? (assuming there is some identification what low volume means for the branch node). 320 the trace of each branch. This is also necessary because each 321 replicated multicast packet can have different telemetry data 322 pertaining to this particular copy (e.g., node delay, egress 323 timestamp, and egress interface). As a consequence, the local data 324 exported by each branch fork node can only contain partial data 325 (e.g., ingress interface and ingress timestamp). [major] This text does not truly compute for me. postcards are not carried within the packet itself, but sent independently. Hence i am slightly lost how this causes different telemetry for each copy? is that not always the situation, replicated or not? 353 There is no need to modify the IOAM trace option header format as 354 specified in [RFC9197]. We just need to configure the branch fork 355 nodes to export the postcards and refresh the IOAM header and data 356 (e.g., clear the node data list and reset the Remaining Length 357 field). [minor] Does this means that everything required for this to work already exists? If no, then what piece of encoding and formal procedures is missing? [major] What does the formal procedure to clear node data list and length fields exactly mean? 359 5. Application Considerations for Multicast Protocols [major] What about segment routing replication segments? https://www.rfc-editor.org/rfc/rfc9524.html There seems some ongoing work wrt SRV6 replication segments (currently still work in progress and expired, but nevertheless one can expect this to be developed sooner or later) From a high level perspective this sections seems slightly overkill and i am not sure it adds a lot of value. Maybe i am missing a introduction of what this section is all about?. If this is saying that telemetry can be used for these types of tunnels, is there then need for so much text and acronyms? Why not simply list all of them and reduce the complete section to some bullet points? 366 diagnostic information. Unlike unicast traceroute, Mtrace2 traces 367 the path that the tree building messages follow from receiver to 368 source. It is usually initiated from an Mtrace2 client by sending an [minor] These follow the control plane messages to build the tree? for all tree building technologies? how would that work for things like MOSPF for example? Maybe there is assumption that this is some PIM style of messaging involved? 382 status data through direct measurements. There are various multicast 383 protocols that are used to forward the multicast data. Each will 384 require their own unique on-path telemetry solution. Mtrace2 doesn't [minor] I am not sure what this is saying exactly with 'multicast data'. I assume that this is saying that there are multiple multicast protocols to build forwarding trees? or is the 'multicast data' referring to something else? 388 5.2. Application in PIM [major] What about IPv6 What about PIM-BIDIR? PIM-DM (even though it is non-optimal technology) I am not sure what the intend of this section is? Is it only to say the telemetry can be useful when PIM is used? 405 5.3. Application of MVPN X-PMSI Tunnel Encapsulation Attribute [major] What is this section trying to achieve? so many acronyms and very different tunnel types. 433 5.4. Application in BIER [major] This section is not providing an IOAM procedures, but seems to be saying that there are BIER requirements and that there is possibility for adding additional metadata in the BIER headers. However no formal procedures are provided, but only indicated. if there are formal procedures to make such mapping, then that should be made explicitly cristal clear in the prescriptive text on how to achieve such 454 6. Security Considerations [minor] high volume mcast streams can be filling up BW very rapidly. IOAM sampling will be important to protect the infrastructure
Jim Guichard
No Objection
Roman Danyliw
No Objection
Comment
(2024-05-28 for -09)
Not sent
Thank you to Roni Even for the GENART review.
Warren Kumari Former IESG member
Yes
Yes
(for -09)
Unknown
Erik Kline Former IESG member
No Objection
No Objection
(for -09)
Not sent
Francesca Palombini Former IESG member
No Objection
No Objection
(for -09)
Not sent
Murray Kucherawy Former IESG member
No Objection
No Objection
(2024-05-29 for -09)
Not sent
Orie Steele Former IESG member
No Objection
No Objection
(for -09)
Not sent
Paul Wouters Former IESG member
No Objection
No Objection
(for -09)
Not sent
Zaheduzzaman Sarker Former IESG member
No Objection
No Objection
(2024-05-30 for -09)
Sent
Thanks for working on this specification. Thanks to Bernard Aboba for his TSVART review. I can see resolutions have been reached to improve the document but they are not present in the current version of this document. I am relaying on the responsible AD to make sure the resolutions are reflected in the futuere versions of the document before it gets approved. Hence not holding a Discuss on those points. I have an additional comment- - The first paragraph of the introduction appeared to be describing multicast video scnenarios which can be realized over the Internet. The lack of relevance to IOAM context gives a feeling that this specification is addresing something that is out of scope. I would suggest that we explicitly make the scenario description related to scope of the IOAN operations. If this is intented to be used over the Internet then we have issue here.