Skip to main content

Requirements and Problem Statement for Monitoring Packet Loss Caused by Network Congestion
draft-he-ippm-congestion-loss-monitoring-problem-00

Document Type Active Internet-Draft (individual)
Authors hexiaoming , Xiao Min
Last updated 2026-06-05
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-he-ippm-congestion-loss-monitoring-problem-00
IPPM Working Group                                                 X. He
Internet-Draft                                             China Telecom
Intended status: Informational                                    X. Min
Expires: 7 December 2026                                       ZTE Corp.
                                                             5 June 2026

Requirements and Problem Statement for Monitoring Packet Loss Caused by
                           Network Congestion
          draft-he-ippm-congestion-loss-monitoring-problem-00

Abstract

   Emerging services including enhanced Mobile Broadband (eMBB) and
   Ultra-Reliable Low Latency Communication (uRLLC), as well as
   Artificial Intelligence (AI)training and inference have imposed
   stringent requirements of "high throughput, low latency, and minimal
   packet loss" on IP bearer network performance.  Network congestion
   can lead to performance degradation and increase uncertainty in
   service delivery, so real-time congestion monitoring is necessary.
   This document discuss the requirements of real-time monitoring of
   packet loss caused by congestion, present the problems and challenges
   faced by existing measurement techniques in monitoring congestion-
   induced packet loss.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 7 December 2026.

Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

He & Min                 Expires 7 December 2026                [Page 1]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Requirements for Real-time Monitoring of Packet Loss Caused by
           Congestion  . . . . . . . . . . . . . . . . . . . . . . .   4
   4.  Problem Statement for Monitoring Packet Loss Caused by
           Congestion  . . . . . . . . . . . . . . . . . . . . . . .   7
   5.  Challenges for Real-time Monitoring of Packet Loss Caused by
           Congestion  . . . . . . . . . . . . . . . . . . . . . . .   9
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  11
   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  11
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  11
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  11
     8.2.  Informative References  . . . . . . . . . . . . . . . . .  12
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  15

1.  Introduction

   With the large-scale deployment of 5G networks, emerging services
   including enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low
   Latency Communication (uRLLC) , demanding significantly reduced
   latency, minimized jitter, and near-zero packet loss rates
   [_GPP_TS_22.261].  At the same time, the technical development of Big
   Data and Artificial Intelligence (AI) calls for intelligent computing
   network infrastructure whose goal is to construct a lossless network
   characterized by "high throughput, low latency, and zero packet loss"
   [Adithya_Gangidi24][Kun_Qian24].  However, the inherent statistical
   multiplexing nature of TCP/IP-based IP networks results in bursty
   traffic patterns, making network congestion an inevitable occurrence.
   Such congestion phenomena degrade network performance and introduce
   the uncertainty in service delivery, e.g., loss leads to packet
   retransmission, increasing delay leads to decreasing throughput.  For
   a long time, numerous studies have been concentrated on congestion
   control mechanisms and related algorithms [RFC9293][RFC9743] to
   improve network performance.

He & Min                 Expires 7 December 2026                [Page 2]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   Network congestion is roughly divided into two classes: long-lived
   congestion and short-lived congestion.  A long lived congestion is
   generally caused by persistent traffic growth, e.g., congestion
   duration ranging from hours to days, which is easy to be observed
   through Network Management System/Element Management System (NMS/
   EMS).  However, a short-lived congestion is almost caused by traffic
   bursts, among which microburst is one of the major contributors.
   Microburst is a phenomenon where a device port receives a
   considerable amount of burst data in a very short time (i.e.,
   typically hundreds of microseconds to tens of milliseconds),
   resulting in an instantaneous burst rate much higher than the average
   rate, even exceeding the port bandwidth
   [Microburst][Shuhei_Yoshida21].  A microburst is prone to packet loss
   but difficult to detect in time.  Many investigations prove that
   microburst is the main culprit affecting latency-sensitive and packet
   loss-sensitive services.  When a microburst occurs, the queuing time
   increases rapidly, and in severe case, packet loss may even occur,
   which are intolerable for applications like Virtual reality (VR).

   In order to reduce uncertain service delivery caused by network
   congestion, it is essential to monitor congestion-induced packet loss
   in real time so that network operators can quickly locate the
   congested nodes and links, and then make path optimization for the
   affected traffic flows to avoid congestion; and evaluate network
   congestion level so as to provide the guidance for network planning,
   capacity expansion and optimization.

   This document discusses the requirements of real-time monitoring of
   packet loss caused by congestion, presents the problems and
   challenges faced by existing monitoring and measurement techniques in
   real-time monitoring of congestion-induced packet loss.

2.  Terminology

   Abbreviations used in this document:

   AI: Artificial Intelligence

   AltMark: Alternate-Marking

   DEX: Direct Exporting

   IOAM: In situ Operation, Administration, and Maintenance

   MPLS: Multi-Protocol Label Switching

   OAM: Operation, Administration, and Maintenance

He & Min                 Expires 7 December 2026                [Page 3]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   SRv6: Segment Routing over IPv6

   VPN: Virtual Private Network

3.  Requirements for Real-time Monitoring of Packet Loss Caused by
    Congestion

   Real-time monitoring of packet loss caused by congestion is valuable
   to network operators in many aspects, including quickly pinpointing
   the congestion location, making rapid path adjustment for loss-
   sensitive traffic flows once congestion loss is detected.  More
   importantly, we can determine the nature of congestion based on real-
   time congestion loss statistics.  That is, if the number of lost
   packets is persistently increasing for a longer time (e.g., hours
   level), a long-lived congestion event may occur; or, if packet loss
   happens in a very short time (e.g., milliseconds to seconds), a
   short-lived congestion event may occur.  Moreover, we can obtain the
   characteristics of microbursts from real-time congestion loss
   statistics, such as the frequency and duration of microburst
   occurrence.  On the other hand, measurement accuracy of congestion-
   induced loss is significant in evaluating congestion level, which
   provides the guidance for subsequent network expansion and
   optimization, as well as a reliable verification of user Service
   Level Objective (SLO) for packet loss.

He & Min                 Expires 7 December 2026                [Page 4]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   Generally speaking, for long-lived congestion events, if these
   monitored congestion events are local, for example, congestion occurs
   only in a very small part of nodes or links in the network, load
   balancing (e.g., by partitioning partial traffic into links with low
   utilization) is the most effective approach to eliminate congestion.
   But for short-lived congestion events, especially for microbursts,
   which typically occur randomly and globally, expanding the whole
   network capacity may be necessary to reduce congestion events if they
   occur frequently.  At present, network operators usually regard
   network utilization as the only indicator of network expansion, and
   they collect minute-level link utilization data based on SNMP.  This
   kind of average traffic statistical data skips many real-time short-
   lived congestion events observed actually as shown in Figure 1, which
   are exactly critical causes leading to bad quality of experience
   (QoE) for latency-sensitive and loss-sensitive services.  So the
   frequency and duration time of short-lived congestion occurrence are
   also important concerns for capacity expansion.  It is essential to
   determine proper frequency and duration time parameters for short-
   lived congestion occurrences.  We can take both link utilization and
   short-lived congestion events into consideration in launching
   capacity expansion plan.  Specifically, when the monitored average
   link utilization meets need of capacity expansion, say 50%, but the
   frequency and duration time parameters detected for short-lived
   congestion occurrence are far below the preset thresholds, we can
   wait some time; otherwise, we should take action immediately.

           Top: 5-minute sampling cycle
   utilization (%)
       95% |
       50% |---------------------------------------
        0% |
          ---------------- time (t)---------------->

   Bottom: millisecond sampling cycle (reveals microbursts)
   utilization (%)
       95% |       ^            ^             ^
       50% |       |            |             |
        0% |       |            |             |
             [microburst]  [microburst]  [microburst]
           ---------------- time (t)---------------->

     Figure 1: Utilization Comparison between Different Sampling Cycle

He & Min                 Expires 7 December 2026                [Page 5]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   Another requirement is the real-time localization of congestion
   occurrence.  It can help operators make rapid troubleshooting,
   improving the efficiency of fault diagnosis and root cause analysis.
   In addition, to analyze the cause of congestion, it is necessary to
   parse what traffic flows are contained in discarded packets and
   identify what traffic flows lead to the congestion such that we can
   take action to those culprits causing congestion.

   From the perspective of adaptability of packet loss monitoring and
   measurement methods, today's networks need to provide different
   transport modes for different services, accordingly, a kind of packet
   loss monitoring method is also required to adapt to various transport
   modes.  For instance, L2/3 VPN is widely used by enterprise, and
   Multi-Protocol Label Switching (MPLS) technique has been deployed by
   network operators to deliver MPLS VPN services.  With the evolution
   of the network and the emergence of new services, more transport
   protocols will emerge to adapt to the delivery of new services.  An
   ideal packet loss monitoring scheme should be protocol-independent,
   that is, it is applicable to all current and future transport
   protocols, including native IPv4/6, SR-MPLS [RFC8402][RFC8660], SRv6
   [RFC8986], Virtual eXtensible Local Area Network (VXLAN)[RFC7348] and
   General Routing Encapsulation (GRE)[RFC8086], etc., such that the
   data plane (e.g., forwarding chip) does not need to be upgraded and
   packet header encapsulation not to be modified to adapt to existing
   monitoring and measurement methods.

   Another praised feature for loss monitoring scheme is to make less or
   even no interference to network so that network load is less affected
   and hence the packet loss monitoring results can reflect the actual
   congestion state.

   In addition, for large-scale operator networks, typically tens of
   thousands of user flows need to be measured simultaneously, and the
   scalability is also vital factor in weighing a good monitoring
   method, that is, the number of concurrent measurements should be less
   limited by network resources (e.g., computing, storage or bandwidth).

   In summary, an ideal packet loss monitoring solution should include
   five aspects:

   *  Real-time: The monitoring system is required to collect and
      analyze congestion-induced packet loss in real time to quickly
      pinpoint the congestion location and identify the cause of
      congestion.

He & Min                 Expires 7 December 2026                [Page 6]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   *  Accuracy: The monitoring system is required to provide the
      accurate measurement results for packet loss caused by congestion
      so that the operators can accurately assess the status and trend
      of the network congestion and make appropriate decisions.

   *  Protocol-independent: The monitoring system is required to be
      independent of network transport protocols so that the data plane
      does not need to be upgraded and packet header encapsulation does
      not need to be modified.

   *  Little or even no interference to network: The monitoring system
      is required to have less or even no interference to network in
      order to reduce the impact on network traffic forwarding behavior.

   *  Scalability: The monitoring system is required to have the
      capability to accommodate tens of thousands of measurement
      sessions simultaneously so that the number of concurrent
      measurements should be less limited by network resources.

4.  Problem Statement for Monitoring Packet Loss Caused by Congestion

   Packet loss measurement is an important means of evaluating network
   performance and user quality of service (QoS).  According to
   [RFC7799], measurement method can be classified as active, passive
   and hybrid method.

   Existing active measurement methods include IP Ping [RFC2151], A One-
   way Active Measurement Protocol (OWAMP) [RFC4656], A Two-Way Active
   Measurement Protocol (TWAMP) [RFC5357],the Simple Two-way Active
   Measurement Protocol (STAMP) [RFC8762],which have been widely adopted
   by network operators.  An active measurement method can obtain one-
   way or two-way packet loss by means of the external probes or the
   built-in measurement module of device , but it can only indirectly
   measure the monitored traffic by sending probe packets to simulate
   real traffic, thus making some deviations with the actual loss
   results.  Also, in some serious circumstances (e.g., sending
   excessive testing traffic), it may interfere with network traffic
   forwarding behavior.

He & Min                 Expires 7 December 2026                [Page 7]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   Compared to active methods, passive methods only rely on the presence
   of the measured traffic flows at one or more observation points.  It
   does not generate additional traffic disturbing the network.  For
   example, monitoring packet loss between two nodes traversed by the
   designated traffic flow can be conducted by configuring packet
   counters on sending node and receiving node respectively.  More
   current studies have concentrated on sampled traffic data provided by
   NetFlow [RFC3954][Yuliang_Li16] [Hua_Hu23], where the accuracy of
   loss detection result depends on traffic sampling scheme, and the
   real-time performance depends on the collection interval.

   The emerging on-path detection techniques, including Alternate-
   Marking method (AltMark) [RFC9341], In Situ Operations,
   Administration, and Maintenance (IOAM) [RFC9197] and In-Band Network
   Telemetry (INT) [INT_Spec], have aroused great interests from both
   industry and academia.  These types of measurement and monitoring
   techniques are classified as hybrid methods.  The greatest advantage
   of AltMark method lies in its high efficiency and credence, as it can
   directly measure the real traffic with a single-marking bit.  But
   there exist some deficiencies.  First, two counters need to be
   employed for each measured flow in every measurement point, and much
   more counters will be consumed when concurrent measured flows and
   measurement points are large.  For example, in hop by-hop mode,
   considering m monitored flows that traverse n nodes along the path,
   then n*m*2 packet counters are needed.  Second, it is required to
   define different packet header encapsulations to adapt to different
   transport protocols for different data plane, and the forwarding chip
   for the data plane also needs to be upgraded accordingly.  For
   instance, the encapsulation for MPLS performance measurement with
   AltMark is defined in [RFC9714], Extension Header Option for IPv6
   network defined in [RFC9343] to encode Alternate-Marking information
   in both the Hop-by-Hop Options Header and Destination Options Header,
   and application of the Alternate-Marking Method to the Segment
   Routing Header defined in [RFC9947].

   IOAM and INT methods can directly monitor Operations, Administration,
   and Maintenance (OAM) information by embedding the monitoring
   instructions and carrying the OAM data of the forwarding path into
   the monitored packet.  However, these methods add some additional
   telemetry data which may claim much bandwidth, even causing path
   Maximum Transmission Unit (MTU) issue.  Similarly to AltMark method,
   they are also protocol dependent, and need to define different packet
   header encapsulations for different transport protocols as defined in
   [RFC9486] for IPv6,[I.D.ietf-mpls-mna-ioam] for MPLS.  As a result,
   this increases the complexity of forwarding chips.  In addition, a
   significant deficiency is that both IOAM and INT lack the ability to
   detect packet loss [Lizhuang_Tan21], so another IOAM option type
   called the Direct Export (DEX) [RFC9326], i.e., postcard-based IOAM,

He & Min                 Expires 7 December 2026                [Page 8]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   is proposed, which is used as a trigger for IOAM data to be directly
   exported without being pushed into data packets.  However, an IOAM
   encapsulating node that supports the DEX Option-Type must support the
   capability to select a subset of the forwarded traffic.  If an IOAM
   encapsulating node incorporates the DEX Option-Type into all the
   traffic it forwards, it may lead to an excessive amount of exported
   data, which may overload the network and the receiving entity.
   Theoretically, if an IOAM encapsulating node incorporates the DEX
   Option-Type into all monitored packets of traffic it forwards, the
   precision of loss result can be ensured.  While too small subset of
   traffic or too low traffic sampling is implemented, loss results may
   lack fidelity, since the transmitting packet interval may be longer
   than the duration of instantaneous congestion such as microburst.  In
   order to augment IOAM capabilities in performance measurement such as
   packet loss, delay and jitter, the two IETF drafts that integrate the
   Alternate-Marking method into IOAM as defined in [I.D.he-ippm-ioam-
   dex-extensions-incorporating-am] and [I.D.he-ippm-ioam-extensions-
   incorporating-am] are proposed.

5.  Challenges for Real-time Monitoring of Packet Loss Caused by
    Congestion

   Loss events may occur due to various situations, including link
   quality degradation, CRC error, illegal packets, malformed frames,
   mismatched access control list (ACL), mismatched forwarding
   information base (FIB), etc.  Loss caused by congestion is only one
   of many loss events.  The above-mentioned monitoring and measurement
   methods have not been specially designed to monitor packet loss
   caused by congestion.  These methods will face challenges when they
   are employed for monitoring congestion-induced packet loss,
   especially in the short-lived congestion caused by frequent
   microburst whose duration is typically hundreds of microseconds to
   tens of milliseconds.  The reasons are as follows:

   *  First, existing monitoring and measurement methods have no ability
      to distinguish congestion-induced packet loss from all loss
      events, and the above-mentioned measurement methods can only
      obtain loss results from all loss events.

He & Min                 Expires 7 December 2026                [Page 9]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   *  Second, it is difficult for them to capture a loss event if this
      loss event is only caused by instantaneous congestion such as
      microburst.  Take active methods into consideration, in order to
      capture a microburst, the probe packets sent must be at
      milliseconds or even microseconds interval, which will generate
      the excessive testing traffic, inundating the forwarding plane.
      For example, when a microburst with the duration of 1ms occurs,
      causing considerable dropped packets, such a microburst may not be
      detected if the probe packet interval is greater than 1ms.
      resultantly, the discarded packets will be not counted.

   *  Third, it is also difficult for existing monitoring and
      measurement methods to accurately measure the number of all
      discarded packets caused by congestion.  When a congestion loss
      event occurs, which generates considerable amount of packet loss,
      AltMark method can only obtain the accurate number of discarded
      packets of the monitored user flow encountering congestion, and
      the discarded packets by the other unmonitored traffic flows
      encountering the same congestion will not be collected, therefore,
      the total number of discarded packets caused by congestion is
      unknown.

   *  Fourth, existing monitoring and measurement methods have no
      capability to analyze what traffic flows are contained in
      discarded packets and identify what traffic flows lead to the
      congestion, because those discarded packets are not saved by
      network devices.

   *  Fifth, existing monitoring and measurement methods have difficulty
      in distinguishing long-lived congestion from short-lived
      congestion, let alone detecting the frequency and duration time of
      microburst occurrences.  Although active methods and on-path
      methods such as AltMark can measure the number of discarded
      packets in every fixed measurement period (typically tens of
      seconds), they cannot detect in a measurement period how often
      microburst occurs and how long it lasts.

   *  Finally, the ability to accurately localize packet loss caused by
      congestion in real time (e.g., from which device ID, port ID and
      queue ID a loss event happens) will help network operator quickly
      pinpoint congested nodes and make rapid troubleshooting.  Existing
      monitoring techniques are also insufficient.Through sending probe
      packets, the traditional active measurement methods can only get
      packet loss results from ingress node to egress node, but cannot
      accurately indicate which node in the forwarding path discarded
      packets.  As for hybrid measurement methods such as AltMark, the
      practical deployment for loss measurement adopts the end-to-end
      option mode, where only ingress node and egress node send packet

He & Min                 Expires 7 December 2026               [Page 10]
Internet-Draft  Requirements and Problem Statement for M       June 2026

      counter values to the monitoring system for analysis, and
      intermediate nodes do not process the measured packets to mitigate
      the network and receiving entity.  In case of packet loss
      detected, the hop-by-hop option mode is switched to localize
      packet loss.  Still, it is difficult to pinpoint the
      instantaneously congested node (e.g., caused by a microburst).
      The reason lies in that when the hop-by-hop option mode is adopted
      after packet loss was detected, the congestion caused by
      microburst might have eliminated.

6.  IANA Considerations

   This document has no IANA actions.

7.  Security Considerations

   The congestion-induced loss monitoring system introduces additional
   traffic to the network.  During network congestion, the monitoring
   system itself must not exacerbate the situation.  Mechanisms such as
   rate limiting and traffic prioritization for congestion-related
   monitoring data should be considered.  Also, some appropriate defense
   measures against Distributed Denial of Service (DDoS) attack are
   necessary to protect the data plane and control plane.

   This document does not specify security mechanisms, but highlights
   that any solution must consider trusted boundary regarding telemetry
   data subscriptions, telemetry data reporting, and protection of
   potentially sensitive operational data.  These aspects are expected
   to be addressed by solution proposals based on deployment
   requirements and threat models.

8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
              Writing an IANA Considerations Section in RFCs", BCP 26,
              RFC 8126, DOI 10.17487/RFC8126, June 2017,
              <https://www.rfc-editor.org/info/rfc8126>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

He & Min                 Expires 7 December 2026               [Page 11]
Internet-Draft  Requirements and Problem Statement for M       June 2026

8.2.  Informative References

   [Adithya_Gangidi24]
              Gangidi, A., Miao, R., and S. Zheng, "RDMA over Ethernet
              for Distributed AI Training at Meta Scale", In ACM SIGCOMM
              2024 Conference , 2024,
              <https://doi.org/10.1145/3651890.3672233>.

   [Hua_Hu23] Hu, H., Liu, Y., and S. Ni, "LossDetection: Real-time
              Packet Loss Monitoring System for Sampled Traffic Data",
              IEEE Transactions on Network and Service Management March,
              2023, <http://DOI.org/10.1109/TNSM.2022.3203389>.

   [I-D.he-ippm-ioam-dex-extensions-incorporating-am]
              hexiaoming, X., Brockners, F., Song, H., Fioccola, G., and
              A. Wang, "IOAM Direct Exporting (DEX) Option Extensions
              for Incorporating the Alternate-Marking Method", Work in
              Progress, Internet-Draft, draft-he-ippm-ioam-dex-
              extensions-incorporating-am-04, 27 February 2026,
              <https://datatracker.ietf.org/doc/html/draft-he-ippm-ioam-
              dex-extensions-incorporating-am-04>.

   [I-D.he-ippm-ioam-extensions-incorporating-am]
              hexiaoming, X., Min, X., Brockners, F., Fioccola, G., and
              C. Xie, "IOAM Trace Option Extensions for Incorporating
              the Alternate-Marking Method", Work in Progress, Internet-
              Draft, draft-he-ippm-ioam-extensions-incorporating-am-06,
              27 February 2026, <https://datatracker.ietf.org/doc/html/
              draft-he-ippm-ioam-extensions-incorporating-am-06>.

   [I-D.ietf-mpls-mna-ioam]
              Gandhi, R., Mirsky, G., Li, T., Song, H., and B. Wen,
              "Supporting In Situ Operations, Administration and
              Maintenance Using MPLS Network Actions", Work in Progress,
              Internet-Draft, draft-ietf-mpls-mna-ioam-05, 19 May 2026,
              <https://datatracker.ietf.org/doc/html/draft-ietf-mpls-
              mna-ioam-05>.

   [INT_Spec] The P4.org Applications Working Group, "In-Band Network
              Telemetry (INT) Data plane Specification V2.1", 2020,
              <https://p4.org/p4-spec/docs/INT_v2_1.pdf>.

   [Kun_Qian24]
              Qian, K., Xi, Q., and J. Cao, "Alibaba HPN: A Data Center
              Network for Large Language Model Training", In ACM SIGCOMM
              2024 Conference , 2024,
              <https://doi.org/10.1145/3651890.3672265>.

He & Min                 Expires 7 December 2026               [Page 12]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   [Lizhuang_Tan21]
              Tan, L., Su, W., and W. Zhang, "A Packet Loss Monitoring
              System for In-Band Network Telemetry: Detection,
              Localization, Diagnosis and Recovery", IEEE Transactions
              on Network and Service Management December, 2021,
              <http://DOI.org/10.1109/TNSM.2021.3125012>.

   [Microburst]
              Huawei Technologies Co., Ltd, "What is a Microburst? How
              to Detect a Microburst,(Nov. 2020)", 2020,
              <https://support.huawei.com/ enterprise/en/doc/>.

   [RFC2151]  Kessler, G. and S. Shepard, "A Primer On Internet and TCP/
              IP Tools and Utilities", FYI 30, RFC 2151,
              DOI 10.17487/RFC2151, June 1997,
              <https://www.rfc-editor.org/info/rfc2151>.

   [RFC3954]  Claise, B., Ed., "Cisco Systems NetFlow Services Export
              Version 9", RFC 3954, DOI 10.17487/RFC3954, October 2004,
              <https://www.rfc-editor.org/info/rfc3954>.

   [RFC4656]  Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M.
              Zekauskas, "A One-way Active Measurement Protocol
              (OWAMP)", RFC 4656, DOI 10.17487/RFC4656, September 2006,
              <https://www.rfc-editor.org/info/rfc4656>.

   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
              Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
              RFC 5357, DOI 10.17487/RFC5357, October 2008,
              <https://www.rfc-editor.org/info/rfc5357>.

   [RFC7348]  Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
              L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
              eXtensible Local Area Network (VXLAN): A Framework for
              Overlaying Virtualized Layer 2 Networks over Layer 3
              Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014,
              <https://www.rfc-editor.org/info/rfc7348>.

   [RFC7799]  Morton, A., "Active and Passive Metrics and Methods (with
              Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799,
              May 2016, <https://www.rfc-editor.org/info/rfc7799>.

   [RFC8086]  Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE-
              in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086,
              March 2017, <https://www.rfc-editor.org/info/rfc8086>.

He & Min                 Expires 7 December 2026               [Page 13]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   [RFC8402]  Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L.,
              Decraene, B., Litkowski, S., and R. Shakir, "Segment
              Routing Architecture", RFC 8402, DOI 10.17487/RFC8402,
              July 2018, <https://www.rfc-editor.org/info/rfc8402>.

   [RFC8660]  Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S.,
              Decraene, B., Litkowski, S., and R. Shakir, "Segment
              Routing with the MPLS Data Plane", RFC 8660,
              DOI 10.17487/RFC8660, December 2019,
              <https://www.rfc-editor.org/info/rfc8660>.

   [RFC8762]  Mirsky, G., Jun, G., Nydell, H., and R. Foote, "Simple
              Two-Way Active Measurement Protocol", RFC 8762,
              DOI 10.17487/RFC8762, March 2020,
              <https://www.rfc-editor.org/info/rfc8762>.

   [RFC8986]  Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer,
              D., Matsushima, S., and Z. Li, "Segment Routing over IPv6
              (SRv6) Network Programming", RFC 8986,
              DOI 10.17487/RFC8986, February 2021,
              <https://www.rfc-editor.org/info/rfc8986>.

   [RFC9197]  Brockners, F., Ed., Bhandari, S., Ed., and T. Mizrahi,
              Ed., "Data Fields for In Situ Operations, Administration,
              and Maintenance (IOAM)", RFC 9197, DOI 10.17487/RFC9197,
              May 2022, <https://www.rfc-editor.org/info/rfc9197>.

   [RFC9293]  Eddy, W., Ed., "Transmission Control Protocol (TCP)",
              STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
              <https://www.rfc-editor.org/info/rfc9293>.

   [RFC9326]  Song, H., Gafni, B., Brockners, F., Bhandari, S., and T.
              Mizrahi, "In Situ Operations, Administration, and
              Maintenance (IOAM) Direct Exporting", RFC 9326,
              DOI 10.17487/RFC9326, November 2022,
              <https://www.rfc-editor.org/info/rfc9326>.

   [RFC9341]  Fioccola, G., Ed., Cociglio, M., Mirsky, G., Mizrahi, T.,
              and T. Zhou, "Alternate-Marking Method", RFC 9341,
              DOI 10.17487/RFC9341, December 2022,
              <https://www.rfc-editor.org/info/rfc9341>.

   [RFC9343]  Fioccola, G., Zhou, T., Cociglio, M., Qin, F., and R.
              Pang, "IPv6 Application of the Alternate-Marking Method",
              RFC 9343, DOI 10.17487/RFC9343, December 2022,
              <https://www.rfc-editor.org/info/rfc9343>.

He & Min                 Expires 7 December 2026               [Page 14]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   [RFC9486]  Bhandari, S., Ed. and F. Brockners, Ed., "IPv6 Options for
              In Situ Operations, Administration, and Maintenance
              (IOAM)", RFC 9486, DOI 10.17487/RFC9486, September 2023,
              <https://www.rfc-editor.org/info/rfc9486>.

   [RFC9714]  Cheng, W., Ed., Min, X., Ed., Zhou, T., Dai, J., and Y.
              Peleg, "Encapsulation for MPLS Performance Measurement
              with the Alternate-Marking Method", RFC 9714,
              DOI 10.17487/RFC9714, February 2025,
              <https://www.rfc-editor.org/info/rfc9714>.

   [RFC9743]  Duke, M., Ed. and G. Fairhurst, Ed., "Specifying New
              Congestion Control Algorithms", BCP 133, RFC 9743,
              DOI 10.17487/RFC9743, March 2025,
              <https://www.rfc-editor.org/info/rfc9743>.

   [RFC9947]  Fioccola, G., Zhou, T., Mishra, G., Wang, X., Zhang, G.,
              and M. Cociglio, "Application of the Alternate-Marking
              Method to the Segment Routing Header", RFC 9947,
              DOI 10.17487/RFC9947, March 2026,
              <https://www.rfc-editor.org/info/rfc9947>.

   [Shuhei_Yoshida21]
              Yoshida, S., Ukon, Y., and S. Ohteru, "FPGA-based network
              microburst analysis system with efficient packet
              capturing", Journal of Optical Communications and
              Networking October, 2021,
              <https://doi.org/10.1364/JOCN.422859>.

   [Yuliang_Li16]
              Li, Y., Miao, R., and C. Kim, "LossRadar: Fast Detection
              of Lost Packets in Data Center Networks", Proceedings of
              the 12th International on Conference on emerging
              Networking Experiments and Technologies 2016,
              <https://doi.org/10.1145/2999572.2999609>.

   [_GPP_TS_22.261]
              3GPP, "Service requirements for the 5G system; Stage 1
              (Release 18)", 2024,
              <https://www.3gpp.org/ftp/specs/archive/22 series/22.261>.

Authors' Addresses

   Xiaoming He
   China Telecom
   Email: hexm4@chinatelecom.cn

He & Min                 Expires 7 December 2026               [Page 15]
Internet-Draft  Requirements and Problem Statement for M       June 2026

   Xiao Min
   ZTE Corp.
   Email: xiao.min2@zte.com.cn

He & Min                 Expires 7 December 2026               [Page 16]