Skip to main content

Additional CATS requirements consideration for Service Segmentation-related use cases
draft-dcn-cats-req-service-segmentation-03

Document Type Active Internet-Draft (individual)
Authors Trần Minh Ngọc , Nguyễn Trung Kiệm , Younghan Kim
Last updated 2026-02-23
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-dcn-cats-req-service-segmentation-03
cats                                                             N. Tran
Internet-Draft                                           K. Nguyen-Trung
Intended status: Informational                                    Y. Kim
Expires: 27 August 2026                              Soongsil University
                                                        23 February 2026

  Additional CATS requirements consideration for Service Segmentation-
                           related use cases
               draft-dcn-cats-req-service-segmentation-03

Abstract

   This document discusses possible additional CATS requirements when
   considering service segmentation in related CATS use cases such as
   AR-VR and Distributed AI Inference

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 27 August 2026.

Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Tran, et al.             Expires 27 August 2026                 [Page 1]
Internet-Draft        cats-req-service-segmentation        February 2026

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology used in this draft  . . . . . . . . . . . . . . .   3
   3.  Example 1: AR-VR (XR) Rendering Sequential Subtask
           Segmentation  . . . . . . . . . . . . . . . . . . . . . .   3
     3.1.  Expected CATS system flow . . . . . . . . . . . . . . . .   5
     3.2.  Impacts on CATS system design . . . . . . . . . . . . . .   5
   4.  Example 2: ML Model Vertical Partitioning Inference Parallel
           Subtask Segmentation  . . . . . . . . . . . . . . . . . .   6
     4.1.  Expected CATS system flow . . . . . . . . . . . . . . . .   8
     4.2.  Impacts on CATS system design . . . . . . . . . . . . . .   8
   5.  Differences comparison between Normal and Service Segmentation
           CATS scenarios  . . . . . . . . . . . . . . . . . . . . .   9
   6.  CATS system design Consideration points to support Service
           Segmentation  . . . . . . . . . . . . . . . . . . . . . .   9
   7.  Normative References  . . . . . . . . . . . . . . . . . . . .  11
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  12

1.  Introduction

   Service segmentation is a service deployment option that splits the
   service into smaller subtasks which can be executed in parallel or in
   sequence before the subtasks execution results are aggregated to
   serve the service request
   [draft-li-cats-task-segmentation-framework].  It is an interesting
   service deployment option that is widely considered to improve the
   performance of several services such as AR-VR or Distributed AI
   Inference which are also key CATS use cases
   [draft-ietf-cats-usecases-requirements].

   For example, a recent 3GPP Technical Report on 6G use cases and
   services [TR-22870-3GPP] describes an XR rendering service that can
   be implemented as a sequential pipeline of subtasks, including a
   render engine, engine adaptation, and rendering acceleration.  In
   contrast, an example of parallel service segmentation is parallel
   Machine Learning (ML) model partitioning for inference [SplitPlace],
   [Gillis].  Specifically, a ML model layer can be divided into
   multiple smaller partitions, which are executed in parallel.  In both
   sequential and parallel segmentation cases, subtask may have multiple
   instances which are deployed across different computing sites.

   This document analyzes these CATS service segmentation use case
   examples to discuss the impact of service segmentation deployment
   method on CATS system design.

Tran, et al.             Expires 27 August 2026                 [Page 2]
Internet-Draft        cats-req-service-segmentation        February 2026

2.  Terminology used in this draft

   This document re-uses the CATS component terminologies which has been
   defined in [draft-ietf-cats-framework].  Additional definitions
   related to service segmentation are:

   Service subtask: An offering that performs only a partial
   funtionality of the original service.  The complete functionality of
   the original service is achieved by aggregating the results of all
   its divided service subtasks.  Subtask result aggregation may be
   performed either in parallel or sequentially.

   Service subtask instance: When a service is segmented into multiple
   service subtasks, each service subtask might have multiple instances
   that performs the same partial functionality of the original service.

3.  Example 1: AR-VR (XR) Rendering Sequential Subtask Segmentation

Tran, et al.             Expires 27 August 2026                 [Page 3]
Internet-Draft        cats-req-service-segmentation        February 2026

                      XR Rendering request
                          +--------+
                          | Client |
                          +---|----+
                              |
                      +-------|-------+
                      |   AR-VR(XR)   |
                      |  App Platform |
                      +-------|-------+
                              |  Supposed Optimal combination:
                              |  RE Site 1, EA Site 3, RA site 4
                              |
                              |  Forwards packet in ORDER:
                              |  Site 1 -> 3 -> 4
                        +-----|-----+------+
+-----------------------|   CATS**  |C-PS  |---------------------+
|       Underlay        | Forwarder |------+          +-------+  |
|    Infrastructure     +-----|-----+                 |C-NMA  |  |
|                             |                       +-------+  |
|       +---------------+-----+---------+---------------+        |
|        Various network latency between different links         |
|       |               |               |               |        |
|       | /-----------\ | /-----------\ | /-----------\ |        |
+-+-----|/----+---+----\|/----+---+----\|/----+---+----\|-----+--+
  |   CATS    |   |  CATS     |   |   CATS    |   |   CATS    |
  | Forwarder |   | Forwarder |   | Forwarder |   | Forwarder |
  +-----|-----+   +-----|-----+   +-----|-----+   +-----|-----+
        |               |               |               |
  +-----|------+   +----|------+   +----|-------+   +---|--------+
  |+----------+|   |+---------+|   |+----------+|   |+----------+|
  ||  Render  ||   || Render  ||   || Engine   ||   ||  Render  ||
  ||  Engine  ||   || Engine  ||   ||Adaptation||   ||Accelerate||
  |+----------+|   |+---------+|   |+----------+|   |+----------+|  +---+---+
  |  Optimal   |   |           |   |  Optimal   |   |  Optimal   |  |C-SMA* |
  |            |   |           |   |            |   |            |  +---+---+
  |+----------+|   |           |   |+----------+|   |            |      |
  || Engine   ||   |           |   || Render   ||   |            |      |
  ||Adaptation||   |           |   ||Accelerate||   |            |      |
  |+----------+|   |           |   |+----------+|   |            |      |
  |            |   |           |   |            |   |            |      |
  +-----|------+   +-----|-----+   +-----|------+   +-----|------+      |
        +----------------+---------------+----------------+-------------+
     Service         Service         Service        Service
      Site 1          Site 2          Site3          Site 4

       Figure 1: Example of a CATS system in Sequential Service
                          Segmentation case

Tran, et al.             Expires 27 August 2026                 [Page 4]
Internet-Draft        cats-req-service-segmentation        February 2026

   Figure 1 illustrates how a CATS system should perform optimal traffic
   steering for an XR rendering service deployed as a sequential
   pipeline of subtasks, including the render engine, engine adaptation,
   and rendering acceleration.  This example is derived from the
   corresponding use case in [TR-22870-3GPP].  To return the rendered XR
   object to the client, the XR rendering request must be processed
   sequentially in the specified order by the three rendering subtasks.

3.1.  Expected CATS system flow

   *  The client sends an XR rendering request via its connected network
      to the XR application platform.

   *  The XR application platform determines that the request should be
      processed by the XR rendering pipeline and forwards the packet via
      its attached CATS Forwarder.

   *  The CATS Path Selector (CATS-PS) determines the optimal subtask
      composition of XR rendering pipeline and selects the most suitable
      instance for each subtask to steer the request.  This selection is
      based on the current status of computing and network resources at
      the sites hosting the XR rendering subtask instances.  For
      example, in Figure 1, the sequential pipeline consists of three
      subtasks: the optimal Render Engine instance is located at Site 1,
      the optimal Engine Adaptation instance at Site 3, and the optimal
      Rendering Acceleration instance at Site 4.

   *  The CATS-PS configures the CATS Forwarder with routing information
      that specifies the required processing order: from Site 1 to Site
      3 to Site 4.

   *  The packet is steered through the CATS underlay infrastructure
      following the specified routing order and is sequentially
      processed at the designated service sites.

   *  The XR application platforms returns the final processed XR
      rendering result to the client.

3.2.  Impacts on CATS system design

   *  A CATS system should provide a method to distinguish different
      CATS candidate paths corresponding to different service subtask
      instance combinations (different subtask composition or same
      subtask composition but different subtask instance location)

   *  A CATS system should provide a method to deliver the service
      request to the determined optimal service subtask instance
      combination in correct order and correct composition.

Tran, et al.             Expires 27 August 2026                 [Page 5]
Internet-Draft        cats-req-service-segmentation        February 2026

4.  Example 2: ML Model Vertical Partitioning Inference Parallel Subtask
    Segmentation

                +-----+             +----------+
    +-----+     |     |  +-------+  |          |
    |Input|---> |Layer|--| Layer |--|   Layer  |   Orignal ML Model
    |     |     |  1  |  | 2 (L2)|  |   3 (L3) |
    +-----+     |(L1) |  +-------+  +----------+
                +-----+

                +-----+             +----------+
    +-----+     |Split|  +-------+  |          |
    |Input|---> |  L1 |--|SplitL2|--| Split L3 |   ML Model Slice 1
    |     |\    +-----+  +-------+  +----------+
    +-----+ \
     Split   \  +-----+
              > |Split|  +-------+  +----------+
                |  L1 |--|SplitL2|--| Split L3 |   ML Model Slice 2
                +-----+  +-------+  +----------+

           Figure 2: ML model Vertical Partitioning Illustration

Tran, et al.             Expires 27 August 2026                 [Page 6]
Internet-Draft        cats-req-service-segmentation        February 2026

                      ML Inference request
                          +--------+
                          | Client |
                          +---|----+
                              |
                      +-------|-------+
   *Merges output from|       ML      | *Divides input corresponding
    Slice 1 and 2     |  App Platform |  to Slice 1, 2 input sizes
    before responding +-------|-------+
    to Client                 |
                              |
                              |  Supposed Optimal combination:
                              |  Slice 1 Site 1, Slice 2 Site 3
                              |
                              |  Forwards packet in PARALLEL:
                              |  Site 1 & 3
                        +-----|-----+------+
+-----------------------|   CATS**  |C-PS  |---------------------+
|       Underlay        | Forwarder |------+          +-------+  |
|    Infrastructure     +-----|-----+                 |C-NMA  |  |
|                             |                       +-------+  |
|       +---------------+-----+---------+---------------+        |
|        Various network latency between different links         |
|       |               |               |               |        |
|       | /-----------\ | /-----------\ | /-----------\ |        |
+-+-----|/----+---+----\|/----+---+----\|/----+---+----\|-----+--+
  |   CATS    |   |  CATS     |   |   CATS    |   |   CATS    |
  | Forwarder |   | Forwarder |   | Forwarder |   | Forwarder |
  +-----|-----+   +-----|-----+   +-----|-----+   +-----|-----+
        |               |               |               |
  +-----|------+   +----|------+   +----|-------+   +---|--------+
  |+----------+|   |+---------+|   |+----------+|   |+----------+|
  ||  Model   ||   || Model   ||   ||  Model   ||   ||  Model   ||
  ||  Slice1  ||   || Slice 1 ||   ||  Slice 2 ||   ||  Slice 2 ||
  |+----------+|   |+---------+|   |+----------+|   |+----------+|  +---+---+
  |  Optimal   |   |           |   |  Optimal   |   |            |  |C-SMA* |
  |            |   |           |   |            |   |            |  +---+---+
  |            |   |           |   |            |   |            |      |
  +-----|------+   +-----|-----+   +-----|------+   +-----|------+      |
        +----------------+---------------+----------------+-------------+
     Service         Service         Service        Service
      Site 1          Site 2          Site3          Site 4

        Figure 3: Example of a CATS system in Parallel Service
                          Segmentation case

Tran, et al.             Expires 27 August 2026                 [Page 7]
Internet-Draft        cats-req-service-segmentation        February 2026

   Figure 3 illustrates how a CATS system can perform optimal traffic
   steering for a machine learning (ML) inference service deployed as a
   parallel pipeline of subtasks, where each subtask corresponds to a
   vertically partitioned slice of the original ML model.  Based on the
   ML model splitting use cases described in [SplitPlace] and [Gillis],
   Figure Figure 3 shows how an ML model can be vertically partitioned
   into slices that are executed in parallel to reduce inference
   response time.  The input inference data from the client should be
   partitioned according to the input dimensions expected by each model
   slice.  These slices then process their respective inputs in
   parallel, and the resulting outputs are merged to produce the final
   inference result, which is returned to the client.

4.1.  Expected CATS system flow

   *  The client sends an ML inference request via its connected network
      to the ML application platform.

   *  The ML application platform determines that the request should be
      processed by the Vertical ML model partitioning pipeline.

   *  The CATS-PS determines the optimal subtask composition for a
      vertically partitioned machine learning (ML) model pipeline and
      selects the most suitable instance for each subtask to steer the
      request.  For example, in Figure 3, the ML model is partitioned
      into two vertical slices: Model Slice 1 is deployed at Service
      Site 1, and Model Slice 2 is deployed at Service Site 3.

   *  The CATS Path Selector (CATS-PS) communicates its pipeline
      decision to the ML application platform.  The platform then pre-
      processes the client’s inference input into two smaller input
      slices, based on the input dimensions required by each model
      slice.

   *  Each input slice is forwarded in parallel to its corresponding
      model slice instance at the designated locations (Site 1 and Site
      3).

   *  Once the processed outputs are returned from each model slice, the
      ML application platform merges them to produce the final ML
      inference result, which is then returned to the client.

4.2.  Impacts on CATS system design

   *  A CATS system should provide a method to distinguish different
      CATS candidate paths corresponding to different service subtask
      instance combinations (different subtask composition or same
      subtask composition but different subtask instance location)

Tran, et al.             Expires 27 August 2026                 [Page 8]
Internet-Draft        cats-req-service-segmentation        February 2026

   *  A CATS system should coordinate with the segmented service
      platform entity to pre-process the original request data into the
      appropriate input formats required by the determined parallel
      subtasks.

5.  Differences comparison between Normal and Service Segmentation CATS
    scenarios

   In the normal CATS scenario:

   *  The CATS system objective is selecting an optimal service instance
      to serve a service request

   *  Different candidate CATS paths are caused by: service instances'
      computing and network resources status.

   *  The CATS system delivers the service request to the determined
      optimal service instance

   In the Service Segmenatation CATS scenario:

   *  The CATS system objective is selecting an optimal service subtask
      combination.  An optimal combination is composed of the optimal
      instances of each service subtask.

   *  Different candidate CATS paths are caused by: service subtask
      instances' computing and network resources status, and possible
      different service segmentation variations (e.g. a service is
      segmented into different number of subtasks)

   *  The CATS system delivers the service request to the determined
      optimal combination of service subtask instances in correct order
      (sequence/parallel) and subtask composition.

6.  CATS system design Consideration points to support Service
    Segmentation

   As AR/VR and Distributed AI Inference are among the CATS-supported
   use cases listed in [draft-ietf-cats-usecases-requirements], the CATS
   system should also fully support scenarios where service segmentation
   is applied to these use cases.

   This section outlines three CATS system design considerations that
   are not yet addressed in existing CATS WG documents, including the
   Problem and Requirement document
   ([draft-ietf-cats-usecases-requirements]), the Framework document
   ([draft-ietf-cats-framework]), and the Metric Definition document
   ([draft-ietf-cats-usecases-requirements]):

Tran, et al.             Expires 27 August 2026                 [Page 9]
Internet-Draft        cats-req-service-segmentation        February 2026

   - Traffic Steering Objective:

   *  The optimal service instance can be a sequence/parallel pipeline
      that consists the optimal instances of each subtask composing the
      service, instead of a single entity providing the complete service
      functionality, as assumed in the conventional CATS scenario.

   - Traffic Steering Mechanism:

   *  The CATS system may be required to provide a mechanism to steer
      service requests in a predetermined sequence, as in the case of
      sequential service segmentation.

   - CATS Metrics Aggregation:

*  CATS metrics can be aggregated not only by metric category (e.g.,
   computing, networking) but also by individual service subtasks.
   For instance, the CATS metric representing a candidate combination
   of subtasks may be derived by aggregating the metrics of its
   component subtasks.

*  One possible realization of such metric aggregation is _Service
   Pipeline Metrics_.

                              Service Pipeline Metrics
                                      +------+
                                      |  M3  |
                                      +------+
                                          |
                     -----------------------------------------
                     |                                       |
              +-------------+                         +-------------+
L2:           |    M2-A     |                         |    M2-X     |
              | (Subtask A) |         (  ...  )       | (Subtask X) |
              +-------------+                         +-------------+
                     |                                       |
            -------------------                       -------------------
            |                 |                       |                 |
         +------+         +------+                 +------+         +------+
L1:      | M1-A |  (...)  | M1-A |                 | M1-X |  (...)  | M1-X |
         +------+         +------+                 +------+         +------+
            |                 |                       |                 |
       -----------       -----------             -----------       -----------
       |    |    |       |    |    |             |    |    |       |    |    |
L0:   M0  (...)  M0     M0  (...)  M0           M0  (...)  M0     M0  (...)  M0

           Figure 4: New CATS Metric Aggregation Level

Tran, et al.             Expires 27 August 2026                [Page 10]
Internet-Draft        cats-req-service-segmentation        February 2026

   The model organizes CATS-related measurements into multiple levels
   of abstraction.  Level 0 (L0) metrics capture primitive
   measurements (e.g., resource, traffic, or system observations).
   Level 1 (L1) metrics are derived from L0 metrics to represent
   service-relevant aspects at a finer granularity.  Level 2 (L2)
   metrics summarize the overall capability or suitability of each
   subtask (or subtask instance) by aggregating its L1 metrics.
   Finally, a pipeline-level metric (denoted as M3) is computed by
   aggregating the L2 metrics across all subtasks in a candidate
   pipeline, yielding a single score that can be used to compare
   candidate pipelines.

*  Aggregating metrics at the pipeline level can also reduce control-
   plane signaling overhead by avoiding the need to disseminate fine-
   grained metrics for each individual service replica.  This
   consideration becomes increasingly important in large-scale 6G
   deployments, where the number of service replicas within a metro-
   area cell may grow to the order of hundreds.

7.  Normative References

   [draft-ietf-cats-framework]
              Li, C., et al., "A Framework for Computing-Aware Traffic
              Steering (CATS)",  draft-ietf-cats-framework, February
              2026.

   [draft-ietf-cats-metric-definition]
              Yao, K., et al., "CATS Metrics Definition",  draft-ietf-
              cats-metric-definition, February 2026.

   [draft-ietf-cats-usecases-requirements]
              Yao, K., et al., "Computing-Aware Traffic Steering (CATS)
              Problem Statement, Use Cases, and Requirements",  draft-
              ietf-cats-usecases-requirements, February 2026.

   [draft-ietf-spring-sr-service-programming]
              Abdelsalam, A. , et al., "Service Programming with Segment
              Routing",  draft-ietf-spring-sr-service-programming,
              February 2026.

   [draft-lbdd-cats-dp-sr]
              Li, C., et al., "Computing-Aware Traffic Steering (CATS)
              Using Segment Routing",  draft-lbdd-cats-dp-sr, October
              2025.

Tran, et al.             Expires 27 August 2026                [Page 11]
Internet-Draft        cats-req-service-segmentation        February 2026

   [draft-li-cats-task-segmentation-framework]
              Li, C., et al., "A Task Segmentation Framework for
              Computing-Aware Traffic Steering",  draft-li-cats-task-
              segmentation-framework, December 2024.

   [Gillis]   Yu, M., Jiang, Z., Chun Ng, H., Wang, W., Chen, R., and B.
              Li, "Gillis: Serving Large Neural Networks in Serverless
              Functions with Automatic Model Partitioning", October
              2021, <https://doi.org/10.1109/ICDCS51616.2021.00022>.

   [SplitPlace]
              Tuli, S., Casale, G., and N. Jennings, "SplitPlace: AI
              Augmented Splitting and Placement of Large-Scale Neural
              Networks in Mobile Edge Environments", May 2022,
              <https://doi.org/10.1109/TMC.2022.3177569>.

   [TR-22870-3GPP]
              "Study on 6G Use Cases and Service Requirements", June
              2025,
              <https://portal.3gpp.org/desktopmodules/Specifications/
              SpecificationDetails.aspx?specificationId=4374>.

Authors' Addresses

   Minh-Ngoc Tran
   Soongsil University
   369, Sangdo-ro, Dongjak-gu
   Seoul
   06978
   Republic of Korea
   Email: mipearlska1307@dcn.ssu.ac.kr

   Kiem Nguyen Trung
   Soongsil University
   369, Sangdo-ro, Dongjak-gu
   Seoul
   06978
   Republic of Korea
   Email: kiemnt@dcn.ssu.ac.kr

   Younghan Kim
   Soongsil University
   369, Sangdo-ro, Dongjak-gu
   Seoul
   06978
   Republic of Korea

Tran, et al.             Expires 27 August 2026                [Page 12]
Internet-Draft        cats-req-service-segmentation        February 2026

   Phone: +82 10 2691 0904
   Email: younghak@ssu.ac.kr

Tran, et al.             Expires 27 August 2026                [Page 13]