Additional CATS requirements consideration for Service Segmentation-related use cases
draft-dcn-cats-req-service-segmentation-00
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
The information below is for an old version of the document.
| Document | Type |
This is an older version of an Internet-Draft whose latest revision state is "Active".
|
|
|---|---|---|---|
| Authors | Trần Minh Ngọc , Younghan Kim | ||
| Last updated | 2025-03-03 | ||
| RFC stream | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-dcn-cats-req-service-segmentation-00
cats N. Tran
Internet-Draft Y. Kim
Intended status: Informational Soongsil University
Expires: 4 September 2025 3 March 2025
Additional CATS requirements consideration for Service Segmentation-
related use cases
draft-dcn-cats-req-service-segmentation-00
Abstract
This document discusses possible additional CATS requirements when
considering service segmentation in related CATS use cases such as
AR-VR and Distributed AI Training
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 4 September 2025.
Copyright Notice
Copyright (c) 2025 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Tran & Kim Expires 4 September 2025 [Page 1]
Internet-Draft cats-req-service-segmentation March 2025
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology used in this draft . . . . . . . . . . . . . . . 2
3. Differences compared with normal CATS scenario . . . . . . . 3
4. Possbile Additional CATS Requirements . . . . . . . . . . . . 3
5. Example 1: AR-VR Hologram Sequence Subtask Segmentation . . . 4
6. Example 2: Federated Learning model training Parallel Subtask
Segmentation . . . . . . . . . . . . . . . . . . . . . . 6
7. Normative References . . . . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
Service segmentation is a service deployment option that splits the
service into smaller subtasks which can be executed in parallel or in
sequence before the subtasks execution results are aggregated to
serve the service request
[draft-li-cats-task-segmentation-framework]. It is an interesting
service deployment option that is widely considered to improve the
performance of several services such as AR-VR or Distributed AI
Training which are also key CATS use cases
[draft-ietf-cats-usecases-requirements]. For example, according to
[Ericssion-holographic-5g], an AR holographic communication service
can be implemented as a pipeline of pre-processing, encoding/decoding
and rendering subtasks. These subtasks can have multiple instances
running over several edge computing sites. Meanwhile, federated
learning model training service can be implemented in a hierarchical
manner according to [hierfedml-ieee-parallel-distributed-system]. In
this case, the federated learning global model aggregator service
combines the local model training results from multiple worker model
aggregators and computing devices. Different worker model aggregator
and device combinations can affect the global model training
performance. Hence, a desirable CATS system should consider these
different subtask combinations in its design.
This document discusses the differences of applying CATS in this
service segmenatation scenario compared with the normal CATS scenario
where a service instance is not segmented. Based on the differences,
possible additional CATS requirement are proposed and analyzed via
examples of AR-VR and Distributed AI Training CATS use cases.
2. Terminology used in this draft
This document re-uses the CATS component terminologies which has been
defined in [draft-ietf-cats-framework].
Tran & Kim Expires 4 September 2025 [Page 2]
Internet-Draft cats-req-service-segmentation March 2025
3. Differences compared with normal CATS scenario
Compared with the normal CATS scenario where a service instance is
only a single entity, applying CATS in this service segmentation
scenario introduces some key differences which might affect the CATS
system design. The differences that need to be considered are as
follows:
* Each subtask can have multiple instances running in different
computing sites/devices which have different computing and network
resources capabilities over time.
* A service might have multiple parallel/sequence subtask
combination options . Different subtask combination might have
different number of subtask and be composed by different subtask
instances.
* Different number of subtask causes different CATS cost between
subtask combination.
* Different subtask instances cause different CATS cost between
subtask combination.
* Instead of selecting an optimal service instance over other
instances, the CATS objective is now selecting an optimal
combination of subtask instances over other subtask instance
combination.
4. Possbile Additional CATS Requirements
To handle the differences mentioned above, this document proposes the
following additional CATS Requirements:
* R1: CATS metric/CATS metric aggregation should consider subtask
instance's computing and network resource condition and
distinguish capabilities of different candidate combination of
subtasks to serve a CATS service request.
* R2: A CATS system should provide mechanism to notice/guide/request
the computing entities that host the services and service subtasks
to implement the determined optimal sub-tasks combination.
* R3: A CATS system should provide mechanism to map the service
request to corresponding segmented subtasks if the original
service is not existed, only subtask instance endpoints are
available.
Tran & Kim Expires 4 September 2025 [Page 3]
Internet-Draft cats-req-service-segmentation March 2025
5. Example 1: AR-VR Hologram Sequence Subtask Segmentation
Request AR hologram
+--------+
| Client |
+---|----+
|
+-------|-------+
| Service*** | ***R3: Map request
| Request | to decode + render
| Segmentation | subtasks
| Component |
+-------|-------+
**R2: Route request to | *R1: Different subtask combination
the determined | CATS cost (Decode + Render)
subtask sequence | - Decode Site 1/3/4 &
+-----|-----+------+ - Render Site 1/2/3
+-----------------------| CATS** |C-PS* |---------------------+
| Underlay** | Forwarder |------+ +-------+ |
| Infrastructure +-----|-----+ |C-NMA* | |
| | +-------+ |
| +---------------+-----+---------+---------------+ |
| 3ms 4ms 3ms 2ms |
| nw delay nw delay nw delay nw delay |
| | | | | |
| | | | | |
| | 2ms | 2ms | 3ms | |
| | nw delay | nw delay | nw delay | |
| | /-----------\ | /-----------\ | /-----------\ | |
+-+-----|/----+---+----\|/----+---+----\|/----+---+----\|-----+--+
| CATS** | | CATS** | | CATS** | | CATS** |
| Forwarder | | Forwarder | | Forwarder | | Forwarder |
+-----|-----+ +-----|-----+ +-----|-----+ +-----|-----+
| | | |
+-----|-----+ +-----|-----+ +-----|-----+ +-----|-----+
|+---------+| |+---------+| |+---------+| |+---------+|
|| Decode || || Render || || Decode || || Decode ||
|+---------+| |+---------+| |+---------+| |+---------+| +---+---+
| 3ms delay | | 3ms delay | | 5ms delay | | 8ms delay | |C-SMA* |
| | | | | | | | +---+---+
|+---------+| | | |+---------+| | | |
|| Render || | | || Render || | | |
|+---------+| | | |+---------+| | | |
| 9ms delay | | | | 7ms delay | | | |
+-----|-----+ +-----|-----+ +-----|-----+ +-----|-----+ |
+---------------+---------------+---------------+-------------+
Service Service Service Service
Site 1 Site 2 Site3 Site 4
Tran & Kim Expires 4 September 2025 [Page 4]
Internet-Draft cats-req-service-segmentation March 2025
Figure 1: Example of additional CATS requirement in an AR use
case example
Figure Figure 1 discusses the additional CATS requirements in an AR
hologram service use case referenced from [Ericssion-holographic-5g].
This example service is responsible for returning a processed 3D
hologram upon receiving a request from an AR client (e.g. AR glass).
The original full service is not available in the network. Instead,
this service is segmented into 2 subtasks: decoding and rendering.
These subtasks have multiple instances running in different service
sites. The current computing resource status of each service site
and the current number of requests served by each service instance
cause different decoding and rendering computing delay as shown in
the figure. Besides, the network delay between the AR client and
different service sites are also different.
Considering applying CATS in this example scenario, the additional
CATS requirements can be explained as follows:
R1: CATS metric/CATS metric aggregation should consider subtask
instance's computing and network resource condition and distinguish
capabilities of different candidate combination of subtasks to serve
a CATS service request.
* In this case, each candidate CATS path is represented by the
combination one Decode service instance and one Render service
instance from the available instances at 4 different service
sites. There are multiple combination options such as Decode
instance at Service Site 1 and Render instance at Service Site 2,
Decode instance at Service Site 4 and Render instance at Service
Site 3, both Decode and Render instances at the same Service Site
1 or 3, etc. For each subtask combination, the computing CATS
metrics of the Decoding and Rendering instance, along with the
network CATS metrics of the corresponding Service Sites (between
client and site and between sites) should be aggregated. For
example, in figure Figure 1, the combination of Decode instance at
Service Site 1 and Render instance at service site 2 has a total
CATS expected delay of 15ms (3ms of computing delay at each
instance and 9ms network delay between cilent and Service Sites)
R2: A CATS system should provide mechanism to notice/guide/request
the computing entities that host the services and service subtasks to
implement the determined optimal sub-tasks combination.
* In this case, the CATS Forwaders and the underlay infrastructure
should provide a mechanism to route the client AR hologram service
request follow the optimal combination sequence determined by the
CATS system. For example, if the combination of Decode instance
Tran & Kim Expires 4 September 2025 [Page 5]
Internet-Draft cats-req-service-segmentation March 2025
at Service Site 1 and Render instance at Service Site 2 is
selected, the request should be routed in the correct order via
the CATS Forwaders at client side, Service Site 1, then Service
Site 2 before return the final response back to the client.
Segment Routing is a example method to achieve this requirement by
routing the request via a list of routing segments
([draft-ietf-spring-sr-service-programming],
[draft-lbdd-cats-dp-sr]).
R3: A CATS system should provide mechanism to map the service request
to corresponding segmented subtasks if the original service is not
existed, only subtask instance endpoints are available.
* In this case, because there are no full AR hologram service, the
service can only be realized by chaining its subtasks. Hence, the
CATS system should provide a component that can segment the
service request into the corresponding subtasks and return the
response from these subtasks to the client. The Task Segmentation
Module discussed in [draft-li-cats-task-segmentation-framework] in
an example.
6. Example 2: Federated Learning model training Parallel Subtask
Segmentation
Tran & Kim Expires 4 September 2025 [Page 6]
Internet-Draft cats-req-service-segmentation March 2025
Request FL model
+--------+
| Client |
+---|----+
| **R2: Different subtask combination
**R1: Ask Global Aggregator | CATS cost (Global + Worker + Device)
to use the determined | - Worker 1/2/1+2/3+4/3+4+5...
combination +-----|-----+------+ - Device 1/2/1+2+3/4+5+...
+-----------------------| CATS |C-PS**|---------------------+
| | Forwarder |------+ +-------+ |
| Underlay +-----|-----+ |C-NMA**| |
| Infrastructure | +-------+ |
| +--------------+-----------------+ |
| 3ms 4ms |
| nw delay nw delay |
| | | |
+--------+-----|-----+--------------------+-----|-----+----------+
| CATS | | CATS |
| Forwarder | | Forwarder |
+-----|-----+ +-----|-----+
+-----|-----+ +-----|-----+
| Global | +-------+ | Global |
| Aggregator| |C-SMA**| | Aggregator|
| Instance 1| +-------+ | Instance 2|
+-|------|--+ +-/----|----\
| | / | \
Different network delay between different Worker and Global Aggregators
/ \ / | \
+--------/-+ +-----\----+ +---------/+ +---|------+ +----\-----+
| Worker | | Worker | | Worker | | Worker | | Worker |
|Aggregator| |Aggregator| |Aggregator| |Aggregator| |Aggregator|
|Instance 1| |Instance 2| |Instance 3| |Instance 4| |Instance 5|
| | | | | | | | | |
|now serve:| |now serve:| |now serve:| |now serve:| |now serve:|
|-3 models | |-2 models | |-3 models | |-1 model | |-2 models |
|-5 devices| |-7 devices| |-4 devices| |-6 devices| |-8 devices|
+-----|----+ +----|-----+ +----|-----+ +----|-----+ +----|-----+
| | | | |
Different network delay between different devices and Worker Aggregators
| | | | |
+-----|------------|----------------|-------------|-------------|-----+
| Local Training Devices |
| (Device 1, Device 2, ......., Device N) |
| (Different computing capabilties) |
+---------------------------------------------------------------------+
Tran & Kim Expires 4 September 2025 [Page 7]
Internet-Draft cats-req-service-segmentation March 2025
Figure 2: Example of additional CATS requirement in a
Hierarchical Federated Learning use case example
Figure Figure 2 discusses the additional CATS requirements in an
Federated Learning Model Training service use case referenced from
[hierfedml-ieee-parallel-distributed-system]. This example service
is responsible for returning a trained federated learning model upon
receiving a request from a client. The federated learning service is
implemented in a hierarchical manner. The service endpoint for
receiving client request is a Global federated learning Aggregator
which can have multiple service instances. Upon receiving a trained
model request, one or multiple Worker Aggregators and Local Training
Devices are assigned to locally train the model for the Global
Aggregator. The number of Training Devices assigned for each Worker
Aggregator is also varied. Each Worker Aggregator aggregates the
local model parameters for its assigned devices and the Global
Aggregator aggregates the parameters from the Workers to create the
global model for replying the client request.
Considering applying CATS in this example scenario, the additional
CATS requirements can be explained as follows:
R1: CATS metric/CATS metric aggregation should consider subtask
instance's computing and network resource condition and distinguish
capabilities of different candidate combination of subtasks to serve
a CATS service request.
* In this case, there are multiple combination of Worker Aggregator
and Local Training Devices that can be assigned for a single
Global Aggregator instance. Hence, selecting only a Global
Aggregator service instance is not enough. Different number of
Worker Aggregators per a Global Aggregator and different number of
Training Devices per Worker Aggregators can cause different Global
Aggregator model training performances. Besides, the computing
resources (CPU/GPU/memory/etc.) between Devices and between Worker
Aggregators are also different. For Worker Aggregator, apart from
the computing resources, the current number of serving models and
devices can also affect the model aggregation performance such as
congestion. Network conditions between Devices and Aggregators
are also varied. Hence, CATS metrics should reflect the computing
and network resource status of each Device and Aggregator. Each
CATS candidate path should be represented by a metric aggregation
of a Global Aggregator instance, one or multiple Worker Aggregator
instances, and their associated Local Training Devices.
R2: A CATS system should provide mechanism to notice/guide/request
the computing entities that host the services and service subtasks to
implement the determined optimal sub-tasks combination.
Tran & Kim Expires 4 September 2025 [Page 8]
Internet-Draft cats-req-service-segmentation March 2025
* In this case, the CATS Path Selector should inform the CATS
determined Global Aggregator instance or the hierarchical
federated learning orchestration entity to use the combination of
chosen Global, Worker Aggregator instances and Local Training
Devices to train the federated learning model.
R3: A CATS system should provide mechanism to map the service request
to corresponding segmented subtasks if the original service is not
existed, only subtask instance endpoints are available.
* In this case, this requirement is not necessary because the full
original service (Global Aggregator) is existed and serve the
request. The CATS system only handles routing between client and
the Global Aggregator instances.
7. Normative References
[draft-ietf-cats-framework]
Li, C., et al., "A Framework for Computing-Aware Traffic
Steering (CATS)", draft-ietf-cats-framework, February
2025.
[draft-ietf-cats-usecases-requirements]
Yao, K., et al., "Computing-Aware Traffic Steering (CATS)
Problem Statement, Use Cases, and Requirements", draft-
ietf-cats-usecases-requirements, February 2025.
[draft-ietf-spring-sr-service-programming]
Ed, F. Clad., et al., "Service Programming with Segment
Routing", draft-ietf-spring-sr-service-programming,
February 2025.
[draft-lbdd-cats-dp-sr]
Li, C., et al., "Computing-Aware Traffic Steering (CATS)
Using Segment Routing", draft-lbdd-cats-dp-sr, January
2025.
[draft-li-cats-task-segmentation-framework]
Li, C., et al., "A Task Segmentation Framework for
Computing-Aware Traffic Steering", draft-li-cats-task-
segmentation-framework, December 2024.
[Ericssion-holographic-5g]
"HOLOGRAPHIC COMMUNICATION IN 5G NETWORKS", May 2022,
<https://www.ericsson.com/49a8b1/assets/local/reports-
papers/ericsson-technology-review/docs/2022/holographic-
communication-in-5g-networks.pdf>.
Tran & Kim Expires 4 September 2025 [Page 9]
Internet-Draft cats-req-service-segmentation March 2025
[hierfedml-ieee-parallel-distributed-system]
Xu, Z., Zhao, D., Liang, W., Rana, O., Zhou, P., and M.
Li, "HierFedML: Aggregator Placement and UE Assignment for
Hierarchical Federated Learning in Mobile Edge Computing",
January 2023, <https://doi.org/10.1109/TPDS.2022.3218807>.
Authors' Addresses
Minh-Ngoc Tran
Soongsil University
369, Sangdo-ro, Dongjak-gu
Seoul
06978
Republic of Korea
Email: mipearlska1307@dcn.ssu.ac.kr
Younghan Kim
Soongsil University
369, Sangdo-ro, Dongjak-gu
Seoul
06978
Republic of Korea
Phone: +82 10 2691 0904
Email: younghak@ssu.ac.kr
Tran & Kim Expires 4 September 2025 [Page 10]