Internet-Draft Slice-Media-Service October 2023
Jiang & Wang Expires 25 April 2024 [Page]
TSV Working Group
Intended Status:
T. Jiang
China Mobile
D. Wang
China Mobile

Encoding 3GPP Slices for Interactive Media Services


Extended Reality & multi-modality communication, or XRM, is a type of advanced service that has been studied and standardized in the 3GPP SA2 working group. It targets at achieving high data rate, ultra-low latency, and high reliability. The streams of an XRM service might be comprised of data from multiple modalities, namely, video, audio, ambient-sensor and haptic detection, etc. XRM service faces challenges on various aspects, e.g. accurate multi-modality data synchronization, QoS differentiation, large volume of packets, and etc. While a new 3GPP network slice type, HDLLC, has been recently introduced to handle the QoS requirements of XRM streams, the ubiquitously-existential encryption of packet header and/or payload post additional challenges to the transport of encoded video packets via 5GS. We have then discussed two potential IETF schemes, e.g., IP-DSCP based or UDP-option extension, that could be applied to 'expose' XRM QoS 'metadata' to 5GS.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 25 April 2024.

1. Introduction

Extended Reality & multi-modality communication, or XRM, is a type of advanced service that has been studied and standardized in the 3GPP SA2 working group. With the objective of achieving economical communication overhead, ultra-low latency, high reliability and top security, it features multi-modal interactions among a group of service entities that are (geographically) distributed at the mobile network edges. The benefits of seamlessly integrating multiple types of streams sourced via multiple inputs make it widely applicable in fields, like AR/VR, telepresence, gaming, education, etc.

The XRM service can consolidate the inputs from more than one source and disseminate information to multiple destinations. That is, the input data from different kinds of devices/sensors or the output data to different types of destinations are required for the same task or application. This type of communication scheme possesses intrinsic advantages of providing services that would be complementary to each other, or even bearing progressive add-on gains, so that redundant delivery and information accuracy could be achieved effectively. In general sense, the streams of an XRM service might be comprised of data from four modalities, namely, video, audio, ambient-sensor and haptic detection.

Thanks to the unique nature and different requirements across modalities, and especially to address the 5G service requirements of different types of media steams with coordinated throughput, latency and reliability, XRM service faces challenges on various aspects, e.g. characteristics of generated data across modalities, accurate multi-modality data synchronization, QoS differentiation, large volume of small packets, and packet-size variation, etc. [_5G.TACMM]

XRM services use (multiple) IP sessions to carry & transport data frames. With a session corresponding to one modality stream, the coordinated transmission among multiple modality flows (or streams) needs to be warranted. The application client(s) of the different types of data of one application may be located at either one destination (e.g., a UE), or multiple destinations (e.g. having VR glasses, gloves, and more).

In current 5GS, a QoS Flow is the finest granularity of QoS differentiation in a PDU Session, which indicates that all PDU packets in a QoS flow are treated according to the same QoS requirements. Specifically for the XRM service, a group of packets carry the payload of a PDU Set (e.g. a frame, video slice/tile). Packets in a media stream belonging to the same PDU Set are decoded/handled as a whole. For example, a frame/video slice may only be decoded at the receiver in case all or certain amount of the packets carrying the frame/video slice are successfully delivered. For example, a frame within a GOP (Group of Pictures) can only be decoded by the client in case all frames on which that frame depends are successfully received. Hence the groups of packets within a PDU Set have inherent dependency on each other in media layer. Without considering the inter-dependancy among the packets within a PDU set, 5GS may perform a scheduling with low efficiency. For example, the 5GS may randomly drop packet(s) but try to deliver other packets of the same PDU set which are useless to the client and thus waste radio resources [TS.23.501].

The similar impact is also applicable to audio samples, haptics applications or remote control operations. The inter-dependency among the packets of a PDU Set (e.g. a frame/video slice) can possibly help enhance the efficiency and promote user experience.

2. 3GPP Slices Mapping for Media Services

2.1. 3GPP Network Slicing

5G network slicing is a network architecture that enables the multiplexing of virtualized and independent logical networks on the same physical network infrastructure. Each network slice is an isolated end-to-end network tailored to fulfil diverse requirements requested by a particular application.

Network slices may differ between supported features and network function optimisations. This technology assumes a central role to support 5G mobile networks that are designed to efficiently embrace a plethora of services with very different service level requirements (SLR). The realization of this service-oriented view of the network leverages on the concepts of software-defined networking (SDN) and network function virtualization (NFV) that allow the implementation of flexible and scalable network slices on top of a common network infrastructure.

The business model adopted in Telco domain normally indicates that a network slice is administrated by a mobile virtual network operator (MVNO). The infrastructure provider (the owner of the Telco infrastructure) leases its physical resources to the MVNOs that share the underlying physical network. According to the availability of the assigned resources, a MVNO can autonomously deploy multiple network slices that are customized to the various applications provided to its own users.

2.2. 3GPP Standardized SSTs

As shown in the following Figure 1, 3GPP 5GS have defined 6 standard SSTs, covering eMBB, URLLC MIoT and more. Standardized SST values provide a way for establishing global interoperability for slicing so that Public Land Mobile Networks or PLMNs can support the roaming use case more efficiently for the most commonly used Slice/Service Types (or SSTs).

  | Types | Value |           Characteristics                      |
  | eMBB  |   1   |  Slice for 5G enhanced mobile broadband        |
  | URLLC |   2   |  Slice for ultra-reliable low-latency comm.    |
  | MIoT  |   3   |  Slice for Massive IoT                         |
  | V2X   |   4   |  Slice for V2X Services                        |
  | HMTC  |   5   |  Slice for High-performance Machine-type comm. |
  | HDLLC |   6   |  Slice for High data-rate & low-latency comm.  |

Figure 1: 3GPP Standardized SSTs

Specifically, the 6th SST in the table, i.e., HDLLC or the slice for High Data-rate & Low-Latency communication service, was just introduced recently as a new SST to handle the XR media service or XRM. The XRM is characterized by high data rate and low latency communication. As per [TS.23.501], the 5GS QoS framework has been enhanced to support different QoS handling for a PDU Set. It supports differentiated QoS provisioning considering different importance of PDU Sets, e.g. by treating packets belonging to less important PDU Set(s) differently to reduce the resource consumption. One add-on benefit is the reduction of the complexity of roaming configuration between networks.

Similarly, another SDO, GSMA ENSWI, also believes that the Extended Reality and Media Services are promising services, and a new standardized SST can bring consistent user experience for XRM Services and enhance the user experience, especially in the roaming case. Currently, GSMA ENSWI is working on defining a new NEST (NEtwork Slice Template) for Extended Reality and Media Services [GSMAnewSSTforXRM].

2.3. Mapping Standardized SST for Encrypted XRM Service

Different SST maps to varied QoS requirements, e.g., guaranteed bandwidth, max-allowed bandwidth, max-data-burst, min-latency, max-latency, jitter variation, etc. Particularly for the 5G XRM service, thanks to the diversified settings of framing, slicing, encoding of video images, there invovle additional parameters like the relative importance among different data packets (or PDUs in 3GPP SA2 term) that are generated from differrent types of frames, e.g., the I, P, B frames which render tiered priorities among them.

Another critical factor impacting the transport of encoded video packets is the ubiquitously-existential encryption of packet (header and) payload. For example, supposing RTP is used to transport video data. If the data contents in a packet are encrypted at the video source (i.e., the UDP source), then the later-added UDP header could not expose the mattered QoS parameters to the routing entities in the underlay transport network until the same packet reaches the UDP destination. However, this brings in somewhat great challenges to the 5GS-based XRM service.

     :               /-----------\             :
      :             |   5GS(SBA) |             :
     :              /\-----------/\            :
     :             /       |       \           :
     :      +-----+     +------+    +-----+    :
     :      | AMF |-----|  SMF |----| PCF |    :
     :      +-----+     +------+    +-----+    :
     :      /    |           |                 :
     :     N1   N2           N4                :
     :    /      |           |         <= Downlink direction
     :   /       |        +-------+            :
     +----+  +-------+ N3 |       | N6         :   +-------------+
     | UE |--|  gNB  |----|  UPF  |------------:---|  IP-domain  |
     +----+  +-------+    |       |            :   | Network(DNN)|
     :                    +-------+            :   +-------------+
     :                     |    |        Uplink direction =>
     :                     +-N9-+              :

Figure 2: A 5GS 'composite' Node

For example, as shown in the Figure 2, the network function(NF) UPF is similar to an IP-domain router, which sends/receives IP packets off the N6 interface to/from the IP domain network DNN. In the XRM scenario, when a downlink packet (from the DNN) with encrypted video contents arrives at the UPF from the N6 interface, the UPF would generally only use the IP 5-tuple for packet classfication and prioritization (i.e., using PDRs in the 3GPP 5G term [TS.23.501]). Unfortunately, the 5-tuple cannot expose all the QoS related information, or so-named 'metadata' for XRM in this draft, that are required for the effective data processing of an XRM stream. The existence of encryption prevents a UPF from diving further into the (video) data contents.

3. Possible IETF Schemes for Encrypted XRM Streams

The challenges, revolving around the encrypted XRM service as described in Section 2.3, lead to the exploration of novel IETF mechanisms to convey these critical 'metadata' to a UPF (i.e., to the 5GS for advanced QoS handling).

3.1. Using IP-DSCP to Map Encrypted XRM Streams

One scheme is to use the 6-bit DS field in an IP header [RFC2474]. The fundamental advantage is that the IP-DSCP bits are not normally subject to the encryption hurdle. However, the 6-bit DS field has only 64 DSCP combinations which could not provide better granularity that is an equal important factor for the further evolution of 5G advanced services. Another disadvantage is DSCP does not have good hierarchy among its 6 bits. For example, while the objective to promote the HDLLC (SST=6) was initially targetting at the XRM service, it could also be customized & applied later to Metaverse service, i.e., another type of HDLLC service which is being discussed in the 3GPP SA2 WG now. Therefore, in the scenario (of HDLLC), any candidate novel scheme should have the hierarchial capability to extend and accommodate the mappings for both the top-level SST, e.g., HDLLC itself, and the more granular sub-types, e.g., XRM, Metaverse, etc., as discussed previously.

3.2. Using UDP-option to Map Encrypted XRM Streams

Another novel (and also more promising) scheme is to use the being-standardized UDP-option [transportUDPoption]. Actually, when we view the requirements of the payload and/or header encryption-handling, the better granularity, along with the extensibility that could be achieved via the new UDP-option structure, make us believe adopting the UDP-option is a better alternative (than using the IP-DSCP). For example, according to [transportUDPoption], we may get a code from the 'Kind' range [10...126] to identify the top-level as the type of '3GPP network slice'. Then, we could further define the sub-structure for more concrete SSTs as per the table in Figure 1.

Another advantage is that UDP is a layer-4 protocol and its header will normally not be processed by IP routers. Not only does this relieve the processing burden off IP transport devices, but also gives a clear demarcation of the TCP/IP layer structure.

3.2.1. UDP-option for 5GS NOT-violate UDP end-to-end rule

Of course, some concerns are currenlty revolving around the extension of the UDP-otions by arguing that UDP is a layer-4 transport protocol and its associated datagrams should be end-to-end processed, i.e., encapsulated at UDP sources and decapsulated at UDP destinations. If we look at the Figure 2, we know the downlink IP packets enter into the 5GS via the UPF N6 interface from the IP domain (DNN) (right-side). The UPF functions to switch IP packets toward the UE (residing on the left side). Obviously, the UE is the genuine end receiver (of a UDP datagram). The UPF is only an intermediary node taking on IP functionalities, which is nothing different from a regular IP-domain node. Therefore, applying the UDP extension option and having the intermediary (IP) node, e.g., a 5GS UPF, process UDP datagrams is indeed a concern of violating the end-to-end layer structure.

Fortunately, there exist somewhat good agurments for the 5GS-based XRM service to adopt the UDP extension option. A 5GS is unique in that it is a composite system, as shown in the Figure 2. It can be considered holistically as a 'blackbox' joining the external IP domain. The IP DNN does not know a UE and its anchored UPF are two seperate entities, nor does it care. Instead, it only cares to forward IP packets downstream to the 5GS (actually toward a UPF via the N6 interface). How the 5GS (i.e., the UPF) may process the packets is out of the scope (of the IP domain). Because of 5GS' 'composite & transparent' characteristics, we argue that a 5GS (UPF) can be granted the capability to 'intelligently' break the IP-UDP demarcation rule by peeking at the (encrypted) XRM 'metadata' in the UDP extension option field. To the external IP domain, this still observes the 'end-to-end' rule.

Actually, there is already an I.D. discussing how to have end points to explicitly distribute the encrpyted metadata to an intermediary network node [EncryptedMetaDataToNetworkNode]. As shown in the Figure 2, the UPF would be the node to use the metadata to assist in decrypting the media contents (and/or headers). Once the UPF gets all the detailed information, it can provision and enforce the QoS settings for the XRM streams [TS.23.501].

Further, the draft [transportUDPoption] also suggests clearly that the UDP-options are just a framework. Options might be defined even when the details are not yet sufficient. The use of such options can be described in separate documents. This suggestion does bode well for the 5GS XRM service because our draft is exactly conforming to the tenet of the UDP-option framework.

4. Security Considerations

Generally, this function will not incur additional security issues.

5. IANA Considerations

A new authentication option or other signaling message option may be used based on the specific implementation.

6. References

6.1. Normative References

Reddy, T., "An Approach for Encrypted Transport Protocol Path Explicit Signals", draft-reddy-tsvwg-explcit-signal-01, .
Nichols, K., Blake, S., Baker, F., and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, DOI 10.17487/RFC2474, , <>.
Touch, J., "Transport Options for UDP", draft-ietf-tsvwg-udp-options-22, .
"3GPP TS 23.501 (V17.0.0): System Architecture for 5G System; Stage 2", 3GPP TS 23.501, .
"3GPP TS 24.501: Non-Access-Stratum (NAS) protocol for 5G System (5GS)", 3GPP TS 24.501, .
Jiang, T., Shi, X., Gao, J., and P. Liu, "On the 5G Edge Network Challenges of Providing Tactile and Multi-modality Communication Services", International Conference on Edge Computing - EDGE'2021 , .

6.2. Informative References

GSMA, "LS reply on a new SST value for Extended Reality and Media Services",, .
3GPP, "Introduction of a new standard SST for Extended Reality and Media Services",, .

Authors' Addresses

Tianji Jiang
China Mobile
San Jose, CA
Dan Wang
China Mobile