Skip to main content

Low Overhead Media Container
draft-mzanaty-moq-loc-04

Document Type Active Internet-Draft (individual)
Authors Mo Zanaty , Suhas Nandakumar , Peter Thatcher
Last updated 2024-11-03
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-mzanaty-moq-loc-04
Network Working Group                                          M. Zanaty
Internet-Draft                                             S. Nandakumar
Intended status: Informational                                     Cisco
Expires: 8 May 2025                                          P. Thatcher
                                                               Microsoft
                                                         4 November 2024

                      Low Overhead Media Container
                        draft-mzanaty-moq-loc-04

Abstract

   This specification describes a media container format for encoded and
   encrypted audio and video media data to be used primarily for
   interactive Media over QUIC Transport (MOQT) [MoQTransport], with the
   goal of it being a low-overhead format.  It further defines the LOC
   Streaming Format for the MOQ Common Catalog format [MoQCatalog] for
   publishers to annouce and describe their LOC tracks and for
   subscribers to consume them.  The specification also provides
   examples to aid application developers for building media
   applications over MOQT and intending to use LOC as the streaming
   format.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 8 May 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

Zanaty, et al.             Expires 8 May 2025                   [Page 1]
Internet-Draft               media container               November 2024

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Requirements Notation and Conventions . . . . . . . . . .   4
     1.2.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   4
   2.  Payload Format  . . . . . . . . . . . . . . . . . . . . . . .   4
     2.1.  MOQ Object Mapping  . . . . . . . . . . . . . . . . . . .   5
     2.2.  LOC Header Extensions . . . . . . . . . . . . . . . . . .   5
       2.2.1.  Common Header Data  . . . . . . . . . . . . . . . . .   6
       2.2.2.  Video Header Data . . . . . . . . . . . . . . . . . .   6
       2.2.3.  Audio Header Data . . . . . . . . . . . . . . . . . .   7
   3.  Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
     3.1.  Catalog Fields  . . . . . . . . . . . . . . . . . . . . .   8
       3.1.1.  Optional Extensions for Video . . . . . . . . . . . .   8
       3.1.2.  Selection Parameters for Video  . . . . . . . . . . .   8
       3.1.3.  Optional Extensions for Audio . . . . . . . . . . . .   9
       3.1.4.  Selection Parameters for Audio  . . . . . . . . . . .   9
     3.2.  Catalog Examples  . . . . . . . . . . . . . . . . . . . .   9
   4.  Payload Encryption  . . . . . . . . . . . . . . . . . . . . .  10
   5.  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . .  10
     5.1.  Application with one audio track  . . . . . . . . . . . .  10
     5.2.  Application with one single quality video track . . . . .  11
     5.3.  Application with single video track with temporal
           layers  . . . . . . . . . . . . . . . . . . . . . . . . .  12
     5.4.  Application with mutiple dependent video tracks . . . . .  12
     5.5.  Application with mutiple dependent video tracks with dyadic
           framerate levels. . . . . . . . . . . . . . . . . . . . .  13
     5.6.  Application with multiple simulcast qualities video
           tracks  . . . . . . . . . . . . . . . . . . . . . . . . .  14
   6.  Security and Privacy Considerations . . . . . . . . . . . . .  15
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
   8.  Normative References  . . . . . . . . . . . . . . . . . . . .  15
   Appendix A.  Acknowledgements . . . . . . . . . . . . . . . . . .  17
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

Zanaty, et al.             Expires 8 May 2025                   [Page 2]
Internet-Draft               media container               November 2024

1.  Introduction

   This specification describes a low-overhead media container format
   for encoded and encrypted audio and video media data, as well as a
   MOQ Common Catalog streaming format called LOC to describe such
   tracks.

   "Low-overhead" refers to minimal extra encapsulation as well as
   minimal application overhead when interfacing with WebCodecs
   [WebCodecs].

   The container format description is specified for all audio and video
   codecs defined in the WebCodecs Codec Registry
   [WEBCODECS-CODEC-REGISTRY].  The audio and video payload bitstream is
   identical to the "internal data" inside an EncodedAudioChunk and
   EncodedVideoChunk, respectively, specified in the registry.

   (Note: Do we need to support timed text tracks such as Web Video Text
   Tracks (WebVTT) ?)

   In addition to the media payloads, critical metadata is also
   specified for audio and video payloads.  (Note: Align with MOQT
   terminology of either "metadata" or "header".)

   A primary motivation is to align with media formats used in WebCodecs
   to minimize extra encapsulation and application overhead when
   interfacing with WebCodecs.  Other container formats like CMAF or RTP
   would require more extensive application overhead in format
   conversions, as well as larger encapsultion overhead which may burden
   some use cases like low bitrate audio scenarios.

   This specification can also be used by applications outside the
   context of WebCodecs or a web browser.  While the media payloads are
   defined by referring to the "internal data" of an EncodedAudioChunk
   or EncodedVideoChunk in the WebCodecs Codec Registry, this "internal
   data" is the elementary bitstream format of codecs without any
   encapsulation.  Referring to the WebCodecs Codec Registry avoids
   duplicating it in an identical IANA registry.

   *  Section 2 defines the core media payload formats.

   *  Section 2.2 defines the metadata associated with audio and video
      payloads.

   *  Section 3 describes the LOC Streaming Format bindings to the MoQ
      Common Catalog format including examples.

Zanaty, et al.             Expires 8 May 2025                   [Page 3]
Internet-Draft               media container               November 2024

1.1.  Requirements Notation and Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD","SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   [RFC2119].

1.2.  Terminology

   Track, Group, Subgroup, Object, and their corresponding identifiers
   (ID or alias) are defined in [MoQTransport] and used here to refer to
   those aspects of the MOQT Object Model.

2.  Payload Format

   The WebCodecs Codec Registry defines the contents of an
   EncodedAudioChunk and EncodedVideoChunk for the audio and video codec
   formats in the registry.  The "internal data" in these chunks is used
   directly in this specification as the "LOC Payload" bitstream.  This
   "internal data" is the elementary bitstream format of each codec
   without any encapsulation.

   For video formats with multiple bitstream formats in the WebCodecs
   Registry, such as H.264/AVC or H.265/HEVC, the LOC Payload uses the
   "canonical" format ("avc" or "hevc", not "annexB") with the following
   additions:

   *  Parameter sets can be sent in the bitstream before key frames,
      similar to "annexB" formats.  (Note that newer "canonical" formats
      such as "avc3" and "hev1" codec strings support parameter sets in
      the bitstream or outside it.)

   *  Parameter sets can be provided by means outside this
      specification, such as "extradata" in "canonical" ("avc" or
      "hevc") formats.

   *  4 byte length codes can be sent before each NAL Unit, similar to
      "canonical" ("avc" or "hevc") formats.  (Note that a length of 1
      should be interpreted as a start code rather than a length.)

   *  4 byte or longer start codes can be sent before each NAL Unit,
      similar to "annexB" formats.

   *  Note that if length codes or start codes are less than 4 bytes,
      which is uncommon, it may not be possible to disambiguate them
      without information outside this specification.

Zanaty, et al.             Expires 8 May 2025                   [Page 4]
Internet-Draft               media container               November 2024

2.1.  MOQ Object Mapping

   An application object when transported as a [MoQTransport] object is
   composed of a MOQ Object Header, with optional Extensions, and a
   Payload.  Media objects encoded using the container format defined in
   this specification populate the MOQ Object Payload with the LOC
   Payload, and the MOQ Object Header Extensions with the LOC Header
   Extensions, as shown below.

   The LOC Payload is the "internal data" of an EncodedAudioChunk or
   EncodedVideoChunk.

   The LOC Header Extensions carry optional metadata related to the
   Payload.

   <-----------  MOQ Object  ------------>
   +----------+--------------+-----------+
   |   MOQ    |  MOQ Header  |    MOQ    |
   |  Header  |  Extensions  |  Payload  |
   +----------+--------------+-----------+
                     |             |
                     |             |
              +--------------+-----------+
              |  LOC Header  |    LOC    |
              |  Extensions  |  Payload  |
              +--------------+-----------+

   LOC Header Extensions = some MOQ Object Header Extensions
   LOC Payload = all MOQ Object Payload
   LOC Payload = "internal data" of EncodedAudio/VideoChunk

2.2.  LOC Header Extensions

   The LOC Header Extensions carry optional metadata for the
   corresponding LOC Payload.  The LOC Header Extensions are contained
   within the MOQ Object Header Extensions.  This metadata provides
   necessary information for end subscribers, relays and other
   intermediaries to perform their operations without accessing the
   media payload.  For example, media switches can use this metadata to
   perform their media switching decisions without accessing the payload
   which may be encrypted end-to-end (from original publisher to end
   subscribers).

   The following sections define specific metadata as LOC Header
   Extensions and register them in the IANA registry for MOQ Object
   Header Extensions.

Zanaty, et al.             Expires 8 May 2025                   [Page 5]
Internet-Draft               media container               November 2024

   Other specifications can define other metadata as LOC Header
   Extensions and register them in the same registry.  Each extension
   must specify the following information in the IANA registry.

   *  Name: Short name for the metadata.  (Not sent on the wire.)

   *  Description: Detailed description for the metadata.  (Not sent on
      the wire.)

   *  ID: Identifier assigned by the registry. (varint)

   *  Length: Length of metadata Value in bytes. (varint)

   *  Value: Value of metadata.  (Length bytes)

2.2.1.  Common Header Data

2.2.1.1.  Capture Timestamp

   *  Name: Capture Timestamp

   *  Description: Wall-clock time in microseconds since the Unix epoch
      when the encoded media frame was captured, encoded as a 64 bit
      unsigned integer in network byte order (big endian).

   *  ID: TBA (IANA, please assign from the MOQ Header Extensions
      Registry)

   *  Length: 8 (bytes)

   *  Value: Varies

2.2.2.  Video Header Data

2.2.2.1.  Video Frame Marking

   *  Name: Video Frame Marking

   *  Description: Flags for video frames which are independent,
      discardable, or base layer sync points, as well as temporal and
      spatial layer identification, as defined in [Framemarking].

   *  ID: TBA (IANA, please assign from the MOQ Header Extensions
      Registry)

   *  Length: Varies (1-3 bytes)

   *  Value: Varies

Zanaty, et al.             Expires 8 May 2025                   [Page 6]
Internet-Draft               media container               November 2024

2.2.3.  Audio Header Data

2.2.3.1.  Audio Level

   *  Name: Audio Level

   *  Description: The magnitude of the audio level of the corresponding
      audio frame encoded in 7 bits as defined in section 3 of
      [RFC6464].

   *  ID: TBA (IANA, please assign from the MOQ Header Extensions
      Registry)

   *  Length: 1 (byte)

   *  Value: Varies

3.  Catalog

   A catalog track provides information about tracks from a given
   publisher.  A catalog is used by subscribers for consuming tracks and
   by publishers to advertise and describe the tracks.  The content of a
   catalog is opaque to the relays and may be end to end encrypted.  A
   catalog describes the details of tracks such as Track IDs and
   corresponding media configuration details, for example, audio/video
   codec details.

   The LOC Streaming Format uses the MoQ Common Catalog Format
   [MoQCatalog] to describe the content being produced by a publisher.

   Per Sect 5.1 of [MoQCatalog], this document registers an entry in the
   "MoQ Streaming Format Type" table, with the type value 2, the name
   "LOC Streaming Format", and the RFC XXX.

   Every LOC catalog track MUST declare a streaming format type (See
   Sect 3.2.1 of [MoQCatalog]) value of 2.

   Every LOC catalog track MUST declare a streaming format version (See
   Sect 3.2.1 of [MoQCatalog]) value of 1, which is the version
   described in this document.

   Every LOC catalog track MUST declare a packaging type (See Sect 3.2.9
   of [MoQCatalog]) of "loc".

Zanaty, et al.             Expires 8 May 2025                   [Page 7]
Internet-Draft               media container               November 2024

   The catalog track MUST have a track name of "catalog".  A catalog
   object MAY be independent of other catalog objects or it MAY
   represent a delta update of a prior catalog object.  The first
   catalog object published within a new group MUST be independent.  A
   catalog object SHOULD only be published only when the availability of
   tracks changes.

   Each catalog update MUST be mapped to a discreet moq-transport
   object.

3.1.  Catalog Fields

   The MOQ Common Catalog defines the required base fields and optional
   extensions.

3.1.1.  Optional Extensions for Video

   The LOC Streaming Format allows the following optional extensions for
   video media.

   *  temporalId: Identifies the temporal layer/sub-layer encoded,
      starting with 0 for the base layer, and increasing with higher
      temporal fidelity.

   *  spatialId: Identifies the spatial and quality layer encoded,
      starting with 0 for the base layer, and increasing with higher
      fidelity.

   *  depends: Identifies track dependencies for a given track, usually
      for video media with scalable layers in separate tracks.

   *  renderGroup: Identifies a group of time-aligned tracks which
      should be rendered simultaneously.

   *  selectionParams: Selection parameters for media quality, fidelity,
      etc.; see next section.

3.1.2.  Selection Parameters for Video

   Each video track can have the following associated Selection
   Parameters.

   *  codec: Codec information (including profile, level, tier, etc.),
      as defined by the codec registrations listed in
      [WEBCODECS-CODEC-REGISTRY].

   *  framerate: As defined in section 7.8 of
      [WEBCODECS-CODEC-REGISTRY].

Zanaty, et al.             Expires 8 May 2025                   [Page 8]
Internet-Draft               media container               November 2024

   *  bitrate: As defined in section 7.7 and 7.8 of
      [WEBCODECS-CODEC-REGISTRY].

   *  width, height: As defined in section 7.8 of
      [WEBCODECS-CODEC-REGISTRY].

   *  displayWidth, displayheight: As defined in section 7.7 of
      [WEBCODECS-CODEC-REGISTRY].

3.1.3.  Optional Extensions for Audio

   The LOC Streaming Format allows the following optional extensions for
   audio media.

   *  renderGroup: Identifies a group of time-aligned tracks which
      should be rendered simultaneously.

   *  selectionParams: Selection parameters for media quality, fidelity,
      etc.; see next section.

3.1.4.  Selection Parameters for Audio

   Each audio track can have the following associated Selection
   Parameters.

   *  codec: Codec information as defined by the codec registrations
      listed in [WEBCODECS-CODEC-REGISTRY].

   *  bitrate: As defined in section 7.7 and 7.8 of
      [WEBCODECS-CODEC-REGISTRY].

   *  samplerate: As defined in section 7.7 of
      [WEBCODECS-CODEC-REGISTRY].

   *  chanelConfig: As defined in section 7.7 of
      [WEBCODECS-CODEC-REGISTRY].

   *  lang: The primary language of the track, using standard tags from
      [RFC5646].

3.2.  Catalog Examples

   See section 3.4 of the MOQ Common Catalog [MoQCatalog].

Zanaty, et al.             Expires 8 May 2025                   [Page 9]
Internet-Draft               media container               November 2024

4.  Payload Encryption

   When end to end encryption is supported, the encoded payload is
   encrypted with symmetric keys derived from key establishment
   mechanisms, such as [MOQ-MLS], and the payload itself is protected
   using mechanisms defined in [SecureObjects].

5.  Examples

   This section provides examples with details for building audio and
   video applications using MOQ and LOC; more specifically, it provides
   information on:

   *  Using a catalog to describe track information,

   *  Packaging media into LOC streaming format, and

   *  Mapping application media objects to the MOQT object model and
      transport.

   The figure below shows the conceptual model for mapping media
   application data to the MOQT object model and underlying QUIC
   transport.

   +------------------------------+
   |     Media Application        |
   |    Audio, Video Frames       |
   +---------------+--------------+
                   |
                   |
   +---------------v--------------------+
   |        MOQT Object Model           |
   | Tracks, Groups, Subgroups, Objects |
   +---------------+--------------------+
                   |
                   |
   +---------------v--------------+
   |             QUIC             |
   |        Streams, Datagrams    |
   +------------------------------+

5.1.  Application with one audio track

   An example is shown below for an Opus mono channel audio track at
   48Khz.

Zanaty, et al.             Expires 8 May 2025                  [Page 10]
Internet-Draft               media container               November 2024

   codec: "opus"
   bitrate: 24000
   samplerate: 480000
   channelConfig: "mono"
   lang: "en"

   When ready for publishing, each encoded audio chunk, say 10ms,
   represents a MOQT Object.  In this setup, there is one MOQT Object
   per MOQT Group, where the GroupID in the object header is increment
   by one for each encoded audio chunk and the ObjectID is defaulted to
   value 0.

   These objects can be sent as QUIC streams or datagrams.  When mapped
   to QUIC datagrams, each object must fit entirely within a QUIC
   datagram, and when mapped to QUIC Streams, each such unitary group is
   sent over an individual unidirectional QUIC stream since there is
   just one SubGroup per each MOQT Group.

5.2.  Application with one single quality video track

   An example is shown below for an H.264 video track with 1280x720p
   resolution and 30 fps frame rate at 1 Mbps bitrate.

   codec: "avc3.42E01E"
   bitrate: 1000000
   framerate: 30
   width: 1280
   height: 720

   When ready for publishing, each encoded video chunk is considered as
   input to MOQT Object payload.  If encrypted, the output of encryption
   will serve as the object's payload.  The GroupID is incremented by 1
   at IDR Frame boundaries.  The ObjectID is increment by 1 for each
   encoded video frame, starting at 0 and resetting to 0 at the start of
   a new group.  The first encoded video frame, MOQT Object with
   ObjectID 0, shall be the Independent (IDR) frame and the rest of the
   encoded video frames corresponds to dependent (delta) frames,
   organized in the decode order.

   When mapping to QUIC for sending, one unidirectional QUIC stream is
   setup to deliver all the encoded video chunks within a MOQT group.

   When decoding at the 'End Consumer', the objects from each of the
   QUIC streams are fed in the GroupID then ObjectID order to the
   decoder for the track.

Zanaty, et al.             Expires 8 May 2025                  [Page 11]
Internet-Draft               media container               November 2024

5.3.  Application with single video track with temporal layers

   An example is shown below for an H.264 video track with 1280x720p
   resolution and 2 temporal layers at 30 fps and 60 fps frame rate.

   codec: "avc3.42E01F"
   bitrate: 1500000
   framerate: 60
   width: 1280
   height: 720

   When ready for publishing, each encoded video chunk is considered as
   input to MOQT Object payload.  If encrypted, the output of encryption
   will serve as the object's payload.  The GroupID is incremented by 1
   at Independent (IDR) frame boundaries.  Each MOQT group shall contain
   2 SubGroups corresponding to the 2 temporal layers as shown below:

   Layer:0/30fps Subgroup: 0 ObjectID: even
   Layer:1/60fps Subgroup: 1 ObjectID: odd

   Within the MOQT group, ObjectID is increment by 1 for each encoded
   video frame, starting at 0 and resetting to 0 at the start of a new
   group.  The first encoded video frame, MOQT Object with ObjectID 0,
   shall be the Indepedent (IDR) frame and the rest of the encoded video
   frames corresponds to dependent (delta) frames, organized in the
   decode order.  When mapping to QUIC for sending, one unidirectional
   QUIC stream is used per SubGroup, thus resulting in 2 QUIC streams
   per MOQT group.

   When decoding at the 'End Consumer' for a given MOQT group, the
   objects must be fed in the GroupID then ObjectID order.  This implies
   that the consumer media application needs to order objects across the
   SubGroup QUIC streams.

5.4.  Application with mutiple dependent video tracks

   An example is shown below for an H.264 video track with 2 spatial
   qualities at 360p and 720p each at 30 fps

Zanaty, et al.             Expires 8 May 2025                  [Page 12]
Internet-Draft               media container               November 2024

   Video Track 1
   codec: "avc3.42E01E"
   bitrate: 500000
   framerate: 30
   width: 640
   height: 360

   Video Track 2
   codec: "svc1.56401F"
   bitrate: 1000000
   framerate: 30
   width: 1280
   height: 720

   When ready for publishing, the mapping to the MOQT object model and
   to underlying QUIC, follows the same procedures as described in
   Section 5.2 for each video track.

   When decoding at the 'End Consumer' for a given MOQT group, the
   objects must be fed in the GroupID then ObjectID order in the
   ascending quality track order.

   For the example in the section, this would imply following pattern
   when decoding group 5.

   Track 1 Group 5 Object 0
   Track 2 Group 5 Object 0
   Track 1 Group 5 Object 1
   Track 2 Group 5 Object 1
   ....

5.5.  Application with mutiple dependent video tracks with dyadic
      framerate levels.

   An example is shown below for an H.264 video track with 2 spatial
   qualities at 360p and 720p, however, the framerate between tracks
   vary dyadically.

Zanaty, et al.             Expires 8 May 2025                  [Page 13]
Internet-Draft               media container               November 2024

   Video Track 1
   codec: "avc3.42E01E"
   bitrate: 500000
   framerate: 30
   width: 640
   height: 360

   Video Track 2
   codec: "svc1.56E01F"
   bitrate: 1000000
   framerate: 60
   width: 1280
   height: 720

   When ready for publishing, the mapping to the MOQT object model and
   to underlying QUIC, follows the same procedures as described in
   Section 5.2 for each video track.

   When decoding at the 'End Consumer' for a given MOQT group, the
   objects from across the tracks must be fed in the timestamp order to
   the decoder, if no frame reordering is present in the encoding.

   If the encoding uses frame reordering, or if timestamp cannot be
   obtained, the object to choose next shall follow the below formula.

   Object Decode Order = ObjectID * multiplier + offset

   multiplier = 2^(maxlayer-max(0,layer-1))
   offset = 2^(maxlayer-layer) MOD multiplier

5.6.  Application with multiple simulcast qualities video tracks

   An example is shown below for an H.264 video track with 2 simulcast
   spatial qualities at 360p and 720p each at 30 fps.

   Video Track 1
   codec: "avc3.42E01E"
   bitrate: 500000
   framerate: 30
   width: 640
   height: 360

   Video Track 2
   codec: "avc3.42E01F"
   bitrate: 1000000
   framerate: 30
   width: 1280
   height: 720

Zanaty, et al.             Expires 8 May 2025                  [Page 14]
Internet-Draft               media container               November 2024

   When ready for publishing, the mapping to the MOQT object model and
   to underlying QUIC, follows the same procedures as described in
   Section 5.2 for each video track.

   When decoding at the 'End Consumer', the objects from the QUIC stream
   are fed in the GroupID then ObjectID order to the decoders setup for
   the corresponding video tracks.

6.  Security and Privacy Considerations

   The metadata in LOC Header Extensions is visible to relays, since the
   MOQ Object Header Extensions are often not encrypted end-to-end (from
   original publisher to end subscribers) in common schemes.  In some
   cases, this may be an intentional design intent for proper relay
   operation.  In other cases, this may be unintentional or undesirable
   leaking of the metadata to relays.  Each metadata that is defined
   should consider the security and privacy aspects of granting relays
   visibility to the metadata.  End-to-end encyption schemes should
   support end-to-end encryption of sensitive metadata.

   The metadata defined and registered in this specification (Capture
   Timestamp, Video Frame Marking, and Audio Level) may be sensitive
   metadata that should be encrypted end-to-end.  They are used by media
   switches, which are not merely relays, and likely have access to some
   media keys.  This may require end-to-end encryption schemes with
   multiple different security key contexts for payload versus metadata.

7.  IANA Considerations

   The IANA registry for MOQ Object Header Extensions is populated with
   the entries specified in section Section 2.2, referencing this
   specification.

   This document creates a new entry in the "MoQ Streaming Format"
   Registry (see [MoQTransport] Sect 8).  The type value is 0x002, the
   name is "LOC Streaming Format" and the RFC is XXX.

8.  Normative References

   [MoQTransport]
              Curley, L., Pugin, K., Nandakumar, S., Vasiliev, V., and
              I. Swett, "Media over QUIC Transport", Work in Progress,
              Internet-Draft, draft-ietf-moq-transport-07, 21 October
              2024, <https://datatracker.ietf.org/doc/html/draft-ietf-
              moq-transport-07>.

Zanaty, et al.             Expires 8 May 2025                  [Page 15]
Internet-Draft               media container               November 2024

   [MoQCatalog]
              Nandakumar, S., Law, W., and M. Zanaty, "Common Catalog
              Format for moq-transport", Work in Progress, Internet-
              Draft, draft-wilaw-moq-catalogformat-02, 30 November 2023,
              <https://datatracker.ietf.org/doc/html/draft-wilaw-moq-
              catalogformat-02>.

   [Framemarking]
              Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame
              Marking RTP Header Extension", Work in Progress, Internet-
              Draft, draft-ietf-avtext-framemarking-16, 4 March 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-avtext-
              framemarking-16>.

   [SecureObjects]
              "Secure Objects for Media over QUIC", n.d.,
              <https://suhashere.github.io/moq-secure-objects/#go.draft-
              jennings-moq-secure-objects.html>.

   [MOQ-MLS]  "Secure Group Key Agreement with MLS over MoQ", n.d.,
              <https://suhashere.github.io/moq-e2ee-mls/draft-jennings-
              moq-e2ee-mls.html>.

   [WebCodecs]
              "WebCodecs", July 2023,
              <https://www.w3.org/TR/webcodecs/>.

   [WEBCODECS-CODEC-REGISTRY]
              "WebCodecs Codec Registry", July 2023,
              <https://www.w3.org/TR/webcodecs-codec-registry/>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC6464]  Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time
              Transport Protocol (RTP) Header Extension for Client-to-
              Mixer Audio Level Indication", RFC 6464,
              DOI 10.17487/RFC6464, December 2011,
              <https://www.rfc-editor.org/rfc/rfc6464>.

   [RFC5646]  Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying
              Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646,
              September 2009, <https://www.rfc-editor.org/rfc/rfc5646>.

Zanaty, et al.             Expires 8 May 2025                  [Page 16]
Internet-Draft               media container               November 2024

Appendix A.  Acknowledgements

   Thanks to Cullen Jennings for suggestions and review.

Authors' Addresses

   Mo Zanaty
   Cisco
   Email: mzanaty@cisco.com

   Suhas Nandakumar
   Cisco
   Email: snandaku@cisco.com

   Peter Thatcher
   Microsoft
   Email: pthatcher@microsoft.com

Zanaty, et al.             Expires 8 May 2025                  [Page 17]