Skip to main content

Low Overhead Media Container

Document Type Active Internet-Draft (individual)
Authors Mo Zanaty , Suhas Nandakumar , Peter Thatcher
Last updated 2023-03-13
RFC stream (None)
Intended RFC status (None)
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
Network Working Group                                          M. Zanaty
Internet-Draft                                             S. Nandakumar
Intended status: Informational                                     Cisco
Expires: 14 September 2023                                   P. Thatcher
                                                           13 March 2023

                      Low Overhead Media Container


   This specification describes a media container format for encoded and
   encrypted audio and video media data to be used for interactive media
   usecases, with the goal of it being a low overhead format.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 14 September 2023.

Copyright Notice

   Copyright (c) 2023 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction
     1.1.  Requirements Notation and Conventions
     1.2.  Terminology
   2.  Payload Format
   3.  Payload Header Data
     3.1.  Common Header Data
     3.2.  Video Header Data
     3.3.  Audio Header Data
   4.  Header Data Registration
   5.  Payload Encryption
   6.  Container Serialization
   7.  MOQ Transport Mapping
   8.  Security Considerations
   9.  IANA Considerations
   10. Normative References
   Appendix A.  Acknowledgements
   Authors' Addresses

1.  Introduction

   This specification describes a low-overhead media container format
   for encoded and encrypted audio and video media data.  "Low-overhead"
   refers to minimal extra encapsulation as well as minimal application
   overhead when interfacing with WebCodecs.

   The container format description is specified for all audio and video
   codecs defined in the WebCodecs Codec Registry.  The audio and video
   payload bitstream is identical to the internal data inside an
   EncodedAudioChunk and EncodedVideoChunk, respectively, specified in
   the registry.

   In addition to the media payloads, critical metadata is also
   specified for audio and video payloads.

   A primary motivation is to align with media formats used in WebCodecs
   to minimize application overhead when interfacing with WebCodecs.
   Other container formats like CMAF or RTP would require more extensive
   application overhead in format conversions, as well as larger
   encapsultion overhead which may burden some use cases like low
   bitrate audio scenarios.

1.1.  Requirements Notation and Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "OPTIONAL" in this document are to be interpreted as described in

1.2.  Terminology


2.  Payload Format

   The WebCodecs Codec Registry defines the contents of an
   EncodedAudioChunk and EncodedVideoChunk for the audio and video codec
   formats in the registry.  The "internal data" in these chunks is used
   directly in this specification as the payload bitstream.

3.  Payload Header Data

   This section specified metadata that needs to be carried out as
   payload metadata.  Payload header data provides necessary information
   for intermediaries to perform switching decisions when the payload is
   inaccessible, due to encryption.

   Section ((#reg)) provides framework for registering new payload
   header fields that aren't defined by this specification

3.1.  Common Header Data

   Following metadata MUST be captured for each media frame

   Sequence Number: Identifies a sequentially increasing variable length
   integer that is incremented per encoded media frame.

   Capture Timestamp in Microseconds: Captures the wall-clock time of
   the encoded media frame.

3.2.  Video Header Data

   Flags for frames which are independent, discardable, or base layer
   sync points, as well as temporal and spatial layer identification.
   [I-D.ietf-avtext-framemarking] .

3.3.  Audio Header Data

   Audio Level: captures the magnitude of the audio level of the
   corresponding audio frame and values in encoded in 7 bits as defined
   in the section 3 of [RFC6464]

4.  Header Data Registration

   This section details the procedures to register header data fields
   that might be useful for a particular class of media applications.

   Registering a given metadata field requires the following attributes
   to be specified.

   Shortname: Short name for the metadata.  (Not sent on the wire.)

   Description: Detailed description for the metadata.  (Not sent on the

   ID: Identifier assigned by the registry. (varint)

   Length: Length of metadata value in bytes. (varint)

   Value: Value of metadata. (length bytes)

   Registration of type "Specification Required" is followed for
   registering new for header data values.

5.  Payload Encryption

   When end to end encryption is supported, the encoded payload is
   encrypted with keys from symmetric keying mechanisms, such a MLS, and
   the payload itself is protected using SFrame or equivalent.

6.  Container Serialization

   The wire encoding of the payload conforming to this specification is
   a set of length delimited values as shown below.

   The Bytes is obtained as output of AEAD operation for encrypting the
   Payload with the header data as additional data input.

   | Payload | Bytes | Payload  | Bytes |
   | Len     |  (0)  | Len (1)  |  (1)  | ...

7.  MOQ Transport Mapping


8.  Security Considerations


9.  IANA Considerations

   TODO on specification required for metadata registration.

10.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,

              Zanaty, M., Berger, E., and S. Nandakumar, "Frame Marking
              RTP Header Extension", Work in Progress, Internet-Draft,
              draft-ietf-avtext-framemarking-13, 11 November 2021,

   [RFC6464]  Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time
              Transport Protocol (RTP) Header Extension for Client-to-
              Mixer Audio Level Indication", RFC 6464,
              DOI 10.17487/RFC6464, December 2011,

Appendix A.  Acknowledgements

   Thanks to Cullen Jennings for suggestions and review.

Authors' Addresses

   Mo Zanaty

   Suhas Nandakumar

   Peter Thatcher