MoQ Media Interop
draft-cenzano-moq-media-interop-01
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
The information below is for an old version of the document.
| Document | Type |
This is an older version of an Internet-Draft whose latest revision state is "Active".
|
|
|---|---|---|---|
| Authors | Jorge Cenzano Ferret , Alan Frindell | ||
| Last updated | 2024-12-27 (Latest revision 2024-10-21) | ||
| RFC stream | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-cenzano-moq-media-interop-01
Media Over QUIC J. Cenzano-Ferret
Internet-Draft A. Frindell
Intended status: Informational Meta
Expires: 1 July 2025 28 December 2024
MoQ Media Interop
draft-cenzano-moq-media-interop-01
Abstract
This protocol can be used to send and receive video and audio over
Media over QUIC Transport [MOQT].
About This Document
This note is to be removed before publishing as an RFC.
The latest revision of this draft can be found at
https://afrind.github.io/draft-cenzano-media-interop/draft-cenzano-
moq-media-interop.html. Status information for this document may be
found at https://datatracker.ietf.org/doc/draft-cenzano-moq-media-
interop/.
Discussion of this document takes place on the Media Over QUIC
Working Group mailing list (mailto:moq@ietf.org), which is archived
at https://mailarchive.ietf.org/arch/browse/moq/. Subscribe at
https://www.ietf.org/mailman/listinfo/moq/.
Source for this draft and an issue tracker can be found at
https://github.com/afrind/draft-cenzano-media-interop.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 1 July 2025.
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 1]
Internet-Draft moq-mi December 2024
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Protocol Operation . . . . . . . . . . . . . . . . . . . . . 2
2.1. Track Names . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Mapping Tracks to MoQT Object Model . . . . . . . . . . . 3
2.3. Timestamps . . . . . . . . . . . . . . . . . . . . . . . 3
2.4. Object Format . . . . . . . . . . . . . . . . . . . . . . 3
2.4.1. Media Type . . . . . . . . . . . . . . . . . . . . . 4
2.4.2. Media payload . . . . . . . . . . . . . . . . . . . . 4
3. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
4. Conventions and Definitions . . . . . . . . . . . . . . . . . 9
5. Security Considerations . . . . . . . . . . . . . . . . . . . 9
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
7. Normative References . . . . . . . . . . . . . . . . . . . . 9
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9
1. Introduction
This protocol specifies a simple mechanism for sending media (video
and audio) over MOQT for both live-streaming and VC style use cases.
The protocol is flexible in order to support this range of use cases.
The following parameters can be updated in the middle of a the track
(ex: frame rate, resolution, codec, etc)
The protocol defines a low overhead packager (not LoC [loc], and is
extensible to other formats such as FMP4.
2. Protocol Operation
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 2]
Internet-Draft moq-mi December 2024
2.1. Track Names
The publisher selects a namespace of their choosing, and sends an
ANNOUNCE message for this namespace.
Within the publisher namespace the publisher will offer media tracks
named as videoX and audioX where X will be an integer starting at 0.
So in case the publisher issues 2 audio tracks and 1 video track, the
track names available will be video0, audio0, and audio1.
The subscriber will consider all of those tracks belonging to the
same namespace as part of the same synchronization group (timestamps
aligned to the same timeline).
2.2. Mapping Tracks to MoQT Object Model
For the video track, the publisher begins a new group at the start of
each IDR (so object 0 will be always an IDR Keyframe), and each group
contains a single subgroup. Each object has the format described in
Section 2.4.
For the audio track, the publisher begins a new group with each audio
object, and each group contains a single subgroup. Each object has
the format described in Section 2.4.
TODO: Datagram forwarding preference could be used, but has problems
if audio frame does not fit in a single UDP payload.
2.3. Timestamps
To avoid using fractional numbers and having to deal with rounding
errors, timestamps will be expressed with two integers: - timestamp
numerator (ex: PTS, DTS, duration) - timebase
To convert a timestamp into seconds you just need to: timestamp(s) =
timestamp numerator / timebase
Example:
PTS = 11, timebase = 30
PTS(s) = 11/30 = 0.366666
2.4. Object Format
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 3]
Internet-Draft moq-mi December 2024
{
Media Type (i)
Media payload (..)
}
Figure 1: MOQT Media object
2.4.1. Media Type
This value indicates what kind of media payload will follow
+======+======================================+
| Code | Value |
+======+======================================+
| 0x0 | Video H264 in AVCC with LOC packager |
+------+--------------------------------------+
| 0x1 | Audio Opus bitsream |
+------+--------------------------------------+
| 0x2 | UTF-8 text |
+------+--------------------------------------+
| 0x3 | Audio AAC-LC in MPEG4 |
+------+--------------------------------------+
Table 1
2.4.2. Media payload
Is where media related information is carried, and it is specifed by
Media type
2.4.2.1. Video H264 in AVCC with LOC packager format
{
Seq ID (i)
PTS Timestamp (i)
DTS Timestamp (i)
Timebase (i)
Duration (i)
Wallclock (i)
Metadata Size (i)
Metadata (..)
Payload (..)
}
Figure 2: MOQT Media video h264 loc
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 4]
Internet-Draft moq-mi December 2024
2.4.2.1.1. Seq ID
Monotonically increasing counter for this media track
2.4.2.1.2. PTS Timestamp
Indicates PTS in timebase
TODO: Varint does NOT accept easily negative, so it could be
challenging to encode at start (priming)
2.4.2.1.3. DTS Timestamp
Not needed if B frames are NOT used, in that case should be same
value as PTS.
TODO: Varint does NOT accept easily negative, so it could be
challenging to encode at start (priming)
2.4.2.1.4. Timebase
Units used in PTS, DTS, and duration.
2.4.2.1.5. Duration
Duration in timebase. It will be 0 if not set
2.4.2.1.6. Wall Clock
EPOCH time in ms when this frame started being captured. It will be
0 if not set
2.4.2.1.7. Metadata Size
Size in bytes of the metadata section It can be 0 if no metadata is
sent
2.4.2.1.8. Metadata
Extradata needed to decode this stream This will be
AVCDecoderConfigurationRecord as described in [ISO14496-15:2019]
section 5.3.3.1, with field lengthSizeMinusOne = 3 (So length = 4).
If any other size length is indicated (in
AVCDecoderConfigurationRecord) we should error with “Protocol
violation”
Any change in encoding parameters MUST send a new
AVCDecoderConfigurationRecord
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 5]
Internet-Draft moq-mi December 2024
2.4.2.1.9. Payload
H264 with bitstream AVC1 format as described in [ISO14496-15:2019]
section 5.3. Using 4bytes size field length.
2.4.2.2. Audio Opus bitsream
{
Seq ID (i)
PTS Timestamp (i)
Timebase (i)
Sample Freq (i)
Num Channels (i)
Duration (i)
Wall Clock (i)
Payload (..)
}
Figure 3: MOQT Media audio Opus LOC
2.4.2.2.1. Seq Id
Monotonically increasing counter for this media track
2.4.2.2.2. PTS Timestamp
Indicates PTS in timebase
TODO: Varint does NOT accept easily negative, so it could be
challenging to encode at start (priming)
2.4.2.2.3. Timebase
Units used in PTS, DTS, and duration
2.4.2.2.4. Sample Freq
Sample frequency used in the original signal (before encoding)
2.4.2.2.5. Num Channels
Number of channels in the original signal (before encoding)
2.4.2.2.6. Duration
Duration in timebase. It will be 0 if not set
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 6]
Internet-Draft moq-mi December 2024
2.4.2.2.7. Wallclock
EPOCH time in ms when this frame started being captured. It will be
0 if not set
2.4.2.2.8. Payload
Opus packets, as described in [RFC6716] - section 3
2.4.2.3. UTF-8 Text
{
Seq ID (i)
Payload (..)
}
Figure 4: MOQT UTF-8 Text
2.4.2.3.1. Seq Id
Monotonically increasing counter for this media track
2.4.2.3.2. Payload
Text packets in UTF-8, as described in [RFC3629]
2.4.2.4. Audio AAC-LC in MPEG4 bitstream
{
Seq ID (i)
PTS Timestamp (i)
Timebase (i)
Sample Freq (i)
Num Channels (i)
Duration (i)
Wall Clock (i)
Payload (..)
}
Figure 5: MOQT Media audio AAC-LC MPEG4 LOC
2.4.2.4.1. Seq Id
Monotonically increasing counter for this media track
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 7]
Internet-Draft moq-mi December 2024
2.4.2.4.2. PTS Timestamp
Indicates PTS in timebase
TODO: Varint does NOT accept easily negative, so it could be
challenging to encode at start (priming)
2.4.2.4.3. Timebase
Units used in PTS, DTS, and duration
2.4.2.4.4. Sample Freq
Sample frequency used in the original signal (before encoding)
2.4.2.4.5. Num Channels
Number of channels in the original signal (before encoding)
2.4.2.4.6. Duration
Duration in timebase. It will be 0 if not set
2.4.2.4.7. Wallclock
EPOCH time in ms when this frame started being captured. It will be
0 if not set
2.4.2.4.8. Payload
AAC frame (syntax element raw_data_block()), as described in section
4.4.2.1 of [ISO14496-3:2009].
3. References
[ISO14496-15:2019] "Carriage of network abstraction layer (NAL) unit
structured video in the ISO base media file format", ISO
ISO14496-15:2019, International Organization for Standardization,
October, 2022.
[ISO14496-3:2009] "Information technology — Coding of audio-visual
objects", ISO ISO14496-3:2009, International Organization for
Standardization, September, 2009.
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 8]
Internet-Draft moq-mi December 2024
4. Conventions and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
5. Security Considerations
TODO Security
6. IANA Considerations
This document has no IANA actions.
7. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
2003, <https://www.rfc-editor.org/rfc/rfc3629>.
[RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the
Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
September 2012, <https://www.rfc-editor.org/rfc/rfc6716>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
Acknowledgments
TODO acknowledge.
Authors' Addresses
Jordi Cenzano-Ferret
Meta
Email: jcenzano@meta.com
Alan Frindell
Meta
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 9]
Internet-Draft moq-mi December 2024
Email: afrind@meta.com
Cenzano-Ferret & Frindell Expires 1 July 2025 [Page 10]