Low Overhead CMAF for Media over QUIC (LOCMAF)
draft-einarsson-moq-locmaf-00
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
| Document | Type | Active Internet-Draft (individual) | |
|---|---|---|---|
| Authors | Torbjörn Einarsson , Hugo Björs | ||
| Last updated | 2026-06-02 | ||
| RFC stream | (None) | ||
| Intended RFC status | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-einarsson-moq-locmaf-00
Media Over QUIC T. Einarsson
Internet-Draft Eyevinn Technology
Intended status: Informational H. Björs
Expires: 4 December 2026 KTH
2 June 2026
Low Overhead CMAF for Media over QUIC (LOCMAF)
draft-einarsson-moq-locmaf-00
Abstract
This document specifies LOCMAF (Low Overhead CMAF for Media over
QUIC), a compact wire format for streaming low-latency CMAF media
over the MoQ Transport protocol (MOQT) with per-object overhead
comparable to the Low Overhead Container (LOC). LOCMAF carries the
CMAF chunk head metadata from a single moof (movie fragment) — as a
small set of tagged fields inside one of two LOCMAF object kinds,
while leaving the sample data (mdat) untouched. In addition, it can
carry the optional styp (segment type), prft (producer reference
time), any number of emsg (event message) boxes. The first object of
each MOQT group carries a full reference; subsequent objects in the
same group carry only the differences. The receiver reconstructs
CMAF chunks that are semantically equivalent to the sender input,
including encryption metadata required by CMAF DRM (Common
Encryption) pipelines.
About This Document
This note is to be removed before publishing as an RFC.
The latest revision of this draft can be found at
https://Eyevinn.github.io/locmaf-id/draft-einarsson-moq-locmaf.html.
Status information for this document may be found at
https://datatracker.ietf.org/doc/draft-einarsson-moq-locmaf/.
Discussion of this document takes place on the Media Over QUIC
Working Group mailing list (mailto:moq@ietf.org), which is archived
at https://mailarchive.ietf.org/arch/browse/moq/. Subscribe at
https://www.ietf.org/mailman/listinfo/moq/.
Source for this draft and an issue tracker can be found at
https://github.com/Eyevinn/locmaf-id.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Einarsson & Björs Expires 4 December 2026 [Page 1]
Internet-Draft LOCMAF June 2026
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 4 December 2026.
Copyright Notice
Copyright (c) 2026 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Relationship to prior work . . . . . . . . . . . . . . . 4
2. Conventions and Definitions . . . . . . . . . . . . . . . . . 5
3. MOQT Group / Object Mapping . . . . . . . . . . . . . . . . . 6
4. CMSF Catalog Signalling . . . . . . . . . . . . . . . . . . . 6
5. CMAF Header Delivery . . . . . . . . . . . . . . . . . . . . 7
6. Scope and Publisher Requirements . . . . . . . . . . . . . . 8
6.1. Mandatory preconditions . . . . . . . . . . . . . . . . . 8
6.2. Recommended source properties . . . . . . . . . . . . . . 9
6.3. tfdt.baseMediaDecodeTime contiguity . . . . . . . . . . . 9
6.4. Optional encoder modes . . . . . . . . . . . . . . . . . 9
7. Object Framing . . . . . . . . . . . . . . . . . . . . . . . 9
7.1. Top-level header IDs . . . . . . . . . . . . . . . . . . 9
7.2. Object layout . . . . . . . . . . . . . . . . . . . . . . 10
7.3. Property encoding (parity rule) . . . . . . . . . . . . . 10
7.4. Zigzag varint encoding . . . . . . . . . . . . . . . . . 11
7.5. Full vs delta dispatch . . . . . . . . . . . . . . . . . 12
8. Field Reference . . . . . . . . . . . . . . . . . . . . . . . 12
8.1. Fields from moof child boxes . . . . . . . . . . . . . . 13
8.2. prft fields . . . . . . . . . . . . . . . . . . . . . . . 15
Einarsson & Björs Expires 4 December 2026 [Page 2]
Internet-Draft LOCMAF June 2026
8.2.1. prftNtpTimestamp . . . . . . . . . . . . . . . . . . 16
8.2.2. prftMediaTime . . . . . . . . . . . . . . . . . . . . 17
8.2.3. prftVersion . . . . . . . . . . . . . . . . . . . . . 17
8.2.4. prftFlags . . . . . . . . . . . . . . . . . . . . . . 17
8.2.5. prft.reference_track_ID . . . . . . . . . . . . . . . 17
8.3. styp fields . . . . . . . . . . . . . . . . . . . . . . . 17
8.4. emsg field . . . . . . . . . . . . . . . . . . . . . . . 18
8.5. Delta deletion marker . . . . . . . . . . . . . . . . . . 19
9. Full LOCMAF Chunk Encoding . . . . . . . . . . . . . . . . . 19
9.1. Emission rules for moof child-box fields . . . . . . . . 19
9.1.1. Sample-size derivation . . . . . . . . . . . . . . . 20
9.2. Emission rules for prft / styp / emsg fields . . . . . . 21
10. Delta LOCMAF Chunk Encoding . . . . . . . . . . . . . . . . . 21
10.1. Field value encoding . . . . . . . . . . . . . . . . . . 22
10.1.1. List length changes . . . . . . . . . . . . . . . . 22
10.2. tfdtBaseMediaDecodeTime is normally derived . . . . . . 23
10.3. Deletions . . . . . . . . . . . . . . . . . . . . . . . 23
10.4. Empty delta . . . . . . . . . . . . . . . . . . . . . . 24
10.5. prft and emsg in delta chunks . . . . . . . . . . . . . 24
11. Compact sample_flags Encoding . . . . . . . . . . . . . . . . 24
11.1. Wire encoding . . . . . . . . . . . . . . . . . . . . . 24
11.2. Reconstruction . . . . . . . . . . . . . . . . . . . . . 25
11.3. Encoder constraint . . . . . . . . . . . . . . . . . . . 25
12. emsg Round-Trip . . . . . . . . . . . . . . . . . . . . . . . 25
12.1. Relationship to MSF eventtimeline . . . . . . . . . . . 25
12.2. Record format . . . . . . . . . . . . . . . . . . . . . 25
12.2.1. timescale default . . . . . . . . . . . . . . . . . 26
12.2.2. presentation_time encoding . . . . . . . . . . . . . 26
13. DRM Box Round-Trip . . . . . . . . . . . . . . . . . . . . . 26
13.1. Supported schemes . . . . . . . . . . . . . . . . . . . 26
13.2. Supported boxes . . . . . . . . . . . . . . . . . . . . 27
13.3. Unsupported boxes . . . . . . . . . . . . . . . . . . . 27
13.4. CENC IV counter prediction (optional) . . . . . . . . . 28
14. Event-Only Tracks and CMAF Ingest Compatibility . . . . . . . 29
15. Receiver Reconstruction . . . . . . . . . . . . . . . . . . . 29
16. Security Considerations . . . . . . . . . . . . . . . . . . . 31
17. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32
17.1. Catalog packaging value and locmafVersion . . . . . . . 32
17.2. LOCMAF Top-Level Header IDs . . . . . . . . . . . . . . 32
18. References . . . . . . . . . . . . . . . . . . . . . . . . . 33
18.1. Normative References . . . . . . . . . . . . . . . . . . 33
18.2. Informative References . . . . . . . . . . . . . . . . . 33
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 34
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 34
Einarsson & Björs Expires 4 December 2026 [Page 3]
Internet-Draft LOCMAF June 2026
1. Introduction
CMAF [CMAF] chunk headers have a size starting at 100 bytes, while
the codec frames they describe may be only a few hundred bytes at low
latency and low-bitrate such as audio tracks. Streaming CMAF
directly over MoQ Transport [MOQT] therefore incurs a per-object
overhead that the Low Overhead Container (LOC) [LOC] avoids by
carrying raw codec frames with a minimal set of metadata. LOC,
however, cannot transport the per-sample CENC [CENC] metadata needed
for browser EME / CDM decryption of DRM-protected live streams, nor
the prft (Producer Reference Time) and emsg (DASH Event Message)
boxes that CMAF may carry alongside the moof.
LOCMAF closes this gap. It exploits the observation that consecutive
CMAF chunk heads within a single CMAF segment are nearly identical:
the first chunk of a MOQT group is sent in full, subsequent chunks
are sent as compact deltas against the previous chunk in the same
group, and mdat payloads are passed through unchanged. The receiver
reconstructs full CMAF chunks that are byte-compatible enough to feed
unmodified MSE / EME pipelines.
This document specifies the LOCMAF object framing, the full and delta
chunk encodings, the CMSF [CMSF] catalog signalling, the receiver
reconstruction algorithm, and the DRM box round-trip.
1.1. Relationship to prior work
A reason that CMAF headers are big is that they have a history in the
multi-sample MP4 file format. Furthermore, each individual box
starts with an 8-byte header using a fixed 4-byte size and a 4-byte
identifier. This is in contrast to MOQT and QUIC that use varint.
A general MP4/CMAF box can be compressed by reducing the header size
by using varints and shorter ids as proposed in the Compressed MP4
draft [COMPRESSED-MP4]. LOCMAF takes a more specific approach:
* LOCMAF does *not* compress the CMAF Header (initialisation
segment). The CMAF Header is carried verbatim in the catalog (see
Section 5). Init compression is a one-time-per-track cost;
LOCMAF's wire-byte target is the per-chunk overhead, which an init
codec cannot reduce. Carrying the CMAF Header verbatim has a
second, deployment-driven benefit: a locmaf packaging track uses
the *same MSF initialisation-data mechanism* as a cmaf packaging
track, and when both wrap the same source they MAY refer to the
same init entry (see Section 5), as being proposed for the next
draft of [CMSF]. Publishers can therefore introduce LOCMAF as a
more efficient wire format for clients that support it without
duplicating the initialisation bytes for legacy CMAF clients —
Einarsson & Björs Expires 4 December 2026 [Page 4]
Internet-Draft LOCMAF June 2026
both audiences consume the same init from the same catalog entry,
and the publisher only adds the LOCMAF-encoded media track
alongside the CMAF one.
* LOCMAF's goal for the per-chunk path is *functionally equivalent
reconstruction*, not byte-exact reconstruction: the reconstructed
CMAF chunk carries the same samples, sample metadata, and CENC
metadata as the source, but byte-level details that do not affect
a CMAF reader are not preserved (Section 15 lists what may
differ). Given that the typical target is to feed an MSE/EME
player instance, this is not a disadvantage.
A reference implementation is available [MOQLIVEMOCK]. Worked
examples and diagrams are published at [LOCMAF-SITE].
2. Conventions and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
The following terms are used throughout this document:
CMAF chunk: One moof + one mdat pair, optionally preceded by at most
one styp, at most one prft, and zero or more emsg boxes, as
defined in [CMAF] §7.3.3.2. The smallest CMAF addressable media
object.
CMAF fragment: One or more CMAF chunks whose first chunk starts at a
Stream Access Point ([CMAF] §7.3.2.2). A fragment is logically a
single MovieFragmentBox worth of samples; in "chunked" CMAF the
samples are split across multiple smaller moof + mdat pairs.
CMAF segment: One or more CMAF fragments in decode order ([CMAF]
§7.3.2.4). The segment is the typical unit of HTTP delivery in
DASH and HLS-fMP4; in LOCMAF the segment corresponds to one MOQT
group.
CMAF Header: The ftyp + moov pair that initialises a CMAF track.
Also called an _initialisation segment_ in DASH parlance and
carried as initData in MSF [MSF] / CMSF [CMSF] catalogs.
MOQT group, MOQT object: As defined in [MOQT].
LOCMAF object: A MOQT object whose payload begins with one of the
top-level header IDs defined in Section 7.
Einarsson & Björs Expires 4 December 2026 [Page 5]
Internet-Draft LOCMAF June 2026
Full LOCMAF chunk: A LOCMAF object whose top-level header ID is
LocmafFullHeader. It carries an absolute encoding of the CMAF
chunk head and serves as the in-group reference for subsequent
delta objects. See Section 9.
Delta LOCMAF chunk: A LOCMAF object whose top-level header ID is
LocmafDeltaHeader. It encodes differences against the most
recently received full LOCMAF chunk in the same MOQT group. See
Section 10.
BMDT: Abbreviation for tfdt.baseMediaDecodeTime ([ISOBMFF]).
3. MOQT Group / Object Mapping
LOCMAF assumes the following mapping from CMAF to MOQT:
* One MOQT group per CMAF segment. Group boundaries align with
random access points.
* One MOQT object per CMAF chunk. Each MOQT object is a LOCMAF
object carrying the (full or delta) chunk head followed by the
unmodified mdat payload.
* Audio MOQT groups typically have the same duration as the video
MOQT groups with which they will be muxed, to enable joint tune-
in.
* Sparse tracks, such as subtitle, or event/metadata tracks, are
more likely to have groups that are not aligned with video.
Per [MOQT], objects within a MOQT group are delivered in order and
groups are independently decodable. LOCMAF relies on both
properties: delta chunks reference the preceding chunk in the same
group (see Section 10), and each group MUST begin with a
LocmafFullHeader so a subscriber tuning in at a group boundary has a
complete reference (see Section 7.5).
4. CMSF Catalog Signalling
A track that carries LOCMAF-encoded chunks MUST advertise:
* packaging equal to "locmaf".
* locmafVersion equal to "0.2" for this version of the
specification.
This document extends the allowed packaging values defined in [MSF]
to include one new entry, in the same manner that [CMSF] adds "cmaf":
Einarsson & Björs Expires 4 December 2026 [Page 6]
Internet-Draft LOCMAF June 2026
+========+========+===============+
| Name | Value | Reference |
+========+========+===============+
| LOCMAF | locmaf | this document |
+--------+--------+---------------+
Table 1
locmafVersion is a track-level catalog field added by this document,
analogous to the track-level fields [CMSF] adds for CMAF. It is
present only when packaging == "locmaf" and is omitted otherwise.
Its value is a string identifying the LOCMAF wire-format version;
"0.2" denotes the format specified here. Receivers MUST compare it
against their highest supported version and SHOULD refuse the track
when the encoder advertises a version they do not implement. The set
of valid locmafVersion values is governed by this document and its
successors, not by an IANA registry (see Section 17).
The CMAF Header for a LOCMAF track is carried by the same MSF [MSF]
mechanism that carries the CMAF Header for a plain cmaf packaging
track. LOCMAF does not define its own init-carriage shape, nor add
locmaf-specific catalog fields beyond locmafVersion. Whatever
mechanism MSF specifies for cmaf init data — inline initData today,
or whichever indirection form MSF adopts (e.g. a root-level init-data
list referenced from the track entry) — applies unchanged to locmaf
tracks.
A consequence: when a cmaf packaging track and a locmaf packaging
track wrap the same source, they MAY refer to the same MSF init-data
entry. The wrapped media is identical at the sample level and the
CMAF Header bytes are identical at the byte level (see Section 5);
only the per-chunk wire encoding differs.
It should be noted that cmaf and locmaf track may be mixed, e.g.
video tracks using cmaf packaging while audio uses locmaf` dito.
5. CMAF Header Delivery
The CMAF Header for a LOCMAF track is byte-identical to the CMAF
Header a plain cmaf packaging track of the same source would carry —
ftyp followed by moov, followed by any optional supplemental boxes
(pssh, mvex with trex, etc.). It is delivered uncompressed via the
catalog using the same MSF mechanism as for cmaf packaging. There is
no LOCMAF-specific CMAF Header carrier.
The moov in the CMAF Header MUST contain exactly one trak box (see
Section 6).
Einarsson & Björs Expires 4 December 2026 [Page 7]
Internet-Draft LOCMAF June 2026
A LOCMAF receiver:
1. Resolves the track's MSF init-data reference (whichever form the
catalog uses) to the base64-encoded CMAF Header bytes.
2. Base64-decodes the CMAF Header bytes.
3. Feeds the bytes to its MSE / decoder pipeline exactly as it would
for a plain CMAF track.
4. Begins receiving LOCMAF-encoded media objects on the subscribed
track and reconstructs each CMAF chunk from the LOCMAF payload.
5. Extracts the parameters required to regenerate CMAF chunks —
track_ID, media timescale, trex defaults, and any track-
encryption information (tenc defaults, default KID, default IV,
scheme type, pattern parameters) — from the decoded CMAF Header.
These values seed the reconstruction state used in step 4.
Compression of the catalog itself is out of scope for LOCMAF and
handled at the MOQT / MSF [MSF] layer.
6. Scope and Publisher Requirements
LOCMAF v0.2 targets the low-latency CMAF case: short CMAF fragments
composed of small CMAF chunks (often one sample per chunk),
optionally carrying CENC encryption metadata. To keep the wire
format minimal, the following constraints apply.
6.1. Mandatory preconditions
A LOCMAF publisher MUST ensure that:
1. *Single trak per moov.* The CMAF Header contains exactly one trak
box. Multi-track ISO BMFF files MUST be demuxed before LOCMAF
encoding.
2. *No key ID (KID) change within a CMAF chunk.* Key-identifier
transitions MUST align with fragment (and therefore chunk)
boundaries. This removes the need for sgpd / sbgp boxes in the
wire format.
3. *Restricted sample_flags populations.* Per-sample, default, and
first-sample sample_flags MUST populate only is_leading,
sample_depends_on, sample_is_depended_on, and
sample_is_non_sync_sample; the fields sample_has_redundancy,
sample_padding_value, and sample_degradation_priority MUST be
zero in the source. See Section 11.
Einarsson & Björs Expires 4 December 2026 [Page 8]
Internet-Draft LOCMAF June 2026
4. *emsg version 1 only.* Any emsg boxes in the source MUST be
version 1 per CMAF §7.4.5. See Section 12.
If a source violates any of these, the publisher MUST NOT use LOCMAF
packaging for that track and MUST instead use plain CMAF or an
alternative packaging (e.g. an MSF eventtimeline companion track for
events). LOCMAF and plain CMAF tracks MAY coexist in the same
catalog under the same namespace.
6.2. Recommended source properties
The following are recommendations whose violation costs wire bytes
but does not break LOCMAF:
1. *Commensurate media timescales.* Choose a timescale so every
frame has an exact integer duration (e.g. 48 000 for 48 kHz AAC,
60 000 for 60000/1001 fps video).
2. *Stable trex defaults.* Keeping trex consistent across the stream
maximises what can be omitted from each chunk header.
6.3. tfdt.baseMediaDecodeTime contiguity
CMAF (§7.5.18) requires that each fragment's BMDT equal the previous
fragment's BMDT plus the sum of its sample durations. The delta-
chunk BMDT derivation defined in Section 10 relies on this property.
Re-anchoring is signalled in-band by emitting an absolute BMDT
override (see Section 10).
6.4. Optional encoder modes
A LOCMAF encoder MAY operate in *strict cmf2 mode*, in which it
always emits the four tfhd defaults (sample duration, size, flags,
sample-description index) in the full chunk header even when they
match trex. This costs ~6 B per group but produces reconstructed
CMAF chunks that satisfy CMAF §7.7.3 fragment self- decodability
(each chunk is a single-chunk fragment in the LOCMAF mapping). It
does not need to be signaled since it does not affect wire
compatibility between encoders and decoders.
7. Object Framing
7.1. Top-level header IDs
LOCMAF defines two top-level header IDs:
Einarsson & Björs Expires 4 December 2026 [Page 9]
Internet-Draft LOCMAF June 2026
+====+===================+=====================================+
| ID | Symbol | Object kind |
+====+===================+=====================================+
| 23 | LocmafFullHeader | Full LOCMAF chunk (see Section 9) |
+----+-------------------+-------------------------------------+
| 25 | LocmafDeltaHeader | Delta LOCMAF chunk (see Section 10) |
+----+-------------------+-------------------------------------+
Table 2
Receivers MUST skip (and SHOULD log) unrecognised header_id values
rather than abort. The MOQT object length terminates the unknown
object cleanly.
Future extensions adding new top-level object kinds use any
unassigned ID, allocated via the IANA registry defined in Section 17.
7.2. Object layout
+-----------------------------+
| header_id (varint) | top-level object kind
+-----------------------------+
| properties_length (varint) | byte length of the properties block
+-----------------------------+
| properties (variable) | sequence of (field_id, value) tuples
+-----------------------------+
| mdat raw payload (rest) | length = MOQT-object-len - (above)
+-----------------------------+
header_id and properties_length are variable-length integers using
the encoding defined by MoQ Transport [MOQT] for the session's MOQT
version.
The mdat payload is the contents of the CMAF mdat box — the sample
data, without the surrounding 8-byte size + 'mdat' box header. The
receiver reconstructs a standard mdat box by wrapping these bytes in
an 8-byte ISO BMFF header.
For event-only tracks (see Section 14) the mdat payload MAY be zero
bytes; the receiver reconstructs an empty mdat box (8-byte header
only).
7.3. Property encoding (parity rule)
The properties block is a flat sequence of (field_id, value) tuples.
Field IDs are MOQT varints. The value encoding is determined by the
parity of the ID:
Einarsson & Björs Expires 4 December 2026 [Page 10]
Internet-Draft LOCMAF June 2026
* *Even ID:* scalar varint. The value is a single MOQT varint. No
length prefix. In delta chunks, the encoded value is a zigzag
varint (see Section 7.4) of the signed delta against the in-group
reference; in full chunks it is an absolute unsigned MOQT varint.
* *Odd ID:* length-prefixed bytes. The tuple is field_id |
value_length | value_bytes. The interpretation of the bytes is
per-field; varint-list fields concatenate elements (each element
is a zigzag varint (see Section 7.4) in delta context, an absolute
MOQT varint in full context), raw-bytes fields carry opaque
content. The one exception is the _signed list_
trunSampleCompositionTimeOffsets (ID 5): its elements are zigzag
varints (see Section 7.4) in BOTH full and delta context, because
composition time offsets are signed in trun version 1 (see
Section 8.1).
Field IDs MAY appear in any order; receivers MUST tolerate any
ordering. Encoders SHOULD emit IDs in ascending order to produce
deterministic wire bytes.
7.4. Zigzag varint encoding
A _zigzag varint_ is a signed integer encoded as an unsigned MOQT
varint by interleaving non-negative and negative values so that
small-magnitude values of either sign occupy small unsigned values,
and thus the shortest varint forms.
For a signed 64-bit integer n, the mapping to its unsigned zigzag
representation z is:
encode: z = (n << 1) ^ (n >> 63) ; arithmetic right shift
; equivalently:
; n >= 0: z = 2 * n
; n < 0: z = -2 * n - 1
decode: n = (z >> 1) ^ -(z & 1) ; equivalently:
; z even: n = z / 2
; z odd: n = -(z + 1) / 2
The first few mappings: 0↔0, -1↔1, 1↔2, -2↔3, 2↔4, -3↔5, 3↔6, ….
The encoded z is then serialised as an unsigned MOQT varint ([MOQT]);
decoders read the MOQT varint and apply the decode rule above.
This zigzag mapping is widely used in compact binary serialisation
formats; the description is included here for self-containment of the
LOCMAF wire format.
Einarsson & Björs Expires 4 December 2026 [Page 11]
Internet-Draft LOCMAF June 2026
LOCMAF uses zigzag varints wherever a signed delta against the in-
group reference is written, namely in scalar even-ID fields (see
above) and per-element in varint-list odd-ID fields. Absolute values
in LocmafFullHeader are encoded as plain unsigned MOQT varints, not
zigzag — with one exception: the signed list
trunSampleCompositionTimeOffsets (ID 5) carries zigzag varints even
in a full chunk, because composition time offsets are signed in trun
version 1 (the common CMAF case: B-frames make the composition/
decode-time relation non-monotonic).
7.5. Full vs delta dispatch
The full-vs-delta distinction is signalled exclusively by the top-
level header_id, never by the MOQT object position within a group.
1. The first object of every MOQT group MUST be a LocmafFullHeader.
2. The encoder MAY emit a LocmafFullHeader at any object position
within a group, not only at object index 0. A mid-group full
chunk re-anchors the in-group reference for subsequent delta
chunks.
3. After receiving a LocmafFullHeader, the decoder MUST discard its
in-group delta state and treat the new full chunk as the
reference for any following LocmafDeltaHeader objects in the
group.
4. The receiver MUST dispatch on header_id alone. It MUST NOT infer
"full" from object index 0 or "delta" from object index > 0.
8. Field Reference
Field IDs are organised in blocks by source box:
Einarsson & Björs Expires 4 December 2026 [Page 12]
Internet-Draft LOCMAF June 2026
+================+==============================+
| Range | Block |
+================+==============================+
| 1–16 | fields from moof child boxes |
+----------------+------------------------------+
| 18, 20, 22, 24 | prft fields |
+----------------+------------------------------+
| 23 | styp field |
+----------------+------------------------------+
| 25 | emsg list |
+----------------+------------------------------+
| 27 | delta deletion marker |
+----------------+------------------------------+
Table 3
8.1. Fields from moof child boxes
Fields drawn from boxes inside moof.traf (i.e. from trun, tfhd, tfdt,
or senc). The symbol prefix names the containing box, and the field
IDs are the same across both Full and Delta chunks.
Einarsson & Björs Expires 4 December 2026 [Page 13]
Internet-Draft LOCMAF June 2026
+====+==================================+===============+
| ID | Symbol | Kind |
+====+==================================+===============+
| 1 | trunSampleSizes | list |
+----+----------------------------------+---------------+
| 2 | tfhdSampleDescriptionIndex | scalar |
+----+----------------------------------+---------------+
| 3 | trunSampleDurations | list |
+----+----------------------------------+---------------+
| 4 | tfhdDefaultSampleDuration | scalar |
+----+----------------------------------+---------------+
| 5 | trunSampleCompositionTimeOffsets | signed list ‡ |
+----+----------------------------------+---------------+
| 6 | tfhdDefaultSampleSize | scalar |
+----+----------------------------------+---------------+
| 7 | trunSampleFlags | list † |
+----+----------------------------------+---------------+
| 8 | tfhdDefaultSampleFlags | scalar † |
+----+----------------------------------+---------------+
| 9 | sencInitializationVector | raw bytes |
+----+----------------------------------+---------------+
| 10 | tfdtBaseMediaDecodeTime | scalar |
+----+----------------------------------+---------------+
| 11 | sencSubsampleCount | list |
+----+----------------------------------+---------------+
| 12 | trunFirstSampleFlags | scalar † |
+----+----------------------------------+---------------+
| 13 | sencBytesOfClearData | list |
+----+----------------------------------+---------------+
| 14 | trunSampleCount | scalar |
+----+----------------------------------+---------------+
| 15 | sencBytesOfProtectedData | list |
+----+----------------------------------+---------------+
| 16 | sencPerSampleIVSize | scalar |
+----+----------------------------------+---------------+
Table 4
† Sample-flag fields (IDs 7, 8, 12) carry the 5-bit packed encoding
defined in Section 11.
‡ The _signed list_ (ID 5) carries zigzag varints (see Section 7.4)
per element in BOTH full and delta chunks, because composition time
offsets are signed in trun version 1. This is the sole odd-ID list
whose full-chunk elements are not plain unsigned varints.
Einarsson & Björs Expires 4 December 2026 [Page 14]
Internet-Draft LOCMAF June 2026
The remaining name components map field-for-field onto the source box
(e.g. tfhdDefaultSampleDuration ↔ tfhd.default_sample_duration,
trunFirstSampleFlags ↔ trun.first_sample_flags). Indexing rules:
per-sample lists (IDs 1, 3, 5, 7, 11) carry the samples[i].* values
from their box; the per-subsample lists (IDs 13, 15) carry
senc.samples[i].subsamples[j].* flattened in chunk order; and
sencInitializationVector (9) is the concatenation of per-sample IVs,
each sencPerSampleIVSize bytes long.
The ID space is structurally aligned with the parity rule: every
default/scalar field has an even ID and every per-sample list field
has an odd ID, with sencInitializationVector (9) as the documented
exception (raw bytes rather than a list).
8.2. prft fields
The ProducerReferenceTimeBox (prft, [CMAF] §6.6.8, §7.3.2.4) carries
an NTP-style wall-clock anchor tied to a media time. In CMAF it MAY
precede any moof inside a CMAF chunk and applies to the addressable
media object whose moof it precedes. LOCMAF carries it per-chunk
through the following fields:
+====+==================+====================+==================+
| ID | Symbol | Source field | Kind |
+====+==================+====================+==================+
| 18 | prftNtpTimestamp | prft.ntp_timestamp | scalar (absolute |
| | | (NTP64) | in full; zigzag |
| | | | delta in delta) |
+----+------------------+--------------------+------------------+
| 20 | prftMediaTime | prft.media_time | scalar (absolute |
| | | | in full; zigzag |
| | | | delta in delta) |
+----+------------------+--------------------+------------------+
| 22 | prftVersion | prft.version | scalar (default |
| | | | 1) |
+----+------------------+--------------------+------------------+
| 24 | prftFlags | prft.flags (24-bit | scalar (default |
| | | FullBox flags) | 0) |
+----+------------------+--------------------+------------------+
Table 5
prftNtpTimestamp (ID 18) and prftMediaTime (ID 20) have no default
value, so a full LOCMAF chunk that carries prft MUST include both;
prftVersion (ID 22) and prftFlags (ID 24) default as given above and
MAY be omitted when they match their defaults. The receiver
reconstructs a prft box in the output chunk iff IDs 18 and 20 are
present.
Einarsson & Björs Expires 4 December 2026 [Page 15]
Internet-Draft LOCMAF June 2026
In a delta chunk, the scalar fields are zigzag-encoded deltas against
the most recent full chunk in the same group that itself carried prft
fields; a field left absent is unchanged from that reference. Deltas
are signed because both quantities can decrease in valid CMAF
streams: producer NTP clocks can be corrected backward, and
composition-time reordering with B-frames can place a chunk's
presentation anchor before the previous chunk's. If the previous
full chunk had no prft, an encoder that begins emitting prft mid-
group MUST use absolute encodings (i.e. re-anchor).
This presence-signalling supports three producer patterns with no
further wire-format support:
1. *None:* no prft field is ever emitted.
2. *Per-group:* absolute prft fields on the LocmafFullHeader only,
absent from subsequent LocmafDeltaHeader objects in the group.
3. *Per-chunk:* absolute prft fields on the LocmafFullHeader, delta
prft fields on subsequent LocmafDeltaHeader objects.
8.2.1. prftNtpTimestamp
The value is the full 64-bit NTP timestamp defined by ISO BMFF
(32-bit seconds since 1900-01-01 + 32-bit fraction) carried as a
varint scalar — absolute in a full chunk, zigzag delta in a delta
chunk. Full source precision is preserved so that downstream
consumers can measure producer-vs-receiver clock drift from the sub-
millisecond jitter around the mean inter-chunk period (a coarser
representation would round away the drift signal).
NTP64 is layout-equivalent to a Q32.32 fixed-point seconds value
(integer seconds in the upper 32 bits, binary fraction of a second in
the lower 32 bits). Encoders and receivers MUST treat the field as a
single 64-bit unsigned integer for the purposes of delta computation:
encoder: delta_i64 = (int64)(current_ntp64 - previous_ntp64)
wire = MOQT-varint(zigzag(delta_i64))
receiver: delta_i64 = unzigzag(MOQT-varint-decode(wire))
current_ntp64 = previous_ntp64 + (uint64)delta_i64
The carry from fraction into seconds at a second boundary is absorbed
by the 64-bit add naturally; there is no separate handling. The
receiver splits the resulting 64-bit value back into the
prft.ntp_timestamp seconds (upper 32) and fraction (lower 32) fields.
Einarsson & Björs Expires 4 December 2026 [Page 16]
Internet-Draft LOCMAF June 2026
The steady-state delta for common frame periods lands in the 4-byte
varint band: ~85.9 M units for a 20 ms (50 fps) gap, ~71.6 M for 60
fps, ~171.8 M for 25 fps. The 4-byte cost is the dominant per-chunk
overhead of the prft path; encoders that do not need drift-detection
precision MAY choose the per-group emission pattern instead of per-
chunk (see the producer patterns above).
8.2.2. prftMediaTime
prftMediaTime carries the v1 prft.media_time field (an integer in the
track's mdhd.timescale ticks) directly. It is not re-scaled.
Steady-state deltas at the track timescale are small (e.g. 1024 ticks
for an AAC frame at 48 kHz, 1001 ticks for a 60000/1001 fps video
frame at timescale 60000) and fit in 1–2 varint bytes.
8.2.3. prftVersion
Defaults to 1 (the v1 prft form with a 64-bit media_time). Encoders
SHOULD omit the field; receivers that find the field absent
reconstruct version 1.
8.2.4. prftFlags
Carries the 24-bit FullBox flags field defined in ISO/IEC 14496-12
for ProducerReferenceTimeBox. Known values include 0 (wall-clock
anchor at the encoder, the common case), 1, 2, 4, 8, and 24
(combinations of the lower bits identifying inband-event semantics
and producer scope). Encoders MAY omit the field when its value is
0; receivers that find the field absent reconstruct flags = 0. The
on-wire encoding is a varint scalar; LOCMAF does not constrain the
value beyond what [ISOBMFF] defines.
8.2.5. prft.reference_track_ID
Not carried on the wire. The receiver reconstructs it as the
track_ID of the single trak in the CMAF Header's moov (see
Section 6).
8.3. styp fields
Fields whose source is the FileTypeBox payload of a styp box
([ISOBMFF], §4.3) at the top of a CMAF chunk:
Einarsson & Björs Expires 4 December 2026 [Page 17]
Internet-Draft LOCMAF June 2026
+====+===============+==================+=========================+
| ID | Symbol | Kind | Notes |
+====+===============+==================+=========================+
| 23 | stypBrandList | raw bytes (odd, | Concatenation of 4-byte |
| | | length-prefixed) | FourCC codes: |
| | | | major_brand followed by |
| | | | each compatible_brand. |
| | | | Length MUST be a |
| | | | positive multiple of 4. |
+----+---------------+------------------+-------------------------+
Table 6
LOCMAF does not carry styp.minor_version on the wire. Per [CMAF]
§7.2, minor_version is 0 for any structural CMAF brand used as
major_brand, so the reconstructed styp.minor_version is always 0.
styp fields MAY appear only in LocmafFullHeader; encoders MUST NOT
emit them in LocmafDeltaHeader. Per [CMAF] §7.3.3.1, a styp inside
an addressable media object is ignored by players, so the delta path
has no use for it.
If a LocmafFullHeader carries no stypBrandList, the receiver emits no
styp box in the reconstructed CMAF chunk. CMAF ([CMAF] §7.3.3.1)
does not require a styp for decoding or playback; players that need
brand information consult the CMAF Header's ftyp. Per [CMAF] §7.2,
when an encoder _does_ emit stypBrandList, the reconstructed
styp.minor_version is 0.
8.4. emsg field
+====+==========+==================+===============================+
| ID | Symbol | Kind | Notes |
+====+==========+==================+===============================+
| 25 | emsgList | raw bytes (odd, | A self-delimited |
| | | length-prefixed) | concatenation of v1 emsg |
| | | | records in CMAF order. See |
| | | | Section 12 for record format. |
+----+----------+------------------+-------------------------------+
Table 7
The presence of emsgList is independent of LocmafFullHeader vs
LocmafDeltaHeader. Both kinds carry the full list when present;
there is no delta encoding for emsg.
Einarsson & Björs Expires 4 December 2026 [Page 18]
Internet-Draft LOCMAF June 2026
8.5. Delta deletion marker
+==+=======================+==================+====================+
|ID| Symbol | Kind | Notes |
+==+=======================+==================+====================+
|27| deltaDeletedLocmafIDs | list (odd, | List of field IDs |
| | | length-prefixed) | removed since the |
| | | | previous moof in |
| | | | the same group. |
| | | | Used only in |
| | | | LocmafDeltaHeader. |
+--+-----------------------+------------------+--------------------+
Table 8
9. Full LOCMAF Chunk Encoding
A LocmafFullHeader carries an absolute encoding of one CMAF chunk's
head: at most one optional styp, at most one optional prft, zero or
more emsg boxes that preceded the moof in the source, and the moof
itself. The mdat payload follows the property block, unchanged.
9.1. Emission rules for moof child-box fields
The encoder walks the source moof (paired with the catalog's moov)
and emits each moof field only when the value cannot be derived from
the moov's trex defaults:
tfhdSampleDescriptionIndex
tfhd.HasSampleDescriptionIndex() AND value ≠
trex.default_sample_description_index
tfhdDefaultSampleDuration
tfhd.HasDefaultSampleDuration() AND value ≠
trex.default_sample_duration
tfhdDefaultSampleSize
all samples in the chunk have the same size AND that size ≠
trex.default_sample_size AND sample_count > 1
tfhdDefaultSampleFlags
tfhd.HasDefaultSampleFlags() AND value ≠ trex.default_sample_flags
tfdtBaseMediaDecodeTime
always
trunSampleCount
always
Einarsson & Björs Expires 4 December 2026 [Page 19]
Internet-Draft LOCMAF June 2026
trunFirstSampleFlags
trun.HasFirstSampleFlags()
trunSampleSizes
trun.HasSampleSize() AND sample sizes are not all equal AND
sample_count > 1; the list carries sample_count − 1 values (first
n−1 in chunk order)
trunSampleDurations
trun.HasSampleDuration()
trunSampleCompositionTimeOffsets
trun.HasSampleCompositionTimeOffset()
trunSampleFlags
trun.HasSampleFlags()
sencPerSampleIVSize
senc present AND per_sample_iv_size ≠
tenc.default_per_sample_iv_size
sencInitializationVector
senc present AND per_sample_iv_size > 0 AND samples carry IVs (see
also Section 13.4)
sencSubsampleCount
senc present AND samples carry subsample maps
sencBytesOfClearData
same as sencSubsampleCount
sencBytesOfProtectedData
same as sencSubsampleCount
In strict cmf2 mode (see Section 6), tfhdDefaultSampleDuration,
tfhdDefaultSampleSize, tfhdDefaultSampleFlags, and
tfhdSampleDescriptionIndex are emitted unconditionally on the full
chunk, even when they match trex.
9.1.1. Sample-size derivation
Let n = trunSampleCount and let P be the chunk's mdat-payload length
(MOQT object length minus the framing already consumed). The
receiver MUST derive sample sizes as follows:
Einarsson & Björs Expires 4 December 2026 [Page 20]
Internet-Draft LOCMAF June 2026
* If trunSampleSizes (ID 1) is present, it carries exactly n − 1
values. sample_size[i] = listed[i] for i in [0, n−1), and
sample_size[n−1] = P − sum(listed). The receiver MUST NOT consult
tfhdDefaultSampleSize, trex.default_sample_size, or any other
source for sample sizes in this chunk.
* Else if tfhdDefaultSampleSize (ID 6) is present, all n samples
have that size.
* Else if trex.default_sample_size is non-zero, all n samples have
that size.
* Else, when n == 1, the lone sample's size is P. When n > 1 and no
size information is available the chunk is malformed and the
receiver MUST reject it.
Correspondingly, when sample_count == 1 both trunSampleSizes and
tfhdDefaultSampleSize MUST be omitted — the single sample's size is
always P. When sample_count > 1 with uniform sizes the encoder MUST
emit tfhdDefaultSampleSize (subject to the trex.default_sample_size
equality rule) and MUST NOT emit trunSampleSizes; when sizes vary the
encoder MUST emit trunSampleSizes with exactly n − 1 entries and MUST
NOT emit tfhdDefaultSampleSize. Omitting the last sample size shaves
one varint per chunk; using the default for uniform-size tracks
(common for fixed-bitrate audio, e.g., AC-3) collapses n − 1 varints
to one.
9.2. Emission rules for prft / styp / emsg fields
stypBrandList is emitted iff the source CMAF chunk preceded its moof
with a styp box. When omitted, the receiver produces no styp in the
reconstructed chunk.
prft fields (18, 20, 22, 24) are emitted iff the source CMAF chunk
preceded its moof with a prft box, subject to the per-field defaults
described in Section 8.2 (encoders MAY omit prftVersion when it is 1
and prftFlags when it is 0). All emitted values are absolute.
emsgList is emitted iff the source CMAF chunk preceded its moof with
one or more v1 emsg boxes. See Section 12 for the record format.
10. Delta LOCMAF Chunk Encoding
A LocmafDeltaHeader carries only the differences between the current
CMAF chunk's head and the most recently received full chunk in the
same MOQT group.
Einarsson & Björs Expires 4 December 2026 [Page 21]
Internet-Draft LOCMAF June 2026
10.1. Field value encoding
Each emitted field's value is interpreted relative to its kind:
+===========+====================================+==================+
| Kind | Wire encoding | Reconstruction |
+===========+====================================+==================+
| scalar | zigzag varint (see Section 7.4) of | current = |
| (even ID) | current_value − previous_value | previous + |
| | | delta |
+-----------+------------------------------------+------------------+
| list (odd | zigzag varint (see Section 7.4) | element-wise |
| ID) | per element, concatenated; element | sum with the |
| | delta = current[i] − previous[i] | previous list |
+-----------+------------------------------------+------------------+
| raw bytes | full new bytes verbatim | overwrite |
| (odd ID) | | previous bytes |
+-----------+------------------------------------+------------------+
Table 9
The "previous value" for each field is the effective value used in
the reconstruction of the previous LOCMAF chunk in the same group
(or, after a mid-group LocmafFullHeader, the previous chunk starting
from that re-anchor).
10.1.1. List length changes
The length of a per-sample list field in the current chunk is
trunSampleCount, which is always emitted (see Section 9). This
covers trunSampleDurations (ID 3), trunSampleCompositionTimeOffsets
(ID 5), trunSampleFlags (ID 7), and sencSubsampleCount (ID 11).
trunSampleSizes (ID 1) is the documented exception: it carries
trunSampleCount − 1 entries (the last sample size is derived from the
mdat-payload length per Section 9.1.1). The per-subsample list
fields (IDs 13, 15) have total length equal to
sum(sencSubsampleCount[i]) over the new sample count. Consequently
the receiver knows len(current) for every list field before parsing
the field's payload bytes.
When len(current) ≠ len(previous) the delta rule extends as follows:
* For indices i in [0, min(len(current), len(previous))): the wire
carries zigzag(current[i] − previous[i]) and the receiver
reconstructs current[i] = previous[i] + delta[i].
Einarsson & Björs Expires 4 December 2026 [Page 22]
Internet-Draft LOCMAF June 2026
* For indices i in [len(previous), len(current)) (current longer
than previous): the wire carries zigzag(current[i]), i.e. the
absolute value, equivalent to treating the missing previous entry
as 0. The receiver reconstructs current[i] = delta[i].
* For indices i in [len(current), len(previous)) (current shorter
than previous): no bytes are emitted for these positions. The
receiver simply truncates to len(current).
The deletion list deltaDeletedLocmafIDs (ID 27) is an exception: it
carries the set of field IDs deleted in _this_ chunk, encoded as
plain unsigned varints (not zigzag, not deltas against a "previous
deletion list"). Its length is determined by the field's byte-length
prefix; the receiver reads unsigned varints until the prefix is
exhausted.
10.2. tfdtBaseMediaDecodeTime is normally derived
The receiver derives the new BMDT as previous_bmdt +
sum(previous_sample_durations). This is safe because CMAF requires
the decode timeline to be contiguous — each fragment's
baseMediaDecodeTime equals the previous fragment's plus the sum of
its sample durations (see Section 6.3, which carries the CMAF
reference and its normative keyword). When the source BMDT
nevertheless diverges from this derivation (audio pre-roll, splicing,
stream re-anchor), the encoder MUST emit tfdtBaseMediaDecodeTime (ID
10) in the delta chunk as an absolute unsigned varint (i.e. the same
encoding as in a full chunk, not a zigzag delta). The receiver
checks for the field first and uses its value when present.
10.3. Deletions
Delta encoding in LOCMAF is additive: a field absent from a
LocmafDeltaHeader is treated as unchanged from the previous chunk.
This compresses the common case (most moof fields stay stable from
chunk to chunk) but on its own gives an encoder no way to signal that
a field which was present in the previous chunk is genuinely gone in
the current one. The deletion marker provides that signal.
The motivating case is trunFirstSampleFlags (ID 12). A SAP-1 random-
access chunk emits this field to flag its first sample as a sync
sample; the immediately following non-sync chunk must say "this
override no longer applies" so the receiver falls back to
trex.default_sample_flags for the first sample.
The deltaDeletedLocmafIDs field (ID 27) carries a varint list of
field IDs that were present in the previous chunk but are no longer
present. The decoder applies deletions before applying deltas.
Einarsson & Björs Expires 4 December 2026 [Page 23]
Internet-Draft LOCMAF June 2026
Example: when the first chunk of a group carried trunFirstSampleFlags
(a SAP-1 sync sample) and the second chunk does not, the second chunk
emits ID 27 with a one-element list containing ID 12. The typical
cost is two bytes — one length-prefixed list with one field ID —
versus the tens to hundreds of bytes of re-anchoring.
10.4. Empty delta
An empty delta payload (properties_length == 0) is valid and means
"no field changed since the previous chunk." This is the steady-
state case for sample-level fragmented streams. The on-wire LOCMAF
object reduces to LocmafDeltaHeader | properties_length=0 | mdat,
which is two bytes plus the mdat.
10.5. prft and emsg in delta chunks
prft fields use the encoding above (scalar values become zigzag
deltas against the in-group reference). emsgList carries the full new
event list, with no delta encoding — see Section 12.
stypBrandList (ID 23) MUST NOT appear in a LocmafDeltaHeader.
11. Compact sample_flags Encoding
ISO BMFF sample_flags ([ISOBMFF] §8.8.3.1) is a 32-bit bit- packed
field, but the bits that vary in CMAF content occupy only five of the
32. LOCMAF encodes the five varying bits in a 5-bit transport value
to fit them in a single 6-bit-payload MOQT varint (leaving room for
the zigzag sign bit in delta context).
11.1. Wire encoding
The 5-bit packed value (LSB first):
+=====+===========================+
| bit | source field |
+=====+===========================+
| 0 | sample_is_non_sync_sample |
+-----+---------------------------+
| 1–2 | sample_depends_on |
+-----+---------------------------+
| 3–4 | sample_is_depended_on |
+-----+---------------------------+
Table 10
Einarsson & Björs Expires 4 December 2026 [Page 24]
Internet-Draft LOCMAF June 2026
trunSampleFlags (ID 7) carries this 5-bit value (range 0–31) per
sample. tfhdDefaultSampleFlags (ID 8) and trunFirstSampleFlags (ID
12) carry the same 5-bit transport.
In a full chunk the field is an unsigned varint scalar (or list). In
a delta chunk it is a signed zigzag varint (or list of zigzag
deltas).
11.2. Reconstruction
The receiver expands the 5-bit transport into a 32-bit sample_flags:
sample_flags = (is_depended_on << 22)
| (depends_on << 24)
| (non_sync << 16)
is_leading, sample_has_redundancy, sample_padding_value, and
sample_degradation_priority are reconstructed as zero.
11.3. Encoder constraint
The encoder MUST validate that the source's sample_flags populates
only the five bits listed above (see Section 6). Source content that
uses other bits MUST be carried via plain CMAF packaging instead.
12. emsg Round-Trip
The DASH event message box ([DASH] §5.10.3) carries application-
defined timed metadata. CMAF (§7.4.5) mandates version 1 emsg boxes
for in-band CMAF event messages.
LOCMAF carries emsg as a length-prefixed list of records inside the
chunk header at field ID 25. Encoders MUST emit only v1 records.
12.1. Relationship to MSF eventtimeline
New MOQT deployments SHOULD use a companion MSF eventtimeline track
[MSF] for event metadata rather than inline emsg. LOCMAF's emsg
support exists primarily to preserve inline events in sources
transcoded from DASH or CMAF Ingest [DASH-IF-INGEST].
12.2. Record format
Each record inside emsgList is a compact encoding of a v1 emsg
payload:
Einarsson & Björs Expires 4 December 2026 [Page 25]
Internet-Draft LOCMAF June 2026
record = scheme_id_uri (varint length + UTF-8 bytes)
| value (varint length + UTF-8 bytes)
| timescale (varint; 0 = "use track mdhd.timescale")
| presentation_time (encoding depends on timescale; below)
| event_duration (varint, `timescale` ticks; 0 = unknown)
| id (varint)
| message_data (varint length + opaque bytes)
reference_track_id and version are implicit (the track this chunk
belongs to, and 1 respectively).
12.2.1. timescale default
timescale == 0 means "use the track's mdhd.timescale." Receivers
write the track's mdhd.timescale into the reconstructed
emsg.timescale field when the record carried 0. Encoders MAY emit a
non-zero timescale only when the source's emsg.timescale actually
differs from the track timescale.
12.2.2. presentation_time encoding
The encoding of presentation_time depends on the record's timescale
field:
* *timescale == 0 (track-timescale, delta encoding):* the field is
encoded as a signed zigzag varint delta against the chunk's BMDT
(in track-timescale ticks). The reconstructed value is chunk_bmdt
+ delta.
* *timescale != 0 (foreign-timescale, absolute encoding):* the field
is encoded as an unsigned varint carrying the absolute
emsg.presentation_time directly. No delta is applied because the
BMDT and the event time would live on different axes.
The receiver discriminates by inspecting the record's timescale
field, which is encoded earlier in the record.
13. DRM Box Round-Trip
LOCMAF preserves the per-sample CENC [CENC] metadata needed for EME-
based decryption.
13.1. Supported schemes
LOCMAF v0.2 supports the following CENC [CENC] protection schemes,
identified by the tenc.default_isProtected = 1 track defaults and the
four-character scheme_type in the surrounding schm box:
Einarsson & Björs Expires 4 December 2026 [Page 26]
Internet-Draft LOCMAF June 2026
* cenc: AES-128-CTR full-sample encryption. Per-sample
initialization vectors are big-endian counters advanced by the
per-sample encrypted-byte total ([CENC], §10.1); LOCMAF carries
these IVs via sencInitializationVector and permits omission under
the counter rule of Section 13.4.
* cbcs: AES-128-CBC subsample pattern encryption with a constant
initialization vector taken from tenc.default_constant_iv ([CENC],
§10.4); no per-sample IV appears in senc, and the pattern
(default_crypt_byte_block / default_skip_byte_block) is carried
verbatim with the CMAF Header.
The cbc1 and cens schemes are out of scope; sources using them MUST
fall back to plain CMAF packaging.
13.2. Supported boxes
+======+=============+==============================================+
| Box | Where | LOCMAF treatment |
| | in CMAF | |
+======+=============+==============================================+
| senc | inside | per-sample IVs and subsample maps carried |
| | traf | via moof field IDs 9, 11, 13, 15, 16. |
+------+-------------+----------------------------------------------+
| saio | inside | not carried on the wire; recomputed by the |
| | traf | receiver to point at the reconstructed senc. |
+------+-------------+----------------------------------------------+
| saiz | inside | not carried on the wire; reconstructed from |
| | traf | per-sample IV size and subsample counts. |
+------+-------------+----------------------------------------------+
| tenc | inside | carried verbatim inside the CMAF Header. |
| | sinf in | |
| | moov | |
+------+-------------+----------------------------------------------+
Table 11
13.3. Unsupported boxes
The following CMAF DRM boxes are not supported by LOCMAF v0.2.
Sources that require them MUST use plain CMAF packaging instead:
Einarsson & Björs Expires 4 December 2026 [Page 27]
Internet-Draft LOCMAF June 2026
+================+================================================+
| Box | Reason for exclusion |
+================+================================================+
| sgpd/sbgp | Mid-fragment key rotation via seig sample |
| | groups is out of scope; KID changes MUST align |
| | with fragment boundaries (see Section 6). |
+----------------+------------------------------------------------+
| pssh (per- | License-acquisition information is signalled |
| fragment) | via the CMSF contentProtections mechanism, per |
| | [CMAF] §7.4.3. |
+----------------+------------------------------------------------+
| subs | Sub-sample information for image subtitle |
| | profiles (e.g. im1i) is out of scope. |
+----------------+------------------------------------------------+
Table 12
13.4. CENC IV counter prediction (optional)
For the cenc scheme, ISO/IEC 23001-7 §9.6 specifies that the per-
sample initialization vector is a big-endian counter advanced sample-
by-sample by exactly ceil(total_encrypted_bytes_in_sample / 16) AES
blocks. Both endpoints already see the per-sample encrypted-byte
totals (sencBytesOfProtectedData) and the IV anchor (carried on the
first full chunk of the track), so the receiver can derive every
subsequent per-sample IV deterministically.
LOCMAF v0.2 permits encoders to omit sencInitializationVector (ID 9)
when the source follows this counter rule, and requires receivers to
support derivation:
* An encoder MAY omit sencInitializationVector on full and delta
chunks when every per-sample IV in the chunk matches the value
derived by the CENC counter rule from the previous chunk's IVs and
sencBytesOfProtectedData totals.
* A receiver MUST be able to derive per-sample IVs from the counter
rule. When sencInitializationVector is absent and the scheme is
cenc, the receiver advances the running IV counter and uses the
derived value.
* When the source diverges from the counter rule (random IVs, mid-
track counter restart, or any non-conformant strategy), the
encoder MUST emit sencInitializationVector absolutely on every
affected sample.
Einarsson & Björs Expires 4 December 2026 [Page 28]
Internet-Draft LOCMAF June 2026
The cbcs scheme uses a constant IV from tenc.default_constant_iv
carried once via the CMAF Header. There is no per-sample IV in the
moof in the first place, so counter prediction does not apply to
cbcs.
14. Event-Only Tracks and CMAF Ingest Compatibility
DASH-IF Ingest [DASH-IF-INGEST] defines a CMAF-based push interface
for live encoders. One of its track shapes is the sparse event-only
track: a CMAF track that carries no media samples (or samples of zero
size) and exists purely to deliver timed events via emsg boxes
attached to its chunks.
LOCMAF supports event-only tracks without any wire-format extension.
A LocmafFullHeader for an event-only group sets trunSampleCount = 0,
carries tfdtBaseMediaDecodeTime and emsgList, and is followed by an
empty mdat payload. Subsequent chunks in the same group use
LocmafDeltaHeader with the absolute-BMDT override pattern (see
Section 10) because the zero sample-count produces no derivation
increment.
Two encoder strategies are valid:
1. *Absolute BMDT per chunk.* The delta chunk emits
tfdtBaseMediaDecodeTime explicitly. Costs an extra varint per
chunk; recommended for sparse event-only tracks.
2. *Synthetic per-chunk sample.* The encoder sets sample_count = 1
with a default_sample_duration equal to the intended per-chunk
advancement and a zero-size sample. BMDT derivation works
without override. Matches how DASH-IF Ingest commonly shapes
sparse metadata tracks (urim, stpp).
For new MOQT deployments, MSF eventtimeline [MSF] is the preferred
mechanism for event metadata. LOCMAF event-only tracks are intended
for gateways that transit-relay CMAF Ingest content unchanged across
MOQT.
15. Receiver Reconstruction
A receiver maintains, per subscribed track:
1. The track's CMAF Header (from the catalog), parsed for the single
trak's track_ID, trex defaults, tenc defaults, and
mdhd.timescale.
Einarsson & Björs Expires 4 December 2026 [Page 29]
Internet-Draft LOCMAF June 2026
2. An in-group "previous chunk" state, populated from the most
recent LocmafFullHeader and updated by each subsequent
LocmafDeltaHeader. Discarded on group boundaries and on mid-
group LocmafFullHeader re-anchors (see Section 7.5).
For each LOCMAF object, the receiver:
1. Reads header_id and dispatches per Section 7.5.
2. Reads properties_length and the property block.
3. Decodes property tuples per the parity rule and the per-field
rules above.
4. Applies the decoded fields to the previous-chunk state to produce
the absolute field values for the current chunk.
5. Reconstructs the CMAF chunk:
1. Synthesises a styp box from stypBrandList when present; omits
it otherwise.
2. Synthesises a prft box from any prft fields present; omits it
when no prft field is present.
3. Synthesises one or more v1 emsg boxes from emsgList when
present; omits them otherwise.
4. Synthesises the moof box (mfhd, traf with tfhd, tfdt, trun,
optionally senc/saio/saiz) from the moof fields and the CMAF
Header's trex / tenc defaults. The LOCMAF wire format
carries neither the track_ID nor the tfhd flags, so the
receiver supplies them from the init and from the CMAF rules:
* The synthesised tfhd.track_ID MUST be set to the track_ID
of the single trak in the CMAF Header's moov (item 1).
* The synthesised tfhd MUST set the default-base-is-moof
flag (tf_flags 0x020000) and MUST NOT set base-data-
offset-present, as required by [CMAF]; sample data offsets
are therefore relative to the start of the containing moof
([ISOBMFF]).
5. Wraps the mdat payload bytes in an 8-byte mdat box header.
6. Feeds the reconstructed chunk to the local CMAF reader / MSE
pipeline.
Einarsson & Björs Expires 4 December 2026 [Page 30]
Internet-Draft LOCMAF June 2026
The reconstructed CMAF chunk is *functionally equivalent* to the
source chunk: every sample has the same size, decode time,
presentation time, flags, and CENC metadata, and the chunk feeds an
MSE / EME pipeline identically to the source. Byte-level identity
with the source moof is not preserved. Implementations MAY differ
in:
* The exact ordering of saio / saiz / senc and other generated boxes
inside the reconstructed traf, provided the ordering is legal
CMAF.
* The trun.tr_flags packing chosen on reconstruction.
* Whether tfhd defaults that match trex appear in the reconstructed
tfhd (they SHOULD when the encoder ran in strict cmf2 mode; they
MAY otherwise).
A receiver MUST NOT depend on byte-level identity with the source
CMAF stream. A downstream consumer that needs a specific CMAF byte
layout MUST repackage the output of the LOCMAF receiver to produce
the desired form.
16. Security Considerations
LOCMAF is a compression layer over CMAF media and does not introduce
new authentication or confidentiality mechanisms. It is intended to
be used over MOQT [MOQT], which inherits QUIC's transport security.
Per-sample encryption metadata defined by [CENC] is preserved through
the LOCMAF round-trip; LOCMAF neither weakens nor strengthens the
underlying DRM scheme.
A receiver MUST validate that reconstructed moof, prft, and emsg
boxes are well-formed before passing them to a media pipeline.
Malformed deltas could otherwise be used to construct ISO BMFF
[ISOBMFF] boxes with inconsistent field lengths. Specifically:
* The receiver MUST bound the size of any reconstructed per-sample
list against trunSampleCount.
* The receiver MUST verify that the sum of reconstructed
sencBytesOfClearData and sencBytesOfProtectedData for each sample
equals the sample's size.
* The receiver MUST verify that the CENC-IV counter, if derived,
does not advance past the per-sample-IV-size range.
Einarsson & Björs Expires 4 December 2026 [Page 31]
Internet-Draft LOCMAF June 2026
The CENC IV counter-prediction optimisation (Section 13.4) does not
disclose key material and produces the same IV stream a conformant
encoder would have transmitted; it does not weaken CENC.
Replay considerations within a MOQT group are inherited from MOQT —
LOCMAF adds no new replay attack surface.
17. IANA Considerations
This document does not register the "locmaf" packaging value or the
locmafVersion catalog field with IANA (see Section 17.1); it does
request a registry for the LOCMAF top-level header IDs (see
Section 17.2).
17.1. Catalog packaging value and locmafVersion
Following the precedent of [CMSF] — whose IANA Considerations record
no IANA actions for its "cmaf" packaging value or its added catalog
fields — this document registers neither of the following with IANA:
* packaging: "locmaf" extends the [MSF] packaging-values table (see
Section 4). [MSF] defines no IANA registry for packaging values;
new values are introduced by the documents that define them.
* locmafVersion is a track-level catalog field defined by this
document (see Section 4). [MSF] permits documents and producers
to define additional catalog fields and maintains no IANA registry
of them.
The set of valid locmafVersion values — "0.2" for this document — is
governed by this specification and its successors; no IANA action is
required. Should a future [MSF] revision introduce an IANA
packaging-value registry, the "locmaf" value SHOULD be registered
there, while locmafVersion and its values remain document-governed.
17.2. LOCMAF Top-Level Header IDs
This document defines a new registry for LOCMAF top-level header IDs.
+====+===================+===============+
| ID | Symbol | Reference |
+====+===================+===============+
| 23 | LocmafFullHeader | this document |
+----+-------------------+---------------+
| 25 | LocmafDeltaHeader | this document |
+----+-------------------+---------------+
Table 13
Einarsson & Björs Expires 4 December 2026 [Page 32]
Internet-Draft LOCMAF June 2026
All other IDs in the unsigned varint range are available for
assignment via Specification Required ([RFC8126]).
LOCMAF property field IDs are part of this document's wire format and
are not registered with IANA. New field IDs are introduced through
revisions of this specification, signalled by a bump of the
locmafVersion catalog value (see Section 4). The full field-ID
assignment for this version is given in Section 8.
18. References
18.1. Normative References
[CMSF] Law, W., "CMSF- a CMAF compliant implementation of MOQT
Streaming Format", Work in Progress, Internet-Draft,
draft-ietf-moq-cmsf-00, 1 December 2025,
<https://datatracker.ietf.org/doc/html/draft-ietf-moq-
cmsf-00>.
[MOQT] Nandakumar, S., Vasiliev, V., Swett, I., and A. Frindell,
"Media over QUIC Transport", Work in Progress, Internet-
Draft, draft-ietf-moq-transport-18, 12 May 2026,
<https://datatracker.ietf.org/doc/html/draft-ietf-moq-
transport-18>.
[MSF] Law, W. and S. Nandakumar, "MOQT Streaming Format", Work
in Progress, Internet-Draft, draft-ietf-moq-msf-01, 2 June
2026, <https://datatracker.ietf.org/doc/html/draft-ietf-
moq-msf-01>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
18.2. Informative References
[CENC] "Information technology — MPEG systems technologies — Part
7: Common encryption in ISO base media file format files",
ISO/IEC 23001-7:2023, 2023.
[CMAF] "Information technology — Multimedia application format
(MPEG-A) — Part 19: Common media application format (CMAF)
for segmented media", ISO/IEC 23000-19:2024, 2024.
Einarsson & Björs Expires 4 December 2026 [Page 33]
Internet-Draft LOCMAF June 2026
[COMPRESSED-MP4]
Curley, L., "Compressed MP4", Work in Progress, Internet-
Draft, draft-lcurley-compressed-mp4-00, 16 March 2026,
<https://datatracker.ietf.org/doc/html/draft-lcurley-
compressed-mp4-00>.
[DASH] "Information technology — Dynamic adaptive streaming over
HTTP (DASH) — Part 1: Media presentation description and
segment formats", ISO/IEC 23009-1:2026, 2026.
[DASH-IF-INGEST]
"DASH-IF Live Media Ingest Protocol", n.d.,
<https://dashif.org/Ingest/>.
[ISOBMFF] "Information technology — Coding of audio-visual objects —
Part 12: ISO base media file format", ISO/
IEC 14496-12:2026, 2026.
[LOC] Zanaty, M., Nandakumar, S., and P. Thatcher, "Low Overhead
Media Container", Work in Progress, Internet-Draft, draft-
ietf-moq-loc-02, 15 March 2026,
<https://datatracker.ietf.org/doc/html/draft-ietf-moq-loc-
02>.
[LOCMAF-SITE]
"LOCMAF — Low Overhead CMAF for MoQ", n.d.,
<https://locmaf.dev>.
[MOQLIVEMOCK]
"moqlivemock — Reference LOCMAF server and tooling", n.d.,
<https://github.com/Eyevinn/moqlivemock>.
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
Writing an IANA Considerations Section in RFCs", BCP 26,
RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/rfc/rfc8126>.
Acknowledgments
The initial version of LOCMAF was developed as part of the Master
Thesis work of Hugo Björs at Eyevinn Technology, supervised by
Torbjörn Einarsson. The authors thank the Media over QUIC working
group, in particular the authors and contributors to [MOQT] and
[CMSF], for the prior art this work builds on.
Authors' Addresses
Einarsson & Björs Expires 4 December 2026 [Page 34]
Internet-Draft LOCMAF June 2026
Torbjörn Einarsson
Eyevinn Technology
Email: torbjorn.einarsson@eyevinn.se
Hugo Björs
KTH
Email: hugobjoers@gmail.com
Einarsson & Björs Expires 4 December 2026 [Page 35]