Scalable Quality Extension for the Opus Codec
draft-valin-opus-scalable-quality-extension-00
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
Document | Type | Active Internet-Draft (individual) | |
---|---|---|---|
Author | Jean-Marc Valin | ||
Last updated | 2024-11-25 | ||
RFC stream | (None) | ||
Intended RFC status | (None) | ||
Formats | |||
Stream | Stream state | (No stream defined) | |
Consensus boilerplate | Unknown | ||
RFC Editor Note | (None) | ||
IESG | IESG state | I-D Exists | |
Telechat date | (None) | ||
Responsible AD | (None) | ||
Send notices to | (None) |
draft-valin-opus-scalable-quality-extension-00
mlcodec JM. Valin Internet-Draft Google Updates: 6716 (if approved) 25 November 2024 Intended status: Standards Track Expires: 29 May 2025 Scalable Quality Extension for the Opus Codec draft-valin-opus-scalable-quality-extension-00 Abstract This document updates RFC6716 to add support for a scalable quality layer. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 29 May 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Valin Expires 29 May 2025 [Page 1] Internet-Draft Scalable Quality Extension November 2024 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 2 2. Scalable Quality Extension . . . . . . . . . . . . . . . . . 2 2.1. Extended resolution . . . . . . . . . . . . . . . . . . . 3 2.2. Extended frequency range . . . . . . . . . . . . . . . . 3 2.3. Time-domain processing at 96 kHz . . . . . . . . . . . . 3 3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 3 4. Security Considerations . . . . . . . . . . . . . . . . . . . 4 5. References . . . . . . . . . . . . . . . . . . . . . . . . . 4 5.1. Normative References . . . . . . . . . . . . . . . . . . 4 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction This document updates RFC6716 to add support for a scalable quality extension layer. 1.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2. Scalable Quality Extension The Opus codec was designed to operate at sampling frequencies up to 48 kHz, with an audio bandwidth up to 20 kHz. The CELT mode that is used for high bitrate coding uses ector quantization with a mostly implicit bit allocation system that is dictated by the bitstream definition. Opus can allocate up to 8 bits per MDCT bin in some of the bands. While Opus capabilities listed above are sufficient to reach achieve perceptually transparent audio coding, there is a use for codecs that scale beyond those specs. That includes the current market for 24-bit/96 kHz codecs, but also any application where the intended receipient is not (only) a human being, e.g. ultra-sonic applications. This document proposes a scalable quality extension layer that both increases the resolution of existing Opus quantizers below 20 kHz, and defines a way of coding audio above 20 kHz, with a sampling rate of 96 kHz. The extension is designed to be forward and backward compatible with [RFC6716]. All extra bits use the Opus extension Valin Expires 29 May 2025 [Page 2] Internet-Draft Scalable Quality Extension November 2024 mechanism defined in [opus-extension] and a 96 kHz decoder is designed to be able to decode a regular 48 kHz RFC 6716 stream and vice versa. The code corresponding to this draft (work in progress) is available on the exp_qext17 branch of the Opus repository at https://gitlab.xiph.org/xiph/opus/ . 2.1. Extended resolution To reduce the coding error, we need to increase the resolution for 3 different quantizers: the fine energy quantizer (scalar), the band pyramid vector quantizer (PVQ), and the band splitting angle quantizer. More on extra resolution here 2.2. Extended frequency range To extend the audio bandwidth, we need to define more frequency bands. Because psychoacoustics is no longer involved past 20 kHz, all new bands are defined to have the same width. More on band definitions here 2.3. Time-domain processing at 96 kHz CELT includes two time-domain filter pairs that require updating for 96 kHz: the preemphasis/deempahsis filters, as well as the pitch prefilter/postfilter. The CELT deemphasis filter is currently defined as D(z)=1/(1 - a1*z^-1) for a 48 kHz signal, where a1=27853/32768. To obtain approximately the same response in the 0-20 kHz range using a sampling rate of 96 kHz, we instead use D(z)=g*(1 - b1*z^-1)/(1 - a1*z^-1), where g=5415/8192, b1=7209/32768, a1=30245/32768. For the pitch pre-filter/post-filter, we use zero-insertion upsampling of the 48 kHz filters, which results in the same frequency response below 24 kHz and a "folded" image above 24 kHz. For example, if for a pitch period T (in 48 kHz units) the postfilter was P(z)=1/(1 - a0*z^-T+1 - a1*z^-T - a2*z^-T-1), then for the same pitch, the 96 kHz filter becomes P(z)=1/(1 - a0*z^-2T+2 - a1*z^-2T - a2*z^-2T-2). 3. IANA Considerations [Note: Until the IANA performs the actions described below, implementers should use 124 instead of 33 as the extension number.] Valin Expires 29 May 2025 [Page 3] Internet-Draft Scalable Quality Extension November 2024 This document assigns ID 33 to the "Opus Extension IDs" registry created in [opus-extension] to implement the proposed scalable quality extension. 4. Security Considerations This document does not add security considerations beyond those already documented in [RFC6716]. 5. References 5.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, September 2012, <https://www.rfc-editor.org/info/rfc6716>. [opus-extension] Terriberry, T.B. and J.-M. Valin, "Extension Formatting for the Opus Codec (draft-ietf-mlcodec-opus-extension)", October 2023. Author's Address Jean-Marc Valin Google Canada Email: jeanmarcv@google.com Valin Expires 29 May 2025 [Page 4]