Machine Learning for Audio Coding
|Document||Proposed charter||Machine Learning for Audio Coding WG (mlcodec)|
|Title||Machine Learning for Audio Coding|
|State||Start Chartering/Rechartering (Internal Steering Group/IAB Review) Initial chartering|
|IESG||Responsible AD||Murray Kucherawy|
|Charter edit AD||Murray Kucherawy|
On agenda of 2023-06-08 IESG telechat
Has enough positions to pass.
|Send notices email@example.com|
Problem Statement The Opus codec (RFC 6716) was adopted by the IETF in 2012. Since then, speech and audio processing technology has made significant progress, thanks in large part to deep learning techniques. It is desirable to update the existing Opus codec to benefit from recent advances without breaking compatibility with RFC 6716. Opus has achieved a wide degree of interoperability by using in-band signaling to avoid negotiation failure. Implementing new coding technology within Opus would allow incremental compatible deployment of the updated specification, while preserving interoperability with the existing billions of Opus-enabled devices. In doing so, we wish to retain the original qualities that drove the original Opus development to develop codecs that (quoting from codec WG charter): 1. Are optimized for use in interactive Internet applications. 2. Are published by a recognized standards development organization (SDO) and therefore subject to clear change control. 3. Can be widely implemented and easily distributed among application developers, service operators, and end users. Objectives The goals of this working group are: 1) Improving the robustness to packet loss of Opus through efficient redundancy transmission 2) Improving the speech coding quality at low bitrates 3) Improving the music coding quality at low bitrates The working group may also consider other improvements to Opus. The group will only consider solutions that result in bitstreams that are forwards and backwards compatible with RFC6716, and thus decodable by any decoder. Although it is likely that machine learning will be required to meet the objectives above, classical solutions will also be considered if they can achieve similar performance. As was the case with the original codec WG, this work will primarily focus on interactive, real-time applications over the Internet and will ensure interoperability with existing IETF real-time protocols, including RTP, SIP/SDP, and WebRTC. Given the widespread deployment of WebRTC, ensuring that the work improves WebRTC experience is of particular importance. Other applications, such as non-real-time streaming will be considered too, but only so long as their requirements do not interfere with those of real-time applications. The working group cannot explicitly rule out the possibility of adopting encumbered technologies; however, consistent with BCP 78 and BCP 79, the working group will try to avoid encumbered technologies that require royalties or other encumbrances that would prevent such technologies from being easy to redistribute and use. Deliverables 1. A specification for a generic Opus extension mechanism that can be used not only for the other proposed deliverables, but can also sustain further extensions to Opus in the future. This document shall be a Proposed Standard document. 2. A specification for coding large amounts of very low bitrate redundancy information for the purpose of significantly improving the robustness of Opus to bursts of packet loss. This document shall be a Proposed Standard document. 3. A specification for improving the quality of SILK- and hybrid-coded speech through decoder changes, with and without side information provided by the encoder. This will be done in a way that does not affect interoperability between original and extended implementations. This document shall be a Proposed Standard document. 4. A specification for improving the quality of CELT-coded audio (both speech and music) through decoder changes, with and without side information provided by the encoder. This will be done in a way that does not affect interoperability between original and extended implementations. This document shall be a Proposed Standard document.
|Sep 2024||Sunmit specification for improving the quality of CELT-coded audio to the IESG as Proposed Standard.|
|Jun 2024||Submit specification for improving the quality SILK- and hybrid-coded speech to the IESG as Proposed Standard.|
|Mar 2024||Submit specification for Opus resiliency against packet loss to the IESG as Proposed Standard.|
|Dec 2023||Submit generic Opus extension mechanism to the IESG as Proposed Standard.|