RTP Payload Format for AC-3 Audio
Network Working Group B. Link
Request for Comments: 4184 T. Hager
Category: Standards Track Dolby Laboratories
RTP Payload Format for AC-3 Audio
Status of This Memo
This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
Copyright (C) The Internet Society (2005).
This document describes an RTP payload format for transporting audio
data using the AC-3 audio compression standard. AC-3 is a high
quality, multichannel audio coding system that is used for United
States HDTV, DVD, cable television, satellite television and other
media. The RTP payload format presented in this document includes
support for data fragmentation.
AC-3 [ATSC] is a high-quality audio codec (audio coding format)
designed to encode multiple channels of audio into a low bit-rate
format. AC-3 achieves its large compression ratios via encoding a
multiplicity of channels as a single entity. Dolby Digital, which is
a branded version of AC-3, encodes up to 5.1 channels of audio.
AC-3 has been adopted as an audio compression scheme for many
consumer and professional applications. It is a mandatory audio
codec for DVD-video, Advanced Television Standards Committee (ATSC)
digital terrestrial television and Digital Living Network Alliance
(DLNA) home networking, as well as an optional multichannel audio
format for DVD-audio.
There is a need to stream AC-3 data over IP networks. The Internet
Real Time Protocol (RTP) provides a mechanism for stream
Link, et al. Standards Track [Page 1]
RFC 4184 RTP Payload for AC-3 October 2005
synchronization and hence serves as the best transport solution for
AC-3, which is primarily used in audio-for-video applications.
Applications for streaming AC-3 include streaming movies from a home
media server to a display, video on demand, and multichannel Internet
Section 2 gives a brief overview of the AC-3 algorithm. Section 3
specifies values for fields in the RTP header, while Section 4
specifies the AC-3 payload format. Section 5 discusses media types
and SDP usage. Security considerations are covered in Section 6,
congestion control in Section 7, and IANA considerations in Section
8. References are given in Sections 9 and 10.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2. Overview of AC-3
AC-3 can deliver up to 5.1 channels of audio at data rates
approximately equal to half of one PCM channel [ATSC], [1994AC3],
[1996AC3]. The ".1" refers to a band-limited, optional, low-
frequency effects (LFE) channel. AC-3 was designed for signals
sampled at rates of 32, 44.1, or 48 kHz. Data rates can vary between
32 kbps and 640 kbps, depending on the number of channels and the
AC-3 exploits psycho-acoustic phenomena that cause a significant
fraction of the information contained in a typical audio signal to be
inaudible. Substantial data reduction occurs via the removal of
inaudible information contained in an audio stream. Source coding
techniques are further used to reduce the data rate.
Like most perceptual coders, AC-3 operates in the frequency domain.
A 512-point TDAC transform is taken with 50% overlap, providing 256
new frequency samples. Frequency samples are then converted to
exponents and mantissas. Exponents are differentially encoded.
Mantissas are allocated a varying number of bits depending on the
audibility of the associated spectral components. Audibility is
determined via a masking curve. Bits for mantissas are allocated
from a global bit pool.
2.1. AC-3 Bit Stream
AC-3 bit streams are organized into synchronization frames. Each
AC-3 frame contains a Synchronization Information (SI) field, a Bit
Show full document text