Internet Engineering Task Force Tim Fingscheidt, Siemens AG
Audio Video Transport WG Bernhard Wimmer, Siemens AG
INTERNET-DRAFT Germany
July 14, 2000
Expires: January 14, 2001
RTP Payload Format for AMR
<draft-fingscheidt-avt-rtp-amr-00.txt>
Status of this memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or cite them other than as "work in progress".
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/lid-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This document is an individual submission to the IETF. Comments
should be directed to the authors.
Abstract
This document proposes a real-time transport protocol (RTP) [1]
payload format for AMR speech encoded [2] signals. It supports all
8 modes of the AMR speech codec and is as well prepared for future
extensions, such as AMR wideband. Mode adaptation and discontinuous
transmission (DTX) are supported as well.
The proposed payload format allows large flexibility with a minimum
of bitrate overhead. One or multiple speech frames can be trans-
mitted in a single packet. Redundant transmission of previously
transmitted frames (or parts thereof) is possible as well as parity
code transmission. With one speech frame per packet the additional
parity code transmission allows reconstruction of N previous lost
speech frames when N consecutive correct packets are buffered in the
receiver. This means a very high robustness while the receiver
buffer size can be chosen according to the application.
For implementation of this draft, please consider also the
requirements of [12].
Fingscheidt & Wimmer [Page 1]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
1. Conventions used
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119 [11].
2. Introduction
The European Telecommunications Standards Institute (ETSI) as well
as the Third Generation Partnership Project (3GPP) standardized the
adaptive multi-rate (AMR) speech codec. In third generation systems
the AMR codec will be mandatory. Three of the AMR modes are earlier
standards like the 6.7 kbps mode (PDC-EFR [3]), the 7.4 kbps mode
(IS-641 codec in TDMA [4]), and the 12.2 kbps mode (GSM-EFR [5]).
The AMR codec comprises 8 modes with different bit rates ranging from
4.75 to 12.2 kbps. In systems with a fixed gross bit rate like e.g.
GSM, this allows assigning different amounts of error protection in
order to preserve high speech quality over a wide range of channel
qualities. The sampling frequency is 8 kHz, speech frames are
processed in 20 ms frames. The AMR modes are closely related to each
other and use the same coding framework.
AMR implementations must support all 8 speech coding modes, and mode
switching can occur to any mode at any speech frame boundary. The
mode information must therefore be transmitted together with the
speech encoded bits to indicate the mode. Furthermore, the decoder
may give an indication to the encoder of what mode it prefers to
receive. This is called a codec mode request (CMR) and is useful to
adjust the ratio of speech coder bits to error protection bits in
order to ensure a certain speech quality.
Along with the AMR codec, voice activity detection (VAD) and
comfort noise generation (CNG) have been standardized. This allows a
reduction of the number of transmitted bits in silence periods.
The three earlier codec standards [3-5] however have different
DTX/VAD/CNG schemes if they are not used in the AMR framework. For
Interoperability reasons the proposed payload format supports also
these CNG formats.
To address the transmission over networks with high packet loss
rates extra redundancy is built into the RTP payload format for AMR
This is done in a very flexible manner by the optional transmission
of parity bit blocks generated from previously transmitted AMR
encoded frames. Dependent on how many previous frames are covered
by this parity bit computation, a certain number of consecutive
past lost frames can be reconstructed at the receiver. Since this
may require buffering, the AMR payload format allows flexible
tradeoff between robustness, bit rate, and receiver delay.
The speech encoded bits have different perceptual sensitivity to bit
errors. Accordingly, unequal error protection (UEP) is employed in
cellular systems. A frame is considered as lost or damaged if
errors are detected in the most sensitive bits. Unequal error
detection (UED) can also be employed on RTP if e.g. UDP lite is used
as transport layer protocol (UDP lite [6] is work in progress). The
Fingscheidt & Wimmer [Page 2]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
payload then has to be ordered in sensitivity order. The sensitivity
order for the AMR encoded bits are defined in [7]. The different
sensitivity can also be exploited by a parity check covering only
the most sensitive bits, as is proposed as an option for the AMR
payload format.
To improve quality in circuit-switched GSM networks connected to
IP networks also frames disturbed on the wireless GSM link should
be transmitted to the decoder in the IP network. Consequently, such
frames must be accompanied by a frame quality information in the
IP network.
This proposal of an RTP payload format for AMR is the third in a
series of internet drafts (works in progress) related to this topic.
In [8] the transmission of multiple speech frames in a single RTP
packet is supported. The advantage of [9] as compared to [8] is
mainly the possibility to transmit redundant speech frames (or
parts thereof).
The present proposal incorporates the abilities of [8,9] with the
addition that there is an option for reconstruction of a larger
number of past lost frames. For the purpose of clarity and simpler
comparison, in the sequel we will follow the structure and the
notation of [9] as far as possible.
3. Requirements
The AMR payload format for RTP was designed to meet the following
requirements:
o Different levels of robustness must be supported:
- no redundancy at all
- past frames (partly) repeated
- parity bits generated over several past frames to yield extreme
robustness capable of handling very high packet loss rates with
no or small speech quality degradation.
o Fast, frame-wise AMR mode adaptation must be supported. This
means that it must be possible to send codec mode requests (CMRs)
back from the receiving side to the transmitting side with
information on the preferred mode. Slower AMR mode adaptation may
also be accomplished with external signaling.
o Discontinuous transmission (DTX) and comfort noise generation
(CNG) as specified in AMR must be supported.
4. RTP Payload Format Specification
This RTP payload format is designed to be flexible, ranging from
very low overhead (minimal) to an extended format with room for
future AMR extensions, e.g. wide band modes, and the possibility
to send extra redundancy information and several speech frames in
one RTP payload packet.
Fingscheidt & Wimmer [Page 3]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
Each RTP payload consists of an
- RTP payload header followed by the
- RTP payload data.
The RTP payload data is generated by the interleaving of one or
several RTP payload frames, see section 4.4. An RTP payload frame
may be generated from
- AMR frames or
- redundancy frames.
Each RTP payload frame must not be octet-aligned, however the RTP
payload shall be octet-aligned. If the last octet of an RTP payload
covers unused bits, these bits shall be set to zero.
4.1. The RTP Payload Header
The payload header has dynamic length, 3 or 8 bits. The bits in the
Header are specified as follows:
Q (1 bit): The payload quality bit indicates, if not set, that the
Payload is severely damaged and the receiver should set the RX_TYPE,
see [10], to SPEECH_BAD or SID_BAD depending on the frame type (FT).
I (1 bit): If I=1, it indicates the existence LEN/DEPTH indicator
bit (L) in each RTP payload frame. If I=0 the LEN/DEPTH indicator do
not exist.
R (1 bit): Indicates if the codec mode request (CMR) is sent or not.
CMR (5 bits): OPTIONAL field, depending on the R bit. Requested
codec mode for the other communication direction. The interpretation
is equal to the FT field, see Table 1.
0
0 1 2
+-+-+-+
|Q|I|R|
+-+-+-+
Figure 1: RTP payload header, R=0
0
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|Q|I|R| CMR |
+-+-+-+-+-+-+-+-+
Figure 2: RTP payload header, R=1
Fingscheidt & Wimmer [Page 4]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
4.2. RTP Payload AMR Frame
The RTP payload AMR frame is designed for covering AMR encoded
speech data and is generated by
- AMR frame header that is followed by the
- AMR frame payload.
The AMR frame must not be octet-aligned.
4.2.1. AMR Frame Header Format
Each AMR frame header includes several specified fields as follows:
F (1 bit): Indicates if this frame is followed by further frames.
F=1 further frames follow, F=0 last frame.
L (1 bit): (OPTIONAL) If the RTP payload header bit I=1 this field
exists. If I=0 this field is not existing. If set to L=1 the AMR
frame header includes the LEN field. If L=0 no LEN field exists in
this AMR frame header.
FT (5 bits): Frame type indicator, indicating the AMR speech coding
mode or comfort noise (CN) mode. The mapping of existing AMR modes
is given in Table 1. This implies that the number of bits of the AMR
frame payload can be derived from Table 1. If FT=15 (No
transmission) L for both AMR and redundancy frames SHOULD be set
to 0.
LEN (7 bits): OPTIONAL field, exists if the AMR header bit L is set,
L=1. LEN specifies the number of octets in the current AMR frame
payload. The following situations may occur and shall be treated as
follows:
- If LEN*8 <= number of speech bits indicated by FT, as shown in
Table. 1,
the number of bits of the AMR frame payload shall be derived by
8*LEN and not by the FT field. This implies that the encoded AMR
data was shortend to 8*LEN.
- otherwise the LEN field SHOULD be ignored.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|L| FT | LEN | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ +
/ AMR frame payload /
/ /
+ +-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: AMR frame format, I=1 and L=1
Fingscheidt & Wimmer [Page 5]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|L| FT | |
+-+-+-+-+-+-+-+ +
| |
+ +
/ AMR frame payload /
/ /
+ +-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: AMR frame format, I=1 and L=0
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| FT | |
+-+-+-+-+-+-+ +
| |
+ +
/ AMR frame payload /
+ +-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5: AMR frame format, I=0
4.2.2. AMR Frame Payload Format
The AMR speech encoder produces AMR speech frames, as defined by [2].
The currently defined AMR speech frame types can be found in Table 1.
speech
Index Mode bits
----------------------------------
0 AMR 4.75 95
1 AMR 5.15 103
2 AMR 5.9 118
3 AMR 6.7 134
4 AMR 7.4 148
5 AMR 7.95 159
6 AMR 10.2 204
7 AMR 12.2 244
8 AMR CNG 39
9 GSM EFR CNG 43
10 IS-641 CNG 38
11 PDC-EFR CNG 37
12 - 14 For future use -
15 No transmission 0
16 - 31 For future use -
Table 1: AMR speech frame types (taken from [9])
Fingscheidt & Wimmer [Page 6]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
The bit order of frame type 0 - 11 is given in [7]. Frame type 15,
no transmission, is needed to indicate not transmitted frames or
lost frames, e.g. when multiple frames are sent in each payload
and comfort noise starts. A frame type sequence in a payload with 8
frames, AMR mode 7, and CNG starts in the fifth frame, could look
like: {7,7,7,7,8,15,15,8}. The AMR DTX (also called "source con-
trolled rate operation", SCR) is described in [10]. Another reason
for the no transmission frame type is a possible need to send an
urgent codec mode request in a silence period with comfort noise.
Before the AMR encoded speech frames are copied to the AMR frame
payload the speech bits shall be ordered to the descending bit-error
sensitivity. This re-ordering process is defined in [7].
After this re-ordering process the AMR encoded speech frame is
copied to the AMR frame payload, according to the particular
setting of the AMR frame header, e.g. copying of the first 8*LEN
bits, see section 4.2.1.
4.3. RTP Payload - Redundancy Frame
The RTP payload redundancy frame is designed for covering redundancy
data for error-correction of lost AMR frames. The redundancy frame
is generated by
- redundancy frame header that is followed by the
- redundancy frame payload.
The redundancy frame must not be octet-aligned.
4.3.1. Redundancy Frame Header Format
Each redundancy frame header includes several specified fields as
follows:
F (1 bit): Indicates if this frame is followed by further frames.
F=1 further frames follow, F=0 last frame.
L (1 bit): (OPTIONAL) If the RTP payload header bit I=1 this field
exists. If I=0 this field is not existing. If set to L=1 the
redundancy frame header includes the LEN field. If L=0 no R_LEN
field exists in this redundancy frame header.
R_FT (5 bits): This field indicates the FT-fields of the past DEPTH
AMR frame headers by the following coding rule.
R_FT(n) = FT(n-1) EXOR ... EXOR FT(n-DEPTH(n)) (Eq. 1)
whereby
n is set to the current AMR frame number.
FT(n) is defined as the AMR frame header field FT of
frame n.
R_FT(n) denotes the redundancy frame header field R_FT of
frame n.
EXOR is defined as the bit-wise exclusive OR operation.
DEPTH(n) denotes the redundancy frame header field DEPTH of
frame n.
Fingscheidt & Wimmer [Page 7]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
R_LEN (7 bits): OPTIONAL field, exists if the redundancy header
bit L is set, L=1. R_LEN specifies the number of octets in the
current redundancy frame payload. Depending on R_LEN several
different operational modes are used that will be described in
section 4.3.2. R_LEN may be changed from redundancy frame to
redundancy frame. If L=0 or/and I=0, R_LEN(n) is set to FT(n),
whereby n denotes the current AMR frame number.
DEPTH (4 bits): OPTIONAL field, exists if the redundancy header
bit L is set, L=1. DEPTH specifies the number of previous AMR frame
payload pakets that are used for the generation of the redundancy
frame payload. The detailed description can be found in section
4.3.2. DEPTH = 0 is currently unused and may be used for future
extension. If L=0 or/and I=0 then DEPTH is set to the default
value 15.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|L| R_FT | R_LEN | DEPTH | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ +
/ redundancy frame payload /
/ /
+ +-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6: Redundancy frame format, I=1 and L=1
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|L| R_FT | |
+-+-+-+-+-+-+-+ +
| |
+ +
/ redundancy frame payload /
/ /
+ +-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7: Redundancy frame format, I=1 and L=0
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| R_FT | |
+-+-+-+-+-+-+ +
| |
+ +
/ redundancy frame payload /
+ +-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 8: Redundancy frame format, I=0
Fingscheidt & Wimmer [Page 8]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
4.3.2. Redundancy Frame Payload Format
The generation of the redundancy payload is based on parity bit
calculation of one or several previous AMR frame payload pakets.
This number of AMR frames is determined by the redundancy frame
header field DEPTH.
The general rules for generating of the parity bits can be found
in section 4.3.3.
The value of R_LEN can in principle be changed during transmission.
Let's assume R_LEN changes from R_LEN1 to R_LEN2, with DEPTH being
constant. In that case for a number of DEPTH AMR frame packets only
min(R_LEN1,R_LEN2) AMR frame payload bits can be reconstructed.
Although adaptation of R_LEN for redundancy frames works seamlessly,
it is RECOMMENDED not to perform such an adaptation on a
frame-by-frame basis.
The value of DEPTH can also be adapted during transmission. Let's
assume DEPTH changes from DEPTH1 to DEPTH2. It is RECOMMENDED to
choose a maximum value of DEPTH dependent on the application
(e.g. streaming services: large DEPTH, VoIP: low DEPTH) and to adapt
it only on a long term basis, since reconstruction capabilities are
reduced in transition regions for a number of min(DEPTH1,DEPTH2)
AMR frames.
4.3.3. Encoding Rules for the Parity Bits
This section describes the encoding rules for the parity bits.
Notation:
n : number of the current AMR frame; n is increased for each
sent AMR frame packet. n denotes also the current
redundancy frame number.
o : number of AMR frame that covers less AMR frame payload
bits than required by current redundancy frame header
field R_LEN(n) > LEN(o).
g(n,m) : bit m in the AMR frame payload of frame n
p(n,m) : bit m in the redundancy frame payload of frame n
XOR : exclusive OR operation
R_LEN(n) : denotes the R_LEN field of the redundancy frame header of
frame n
The parity bits SHALL be calculated by the following equation:
p(n,m) = g(n-1,m) EXOR ... EXOR g(n-DEPTH+1, m) EXOR g(n-DEPTH, m)
(eq.2)
for m = 0 ... R_LEN(n)-1;
Eq. 2 requires that all LEN(i) with i = (1, ... , DEPTH) of the AMR
frames are at least as large as R_LEN(n). In the event that this is
not valid the missing AMR frame payload bits SHALL be virtually
generated by the following rule.
Fingscheidt & Wimmer [Page 9]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
if (o = n-DEPTH)
g(o, LEN(o)+i) = 0, for i=0...(R_LEN(n)-LEN(o)-1);
else
if (R_LEN(n)-LEN(o) <= LEN(o-1))
g(o, LEN(o)+i) = g(o-1, i), for i=0...(R_LEN(n)-LEN(o)-1);
else {
g(o, LEN(o)+i) = g(o-1, i), for i = 0 ... (LEN(o-1)-1);
g(o, LEN(o)+LEN(o-1)+i) = 0,
for i = 0 ... (R_LEN(n)-LEN(o)-LEN(o-1)-1);
}
This rule implies that virtuell data SHALL be copied from the most
sensitive bits of the previous AMR frame payload of the AMR frame o.
However if the previous AMR frame number (o-1) is outside the window
defined by the DEPTH parameter of the current redundancy frame the
virtual data is set to 0. In the case that the AMR frame payload
(o-1) contains less bits than required to achieve all virtual bits
of AMR frame payload (o) then first all AMR frame payload bits of
(o-1) SHALL be taken and then the missing virtual bits of AMR frame
payload (o) SHALL be set to 0.
Example:
In this example, see Figure 9, it can be seen that the AMR frame
payload contains not enough bits. Therefore the most sensitive bits
of AMR frame payload (n-3) are virtually appended to AMR frame pay-
load (n-2) until the desired length is reached.
time: n-3 n-2 n-1 n
+----------+ +-----------+ +----------+ +--------+
| |- XOR -| g(n-2,m), |- XOR -| | = | |
| g(n-3,m) |- XOR -| fill with |- XOR -| g(n-1,m) | = | p(n,m) |
| |- XOR -| g(n-3,m) |- XOR -| | = | |
+----------+ +-----------+ +----------+ +--------+
Figure 9: Example of parity bit generation for p(n,m) with DEPTH=3
and the number of AMR frame payload bits in frame n-2 being smaller
than 8*R_LEN(n).
4.3.4. Decoding of Redundancy Frame Payload
Decoding of these parity codes is intended in the following manner.
Imagine one frame of AMR encoded bits and one parity bit block per
frame. Every value of DEPTH >= 1 allows the reconstruction of a
single lost frame among the last DEPTH frames. DEPTH = 2 allows the
reconstruction of two consecutive lost frames, once two good frames
are received. In general, a number of DEPTH buffered packets allows
for the reconstruction of a number of DEPTH lost frames preceding
them. The set of equations given by the XOR operations is solved at
Fingscheidt & Wimmer [Page 10]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
first for the last (!) lost frame (unknowns), using the DEPTH
buffered frames as knowns. Then everything is solved for the last
but first lost frame, taking into account the already reconstructed
last lost frame's bits. And so forth.
Here the tremenduous strength of using parity codes instead of frame
repetition becomes obvious: Especially for streaming applications a
large value of DEPTH allows to reconstruct error bursts of the same
large number of DEPTH consecutive frames.
4.3.5. Implications for DTX and the choice of DEPTH
For delay reasons it is not advisable to store a large number
(DEPTH) of CNG frames in the receiver buffer before previous lost
CNG AMR frames or AMR frame payload packets, containing speech data,
can be reconstructed.
Thus the follwing rules SHALL apply:
o Starting with the second AMR frame containing one/several CNG
frames, DEPTH SHALL be set maximally to 1 for all consecutive
redundancy frames containing CNG AMR frames.
o In the first and the second AMR frame containing no CNG after a
speech pause, DEPTH SHALL be set maximally to 1.
These rules allow optimal recovery of lost AMR frames in DTX
operation, while keeping delay at a minimum.
4.4. Payload Block Sorting
In general a bit
error in a more sensitive bit is subjectively more annoying than in
a less sensitive bit. To be able to protect the most sensitive bits
in a AMR and redundancy frames with a forward error detection code,
e.g. a CRC outside RTP, the full RTP payload data MUST be sorted in
sensitivity order. The protection MAY then cover an appropriate
number of octets from the beginning of the AMR and/or redundancy
frames. How many octets depends on the channel and application.
This can for example be accomplished by UDP lite [6] (work in
progress). To maintain sensitivity ordering inside the AMR payload
when more than one speech frame is transmitted in one packet
reordering of the data is needed.
The reordering to maintain the sensitivity ordered AMR payload SHALL
be performed on bit level. The AMR payload header SHALL still be
placed unchanged in the beginning of the payload. Thereafter, the
payload frames are sorted with one bit alternating from each payload
frame.
Fingscheidt & Wimmer [Page 11]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
+-------------+
| h(0)-h(H-1) |
+------------------------+
| f(0,0) _ f(0,F(0)) |
+----------------------------+
| f(1,0) _ f(1,F(1)) |
+----------------------------+
| f(2,0) _ f(2,F(2)) |
+----------------------+
\ \
+-------------------------------+
| f(N-1,0) _ f(N-1,F(N-1)) |
+-------------------------------+
Figure 10: The payload header and N AMR/redundancy frames before
sorting.
The sorting algorithm can be described in C-code.
b(m) : bit m of RTP final payload
f(n,m) : bit m in AMR/redundancy frame payload of frame n
F(n) : number of bits in AMR/redundancy frame n, defined by FT
or by LEN/R_LEN
h(m) : bit m of RTP payload header
H : number of RTP payload header bits, 3 or 8 bits
N : number of AMR/redundancy frames in the RTP payload
S : number of unused bits
Payload frames f(n,m) are ordered in consecutive order, where frame
n=1 is preceding frame n=2.
The sorting algorithm is defined in C-style as:
for (i = 0; i < H; i++)
b(i) = h(i);
max = max(F(0),..,F(N-1));
k = H;
for (i = 0; i < max; i++){
for (j = 0; j < N; j++){
if (i < F(j)){
b(k++) = f(j,i);
}
}
}
S = 8 - k%8;
if (S < 8){
for (i = 0; i < S; i++)
b(k++) = 0;
}
Fingscheidt & Wimmer [Page 12]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
5. RTP header usage
The RTP header marker bit (M) is used to mark (M=1) the packages
containing the first speech frame after CN. In all other packages
the marker bit is set to 0 (M=0).
The time-stamp corresponds to the sampling time of the first sample
encoded for the first encoded speech frame in the AMR frame. The
timestamp unit is in samples, i.e. one AMR speech frame is 20 ms
and sampling frequency is 8 kHz corresponds to 160 encoded speech
samples per frame, i.e. the timestamp is increased by 160 for each
AMR speech consecutive frame.
Due to DTX functionality each RTP packet SHALL contain the
appropriate time-stamp of the first AMR frame, covered by the RTP
payload. Each AMR frame containg CNG data or the first AMR frame
containing speech data after CNG SHALL start with a new RTP packet.
This is required to achieve the correct timing information.
Please consider also [12] for setting of particular parameters.
6. Examples
6.1. Simple example
In the simple example we just send one full (I=0) frame in each RTP
packet, no codec mode request CMR is sent (R=0), the payload was not
damaged at IP origin (Q=1). In this example we transmit one frame
encoded with the 5.9 kbps mode (FT=2). The speech encoded bits are
put into f(0) to f(117) in descending sensitivity order according to
[7].
| Bit no. |
Oct.| 0 1 2 3 4 5 6 7 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
0 | Q=1 | I=0 | R=0 | F=0 | 0 | 0 | 0 | 1 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
1 | 0 | f(0) | f(1) | f(2) | ... | ... | ... | ... |
----+-------+-------+-------+-------+-------+-------+-------+-------+
16 | ... | ... | ... | ... | f(115)| f(116)| f(117)| 0 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
Figure 11: One frame per packet example.
6.2. Example with parity bits
In this example a AMR frame with 6.7 kbps mode (FT=3) is sent with
one redundancy frame packet.
- The RTP payload header is set to Q=1, I=1, R=1 and CMR = 6. A mode
request is sent(R=1), requesting the 10.2 kbps mode for the other
link (CMR=6).
- The AMR frame header uses F=1, L=0 (this implies NO LEN field) and
FT = 3. The AMR frame header is followed by the AMR frame payload,
denoted by f(0) to f(133).
Fingscheidt & Wimmer [Page 13]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
- The redundancy frame header is set to
- F = 0 (no following frames),
- L = 1 (R_LEN and DEPTH exist)
- R_FT = 3 (the 3 previous AMR frame header fields FT were 3),
- R_LEN = 2 (number of redundancy frame payload bits = 2*8 = 16)
- DEPTH = 3 (the 3 previous AMR frame payload packets are taken
for redundancy frame payload calculation)
The redundancy frame paylaod covers 16 bits and is denoted by the
value r(.).
| Bit no. |
Oct.| 0 1 2 3 4 5 6 7 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
0 | Q=1 | I=1 | R=1 | 0 | 0 | 1 | 1 | 0 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
1 | F=1 | F=0 | L=0 | L=1 | 0 | 0 | 0 | 0 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
2 | 0 | 0 | 1 | 1 | 1 | 1 | f(0) | 0 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
3 | f(1) | 0 | f(2) | 0 | f(3) | 0 | f(4) | 0 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
4 | f(5) | 1 | f(6) | 0 | f(7) | 0 | f(8) | 0 |
----+-------+-------+-------+-------+-------+-------+-------+-------+
5 | f(9) | 1 | f(10) | 1 | f(11) | r(0) | f(12) | r(1) |
----+-------+-------+-------+-------+-------+-------+-------+-------+
6 | f(13) | r(2) | f(14) | r(3) | ... | ... | ... | ... |
----+-------+-------+-------+-------+-------+-------+-------+-------+
.. | ... | ... | ... | ... | ... | ... | ... | ... |
----+-------+-------+-------+-------+-------+-------+-------+-------+
9 | ... | ... | ... | r(15) | f(27) | r(16) | f(28) | f(29) |
----+-------+-------+-------+-------+-------+-------+-------+-------+
.. | ... | ... | ... | ... | ... | ... | ... | ... |
----+-------+-------+-------+-------+-------+-------+-------+-------+
33 | ... | ... | ... | ... | f(130)| f(131)| f(132)| f(133)|
----+-------+-------+-------+-------+-------+-------+-------+-------+
Figure 12: Example with 1 AMR frame and 1 redundancy frame
Fingscheidt & Wimmer [Page 14]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
7. References
[1] IETF RFC1889, "RTP: A Transport Protocol for Real-Time
Applications"
[2] GSM 06.90, "Adaptive Multi-Rate (AMR) speech transcoding"
[3] ARIB, RCR STD-27H, Section 5.4, "ACELP Speech CODEC"
[4] TIA/EIA IS-641-A, "TDMA Cellular/PCS _Radio interface, Enhanced
Full-Rate Voice Codec"
[5] GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding"
[6] IETF draft-larzon-udplite-02.txt, "The UDP Lite Protocol"
[7] 3G TS 26.101, "AMR Speech Codec Frame Structure"
[8] IETF draft-lakaniemi-avt-rtp-amr-00.txt, "RTP Payload Format
for AMR"
[9] IETF draft-sjoberg-avt-rtp-amr-00.txt, "RTP payload format
for AMR"
[10] 3G TS 26.093, "AMR Speech Codec; Source Controlled Rate
Operation"
[11] RFC 2119, "Key words for use in RFCs to Indicate Requirement
Levels"
[12] IETF draft-wimmer-amr-01.txt, "MIME Type Registration for AMR
Speech Codec"
8. Authors' addresses
Tim Fingscheidt
Siemens AG, ICP CD
Grillparzerstrasse 10-18
D - 81675 Munich
Germany
Phone: ++49 89 722 57658
Fax: ++49 89 722 46489
E-mail: Tim.Fingscheidt@mch.siemens.de
Bernhard Wimmer (contact person)
Siemens AG, ICP CD
Grillparzerstrasse 10-18
D - 81675 Munich
Germany
Phone: ++49 89 722 23247
Fax: ++49 89 722 46489
E-mail: Bernhard.Wimmer@mch.siemens.de
This Internet-Draft expires January, 14, 2001.
Fingscheidt & Wimmer [Page 15]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000
Full Copyright Statement
"Copyright (C) The Internet Society (date). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES; EXPRESS OR IMPLIED; INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF INFORMATION HEREIN
WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Fingscheidt & Wimmer [Page 16]