Internet Engineering Task Force Johan Sjoberg, Ericsson
Audio Video Transport WG Magnus Westerlund, Ericsson
INTERNET-DRAFT Ari Lakaniemi, Nokia
August 14, 2000 Petri Koskelainen, Nokia
Expires: February 14, 2001 Berhard Wimmer, Siemens
Tim Fingscheidt, Siemens
RTP payload format for AMR
<draft-ietf-avt-rtp-amr-00.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or cite them other than as "work in progress".
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/lid-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This document is an individual submission to the IETF. Comments
should be directed to the authors.
Abstract
This document describes a proposed real-time transport protocol (RTP)
[8] payload format for AMR speech encoded [1] signals. The AMR
payload format is designed to be able to interoperate with existing
AMR transport formats. This document also includes a MIME type
registration for AMR. The MIME type is specified for both real-time
transport and storage.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 1]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
1. Introduction
The adaptive multi-rate (AMR) speech codec was developed by the
European Telecommunications Standards institute (ETSI). The AMR codec
is standardized for GSM, and is also chosen by 3GPP as the mandatory
codec for third generation systems. It is currently under
standardization for TDMA. I.e. the AMR codec will be widely used in
cellular systems. The AMR codec is developed to preserve high speech
quality under a wide range of transmission conditions.
The AMR codec is a multi-mode codec with 8 narrow band modes with bit
rates between 4.75 and 12.2 kbps. The sampling frequency is 8000 Hz
and processing is done on 20 ms frames, i.e. 160 samples per frame.
The AMR modes are closely related to each other and uses the same
coding framework. Three of the AMR modes are already adopted and used
standards of there own, the 6.7 kbps mode as PDC-EFR [7], the 7.4
kbps mode as IS-641 codec in TDMA [6], and the 12.2 kbps mode as GSM-
EFR [5].
The AMR codec is designed with a voice activity detector (VAD) and
generation of comfort noise (CN) parameters during silence periods.
Hence, the AMR codec can reduce the number of transmitted bits and
packets during silence periods to a minimum. The operation to send CN
parameters at regular intervals during silence periods is usually
called discontinuous transmission (DTX) or source controlled rate
(SCR) operation.
AMR implementations must support all 8 speech coding modes, and mode
switching can occur to any mode at any time. The mode information
must therefore be transmitted together with the speech encoded bits,
to indicate the mode. The AMR speech codec is designed with modes
producing different bit rates to be able to adapt the source bit rate
according to the radio link quality in mobile phone systems. The
objective was to give highest possible speech quality under a variety
of radio channel conditions. To realize rate adaptation the decoder
needs to signal the mode it prefers to receive to the encoder.
Due to the flexibility and robustness of AMR, it is suitable also for
other purposes than circuit switched cellular systems. Other suitable
applications are real-time services over packet switched networks,
e.g. over RTP. To be optimized for transmission over networks with
high packet loss rates, the possibility to use extra redundancy is
built into the RTP payload format for AMR. The speech encoded bits
have different perceptual sensitivity to bit errors and cellular
systems exploit this by using unequal error protection and detection
(UEP and UED). This mechanism concentrates the correction and
detection of corrupted bits to the perceptually most sensitive bits.
A frame is only regarded as lost or damaged if errors are detected in
the most sensitive bits. The UED can also be employed on RTP if UDP
lite is used as transport layer protocol (UDP lite [10] is work in
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 2]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
progress). To enable this, the bits in the payload have to be ordered
in sensitivity order. The AMR encoded bits are defined in sensitivity
order in [2]. If the receiver supports option to retransmit redundant
frames, the different sensitivity could also be used for transmitting
only the most sensitive bits of a redundant frame. The special
problems with IP real-time traffic over cellular access networks are
further discussed in [9].
Other AMR scenarios are possible, e.g. one end is circuit switched
GSM, which is connected through a gateway to IP network and an IP
terminal in the other end. To improve quality, also frames damaged by
the GSM radio should be transmitted to the decoder in the IP network.
To make this possible, frame quality information has to be
transmitted over the IP network. The quality bit is also needed for
the AMR RTP payload format to interwork with for example the ATM AAL2
AMR profile.
2. Requirements
The AMR payload format for RTP was designed to meet the following
requirements:
o Different levels of robustness must be supported, from no
redundant data to extreme robustness capable of handling very
high packet loss rates with no or small speech quality
degradation.
o Fast, bandwidth efficient, frame-wise AMR mode adaptation must
be supported. This means that it must be possible to send Codec
Mode Requests back from the receiving side to the transmitting
side with information on the preferred mode.
o Source controlled rate operation (SCR) (also called DTX) and
comfort noise parameter (CN) transmission defined in AMR must be
supported.
3. Payload format
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119 [3].
The AMR payload format is designed to be flexible, ranging from very
low overhead to an extended format with the possibility to send
redundancy information and several speech frames in one packet.
The payload format consists of payload header and one or more payload
frames. Neither the payload header nor the payload frames are octet
aligned on their own but the full payload is. If the option to
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 3]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
transmit robust sorted payload is enabled and employed, the full
payload SHALL finally be ordered in descending bit error sensitivity
order to be prepared for unequal error protection or unequal error
detection schemes, e.g. UDP lite [10]. The AMR encoded bit streams
are defined in sensitivity order in Annex B of [2], the original
order as delivered from the speech encoder is defined in [1].
The last octet of an AMR payload packet is padded with zeroes at the
end if not all bits are used.
The AMR frame types, or modes, are defined in [2]. Frame type 15, no
transmission, is needed to indicate not transmitted frames or lost
frames. Not transmitted could mean both no data produced by the
speech encoder for this frame or no data transmitted in this payload,
i.e. valid data for this frame could be sent in another payload. For
example, when multiple frames are sent in each payload and comfort
noise starts. A frame type sequence in a payload with 8 frames,
speech frames with AMR mode 7 are interrupted by CN in the
fifth frame, could look like: {7,7,7,7,8,15,15,8}. The AMR SCR is
described in [4].
The AMR payload format supports robust transmission, multiple frames
in one payload packet, and the use of fast codec mode adaptation.
The robust behavior is accomplished by using the optional possibility
to retransmit previously transmitted frames together with the current
frame or frames. The redundant frames could be transmitted in their
entirety or only partly. If only a part of the redundant frame is
transmitted, the least sensitive bits are omitted. A partially
transmitted redundant frame SHALL fill the number of used octets for
that frame. The bits in the payload are sorted in descending
sensitivity order to support UED, like in UDP lite [10], if partial
redundancy is used. Each full AMR speech frame SHALL be transmitted
at least once.
The bits in redundant frames that are not transmitted MUST be
reconstructed on the receiver side when the partial redundant frame
is used for speech decoding. It is RECOMMENDED to produce the non
received bits with state of the art error concealment unit (ECU)
actions. Nothing resulting in worse quality than using random
generated bits SHOULD be used. The use of a fixed pattern SHOULD be
avoided for speech quality reasons.
A frame quality indicator is included for interoperability with the
ATM payload format described in ITU-T I.366.2, the UMTS Iu interface
[13] and other transport formats. The speech quality is significantly
increased if damaged frames are forwarded to the speech decoder error
concealment unit and not dropped. In many communication scenarios the
AMR encoded bits will be transmitted from one IP/UDP/RTP terminal to
a terminal in a system with another transport format and/or vice
versa. The transport format transcoding will be done in a gate way. A
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 4]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
second likely scenario is that IP/UDP/RTP is used as transport
between other systems, i.e. IP is originated and terminated in gate
ways on both sides of the IP transport.
AMR over
I.366.{2,3} or +------+ +----------+
3G Iu or | | IP/UDP/RTP/AMR | |
-------------->| GW |----------------------->| TERMINAL |
GSM Abis | | | |
etc. +------+ +----------+
Figure 1: GW to VoIP terminal scenario
AMR over AMR over
I.366.{2,3} or +------+ +------+ I.366.{2,3} or
3G Iu or | | IP/UDP/RTP/AMR | | 3G Iu or
-------------->| GW |-------------------->| GW |--------------->
GSM Abis | | | | GSM Abis
etc. +------+ +------+ etc.
Figure 2. GW to GW scenario
3.1. The payload header
The payload header has dynamic length, 3 or 6 bits. The bits in the
header are specified as follows:
S (1bit): Indicates if set that the payload is robust sorted,
otherwise simple payload sorting is employed. Note that this bit can
be set only if the receiver has signaled support for the option
robust payload sorting.
L (1 bit): Indicates the existence of LEN fields in the payload
frames. Note that this bit can be set only if the receiver has
signaled support for the option to transmit redundant data.
R (1 bit): Indicates, if set, that the Codec Mode Request (CMR) is
sent.
CMR (3 bits): OPTIONAL field, depending on the R bit. Requested codec
mode for the other communication direction. The mapping of existing
AMR modes to CMR is are given by the three least significant bits in
Table 1a in [2].
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 5]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
0
0 1 2
+-+-+-+
|S|L|R|
+-+-+-+
Figure 3: AMR payload header, when R=0
0
0 1 2 3 5 6
+-+-+-+-+-+-+
|S|L|R| CMR |
+-+-+-+-+-+-+
Figure 4: AMR payload header, when R=1
3.2. AMR payload frame
An AMR payload frame represent one encoded speech frame. Each payload
frame includes several specified fields as follows:
F (1 bit): Indicates if this frame is followed by further frames. F=1
further frames follow, F=0 last frame.
Q (1 bit): The payload quality bit indicates, if not set, that the
payload is severely damaged and the receiver should set the RX_TYPE,
see [4], to SPEECH_BAD or SID_BAD depending on the frame type (FT).
FT (4 bits): Frame type indicator, indicating the AMR speech coding
mode or comfort noise (CN) mode. The mapping of existing AMR modes to
FT is given in Table 1a in [2]. If FT=15 (No transmission) no LEN or
AMR encoded bits follow.
LEN (5 bits): OPTIONAL field, exists if the payload header bit L is
set, L=1. LEN specifies the number of octets used for the AMR encoded
bits field in this frame. If LEN indicates more bits than the AMR
mode information in the FT field, the implicit knowledge of the
number of bits for the AMR mode indicated by FT is the valid number
of AMR encoded bits, in octets. If LEN indicates fewer bits than
given by the mode information in the FT field, LEN gives the number
of encoded bits. If a frame is transmitted only partially the least
sensitive bits at the end of the frame are omitted. This use is
intended for partial redundant data.
AMR encoded bits: This is the speech codec encoded data field. The
length of this field is either defined implicitly by the AMR mode in
the FT field, or by the LEN field. The last payload frame SHALL
always contain a full AMR frame, i.e. no LEN field is needed or used.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 6]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|Q| FT | LEN | |
+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ +
/ AMR encoded bits /
+ +-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5: Payload frame format, F=1 and L=1
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|Q| FT | |
+-+-+-+-+-+-+ +
| |
+ +
/ AMR encoded bits /
+ +-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6: Payload frame format, F=0 or L=0
3.3. Compound AMR payload
The compound AMR payload consists of one AMR payload header and one
or more AMR payload frames, see section 3.1. and 3.2. These can be
put together with robust or simple payload sorting. The payload
header bit S indicates the method used.
Definitions for describing the compound AMR payload:
b(m) - bit m of the compound AMR payload
f(n,m) - bit m in payload frame n
F(n) - number of bits in payload frame n, defined by FT or by LEN
h(m) - bit m of payload header
H - number of payload header bits, 3 or 6 bits
N - number of payload frames in the payload
S - number of unused bits
Payload frames f(n,m) are ordered in consecutive order, where frame
n=1 is preceding frame n=2. Within one payload all frames between the
oldest and most recent must be present. If speech data is missing for
one frame, due to e.g. DTX, send the NO_TRANSMISSION frame type.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 7]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
Before sorting the payload consists of data ordered as described in
Figure 7.
+-------------+
| h(0)-h(H-1) |
+------------------------+
| f(0,0) _ f(0,F(0)) |
+----------------------------+
| f(1,0) _ f(1,F(1)) |
+----------------------------+
| f(2,0) _ f(2,F(2)) |
+----------------------+
\ \
+-------------------------------+
| f(N-1,0) _ f(N-1,F(N-1)) |
+-------------------------------+
Figure 7: The payload header and N payload frames before sorting.
3.3.1. Robust payload sorting
A bit error in a more sensitive bit is subjectively more annoying
than in a less sensitive bit. Therefore, to be able to protect the
most sensitive bits in a payload packet with a forward error
detection code, e.g. a CRC outside RTP, the bits inside a frame are
ordered into sensitivity order. If the option to transmit redundant
data is employed, the full RTP payload MUST be further sorted into
sensitivity order. The protection SHOULD then cover an appropriate
number of octets from the beginning of the payload, covering at least
the AMR payload header, F, Q, FT, LEN bits and class A bits (see
[2]). Exactly how many octets that needs protection depends on the
channel and application. To maintain sensitivity ordering inside the
AMR payload, when more than one speech frame is transmitted in one
payload, reordering of the data is needed.
The reordering to maintain the sensitivity ordered AMR payload SHALL
be performed on bit level. The AMR payload header SHALL still be
placed unchanged in the beginning of the payload. Thereafter, the
payload frames are sorted with one bit alternating from each payload
frame.
The robust payload sorting algorithm is defined in C-style as:
for (i = 0; i < H; i++){
b(i) = h(i);
}
max = max(F(0),..,F(N-1));
k = H;
for (i = 0; i < max; i++){
for (j = 0; j < N; j++){
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 8]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
if (i < F(j)){
b(k++) = f(j,i);
}
}
}
S = 8 - k%8;
if (S < 8){
for (i = 0; i < S; i++){
b(k++) = 0;
}
}
3.3.2. Simple payload sorting
If multiple new frames are encapsulated into the payload and robust
payload sorting is not used. The payload is formed by concatenating
the payload header and the bits from each AMR frame in the payload.
However, the bits inside a frame are ordered into sensitivity order
as defined in [2].
The simple payload sorting algorithm is defined in C-style as:
for (i = 0; i < H; i++){
b(i) = h(i);
}
k = H;
for (j = 0; j < N; j++){
for (i = 0; i < F(j); i++){
b(k++) = f(j,i);
}
}
}
S = 8 - k%8;
if (S < 8){
for (i = 0; i < S; i++){
b(k++) = 0;
}
}
3.4. Decoding security consideration
If the payload length calculation, using F, FT and LEN fields, do not
indicate the same length as the actually received payload size the
payload MUST be dropped. Decoding a packet that has errors in length
indicator bits could severely degrade the speech quality.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 9]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
4. RTP header usage
The RTP header marker bit (M) is used to mark (M=1) the packages
containing the first speech frame after CN. For all other packages
the marker bit is set to 0 (M=0).
The timestamp corresponds to the sampling time of the first sample
encoded for the first encoded speech frame in the packet. The
timestamp unit is in samples. The duration of one AMR speech frame is
20 ms and the sampling frequency is 8 kHz, corresponding to 160
encoded speech samples per frame. Thus, the timestamp is increased by
160 for each consecutive frame. All frames in a packet MUST be
successive 20 ms frames.
5. Congestion Control
The need of congestion control for data transported with RTP is
addressed in [14]. AMR speech data have some elastic properties due
to the different bandwidth demand for each mode. Another parameter
that can reduce the bandwidth demand for AMR are how many frames of
speech data that are encapsulated in each payload. This will reduce
the number of packets and the overhead from IP/UDP/RTP headers. If
using FEC there is also the need to regulate the amount, so the FEC
itself does not worsen the problem. Therefore, it is RECOMMENDED that
applications using this payload implements congestion control. The
actual mechanism for congestion control is not specified but should
be suitable for real-time flows, e.g. "Equation-Based Congestion
Control for Unicast Applications" [15].
6. Examples
6.1. Simple example
In the simple example we just send one full (L=0) frame in each RTP
packet, no Codec Mode Request CMR is sent (R=0), the payload was not
damaged at IP origin (Q=1). In this example we transmit one frame
encoded with the 5.9 kbps mode (FT=2). The speech encoded bits are
put into f(0) to f(117) in descending sensitivity order according to
[2]. Simple payload sorting is used, S=0.
| Bit no. |
Oct| 0 1 2 3 4 5 6 7 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
0 | S=0 | L=0 | R=0 | F=0 | Q=1 | 0 | 0 | 1 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
1 | 0 | f(0) | f(1) | f(2) | ... | ... | ... | ... |
---+-------+-------+-------+-------+-------+-------+-------+-------+
15 | ... | ... | ... | ... | f(115)| f(116)| f(117)| 0 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
Figure 8: One frame per packet example.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 10]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
6.2. Example with partial redundancy
In this example the 6.7 kbps mode (FT=3) is sent with one redundant
frame, also FT=3. Only a part of the redundant frame is sent, in this
example 12 octets, (L=1, LEN=12). A mode request is sent(R=1),
requesting the 10.2 kbps mode for the other link(CMR=6). The
redundant frame (12 octets) including FT is r(0) to r(91) and the
current frame (134 bits) is f(0) to f(133).
| Bit no. |
Oct| 0 1 2 3 4 5 6 7 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
0 | S=1 | L=1 | R=1 | 1 | 1 | 0 | F=1 | F=0 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
1 | Q=1 | Q=1 | 0 | 0 | 1 | 0 | 1 | 1 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
2 | 0 | 1 | 0 | f(0) | 0 | f(1) | 0 | f(2) |
---+-------+-------+-------+-------+-------+-------+-------+-------+
3 | 1 | f(3) | 1 | f(4) | r(0) | f(5) | r(1) | f(6) |
---+-------+-------+-------+-------+-------+-------+-------+-------+
4 | r(2) | f(7) | r(3) | f(8) | ... | ... | ... | ... |
---+-------+-------+-------+-------+-------+-------+-------+-------+
26 | r(90) | f(95) | r(91) | f(96) | f(97) | f(98) | ... | ... |
---+-------+-------+-------+-------+-------+-------+-------+-------+
30 | ... | ... | ... | ... | ... | ... | f(131)| f(132)|
---+-------+-------+-------+-------+-------+-------+-------+-------+
31 | f(133)| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
Figure 9: Example with partial redundancy.
6.3. Example with multiple frames per payload
In this example two 5.9 kbps mode (FT=2) frames are sent in one
packet. No partial redundancy is used (L=0). A mode request is
sent(R=1), requesting the 7.95 kbps mode for the other link(CMR=5).
The first frame is represented by the 118 bits f(0) to f(117) and the
subsequent frame by g(0) to g(117). Robust sorting is not used.
| Bit no. |
Oct| 0 1 2 3 4 5 6 7 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
0 | S=0 | L=0 | R=1 | 1 | 0 | 1 | F=1 | Q=1 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
1 | 0 | 0 | 1 | 0 | f(0) | f(1) | ... | ... |
---+-------+-------+-------+-------+-------+-------+-------+-------+
15 | ... | ... | ... | ... | ... | ... | ... | f(115)|
---+-------+-------+-------+-------+-------+-------+-------+-------+
16 | f(116)| f(117)| F=0 | Q=1 | 0 | 0 | 1 | 0 |
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 11]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
---+-------+-------+-------+-------+-------+-------+-------+-------+
17 | g(0) | g(1) | g(2) | ... | ... | ... | ... | ... |
---+-------+-------+-------+-------+-------+-------+-------+-------+
31 | ... | ... | ... | ... | g(116)| g(117)| 0 | 0 |
---+-------+-------+-------+-------+-------+-------+-------+-------+
Figure 10: Example two frames per payload.
7. The AMR MIME type registration
This chapter defines the MIME type for the Adaptive Multi-Rate (AMR)
speech codec [1]. The data format and parameters are specified for
both real-time transport and for storage type applications (e.g. e-
mail attachment, multimedia messaging). The former is referred as RTP
mode and the latter as storage mode.
AMR implementations according to [1] MUST support all eight coding
modes. The mode change can occur at any time during operation and
therefore the mode information is transmitted in-band together with
speech bits to allow mode change without any additional signaling.
In addition to the speech codec, AMR specifications also include
Discontinuous Transmission / comfort noise (DTX/CN) functionality
[11]. The DTX/CN switches the transmission off during silent parts of
the speech and only CN parameter updates are sent in regular
intervals.
7.1. RTP mode
It is possible that the decoder may want to receive a certain AMR
mode or a subset of AMR modes, due to link limitations in some
cellular systems, e.g. the GSM radio link can only use a subset of
maximum four modes. Therefore, it is possible to request a specific
set of AMR modes in capability description and the encoder MUST abide
this request. If the request for mode set is not given any mode may
be used or requested.
Although in principle the AMR codec can perform a mode change at any
time between any two modes, it is possible to set limitations for
mode changes. The decoder has possibility to define the minimum
number of frames between mode changes and to limit the mode change to
happen into neighboring modes only. Also this is motivated by
limitations on the GSM radio link.
It is also possible to limit the number of AMR frames encapsulated
into one RTP packet. This is an optional feature and if no parameter
is given in capability description, the transmitter can encapsulate
any number of AMR speech frames into one RTP packet.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 12]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
There is also an option to retransmit one or more previously
transmitted frames together with a new frame to help the receiver to
recover from packet losses in difficult transmission conditions. It
is also possible to transmit these frames only partially in such a
way that only the most sensitive bits are retransmitted. Since the
transmission of partly redundant frames is an optional property, it
can be used only if the receiver has signaled support for this
functionality in capability description. The partial redundancy is
RECOMMENDED to be implemented and turned on at least for
conversational services.
To support unequal error protection and/or detection the payload
format supports robust payload sorting. The robust payload sorting is
an optional feature and can only be used if the receiver has signaled
support for this functionality in capability description.
7.2. Storage mode
For storing AMR frames e.g. as a file or e-mail attachment, the AMR
frames must be encapsulated in consecutive compound AMR frames, see
chapter 3. Some limitations of the storage format is needed, since no
exchange of particular coding considerations can be signaled before
downloading or receiving stored AMR data and no timestamp information
is available in the file. The receiving entity (AMR decoder) MUST be
able to decode all eight coding modes as well as the AMR DTX/CN [6].
The compound AMR payload SHALL be stored without partial redundancy
and with simple payload sorting, see section 3.3. Not transmitted
frames, during for example DTX MUST be stored as NO_TRANSMISSION
frames to keep synchronization with original media.
7.3. MIME Registration
MIME-name for the AMR codec is allocated from IETF tree since AMR is
expected to be widely used speech codec in VoIP applications. Some
parts of this chapter will distinguish between RTP and storage modes.
Media Type name: audio
Media subtype name: AMR
Required parameters: none
Optional parameters for RTP mode:
ptime: Definition as usual in RTP audio.
mode-set: Requested AMR mode set. Restricts the active codec mode
set to a subset of all modes. Possible values are comma
separated list of modes: 0,...,7 (see Table 1a [2] an
example is given in section 7.4). If not present, all
speech modes are available.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 13]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
mode-change-period: Defines a number N which restricts the mode
changes in such a way that mode changes are only allowed
on multiples of N, initial state of the phase is
arbitrary. If this parameter is not present, mode change
can happen at any time.
mode-change-neighbor: If present, mode changes SHALL only be made to
neighboring modes in the active codec mode set. If not
present, change between any two modes is allowed.
maxframes: Maximum number of AMR speech frames in one RTP packet.
The receiver may set this parameter in order to limit
the buffering requirements or delay.
redundancy: If present, transmission of partly redundant frames is
supported, otherwise not supported.
robust-sorting: If present, robust payload sorting is supported,
otherwise not supported and simple payload sorting SHALL
be used.
Optional parameters for storage mode: none
Encoding considerations for RTP mode: See section 3 in this document.
Encoding considerations for storage mode: The AMR speech frames are
packed into consecutive compound AMR payloads, see section 3. The
compound AMR payloads must be stored in sequential order. This
implies that the first octet after payload n must be the first octet
of payload (n+1). Furthermore, missing frames and non-received frames
between CN updates during non-speech period must be encapsulated into
a compound AMR payload as NO_TRANSMISSION frames (frame type 15, see
definition in [2]). Each receiving entity that accepts this MIME type
must be able to decode all eight AMR coding modes [1] and the AMR
DTX/CN [11].
Security considerations: none
Public specification: please refer to chapter 8 "References".
Additional information for storage mode:
Magic number: none
File extensions: amr, AMR
Macintosh file type code: none
Object identifier or OID: none
Person & email address to contact for further information:
johan.sjoberg@ericsson.com
ari.lakaniemi@nokia.com
Bernhard.Wimmer@mch.siemens.de
Intended usage: COMMON. It is expected that many VoIP applications
(as well as mobile applications) will use this type.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 14]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
Author/Change controller:
johan.sjoberg@ericsson.com
ari.lakaniemi@nokia.com
Bernhard.Wimmer@mch.siemens.de
7.4 Mapping to SDP Parameters
Please note that this chapter applies to the RTP mode only.
Parameters are mapped to SDP [12] as usual.
Example usage in SDP:
m=audio 49120 RTP/AVP 97
a=rtpmap:97 AMR
a=fmtp:97 mode-set=0,2,5,7; maxframes=2
8. References
[1] 3G TS 26.090, "Adaptive Multi-Rate (AMR) speech transcoding".
[2] 3G TS 26.101, "AMR Speech Codec Frame Structure".
[3] IETF RFC 2119, "Key words for use in RFCs to Indicate
Requirement Levels".
[4] 3G TS 26.093, "AMR Speech Codec; Source Controlled Rate
operation".
[5] GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding".
[6] TIA/EIA -136-Rev.A, part 410 - "TDMA Cellular/PCS - Radio
Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS-
641. TIA published standard, 1998".
[7] ARIB, RCR STD-27H, "Personal Digital Cellular Telecommunication
System RCR Standard".
[8] IETF RFC1889, "RTP: A Transport Protocol for Real-Time
Applications".
[9] IETF draft-westberg-realtime-cellular-01.txt, "Realtime Traffic
over Cellular Access Networks".
[10] IETF draft-larzon-udplite-03.txt, "The UDP Lite Protocol".
[11] GSM 06.92, "Comfort noise aspects for Adaptive Multi-Rate (AMR)
speech traffic channels".
[12] M. Handley and V. Jacobson, "SDP: Session Description
Protocol", RFC 2327, April 1998
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 15]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
[13] 3G TS 25.415 "UTRAN Iu Interface User Plane Protocols"
[14] IETF draft-ietf-rtp-new-08.txt, Chapter 10, "RTP: A Transport
Protocol for Real-Time Applications".
[15] S. Floyd, M. Handley, J. Padhye, J. Widmer, "Equation-Based
Congestion Control for Unicast Applications", ACM SIGCOMM 2000,
Stockholm, Sweden
9. Authors' addresses
Johan Sjoberg
Ericsson Research
Ericsson Radio Systems AB
Torshamnsgatan 23
SE-164 80 Stockholm
SWEDEN
E-mail: Johan.Sjoberg@ericsson.com
Magnus Westerlund
Ericsson Research
Ericsson Radio Systems AB
Torshamnsgatan 23
SE-164 80 Stockholm
SWEDEN
E-mail: Magnus.Westerlund@ericsson.com
Ari Lakaniemi
Nokia Research Center
P.O.Box 407
FIN-00045 Nokia Group
Finland
E-mail: ari.lakaniemi@nokia.com
Petri Koskelainen
Nokia Research Center
P.O.Box 100
FIN-33721 Tampere
Finland
E-mail: petri.koskelainen@nokia.com
Tim Fingscheidt
Siemens AG, ICP CD
Grillparzerstrasse 10-18
D - 81675 Munich
Germany
Phone: +49 89 722 57658
Fax: +49 89 722 46489
E-mail: Tim.Fingscheidt@mch.siemens.de
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 16]
INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000
Bernhard Wimmer
Siemens AG, ICP CD
Grillparzerstrasse 10-18
D - 81675 Munich
Germany
Phone: +49 89 722 23247
Fax: +49 89 722 46489
E-mail: Bernhard.Wimmer@mch.siemens.de
This Internet-Draft expires February 14, 2001.
Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 17]