Internet Engineering Task Force                  Johan Sjoberg, Ericsson
Audio Video Transport WG                     Magnus Westerlund, Ericsson
INTERNET-DRAFT                                      Ari Lakaniemi, Nokia
August 14, 2000                                 Petri Koskelainen, Nokia
Expires: February 14, 2001                       Berhard Wimmer, Siemens
                                                Tim Fingscheidt, Siemens



                       RTP payload format for AMR
                    <draft-ietf-avt-rtp-amr-00.txt>


Status of this Memo


   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This document is an individual submission to the IETF. Comments
   should be directed to the authors.


Abstract

   This document describes a proposed real-time transport protocol (RTP)
   [8] payload format for AMR speech encoded [1] signals. The AMR
   payload format is designed to be able to interoperate with existing
   AMR transport formats. This document also includes a MIME type
   registration for AMR. The MIME type is specified for both real-time
   transport and storage.







Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 1]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000



1. Introduction

   The adaptive multi-rate (AMR) speech codec was developed by the
   European Telecommunications Standards institute (ETSI). The AMR codec
   is standardized for GSM, and is also chosen by 3GPP as the mandatory
   codec for third generation systems. It is currently under
   standardization for TDMA. I.e. the AMR codec will be widely used in
   cellular systems. The AMR codec is developed to preserve high speech
   quality under a wide range of transmission conditions.

   The AMR codec is a multi-mode codec with 8 narrow band modes with bit
   rates between 4.75 and 12.2 kbps. The sampling frequency is 8000 Hz
   and processing is done on 20 ms frames, i.e. 160 samples per frame.
   The AMR modes are closely related to each other and uses the same
   coding framework. Three of the AMR modes are already adopted and used
   standards of there own, the 6.7 kbps mode as PDC-EFR [7], the 7.4
   kbps mode as IS-641 codec in TDMA [6], and the 12.2 kbps mode as GSM-
   EFR [5].

   The AMR codec is designed with a voice activity detector (VAD) and
   generation of comfort noise (CN) parameters during silence periods.
   Hence, the AMR codec can reduce the number of transmitted bits and
   packets during silence periods to a minimum. The operation to send CN
   parameters at regular intervals during silence periods is usually
   called discontinuous transmission (DTX) or source controlled rate
   (SCR) operation.

   AMR implementations must support all 8 speech coding modes, and mode
   switching can occur to any mode at any time. The mode information
   must therefore be transmitted together with the speech encoded bits,
   to indicate the mode. The AMR speech codec is designed with modes
   producing different bit rates to be able to adapt the source bit rate
   according to the radio link quality in mobile phone systems. The
   objective was to give highest possible speech quality under a variety
   of radio channel conditions. To realize rate adaptation the decoder
   needs to signal the mode it prefers to receive to the encoder.

   Due to the flexibility and robustness of AMR, it is suitable also for
   other purposes than circuit switched cellular systems. Other suitable
   applications are real-time services over packet switched networks,
   e.g. over RTP. To be optimized for transmission over networks with
   high packet loss rates, the possibility to use extra redundancy is
   built into the RTP payload format for AMR. The speech encoded bits
   have different perceptual sensitivity to bit errors and cellular
   systems exploit this by using unequal error protection and detection
   (UEP and UED). This mechanism concentrates the correction and
   detection of corrupted bits to the perceptually most sensitive bits.
   A frame is only regarded as lost or damaged if errors are detected in
   the most sensitive bits. The UED can also be employed on RTP if UDP
   lite is used as transport layer protocol (UDP lite [10] is work in



Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 2]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   progress). To enable this, the bits in the payload have to be ordered
   in sensitivity order. The AMR encoded bits are defined in sensitivity
   order in [2]. If the receiver supports option to retransmit redundant
   frames, the different sensitivity could also be used for transmitting
   only the most sensitive bits of a redundant frame. The special
   problems with IP real-time traffic over cellular access networks are
   further discussed in [9].

   Other AMR scenarios are possible, e.g. one end is circuit switched
   GSM, which is connected through a gateway to IP network and an IP
   terminal in the other end. To improve quality, also frames damaged by
   the GSM radio should be transmitted to the decoder in the IP network.
   To make this possible, frame quality information has to be
   transmitted over the IP network. The quality bit is also needed for
   the AMR RTP payload format to interwork with for example the ATM AAL2
   AMR profile.


2.  Requirements

   The AMR payload format for RTP was designed to meet the following
   requirements:

     o Different levels of robustness must be supported, from no
      redundant data to extreme robustness capable of handling very
      high packet loss rates with no or small speech quality
      degradation.

     o Fast, bandwidth efficient, frame-wise AMR mode adaptation must
      be supported. This means that it must be possible to send Codec
      Mode Requests back from the receiving side to the transmitting
      side with information on the preferred mode.

     o Source controlled rate operation (SCR) (also called DTX) and
      comfort noise parameter (CN) transmission defined in AMR must be
      supported.


3. Payload format

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC2119 [3].

   The AMR payload format is designed to be flexible, ranging from very
   low overhead to an extended format with the possibility to send
   redundancy information and several speech frames in one packet.

   The payload format consists of payload header and one or more payload
   frames. Neither the payload header nor the payload frames are octet
   aligned on their own but the full payload is. If the option to



Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 3]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   transmit robust sorted payload is enabled and employed, the full
   payload SHALL finally be ordered in descending bit error sensitivity
   order to be prepared for unequal error protection or unequal error
   detection schemes, e.g. UDP lite [10]. The AMR encoded bit streams
   are defined in sensitivity order in Annex B of [2], the original
   order as delivered from the speech encoder is defined in [1].

   The last octet of an AMR payload packet is padded with zeroes at the
   end if not all bits are used.

   The AMR frame types, or modes, are defined in [2]. Frame type 15, no
   transmission, is needed to indicate not transmitted frames or lost
   frames. Not transmitted could mean both no data produced by the
   speech encoder for this frame or no data transmitted in this payload,
   i.e. valid data for this frame could be sent in another payload. For
   example, when multiple frames are sent in each payload and comfort
   noise starts. A frame type sequence in a payload with 8 frames,
   speech frames with AMR mode 7 are interrupted by CN in the
   fifth frame, could look like: {7,7,7,7,8,15,15,8}. The AMR SCR is
   described in [4].

   The AMR payload format supports robust transmission, multiple frames
   in one payload packet, and the use of fast codec mode adaptation.

   The robust behavior is accomplished by using the optional possibility
   to retransmit previously transmitted frames together with the current
   frame or frames. The redundant frames could be transmitted in their
   entirety or only partly. If only a part of the redundant frame is
   transmitted, the least sensitive bits are omitted. A partially
   transmitted redundant frame SHALL fill the number of used octets for
   that frame. The bits in the payload are sorted in descending
   sensitivity order to support UED, like in UDP lite [10], if partial
   redundancy is used. Each full AMR speech frame SHALL be transmitted
   at least once.

   The bits in redundant frames that are not transmitted MUST be
   reconstructed on the receiver side when the partial redundant frame
   is used for speech decoding. It is RECOMMENDED to produce the non
   received bits with state of the art error concealment unit (ECU)
   actions. Nothing resulting in worse quality than using random
   generated bits SHOULD be used. The use of a fixed pattern SHOULD be
   avoided for speech quality reasons.

   A frame quality indicator is included for interoperability with the
   ATM payload format described in ITU-T I.366.2, the UMTS Iu interface
   [13] and other transport formats. The speech quality is significantly
   increased if damaged frames are forwarded to the speech decoder error
   concealment unit and not dropped. In many communication scenarios the
   AMR encoded bits will be transmitted from one IP/UDP/RTP terminal to
   a terminal in a system with another transport format and/or vice
   versa. The transport format transcoding will be done in a gate way. A



Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 4]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   second likely scenario is that IP/UDP/RTP is used as transport
   between other systems, i.e. IP is originated and terminated in gate
   ways on both sides of the IP transport.


    AMR over
    I.366.{2,3} or +------+                        +----------+
    3G Iu or       |      |     IP/UDP/RTP/AMR     |          |
    -------------->|  GW  |----------------------->| TERMINAL |
    GSM Abis       |      |                        |          |
    etc.           +------+                        +----------+

   Figure 1: GW to VoIP terminal scenario


    AMR over                                             AMR over
    I.366.{2,3} or +------+                     +------+ I.366.{2,3} or
    3G Iu or       |      |   IP/UDP/RTP/AMR    |      | 3G Iu or
    -------------->|  GW  |-------------------->|  GW  |--------------->
    GSM Abis       |      |                     |      | GSM Abis
    etc.           +------+                     +------+ etc.

   Figure 2. GW to GW scenario


3.1.  The payload header

   The payload header has dynamic length, 3 or 6 bits. The bits in the
   header are specified as follows:

   S (1bit): Indicates if set that the payload is robust sorted,
   otherwise simple payload sorting is employed. Note that this bit can
   be set only if the receiver has signaled support for the option
   robust payload sorting.

   L (1 bit): Indicates the existence of LEN fields in the payload
   frames. Note that this bit can be set only if the receiver has
   signaled support for the option to transmit redundant data.

   R (1 bit): Indicates, if set, that the Codec Mode Request (CMR) is
   sent.

   CMR (3 bits): OPTIONAL field, depending on the R bit. Requested codec
   mode for the other communication direction. The mapping of existing
   AMR modes to CMR is are given by the three least significant bits in
   Table 1a in [2].








Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 5]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


    0
    0 1 2
   +-+-+-+
   |S|L|R|
   +-+-+-+

   Figure 3: AMR payload header, when R=0

    0
    0 1 2 3 5 6
   +-+-+-+-+-+-+
   |S|L|R| CMR |
   +-+-+-+-+-+-+

   Figure 4: AMR payload header, when R=1


3.2. AMR payload frame

   An AMR payload frame represent one encoded speech frame. Each payload
   frame includes several specified fields as follows:

   F (1 bit): Indicates if this frame is followed by further frames. F=1
   further frames follow, F=0 last frame.

   Q (1 bit): The payload quality bit indicates, if not set, that the
   payload is severely damaged and the receiver should set the RX_TYPE,
   see [4], to SPEECH_BAD or SID_BAD depending on the frame type (FT).

   FT (4 bits): Frame type indicator, indicating the AMR speech coding
   mode or comfort noise (CN) mode. The mapping of existing AMR modes to
   FT is given in Table 1a in [2]. If FT=15 (No transmission) no LEN or
   AMR encoded bits follow.

   LEN (5 bits): OPTIONAL field, exists if the payload header bit L is
   set, L=1. LEN specifies the number of octets used for the AMR encoded
   bits field in this frame. If LEN indicates more bits than the AMR
   mode information in the FT field, the implicit knowledge of the
   number of bits for the AMR mode indicated by FT is the valid number
   of AMR encoded bits, in octets. If LEN indicates fewer bits than
   given by the mode information in the FT field, LEN gives the number
   of encoded bits. If a frame is transmitted only partially the least
   sensitive bits at the end of the frame are omitted. This use is
   intended for partial redundant data.

   AMR encoded bits: This is the speech codec encoded data field. The
   length of this field is either defined implicitly by the AMR mode in
   the FT field, or by the LEN field. The last payload frame SHALL
   always contain a full AMR frame, i.e. no LEN field is needed or used.





Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 6]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|Q|  FT   |   LEN   |                                         |
   +-+-+-+-+-+-+-+-+-+-+-+                                         +
   |                                                               |
   +                                                               +
   /                    AMR encoded bits                           /
   +                                                 +-+-+-+-+-+-+-+
   |                                                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Figure 5: Payload frame format, F=1 and L=1

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|Q|  FT   |                                                   |
   +-+-+-+-+-+-+                                                   +
   |                                                               |
   +                                                               +
   /                    AMR encoded bits                           /
   +                                             +-+-+-+-+-+-+-+-+-+
   |                                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Figure 6: Payload frame format, F=0 or L=0


3.3. Compound AMR payload

   The compound AMR payload consists of one AMR payload header and one
   or more AMR payload frames, see section 3.1. and 3.2. These can be
   put together with robust or simple payload sorting. The payload
   header bit S indicates the method used.

   Definitions for describing the compound AMR payload:

   b(m)    - bit m of the compound AMR payload
   f(n,m)  - bit m in payload frame n
   F(n)    - number of bits in payload frame n, defined by FT or by LEN
   h(m)    - bit m of payload header
   H       - number of payload header bits, 3 or 6 bits
   N       - number of payload frames in the payload
   S       - number of unused bits

   Payload frames f(n,m) are ordered in consecutive order, where frame
   n=1 is preceding frame n=2. Within one payload all frames between the
   oldest and most recent must be present. If speech data is missing for
   one frame, due to e.g. DTX, send the NO_TRANSMISSION frame type.




Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 7]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   Before sorting the payload consists of data ordered as described in
   Figure 7.

   +-------------+
   | h(0)-h(H-1) |
   +------------------------+
   | f(0,0) _ f(0,F(0))     |
   +----------------------------+
   | f(1,0) _ f(1,F(1))         |
   +----------------------------+
   | f(2,0) _ f(2,F(2))   |
   +----------------------+
   \                          \
   +-------------------------------+
   | f(N-1,0) _ f(N-1,F(N-1))      |
   +-------------------------------+

   Figure 7: The payload header and N payload frames before sorting.


3.3.1. Robust payload sorting

   A bit error in a more sensitive bit is subjectively more annoying
   than in a less sensitive bit. Therefore, to be able to protect the
   most sensitive bits in a payload packet with a forward error
   detection code, e.g. a CRC outside RTP, the bits inside a frame are
   ordered into sensitivity order. If the option to transmit redundant
   data is employed, the full RTP payload MUST be further sorted into
   sensitivity order. The protection SHOULD then cover an appropriate
   number of octets from the beginning of the payload, covering at least
   the AMR payload header, F, Q, FT, LEN bits and class A bits (see
   [2]). Exactly how many octets that needs protection depends on the
   channel and application. To maintain sensitivity ordering inside the
   AMR payload, when more than one speech frame is transmitted in one
   payload, reordering of the data is needed.

   The reordering to maintain the sensitivity ordered AMR payload SHALL
   be performed on bit level. The AMR payload header SHALL still be
   placed unchanged in the beginning of the payload. Thereafter, the
   payload frames are sorted with one bit alternating from each payload
   frame.

   The robust payload sorting algorithm is defined in C-style as:

   for (i = 0; i < H; i++){
     b(i) = h(i);
   }
   max = max(F(0),..,F(N-1));
   k = H;
   for (i = 0; i < max; i++){
     for (j = 0; j < N; j++){



Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 8]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


       if (i < F(j)){
         b(k++) = f(j,i);
       }
     }
   }
   S = 8 - k%8;
   if (S < 8){
     for (i = 0; i < S; i++){
       b(k++) = 0;
     }
   }


3.3.2. Simple payload sorting

   If multiple new frames are encapsulated into the payload and robust
   payload sorting is not used. The payload is formed by concatenating
   the payload header and the bits from each AMR frame in the payload.
   However, the bits inside a frame are ordered into sensitivity order
   as defined in [2].

   The simple payload sorting algorithm is defined in C-style as:

   for (i = 0; i < H; i++){
     b(i) = h(i);
   }
   k = H;
   for (j = 0; j < N; j++){
     for (i = 0; i < F(j); i++){
         b(k++) = f(j,i);
       }
     }
   }
   S = 8 - k%8;
   if (S < 8){
     for (i = 0; i < S; i++){
       b(k++) = 0;
     }
   }


3.4. Decoding security consideration

   If the payload length calculation, using F, FT and LEN fields, do not
   indicate the same length as the actually received payload size the
   payload MUST be dropped. Decoding a packet that has errors in length
   indicator bits could severely degrade the speech quality.







Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt     [Page 9]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


4. RTP header usage

   The RTP header marker bit (M) is used to mark (M=1) the packages
   containing the first speech frame after CN. For all other packages
   the marker bit is set to 0 (M=0).

   The timestamp corresponds to the sampling time of the first sample
   encoded for the first encoded speech frame in the packet. The
   timestamp unit is in samples. The duration of one AMR speech frame is
   20 ms and the sampling frequency is 8 kHz, corresponding to 160
   encoded speech samples per frame. Thus, the timestamp is increased by
   160 for each consecutive frame. All frames in a packet MUST be
   successive 20 ms frames.

5. Congestion Control

   The need of congestion control for data transported with RTP is
   addressed in [14]. AMR speech data have some elastic properties due
   to the different bandwidth demand for each mode. Another parameter
   that can reduce the bandwidth demand for AMR are how many frames of
   speech data that are encapsulated in each payload. This will reduce
   the number of packets and the overhead from IP/UDP/RTP headers. If
   using FEC there is also the need to regulate the amount, so the FEC
   itself does not worsen the problem. Therefore, it is RECOMMENDED that
   applications using this payload implements congestion control. The
   actual mechanism for congestion control is not specified but should
   be suitable for real-time flows, e.g. "Equation-Based Congestion
   Control for Unicast Applications" [15].

6. Examples

6.1. Simple example

   In the simple example we just send one full (L=0) frame in each RTP
   packet, no Codec Mode Request CMR is sent (R=0), the payload was not
   damaged at IP origin (Q=1). In this example we transmit one frame
   encoded with the 5.9 kbps mode (FT=2). The speech encoded bits are
   put into f(0) to f(117) in descending sensitivity order according to
   [2]. Simple payload sorting is used, S=0.

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  S=0  |  L=0  |  R=0  |  F=0  |  Q=1  |   0   |   0   |   1   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |   0   | f(0)  | f(1)  | f(2)  |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   15 |  ...  |  ...  |  ...  |  ...  | f(115)| f(116)| f(117)|   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 8: One frame per packet example.



Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 10]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000



6.2. Example with partial redundancy

   In this example the 6.7 kbps mode (FT=3) is sent with one redundant
   frame, also FT=3. Only a part of the redundant frame is sent, in this
   example 12 octets, (L=1, LEN=12). A mode request is sent(R=1),
   requesting the 10.2 kbps mode for the other link(CMR=6). The
   redundant frame (12 octets) including FT is r(0) to r(91) and the
   current frame (134 bits) is f(0) to f(133).

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  S=1  |  L=1  |  R=1  |   1   |   1   |   0   |  F=1  |  F=0  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |  Q=1  |  Q=1  |   0   |   0   |   1   |   0   |   1   |   1   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    2 |   0   |   1   |   0   | f(0)  |   0   | f(1)  |   0   | f(2)  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    3 |   1   | f(3)  |   1   | f(4)  | r(0)  | f(5)  | r(1)  | f(6)  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    4 | r(2)  | f(7)  | r(3)  | f(8)  |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   26 | r(90) | f(95) | r(91) | f(96) | f(97) | f(98) |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   30 |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  | f(131)| f(132)|
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   31 | f(133)|   0   |   0   |   0   |   0   |   0   |   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 9: Example with partial redundancy.


6.3. Example with multiple frames per payload

   In this example two 5.9 kbps mode (FT=2) frames are sent in one
   packet. No partial redundancy is used (L=0). A mode request is
   sent(R=1), requesting the 7.95 kbps mode for the other link(CMR=5).
   The first frame is represented by the 118 bits f(0) to f(117) and the
   subsequent frame by g(0) to g(117). Robust sorting is not used.

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  S=0  |  L=0  |  R=1  |   1   |   0   |   1   |  F=1  |  Q=1  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |   0   |   0   |   1   |   0   | f(0)  | f(1)  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   15 |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  | f(115)|
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   16 | f(116)| f(117)|  F=0  |  Q=1  |   0   |   0   |   1   |   0   |



Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 11]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   17 | g(0)  | g(1)  | g(2)  |  ...  |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   31 |  ...  |  ...  |  ...  |  ...  | g(116)| g(117)|   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 10: Example two frames per payload.


7. The AMR MIME type registration

   This chapter defines the MIME type for the Adaptive Multi-Rate (AMR)
   speech codec [1]. The data format and parameters are specified for
   both real-time transport and for storage type applications (e.g. e-
   mail attachment, multimedia messaging). The former is referred as RTP
   mode and the latter as storage mode.

   AMR implementations according to [1] MUST support all eight coding
   modes. The mode change can occur at any time during operation and
   therefore the mode information is transmitted in-band together with
   speech bits to allow mode change without any additional signaling.

   In addition to the speech codec, AMR specifications also include
   Discontinuous Transmission / comfort noise (DTX/CN) functionality
   [11]. The DTX/CN switches the transmission off during silent parts of
   the speech and only CN parameter updates are sent in regular
   intervals.


7.1. RTP mode

   It is possible that the decoder may want to receive a certain AMR
   mode or a subset of AMR modes, due to link limitations in some
   cellular systems, e.g. the GSM radio link can only use a subset of
   maximum four modes. Therefore, it is possible to request a specific
   set of AMR modes in capability description and the encoder MUST abide
   this request. If the request for mode set is not given any mode may
   be used or requested.

   Although in principle the AMR codec can perform a mode change at any
   time between any two modes, it is possible to set limitations for
   mode changes. The decoder has possibility to define the minimum
   number of frames between mode changes and to limit the mode change to
   happen into neighboring modes only. Also this is motivated by
   limitations on the GSM radio link.

   It is also possible to limit the number of AMR frames encapsulated
   into one RTP packet. This is an optional feature and if no parameter
   is given in capability description, the transmitter can encapsulate
   any number of AMR speech frames into one RTP packet.




Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 12]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   There is also an option to retransmit one or more previously
   transmitted frames together with a new frame to help the receiver to
   recover from packet losses in difficult transmission conditions. It
   is also possible to transmit these frames only partially in such a
   way that only the most sensitive bits are retransmitted. Since the
   transmission of partly redundant frames is an optional property, it
   can be used only if the receiver has signaled support for this
   functionality in capability description. The partial redundancy is
   RECOMMENDED to be implemented and turned on at least for
   conversational services.

   To support unequal error protection and/or detection the payload
   format supports robust payload sorting. The robust payload sorting is
   an optional feature and can only be used if the receiver has signaled
   support for this functionality in capability description.


7.2. Storage mode

   For storing AMR frames e.g. as a file or e-mail attachment, the AMR
   frames must be encapsulated in consecutive compound AMR frames, see
   chapter 3. Some limitations of the storage format is needed, since no
   exchange of particular coding considerations can be signaled before
   downloading or receiving stored AMR data and no timestamp information
   is available in the file. The receiving entity (AMR decoder) MUST be
   able to decode all eight coding modes as well as the AMR DTX/CN [6].
   The compound AMR payload SHALL be stored without partial redundancy
   and with simple payload sorting, see section 3.3. Not transmitted
   frames, during for example DTX MUST be stored as NO_TRANSMISSION
   frames to keep synchronization with original media.


7.3. MIME Registration

   MIME-name for the AMR codec is allocated from IETF tree since AMR is
   expected to be widely used speech codec in VoIP applications. Some
   parts of this chapter will distinguish between RTP and storage modes.

   Media Type name:     audio

   Media subtype name:  AMR

   Required parameters: none

   Optional parameters for RTP mode:
    ptime:     Definition as usual in RTP audio.
    mode-set:  Requested AMR mode set. Restricts the active codec mode
               set to a subset of all modes. Possible values are comma
               separated list of modes: 0,...,7 (see Table 1a [2] an
               example is given in section 7.4). If not present, all
               speech modes are available.



Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 13]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


    mode-change-period: Defines a number N which restricts the mode
               changes in such a way that mode changes are only allowed
               on multiples of N, initial state of the phase is
               arbitrary. If this parameter is not present, mode change
               can happen at any time.
    mode-change-neighbor: If present, mode changes SHALL only be made to
               neighboring modes in the active codec mode set. If not
               present, change between any two modes is allowed.
    maxframes: Maximum number of AMR speech frames in one RTP packet.
               The receiver may set this parameter in order to limit
               the buffering requirements or delay.
    redundancy: If present, transmission of partly redundant frames is
               supported, otherwise not supported.
    robust-sorting: If present, robust payload sorting is supported,
               otherwise not supported and simple payload sorting SHALL
               be used.

   Optional parameters for storage mode:     none

   Encoding considerations for RTP mode: See section 3 in this document.

   Encoding considerations for storage mode: The AMR speech frames are
   packed into consecutive compound AMR payloads, see section 3. The
   compound AMR payloads must be stored in sequential order. This
   implies that the first octet after payload n must be the first octet
   of payload (n+1). Furthermore, missing frames and non-received frames
   between CN updates during non-speech period must be encapsulated into
   a compound AMR payload as NO_TRANSMISSION frames (frame type 15, see
   definition in [2]). Each receiving entity that accepts this MIME type
   must be able to decode all eight AMR coding modes [1] and the AMR
   DTX/CN [11].

   Security considerations: none

   Public specification: please refer to chapter 8 "References".

   Additional information for storage mode:
     Magic number: none
     File extensions: amr, AMR
     Macintosh file type code: none
     Object identifier or OID: none

   Person & email address to contact for further information:
     johan.sjoberg@ericsson.com
     ari.lakaniemi@nokia.com
     Bernhard.Wimmer@mch.siemens.de
   Intended usage: COMMON. It is expected that many VoIP applications
   (as well as mobile applications) will use this type.






Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 14]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   Author/Change controller:
     johan.sjoberg@ericsson.com
     ari.lakaniemi@nokia.com
     Bernhard.Wimmer@mch.siemens.de

7.4 Mapping to SDP Parameters

   Please note that this chapter applies to the RTP mode only.

   Parameters are mapped to SDP [12] as usual.
   Example usage in SDP:
    m=audio 49120 RTP/AVP 97
    a=rtpmap:97 AMR
    a=fmtp:97 mode-set=0,2,5,7; maxframes=2


8.   References

   [1]  3G TS 26.090, "Adaptive Multi-Rate (AMR) speech transcoding".

   [2]  3G TS 26.101, "AMR Speech Codec Frame Structure".

   [3]  IETF RFC 2119, "Key words for use in RFCs to Indicate
        Requirement Levels".

   [4]  3G TS 26.093, "AMR Speech Codec; Source Controlled Rate
        operation".

   [5]  GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding".

   [6]  TIA/EIA -136-Rev.A, part 410 - "TDMA Cellular/PCS - Radio
        Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS-
        641. TIA published standard, 1998".

   [7]  ARIB, RCR STD-27H, "Personal Digital Cellular Telecommunication
        System RCR Standard".

   [8]  IETF RFC1889, "RTP: A Transport Protocol for Real-Time
        Applications".

   [9]  IETF draft-westberg-realtime-cellular-01.txt, "Realtime Traffic
        over Cellular Access Networks".

   [10] IETF draft-larzon-udplite-03.txt, "The UDP Lite Protocol".

   [11] GSM 06.92, "Comfort noise aspects for Adaptive Multi-Rate (AMR)
        speech traffic channels".

   [12] M. Handley and V. Jacobson, "SDP: Session Description
        Protocol", RFC 2327, April 1998




Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 15]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   [13] 3G TS 25.415 "UTRAN Iu Interface User Plane Protocols"

   [14] IETF draft-ietf-rtp-new-08.txt, Chapter 10, "RTP: A Transport
        Protocol for Real-Time Applications".

   [15] S. Floyd, M. Handley, J. Padhye, J. Widmer, "Equation-Based
        Congestion Control for Unicast Applications", ACM SIGCOMM 2000,
        Stockholm, Sweden


9. Authors' addresses

   Johan Sjoberg
   Ericsson Research
   Ericsson Radio Systems AB
   Torshamnsgatan 23
   SE-164 80 Stockholm
   SWEDEN
   E-mail: Johan.Sjoberg@ericsson.com

   Magnus Westerlund
   Ericsson Research
   Ericsson Radio Systems AB
   Torshamnsgatan 23
   SE-164 80 Stockholm
   SWEDEN
   E-mail: Magnus.Westerlund@ericsson.com

   Ari Lakaniemi
   Nokia Research Center
   P.O.Box 407
   FIN-00045 Nokia Group
   Finland
   E-mail: ari.lakaniemi@nokia.com

   Petri Koskelainen
   Nokia Research Center
   P.O.Box 100
   FIN-33721 Tampere
   Finland
   E-mail: petri.koskelainen@nokia.com

   Tim Fingscheidt
   Siemens AG, ICP CD
   Grillparzerstrasse 10-18
   D - 81675 Munich
   Germany
   Phone: +49 89 722 57658
   Fax:   +49 89 722 46489
   E-mail: Tim.Fingscheidt@mch.siemens.de




Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 16]


INTERNET-DRAFT         RTP Payload Format for AMR        August 14, 2000


   Bernhard Wimmer
   Siemens AG, ICP CD
   Grillparzerstrasse 10-18
   D - 81675 Munich
   Germany
   Phone: +49 89 722 23247
   Fax:   +49 89 722 46489
   E-mail: Bernhard.Wimmer@mch.siemens.de



   This Internet-Draft expires February 14, 2001.










































Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt    [Page 17]