Internet Engineering Task Force                  Johan Sjoberg, Ericsson
Audio Video Transport WG                     Magnus Westerlund, Ericsson
INTERNET-DRAFT                                      Ari Lakaniemi, Nokia
July 14, 2001                                   Petri Koskelainen, Nokia
Expires: January 14, 2000



                       RTP payload format for AMR
                   <draft-sjoberg-avt-rtp-amr-01.txt>


Status of this Memo


   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This document is an individual submission to the IETF. Comments
   should be directed to the authors.


Abstract

   This document describes a proposed real-time transport protocol (RTP)
   [8] payload format for AMR speech encoded [1] signals. The AMR
   payload format is designed to be able to interoperate with existing
   AMR transport formats. This document also includes a MIME type
   registration for AMR. The MIME type is specified for both real-time
   transport and storage.








Sjoberg                                                         [Page 1]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


1. Introduction

   The adaptive multi-rate (AMR) speech codec was developed by the
   European Telecommunications Standards institute (ETSI). The AMR codec
   is standardized for GSM, and is also chosen by 3GPP as the mandatory
   codec for third generation systems. It is currently under
   standardization for TDMA. I.e. the AMR codec will be widely used in
   cellular systems. The AMR codec is developed to preserve high speech
   quality under a wide range of transmission conditions.

   The AMR codec is a multi-mode codec with 8 narrow band modes with bit
   rates between 4.75 and 12.2 kbps. The sampling frequency is 8000 Hz
   and processing is done on 20 ms frames, i.e. 160 samples per frame.
   The AMR modes are closely related to each other and uses the same
   coding framework. Three of the AMR modes are already adopted and used
   standards of there own, the 6.7 kbps mode as PDC-EFR [7], the 7.4
   kbps mode as IS-641 codec in TDMA [6], and the 12.2 kbps mode as GSM-
   EFR [5].

   AMR implementations must support all 8 speech coding modes, and mode
   switching can occur to any mode at any time. The mode information
   must therefore be transmitted together with the speech encoded bits,
   to indicate the mode.

   It is possible for the decoder to signal to the encoder the mode it
   prefers to receive. The reason can be e.g. transmission bandwidth or
   quality.

   The AMR codec is designed with a voice activity detector (VAD) and
   generation of comfort noise (CN) parameters during silence periods.
   Hence, the AMR codec can reduce the number of transmitted bits and
   packets during silence periods to a minimum. The operation to send CN
   parameters at regular intervals during silence periods is usually
   called discontinuous transmission (DTX) or source controlled rate
   (SCR) operation. The three codec standards that are part of AMR
   [5][6][7] also have SCR/CN functionality specified. To enable
   interoperability with terminals supporting these standards the AMR
   can optionally be extended to support also these CN schemes, see [2].

   Due to the flexibility and robustness of AMR, it is suitable also for
   other purposes than circuit switched cellular systems. Other suitable
   applications are real-time services over packet switched networks,
   e.g. over RTP. To be optimized for transmission over networks with
   high packet loss rates, the possibility to use extra redundancy is
   built into the RTP payload format for AMR. The speech encoded bits
   have different perceptual sensitivity to bit errors and cellular
   systems exploit this by using unequal error protection and detection
   (UEP and UED). This mechanism concentrates the correction and
   detection of corrupted bits to the perceptually most sensitive bits.
   A frame is only regarded as lost or damaged if errors are detected in
   the most sensitive bits. The UED can also be employed on RTP if UDP



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 2]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


   lite is used as transport layer protocol (UDP lite [10] is work in
   progress). To enable this, the bits in the payload have to be ordered
   in sensitivity order. The AMR encoded bits are defined in sensitivity
   order in [2]. If the receiver supports option to retransmit redundant
   frames, the different sensitivity could also be used for transmitting
   only the most sensitive bits of a redundant frame. The special
   problems with IP real-time traffic over cellular access networks are
   further discussed in [9].

   Other AMR scenarios are possible, e.g. one end is circuit switched
   GSM, which is connected through a gateway to IP network and an IP
   terminal in the other end. To improve quality, also frames damaged by
   the GSM radio should be transmitted to the decoder in the IP network.
   To make this possible, frame quality information has to be
   transmitted over the IP network. The quality bit is also needed for
   the AMR RTP payload format to interwork with for example the ATM AAL2
   AMR profile.


2.  Requirements

   The AMR payload format for RTP was designed to meet the following
   requirements:

    o Different levels of robustness must be supported, from no
      redundant data to extreme robustness capable of handling very
      high packet loss rates with no or small speech quality
      degradation.

    o Fast, frame-wise AMR mode adaptation must be supported. This
      means that it must be possible to send Codec Mode Requests back
      from the receiving side to the transmitting side with information
      on the preferred mode. Slower AMR mode adaptation may also be
      accomplished with external signaling.

    o Source controlled rate operation (SCR) and comfort noise
      parameter (CN) transmission defined in AMR must be supported.


3. Payload format

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC2119 [3].

   The AMR payload format is designed to be flexible, ranging from very
   low overhead to an extended format with the possibility to send
   redundancy information and several speech frames in one packet.

   The payload format consists of payload header and zero or more
   payload frames. Neither the payload header nor the payload frames are



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 3]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


   octet aligned on their own but the full payload is. If the option to
   transmit redundant information is enabled and employed, the full
   payload SHALL finally be ordered in descending bit error sensitivity
   order to be prepared for unequal error protection or unequal error
   detection schemes, e.g. UDP lite. The AMR encoded bit streams are
   defined in sensitivity order in Annex B of [2], the original order as
   delivered from the speech encoder is defined in [1].

   The last octet of an AMR payload packet is padded with zeroes at the
   end if not all bits are used.

   The AMR frame types, or modes, are defined in [2]. Frame type 15, no
   transmission, is needed to indicate not transmitted frames or lost
   frames. Not transmitted could mean both no data produced by the
   speech encoder for this frame or no data transmitted in this payload,
   i.e. valid data for this frame could be sent in another payload. For
   example, when multiple frames are sent in each payload and comfort
   noise starts. A frame type sequence in a payload with 8 frames,
   speech frames with AMR mode 7 are interrupted by CN in the
   fifth frame, could look like: {7,7,7,7,8,15,15,8}. The AMR SCR is
   described in [4].

   The AMR payload format supports robust transmission, multiple frames
   in one payload packet, and the use of fast codec mode adaptation.

   The robust behavior is accomplished by using the optional possibility
   to retransmit previously transmitted frames together with the current
   frame or frames. The redundant frames could be transmitted in their
   entirety or only partly. If only a part of the redundant frame is
   transmitted, the least sensitive bits are omitted. A partially
   transmitted redundant frame SHALL fill the number of used octets for
   that frame. The bits in the payload are sorted in descending
   sensitivity order to support UED, like in UDP lite [10], if partial
   redundancy is used.

   When bits in redundant frames are not transmitted, the not
   transmitted/received bits MUST be reconstructed on the receiver side.
   It is RECOMMENDED to produce the non received bits with state of the
   art ECU actions. Nothing giving worse quality than using a random
   generated bits SHOULD be used. To use a fixed pattern SHOULD be
   avoided for speech quality reasons.

   Note that the possibility to transmit partial redundant frames can be
   employed only if the receiver has signaled support for this in
   capability description.

   A frame quality indicator is included for interoperability with the
   ATM payload format described in ITU-T I.366.2 and the UTRAN Iu
   interface [14]. The speech quality is significantly increased if
   damaged frames are forwarded to the speech decoder error concealment
   unit and not dropped.



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 4]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


3.1.  The payload header

   The payload header has dynamic length, 3 or 7 bits. The bits in the
   header are specified as follows:

   Q (1 bit): The payload quality bit indicates, if not set, that the
   payload is severely damaged and the receiver should set the RX_TYPE,
   see [4], to SPEECH_BAD or SID_BAD depending on the frame type (FT).

   L (1 bit): Indicates the existence of LEN fields in the payload
   frames and that sensitivity sorting is used. Note that this bit can
   be set only if the receiver has signaled support for option to
   transmit redundant data.

   R (1 bit): Indicates, if set, that the Codec Mode Request (CMR) is
   sent.

   CMR (4 bits): OPTIONAL field, depending on the R bit. Requested codec
   mode for the other communication direction. The mapping of existing
   AMR modes are given in Table 1a in [2].

    0
    0 1 2
   +-+-+-+
   |Q|L|R|
   +-+-+-+

   Figure 1: AMR payload header, R=0

    0
    0 1 2 3 4 5 6
   +-+-+-+-+-+-+-+
   |Q|L|R|  CMR  |
   +-+-+-+-+-+-+-+

   Figure 2: AMR payload header, R=1


3.2.  AMR payload frame

   An AMR payload frame represent one encoded speech frame. Each payload
   frame includes several specified fields as follows:

   F (1 bit): Indicates if this frame is followed by further frames. F=1
   further frames follow, F=0 last frame.

   LEN (5 bits): OPTIONAL field, exists if the payload header bit L is
   set, L=1. LEN specifies the number of octets in the FT field and AMR
   encoded bits field in this frame. If LEN indicates more bits than the
   AMR mode information in the FT field, the implicit knowledge of the
   number of bits for the AMR mode indicated by FT is the valid number



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 5]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


   of AMR encoded bits. If LEN indicates fewer bits than given by the
   mode information in the FT field, LEN gives the number of encoded
   bits. If a frame is transmitted only partially the least sensitive
   bits at the end of the frame are omitted. This use is intended for
   partial redundant data.

   FT (4 bits): Frame type indicator, indicating the AMR speech coding
   mode or comfort noise (CN) mode. The mapping of existing AMR modes
   are given in Table 1a in [2]. If FT=15 (No transmission) no LEN or
   AMR encoded bits follow.

   AMR encoded bits: This is the speech codec encoded data field. The
   length of this field is either defined implicitly by the AMR mode in
   the FT field, or by the LEN field. The last payload frame SHALL
   always contain a full AMR frame, i.e. no LEN field is needed.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|   LEN   |  FT   |                                           |
   +-+-+-+-+-+-+-+-+-+-+                                           +
   |                                                               |
   +                                                               +
   /                    AMR encoded bits                           /
   +                                                 +-+-+-+-+-+-+-+
   |                                                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Figure 3: Payload frame format, F=1 and L=1

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|  FT   |                                                     |
   +-+-+-+-+-+                                                     +
   |                                                               |
   +                                                               +
   /                    AMR encoded bits                           /
   +                                             +-+-+-+-+-+-+-+-+-+
   |                                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Figure 4: Payload frame format, F=0 or L=0


3.3. Payload block sorting

   A bit error in a more sensitive bit is subjectively more annoying
   than in a less sensitive bit. Therefore, to be able to protect the
   most sensitive bits in a payload packet with a forward error
   detection code, e.g. a CRC outside RTP, the bits inside a frame are



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 6]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


   ordered into sensitivity order. If the option to transmit redundant
   data is employed, the full RTP payload MUST be further sorted into
   sensitivity order. The protection MAY then cover an appropriate
   number of octets from the beginning of the payload. How many octets
   depend on the channel and application. This can for example be
   accomplished by UDP lite [10] (work in progress). To maintain
   sensitivity ordering inside the AMR payload when more than one speech
   frame is transmitted in one packet reordering of the data is needed.
   The reordering is only performed if partial redundancy is used, i.e.
   L=1.

   The reordering to maintain the sensitivity ordered AMR payload SHALL
   be performed on bit level. The AMR payload header SHALL still be
   placed unchanged in the beginning of the payload. Thereafter, the
   payload frames are sorted with one bit alternating from each payload
   frame.

   +-------------+
   | h(0)-h(H-1) |
   +------------------------+
   | f(0,0) _ f(0,F(0))     |
   +----------------------------+
   | f(1,0) _ f(1,F(1))         |
   +----------------------------+
   | f(2,0) _ f(2,F(2))   |
   +----------------------+
   \                          \
   +-------------------------------+
   | f(N-1,0) _ f(N-1,F(N-1))      |
   +-------------------------------+

   Figure 5: The payload header and N payload frames before sorting.

   The sorting algorithm can be described in C-code.

   b(m)    - bit m of RTP final payload
   f(n,m)  - bit m in payload frame n
   F(n)    - number of bits in payload frame n, defined by FT or by LEN
   h(m)    - bit m of payload header
   H       - number of payload header bits, 3 or 7 bits
   N       - number of payload frames in the payload
   S       - number of unused bits

   Payload frames f(n,m) are ordered in consecutive order, where frame
   n=1 is preceding frame n=2.

   The sorting algorithm is defined in C-style as:

   for (i = 0; i < H; i++){
     b(i) = h(i);
   }



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 7]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


   max = max(F(0),..,F(N-1));
   k = H;
   for (i = 0; i < max; i++){
     for (j = 0; j < N; j++){
       if (i < F(j)){
         b(k++) = f(j,i);
       }
     }
   }
   S = 8 - k%8;
   if (S < 8){
     for (i = 0; i < S; i++){
       b(k++) = 0;
     }
   }

   Note that if multiple new frames are encapsulated into the payload
   and partial redundant data is not transmitted, payload bit-sorting
   SHALL NOT be performed but the payload is formed by concatenating the
   payload header and the bits from each AMR frame in the payload.
   However, the bits inside a frame are ordered into sensitivity order
   as defined in [2]. In this case the bits are stored into payload
   according to C-style algorithm below (see the definition of symbols
   above).

   for (i = 0; i < H; i++){
     b(i) = h(i);
   }
   k = H;
   for (j = 0; j < N; j++){
     for (i = 0; i < F(j); i++){
         b(k++) = f(j,i);
       }
     }
   }
   S = 8 - k%8;
   if (S < 8){
     for (i = 0; i < S; i++){
       b(k++) = 0;
     }
   }


4. RTP header usage

   The RTP header marker bit (M) is used to mark (M=1) the packages
   containing the first speech frame after CN. All other packages the
   marker bit is set to 0 (M=0).

   The timestamp corresponds to the sampling time of the first sample
   encoded for the first encoded speech frame in the packet. The



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 8]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


   timestamp unit is in samples. The duration of one AMR speech frame is
   20 ms and the sampling frequency is 8 kHz, corresponding to 160
   encoded speech samples per frame. Thus, the timestamp is increased by
   160 for each consecutive frame. All frames in a packet MUST be
   successive 20 ms frames.


5. Examples

5.1. Simple example

   In the simple example we just send one full (L=0) frame in each RTP
   packet, no Codec Mode Request CMR is sent (R=0), the payload was not
   damaged at IP origin (Q=1). In this example we transmit one frame
   encoded with the 5.9 kbps mode (FT=2). The speech encoded bits are
   put into f(0) to f(117) in descending sensitivity order according to
   [2].

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  Q=1  |  L=0  |  R=0  |  F=0  |   0   |   0   |   1   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 | f(0)  | f(1)  | f(2)  |  ...  |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   15 |  ...  |  ...  |  ...  | f(115)| f(116)| f(117)|   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 6: One frame per packet example.


5.2. Example with partial redundancy

   In this example the 6.7 kbps mode (FT=3) is sent with one redundant
   frame, also FT=3. Only a part of the redundant frame is sent, in this
   example 12 octets, (L=1, LEN=12). A mode request is sent(R=1),
   requesting the 10.2 kbps mode for the other link(CMR=6). The
   redundant frame (12 octets) is r(0) to r(95) and the current frame
   (134 bits) is f(0) to f(133).

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  Q=1  |  L=1  |  R=1  |   0   |   1   |   1   |   0   |  F=1  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |  F=0  |   0   |   0   |   1   |   0   |   1   |   1   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    2 |   1   |   0   | f(0)  |   0   | f(1)  |   0   | f(2)  |   1   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    3 | f(3)  |   1   | f(4)  | r(0)  | f(5)  | r(1)  | f(6)  | r(2)  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+



Sjoberg/Westerlund/Lakaniemi/Koskelainen                        [Page 9]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


    4 | f(7)  | r(3)  | f(8)  |  ...  |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   26 | f(95) | r(91) | f(96) | f(97) | f(98) |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   30 |  ...  |  ...  |  ...  |  ...  |  ...  | f(131)| f(132)| f(133)|
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 7: Example with partial redundancy.


5.3. Example with multiple frames per payload

   In this example two 5.9 kbps mode (FT=2) frames are sent in one
   packet. No partial redundancy is used (L=0). A mode request is
   sent(R=1), requesting the 7.95 kbps mode for the other link(CMR=5).
   The first frame is represented by the 118 bits f(0) to f(117) and the
   subsequent frame by g(0) to g(117).

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  Q=1  |  L=0  |  R=1  |   0   |   1   |   0   |   1   |  F=1  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |   0   |   0   |   1   |   0   | f(0)  | f(1)  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   15 |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  | f(115)|
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   16 | f(116)| f(117)|  F=0  |   0   |   0   |   1   |   0   | g(0)  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   17 | g(1)  | g(2)  |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   31 |  ...  |  ...  |  ...  | g(116)| g(117)|   0   |   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 8: Example two frames per payload.


6. The AMR MIME type registration

   This chapter defines the MIME type for Adaptive Multi-Rate (AMR)
   speech codec [1]. The data format and parameters are specified for
   both real-time transport and for storage type applications (e.g. e-
   mail attachment, multimedia messaging). The former is referred as RTP
   mode and the latter as storage mode.

   AMR implementations according to [1] MUST support all eight coding
   modes. The mode change can occur at any time during operation and
   therefore the mode information is transmitted in-band together with
   speech bits to allow mode change without any additional signaling.





Sjoberg/Westerlund/Lakaniemi/Koskelainen                       [Page 10]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


   In addition to the speech codec, AMR specifications also include
   Discontinuous Transmission / comfort noise (DTX/CN) functionality
   [11]. The DTX/CN switches the transmission off during silent parts of
   the speech and only CN parameter updates are sent in regular
   intervals.


6.1  RTP mode

   It is possible that the decoder may want to receive certain AMR mode
   or a subset of AMR modes. In the end to end transmission parts of the
   chain may have limitations in the number of modes in the active codec
   set, e.g. the GSM radio link can only use a subset of maximum four
   modes. Therefore, it is possible to request specific set of AMR modes
   in capability description and it is mandatory for encoder to abide
   this request. If request for mode set is not given, encoder can
   freely decide which AMR mode to use.

   Although in principle AMR codec can perform a mode change at any time
   between any two modes, it is possible to set limitations for mode
   changes. The decoder has possibility to define the minimum number of
   frames between mode changes and to limit the mode change to happen
   into neighboring modes only.

   In addition to AMR DTX/CN scheme, the three codec standards that are
   part of the AMR also have their own DTX/CN schemes ([6][7][12]). To
   enable interoperability with terminals supporting these standards,
   AMR can optionally be extended to support also these CN schemes. The
   CN capabilities are signaled in capability description. If no CN
   capabilities are reported, it is assumed that AMR CN is supported. If
   CN capabilities are reported, all supported CN types (including AMR
   CN) must be signaled.

   It is also possible to limit the number of AMR frames encapsulated
   into one RTP packet. This is an optional feature and if no parameter
   is given in capability description, the transmitter can encapsulate
   any number of AMR speech frames into one RTP packet.

   There is also an option to retransmit one or more previously
   transmitted frames to help the receiver to recover from packet losses
   in difficult transmission conditions. It also possible to transmit
   these frames only partially in such a way that only the most
   sensitive bits are transmitted. Since the transmission of partly
   redundant frames is an optional property, it can be used only if the
   receiver has signaled support for this functionality in capability
   description. The partial redundancy is RECOMMENDED to be implemented
   and turned on at least for conversational services.







Sjoberg/Westerlund/Lakaniemi/Koskelainen                       [Page 11]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


6.2 Storage mode

   For storing AMR frames e.g. as a file or e-mail attachment, the AMR
   frames must be formatted according to Annex A of [9]. Because no
   exchange of particular coding parameters, e.g. specific DTX/CN mode,
   can be signaled before downloading or receiving stored AMR data, the
   receiving entity (AMR decoder) MUST be able to decode all eight
   coding modes as well as the AMR DTX/CN [6].


6.3 MIME Registration

   MIME-name for the AMR codec is allocated from IETF tree since AMR is
   expected to be widely used speech codec in VoIP applications. Some
   parts of this chapter will distinguish between RTP and storage modes.

   Media Type name:     audio

   Media subtype name:  AMR

   Required parameters: none

   Optional parameters for RTP mode:
    ptime:     Definition as usual in RTP audio.
    mode-set:  Requested AMR mode set. Restricts the active codec mode
               set to a subset of all modes. Possible values are:
               0,...,7 (see Table 1a [2]). If not present, all speech
               modes are available.
    mode-change-period: Defines a number N which restricts the mode
               changes in such a way that mode changes are only allowed
               on multiples of N, initial state of the phase is
               arbitrary. If this parameter is not present, mode change
               can happen at any time.
    mode-change-neighbor: If present, mode changes SHALL be made to
               neighboring modes only. If not present, change between
               any two modes is allowed.
    amr-cn:    If present, GSM AMR DTX/CN is supported. Note that if no
               CN capabilities are reported, AMR DTX/CN is assumed to
               be supported, i.e. this parameter is only sent together
               with one of the following CN parameters.
    pdc-efr-cn:If present, PDC-EFR DTX/CN is supported, otherwise not
               supported.
    is-641-cn: If present, IS-641 DTX/CN is supported, otherwise not
               supported.
    gsm-efr-cn:If present, GSM EFR DTX/CN is supported, otherwise not
               supported.
    maxframes: Maximum number of AMR speech frames in one RTP packet.
               The receiver may set this parameter in order to limit
               the buffering requirements or delay.
    redundancy:If present, transmission of partly redundant frames is
               supported, otherwise not supported.



Sjoberg/Westerlund/Lakaniemi/Koskelainen                       [Page 12]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000



   Optional parameters for storage mode:     none

   Encoding considerations for RTP mode: See section 3 in this document.

   Encoding considerations for storage mode: Each audio frame must be
   formatted in octet format according to AMR Interface Format 2 (AMR
   IF2) specified in Annex A of [2]. The audio frames must be stored in
   sequential order. This implies that the first octet after frame n
   must be the first octet of frame (n+1). Furthermore, missing frames
   and non-received frames between CN updates during non-speech period
   must be stored as NO_TRANSMISSION frames (frame type 15, see
   definition in [2]). Each receiving entity that accepts this MIME type
   must be able to decode all eight AMR coding modes [1] and the AMR
   DTX/CN [11].

   Security considerations: none

   Interoperability considerations for RTP mode: If CN capabilities are
   not signaled in the capability description, only AMR CN is supported.

   Public specification: please refer to chapter 7 "References".

   Additional information for storage mode:
     Magic number: none
     File extensions: amr, AMR
     Macintosh file type code: none
     Object identifier or OID: none

   Person & email address to contact for further information:
     johan.sjoberg@ericsson.com
     ari.lakaniemi@nokia.com

   Intended usage: COMMON. It is expected that many VoIP applications
   (as well as mobile applications) will use this type.

   Author/Change controller:
     johan.sjoberg@ericsson.com
     ari.lakaniemi@nokia.com


6.4 Mapping to SDP Parameters

   Please note that this chapter applies to the RTP mode only.

   Parameters are mapped to SDP [13] as usual.
   Example usage in SDP:
    m=audio 49120 RTP/AVP 97
    a=rtpmap:97 AMR
    a=fmtp:97 mode-set=0,2,5,7; maxframes=2




Sjoberg/Westerlund/Lakaniemi/Koskelainen                       [Page 13]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


7.   References

   [1]  GSM 06.90, "Adaptive Multi-Rate (AMR) speech transcoding".

   [2]  3G TS 26.101, "AMR Speech Codec Frame Structure".

   [3]  IETF RFC 2119, "Key words for use in RFCs to Indicate
        Requirement Levels".

   [4]  3G TS 26.093, "AMR Speech Codec; Source Controlled Rate
        operation".

   [5]  GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding".

   [6]  TIA/EIA -136-Rev.A, part 410 - "TDMA Cellular/PCS - Radio
        Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS-
        641. TIA published standard, 1998".

   [7]  ARIB, RCR STD-27H, "Personal Digital Cellular Telecommunication
        System RCR Standard".

   [8]  IETF RFC1889, "RTP: A Transport Protocol for Real-Time
        Applications".

   [9]  IETF draft-westberg-realtime-cellular-01.txt, "Realtime Traffic
        over Cellular Access Networks".

   [10] IETF draft-larzon-udplite-02.txt, "The UDP Lite Protocol".

   [11] GSM 06.92, "Comfort noise aspects for Adaptive Multi-Rate (AMR)
        speech traffic channels".

   [12] GSM 06.62: Comfort noise aspect for Enhanced Full Rate (EFR)
        speech traffic channels

   [13] M. Handley and V. Jacobson, "SDP: Session Description
        Protocol", RFC 2327, April 1998

   [14] 3G TS 25.415 "UTRAN Iu Interface User Plane Protocols"















Sjoberg/Westerlund/Lakaniemi/Koskelainen                       [Page 14]


INTERNET-DRAFT         RTP Payload Format for AMR          July 14, 2000


8. Authors' addresses

   Johan Sjoberg
   Ericsson Research
   Ericsson Radio Systems AB
   Torshamnsgatan 23
   SE-164 80 Stockholm
   SWEDEN
   E-mail: Johan.Sjoberg@ericsson.com

   Magnus Westerlund
   Ericsson Research
   Ericsson Radio Systems AB
   Torshamnsgatan 23
   SE-164 80 Stockholm
   SWEDEN
   E-mail: Magnus.Westerlund@ericsson.com

   Ari Lakaniemi
   Nokia Research Center
   P.O.Box 407
   FIN-00045 Nokia Group
   Finland
   E-mail: ari.lakaniemi@nokia.com

   Petri Koskelainen
   Nokia Research Center
   P.O.Box 100
   FIN-33721 Tampere
   Finland
   E-mail: petri.koskelainen@nokia.com


   This Internet-Draft expires January 14, 2001.




















Sjoberg/Westerlund/Lakaniemi/Koskelainen                       [Page 15]