Internet Engineering Task Force                  Johan Sjoberg, Ericsson
Audio Video Transport WG                     Magnus Westerlund, Ericsson
INTERNET-DRAFT                                      Ari Lakaniemi, Nokia
December 22, 2000                               Petri Koskelainen, Nokia
Expires: June 22, 2001                          Bernhard Wimmer, Siemens
                                                Tim Fingscheidt, Siemens
                                                  Qiaobing Xie, Motorola
                                                  Sanjay Gupta, Motorola


                       RTP payload format for AMR
                    <draft-ietf-avt-rtp-amr-02.txt>


Status of this Memo


   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This document is an individual submission to the IETF. Comments
   should be directed to the authors.


Abstract

   This document describes a proposed real-time transport protocol (RTP)
   [8] payload format for AMR speech encoded [1] signals. The AMR
   payload format is designed to be able to interoperate with existing
   AMR transport formats. This document also includes a MIME type
   registration for AMR. The MIME type is specified for both real-time
   transport and storage.






Sjoberg et al.                                                  [Page 1]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000




1. Introduction

   The adaptive multi-rate (AMR) speech codec [1] was developed by the
   European Telecommunications Standards institute (ETSI). The AMR codec
   is standardized for GSM, and is also chosen by 3GPP as the mandatory
   codec for third generation systems. It is currently under
   standardization for TDMA. I.e. the AMR codec will be widely used in
   cellular systems. The AMR codec is developed to preserve high speech
   quality under a wide range of transmission conditions.

   The AMR codec is a multi-mode codec with 8 narrow band speech modes
   with bit rates between 4.75 and 12.2 kbps. The sampling frequency is
   8000 Hz and processing is done on 20 ms frames, i.e. 160 samples per
   frame. The AMR modes are closely related to each other and use the
   same coding framework. Three of the AMR modes are already adopted
   standards of their own, the 6.7 kbps mode as PDC-EFR [7], the 7.4
   kbps mode as IS-641 codec in TDMA [6], and the 12.2 kbps mode as GSM-
   EFR [5].

   The AMR codec is designed with a voice activity detector (VAD) and
   generation of comfort noise (CN) parameters during silence periods.
   Hence, the AMR codec can reduce the number of transmitted bits and
   packets during silence periods to a minimum. The operation to send CN
   parameters at regular intervals during silence periods is usually
   called discontinuous transmission (DTX) or source controlled rate
   (SCR) operation.

   AMR implementations must support all 8 speech coding modes, and mode
   switching can occur to any mode at any time. The mode information
   must therefore be transmitted together with the speech encoded bits,
   to indicate the mode. The AMR speech codec is designed with modes
   producing different bit rates to be able to adapt the source bit rate
   according to the radio link quality in mobile phone systems. The
   objective was to give highest possible speech quality under a variety
   of radio channel conditions. To realize rate adaptation the decoder
   needs to signal the mode it prefers to receive to the encoder.

   Due to the flexibility and robustness of AMR, it is suitable also for
   other purposes than circuit switched cellular systems. Other suitable
   applications are real-time services over packet switched networks.
   The payload format should be designed to support methods for
   increasing robustness to both bit errors and packet loss. The speech
   encoded bits have different perceptual sensitivity to bit errors and
   cellular systems exploit this by using unequal error protection and
   detection (UEP and UED). This could also be done in IP networks. The
   UEP/UED mechanism concentrates the correction and detection of
   corrupted bits to the perceptually most sensitive bits. A frame is
   only regarded as lost or damaged if errors are detected in the most
   sensitive bits. UED can also be employed for RTP if for example UDP-



Sjoberg et al.                  [Page 2]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   lite is used as transport layer protocol (UDP-lite [10] is work in
   progress). To facilitate this, the most error sensitive bits have to
   be transmitted first. The special problems with IP real-time traffic
   over cellular access networks are further discussed in [9].

   Other AMR scenarios are possible, e.g. one end is circuit switched
   GSM, which is connected through a gateway to IP network and an IP
   terminal in the other end. To improve quality, also frames damaged by
   the GSM radio should be transmitted to the decoder in the IP network.
   To make this possible, frame quality information has to be
   transmitted over the IP network. The quality bit is also needed for
   the AMR RTP payload format to interwork with for example the ATM AAL2
   AMR profile.


2.  Requirements

   The AMR payload format for RTP was designed to meet the following
   requirements:

     o Different levels of robustness must be supported, from no
      redundant data to extreme robustness capable of handling very
      high packet loss rates with no or small speech quality
      degradation.

     o Fast, bandwidth efficient, frame-wise AMR mode adaptation must
      be supported. This means that it must be possible to send Codec
      Mode Requests back from the receiving side to the transmitting
      side with information on the preferred mode.

     o Source controlled rate operation (SCR) (also called DTX) and
      comfort noise parameter (CN) transmission defined in AMR must be
      supported.


3. Payload format

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC2119 [3].

   The AMR payload format is designed to be flexible, ranging from very
   low overhead to an extended format with the possibility to increase
   bit error robustness and pack several speech frames in one packet.

   The payload format consists of one payload header, a table of
   content, optionally one CRC per payload frame and zero or more
   payload frames. The payload format is bandwidth efficient. This is
   achieved by not using octet alignment for the payload header, table
   of content or the payload frames, but the full payload is octet
   aligned. If the option to transmit a robust sorted payload is enabled



Sjoberg et al.                  [Page 3]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   and employed, the full payload SHALL finally be ordered in descending
   bit error sensitivity order to be prepared for unequal error
   protection or unequal error detection schemes, e.g. UDP-lite [10].
   The AMR encoded bit streams are defined in sensitivity order in Annex
   B of [2], the original order as delivered from the speech encoder is
   defined in [1].

   The last octet of an AMR payload packet is padded with zeroes at the
   end if not all bits are used.

   The AMR frame types, or modes, are defined in [2]. Frame type 15, no
   transmission, is needed to indicate not transmitted frames or lost
   frames. Not transmitted could mean both no data produced by the
   speech encoder for this frame or no data transmitted in this payload,
   i.e. valid data for this frame could be sent in another payload. For
   example, when multiple frames are sent in each payload and comfort
   noise starts. A frame type sequence in a payload with 8 frames,
   speech frames with AMR mode 7 are interrupted by CN in the
   fifth frame, could look like: {7,7,7,7,8,15,15,8}. The AMR SCR/DTX is
   described in [4].

   The AMR payload format supports robust transmission, multiple frames
   in one payload packet, and the use of fast codec mode adaptation.

   Robustness against packet loss can be accomplished by using the
   possibility to retransmit previously transmitted frames together with
   the current frame or frames. This is only possible if the number of
   frames per payload is not restricted by the MIME parameters.

   The AMR performance over error tolerant links can be be improved by
   delivering also speech frames with bit errors. Unequal error
   detection is needed since bit errors SHOULD only be allowed in the
   least error sensitive bits. This payload format provides two
   alternative methods to implement unequal error detection:

   A. CRC calculation over the class A speech bits

      If several consecutive speech frames are packed into each
      payload, the optional CRC may be used to protect the class A
      speech bits, see table 1. The number of class A bits is specified
      as informative in [2] and therefore copied into table 1 as
      normative for this payload format. Speech frames with errors in
      class A bits MUST be marked with SPEECH_BAD for corrupted speech
      frames (FT=0..7) or SID_BAD for corrupted SID frames (FT=8) and
      be sent to the speech decoder, see [4]. In this case the RTP
      header, payload header, table of content and CRC should be
      covered by a transport layer CRC, e.g. UDP-lite [10]. Packets
      should be discarded if the transport layer CRC detects errors.

   B. Robust sorting of payload bits




Sjoberg et al.                  [Page 4]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


      Robust behavior can also be accomplished by robust sorting of the
      payload. This enables the use of UED (e.g. UDP-lite) and UEP
      (e.g. ULP [15]). The UED and/or UEP is recommended to cover at
      least the RTP header, payload header, table of content and the
      most bit error sensitive bits.

                     Class A   total speech
   Index   Mode       bits       bits
   ----------------------------------------
     0     AMR 4.75   42         95
     1     AMR 5.15   49        103
     2     AMR 5.9    55        118
     3     AMR 6.7    58        134
     4     AMR 7.4    61        148
     5     AMR 7.95   75        159
     6     AMR 10.2   65        204
     7     AMR 12.2   81        244
     8     AMR CNG    39         39

   Table 1. Specification of the number of class A bits.

   A frame quality indicator is included for interoperability with the
   ATM payload format described in ITU-T I.366.2, the UMTS Iu interface
   [13] and other transport formats. The speech quality is increased if
   damaged frames are forwarded to the speech decoder error concealment
   unit and not dropped. In many communication scenarios the AMR encoded
   bits will be transmitted from one IP/UDP/RTP terminal to a terminal
   in a system with another transport format and/or vice versa. The
   transport format transcoding will be done in a gate way. A second
   likely scenario is that IP/UDP/RTP is used as transport between other
   systems, i.e. IP is originated and terminated in gate ways on both
   sides of the IP transport.

    AMR over
    I.366.{2,3} or +------+                        +----------+
    3G Iu or       |      |     IP/UDP/RTP/AMR     |          |
    -------------->|  GW  |----------------------->| TERMINAL |
    GSM Abis       |      |                        |          |
    etc.           +------+                        +----------+

   Figure 1: GW to VoIP terminal scenario


    AMR over                                             AMR over
    I.366.{2,3} or +------+                     +------+ I.366.{2,3} or
    3G Iu or       |      |   IP/UDP/RTP/AMR    |      | 3G Iu or
    -------------->|  GW  |-------------------->|  GW  |--------------->
    GSM Abis       |      |                     |      | GSM Abis
    etc.           +------+                     +------+ etc.

   Figure 2. GW to GW scenario



Sjoberg et al.                  [Page 5]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000




3.1. The payload header

   The length of the payload header is 6 bits. The bits in the header
   are specified as follows:

   S (1bit): Indicates if set that the payload is robust sorted,
   otherwise simple payload sorting is employed. Note that this bit can
   be set only if the receiver has signaled support for the optional
   robust payload sorting.

   C (1 bit): Indicates the existence of optional CRC fields in the
   payload table of content. Note that this bit can be set only if the
   receiver has signaled support for the optional CRC.

   R (1 bit): Indicates, if set, that the Codec Mode Request (CMR) is
   valid.

   CMR (3 bits): this field is only valid if the R bit is set(R=1).
   Requested codec mode for the other communication direction. The
   mapping of existing AMR modes to CMR is given by the three least
   significant bits in Table 1a in [2]. If R=0 the CMR bits shall be set
   to zero, other values are for future use.



    0
    0 1 2 3 5 6
   +-+-+-+-+-+-+
   |S|C|R| CMR |
   +-+-+-+-+-+-+

   Figure 3: AMR payload header


3.2. The payload table of content and CRCs

   The table of content (ToC) consists of one table of content entry for
   each speech frame in the payload. A table of content entry includes
   several specified fields as follows:

   F (1 bit): Indicates if this frame is followed by further frames. F=1
   further frames follow, F=0 last frame.

   Q (1 bit): The payload quality bit indicates, if not set, that the
   payload is severely damaged and the receiver should set the RX_TYPE,
   see [4], to SPEECH_BAD or SID_BAD depending on the frame type (FT).

   FT (4 bits): Frame type indicator, indicating the AMR speech coding
   mode or comfort noise (CN) mode. The mapping of existing AMR modes to



Sjoberg et al.                  [Page 6]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   FT is given in Table 1a in [2]. If FT=15 (No transmission) no CRC or
   payload frame is present.

    0
    0 1 2 3 4 5
   +-+-+-+-+-+-+
   |F|Q|  FT   |
   +-+-+-+-+-+-+

   Figure 5: Table of content entry field

   CRC (8 bits): OPTIONAL field, exists if the payload header bit C is
   set (C=1). The 8 bit CRC is used for error detection. These 8 parity
   bits are generated according to section 4.1.4 in [2].

    0
    0 1 2 3 4 5 6 7
   +-+-+-+-+-+-+-+-+
   |      CRC      |
   +-+-+-+-+-+-+-+-+

   Figure 5: CRC field

   The ToC and CRCs are arranged with all table of content entries
   fields first followed by all CRC fields. The ToC starts with the
   frame data belonging to the oldest speech frame.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|Q|  FT   |F|Q|  FT   |F|Q|  FT   |      CRC      |      CRC  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   |      CRC      |
   +-+-+-+-+-+-+-+-+-+-+

   Figure 5: The ToC and CRCs for a payload with three speech frames


3.3. AMR speech frame

   An AMR speech frame represent one encoded speech frame encode with
   the mode according to the ToC field FT. The length of this field is
   implicitly defined by the AMR mode in the FT field. The bits SHALL be
   sorted according to Appendix B of [2].


3.4. Compound AMR payload

   The compound AMR payload consists of one AMR payload header, the
   table of content and one or more AMR payload frames, see section




Sjoberg et al.                  [Page 7]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   3.1., 3.2 and 3.3. These can be put together with robust or simple
   payload sorting. The payload header bit S indicates the method used.

   Definitions for describing the compound AMR payload:

   b(m)    - bit m of the compound AMR payload
   t(n,m)  - bit m in the table of content entry for speech frame n
   p(n,m)  - bit m in the CRC for speech frame n
   f(n,m)  - bit m in speech frame n
   F(n)    - number of bits in speech frame n, defined by FT
   h(m)    - bit m of payload header
   C       - number of CRC bits , 0 or 8 bits
   N       - number of payload frames in the payload
   S       - number of unused bits

   Payload frames f(n,m) are ordered in consecutive order, where frame
   n=1 is preceding frame n=2. Within one payload all frames between the
   oldest and most recent must be present. If speech data is missing for
   one frame, due to e.g. DTX, send the NO_TRANSMISSION frame type.


3.4.1. Robust payload sorting

   A bit error in a more sensitive bit is subjectively more annoying
   than in a less sensitive bit. Therefore, to be able to protect only
   the most sensitive bits in a payload packet with a forward error
   detection code, e.g. a CRC outside RTP, the bits inside a frame are
   ordered into sensitivity order. The protection SHOULD cover an
   appropriate number of octets from the beginning of the payload,
   covering at least the AMR payload header, ToC and class A bits (see
   [2]). Exactly how many octets that needs protection depends on the
   network and application. To maintain sensitivity ordering inside the
   AMR payload, when more than one speech frame is transmitted in one
   payload, reordering of the data is needed.

   The reordering to maintain the sensitivity ordered AMR payload SHALL
   be performed on bit level. The AMR payload header, ToC and CRCs SHALL
   still be placed unchanged in the beginning of the payload.
   Thereafter, the payload frames are sorted with one bit alternating
   from each payload frame.

   The robust payload sorting algorithm is defined in C-style as:

   /* payload header */
   k=0;
   for (i = 0; i < 6; i++){
     b(k++) = h(i);
   }
   /* table of content */
   for (j = 0; j < N; j++){
     for (i = 0; i < 6; i++){



Sjoberg et al.                  [Page 8]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


       b(k++) = t(j,i);
     }
   }
   /* CRCs */
   for (j = 0; j < N; j++){
     for (i = 0; i < C; i++){
       b(k++) = p(j,i);
     }
   }
   /* payload frames */
   max = max(F(0),..,F(N-1));
   for (i = 0; i < max; i++){
     for (j = 0; j < N; j++){
       if (i < F(j)){
         b(k++) = f(j,i);
       }
     }
   }
   /* padding */
   S = 8 - k%8;
   if (S < 8){
     for (i = 0; i < S; i++){
       b(k++) = 0;
     }
   }


3.4.2. Simple payload sorting

   If multiple new frames are encapsulated into the payload and robust
   payload sorting is not used. The payload is formed by concatenating
   the payload header, the ToC, optional CRC fields and the speech
   frames in the payload. However, the bits inside a frame are ordered
   into sensitivity order as defined in [2].

   The simple payload sorting algorithm is defined in C-style as:

   /* payload header */
   k=0;
   for (i = 0; i < 6; i++){
     b(k++) = h(i);
   }
   /* table of content */
   for (j = 0; j < N; j++){
     for (i = 0; i < 6; i++){
       b(k++) = t(j,i);
     }
   }
   /* CRCs */
   for (j = 0; j < N; j++){
     for (i = 0; i < C; i++){



Sjoberg et al.                  [Page 9]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


       b(k++) = p(j,i);
     }
   }
   /* payload frames */
   for (j = 0; j < N; j++){
     for (i = 0; i < F(j); i++){
         b(k++) = f(j,i);
       }
     }
   }
   /* padding */
   S = 8 - k%8;
   if (S < 8){
     for (i = 0; i < S; i++){
       b(k++) = 0;
     }
   }


3.5. Decoding security consideration

   If the payload length calculation, using C, F and FT fields, do not
   indicate the same length as the actually received payload size the
   payload should be dropped. Decoding a packet that has errors in
   length indicator bits could severely degrade the speech quality.


4. RTP header usage

   The RTP header marker bit (M) is used to mark (M=1) the packages
   containing the first speech frame after CN. For all other packages
   the marker bit is set to 0 (M=0).

   The timestamp corresponds to the sampling time of the first sample
   encoded for the first encoded speech frame in the packet. The
   timestamp unit is in samples. The duration of one AMR speech frame is
   20 ms and the sampling frequency is 8 kHz, corresponding to 160
   encoded speech samples per frame. Thus, the timestamp is increased by
   160 for each consecutive frame. All frames in a packet MUST be
   successive 20 ms frames.


5. Congestion Control

   The need of congestion control for data transported with RTP has to
   be considered. AMR speech data have some elastic properties due to
   the different bandwidth demand for each mode. Another parameter that
   can reduce the bandwidth demand for AMR are how many frames of speech
   data that are encapsulated in each payload. This will reduce the
   number of packets and the overhead from IP/UDP/RTP headers. If using
   forward error correction (FEC) there is also the need to regulate the



Sjoberg et al.                  [Page 10]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   amount, so the FEC itself does not worsen the problem. Therefore, it
   is RECOMMENDED that applications using this payload implements
   congestion control. The actual mechanism for congestion control is
   not specified but should be suitable for real-time flows, e.g.
   "Equation-Based Congestion Control for Unicast Applications" [14].


6. Security

   As this format transports encoded speech, the main security issues
   are confidentiality and authentication of the speech itself. Some
   other smaller issues also exist. The payload format itself does not
   have any support for security. These issues have to be solved by a
   payload external mechanism.

6.1. Confidentiality

   To achieve confidentiality of the encoded speech all speech data bits
   must be encrypted. There is less need to encrypt the payload header
   or the frame header as they only carry information about the
   requested AMR mode, AMR frame type and frame quality. This
   information could be useful to some third party, e.g. quality
   monitoring. The type of encryption used can not only have impact on
   the confidentiality but also on error robustness. The error
   robustness against bit errors will be non, unless an encryption
   method without error-propagation is used, e.g. a stream cipher. This
   is only an issue when using UEP/D, when bit errors can be accepted in
   some part of the payload.

6.2. Authentication

   To authenticate the sender of the speech an external mechanism have
   to be added. It is recommended that such a mechanism protects all the
   speech data bits. To prevent a man in the middle to tamper with the
   packetization of the speech data, some extra data could be protected.
   The data is: RTP timestamp, RTP sequence number, RTP marker bit.
   Tampering could result in erroneous depacketization/decoding that
   could lower speech quality. Tampering with the AMR mode request field
   can result in that the sender must receive speech in a different
   quality than desired.


7. Examples

7.1. Simple example

   In the simple example we just send one frame in each RTP packet, no
   valid Codec Mode Request CMR is sent (R=0), the payload was not
   damaged at IP origin (Q=1) and no CRC is used. The AMR mode is the
   5.9 kbps mode (FT=2). The speech encoded bits are put into f(0) to




Sjoberg et al.                  [Page 11]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   f(117) in descending sensitivity order according to [2]. Simple
   payload sorting is used, S=0.

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  S=0  |  C=0  |  R=0  |   0   |   0   |   0   |  F=0  |  Q=1  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |   0   |   0   |   1   |   0   | f(0)  | f(1)  | f(2)  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   16 | f(116)| f(117)|   0   |   0   |   0   |   0   |   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 8: One frame per packet example.

7.2. Example with CRCs

   In this example the two frames with 6.7 kbps mode (FT=3) are sent in
   the payload. A mode request is sent(R=1), requesting the 10.2 kbps
   mode for the other link(CMR=6). CRC is used (C=1). Frame one (134
   bits) is f1(0..133) and frame 2 f2(0..133). For each payload frame a
   CRC is calculated p1(0..7) for frame 1 and p2(0..7) for frame 2.
   Simple payload sorting is used, S=0.

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  S=0  |  C=1  |  R=1  |   1   |   1   |   0   |  F=1  |  Q=1  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |   0   |   0   |   1   |   1   |  F=0  |  Q=1  |   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    2 |   1   |   1   | p1(0) | p1(1) | p1(2) | p1(3) | p1(4) | p1(5) |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    3 | p1(6) | p1(7) | p2(0) | p2(1) | p2(2) | p2(3) | p2(4) | p2(5) |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    4 | p2(6) | p2(7) | f1(0) | f1(1) |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   20 |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  |f1(132)|f1(133)|
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   21 | f2(0) | f2(1) |  ...  |  ...  |  ...  |  ...  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   37 |  ...  |  ...  |  ...  |f2(131)|f2(132)|f2(133)|   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 9: Example with CRCs.


7.3. Example with multiple frames per payload and robust sorting

   In this example two 5.9 kbps mode (FT=2) frames are sent in one
   payload. No CRC is used (C=0). A mode request is sent(R=1),



Sjoberg et al.                  [Page 12]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   requesting the 7.95 kbps mode for the other link(CMR=5). The first
   frame is represented by the 118 bits f(0) to f(117) and the
   subsequent frame by g(0) to g(117). Robust sorting is used.

      |                            Bit no.                            |
   Oct|   0       1       2       3       4       5       6       7   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    0 |  S=0  |  C=0  |  R=1  |   1   |   0   |   1   |  F=1  |  Q=1  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    1 |   0   |   0   |   1   |   0   |  F=0  |  Q=1  |   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
    2 |   1   |   0   | f(0)  | g(0)  | f(1)  | g(1)  |  ...  |  ...  |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+
   31 |  ...  |  ...  | f(116)| g(116)| f(117)| g(117)|   0   |   0   |
   ---+-------+-------+-------+-------+-------+-------+-------+-------+

   Figure 10: Example two frames per payload and robust sorting.


8. The AMR MIME type registration

   This chapter defines the MIME type for the Adaptive Multi-Rate (AMR)
   speech codec [1]. The data format and parameters are specified for
   both real-time transport and for storage type applications (e.g. e-
   mail attachment, multimedia messaging). The former is referred as RTP
   mode and the latter as storage mode.

   AMR implementations according to [1] MUST support all eight coding
   modes. The mode change can occur at any time during operation and
   therefore the mode information is transmitted in-band together with
   speech bits to allow mode change without any additional signaling.

   In addition to the speech codec, AMR specifications also include
   Discontinuous Transmission / comfort noise (DTX/CN) functionality
   [11]. The DTX/CN switches the transmission off during silent parts of
   the speech and only CN parameter updates are sent at regular
   intervals.


8.1. RTP mode

   It is possible that the decoder may want to receive a certain AMR
   mode or a subset of AMR modes, due to link limitations in some
   cellular systems, e.g. the GSM radio link can only use a subset of
   maximum four modes. Therefore, it is possible to request a specific
   set of AMR modes in capability description and the encoder MUST abide
   this request. If the request for mode set is not given any mode may
   be used or requested.

   The AMR codec can in principle perform a mode change at any time
   between any two modes. To support interoperability with GSM through a



Sjoberg et al.                  [Page 13]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   gate-way it is possible to set limitations for mode changes. The
   decoder has possibility to define the minimum number of frames
   between mode changes and to limit the mode change to happen into
   neighboring modes only.

   It is also possible to limit the number of AMR frames encapsulated
   into one RTP packet. This is an optional feature and if no parameter
   is given in capability description, the transmitter can encapsulate
   any number of AMR speech frames into one RTP packet.

   The payload CRC UED can only be used if the receiver has signaled
   support for this functionality in the capability description.

   To support unequal error protection and/or detection the payload
   format supports robust payload sorting. The robust payload sorting is
   an optional feature and can only be used if the receiver has signaled
   support for this functionality in the capability description.


8.2. Storage mode
   The AMR storage mode is used for storing AMR frames, e.g. as a file
   or e-mail attachment. Frames are stored in consecutive order in octet
   aligned manner. This implies that the first octet after the last
   octet of frame n must be the first octet of frame n+1. Each stored
   AMR frame consists of a Q bit and the 4-bit FT field (see definition
   in section 3.2), followed by the AMR encoded speech bits (see section
   3.3). The last octet of each frame is padded with zeroes, if needed,
   to achieve octet alignment. An example is given in figure 11.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Q|  FT   |                                                     |
   +-+-+-+-+-+                                                     +
   |                                                               |
   +                AMR speech bits for frame n                    +
   |                                                               |
   +                                                     +-+-+-+-+-+
   |                                                     | Padding |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Q|  FT   |                                                     |
   +-+-+-+-+-+                                                     +
   |                                                               |
   +                AMR speech bits for frame n+1                  +
   |                                                               |
   +                                                     +-+-+-+-+-+
   |                                                     | Padding |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+






Sjoberg et al.                  [Page 14]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000


   Figure 11: An example of storage format with two AMR 5.9 kbit/s
   frames (118 speech bits). Note that bits marked as 'padding' must be
   set to zero.

   Frames lost in transmission and non-received frames between SID
   updates during non-speech period must be stored as NO_TRANSMISSION
   frames (frame type 15, see definition in [2]) to keep synchronization
   with the original media.

   The receiving entity (AMR decoder) MUST be able to decode all eight
   coding modes as well as the AMR DTX/CN [6]. Since no exchange of
   particular coding considerations can be signaled before downloading
   or receiving stored AMR data, the optional features (robust sorting,
   CRC) specified for RTP mode MUST NOT be used with storage mode.


8.3. MIME Registration

   MIME-name for the AMR codec is allocated from IETF tree since AMR is
   expected to be widely used speech codec in VoIP applications. Some
   parts of this chapter will distinguish between RTP and storage modes.

   Media Type name:     audio

   Media subtype name:  AMR

   Required parameters: none

   Optional parameters for RTP mode:
    mode-set:  Requested AMR mode set. Restricts the active codec mode
               set to a subset of all modes. Possible values are comma
               separated list of modes: 0,...,7 (see Table 1a [2] an
               example is given in section 8.4). If not present, all
               speech modes are available.
    mode-change-period: Defines a number N which restricts the mode
               changes in such a way that mode changes are only allowed
               on multiples of N, initial state of the phase is
               arbitrary. If this parameter is not present, mode change
               can happen at any time.
    mode-change-neighbor: If present, mode changes SHALL only be made to
               neighboring modes in the active codec mode set. If not
               present, change between any two modes in the active codec
               mode set is allowed.
    maxframes: Maximum number of AMR speech frames in one RTP packet.
               The receiver may set this parameter in order to limit
               the buffering requirements or delay.
    crc:       If present, transmission of CRCs in the payload is
               supported, otherwise not supported.
    robust-sorting: If present, robust payload sorting is supported,
               otherwise not supported and simple payload sorting SHALL
               be used.



Sjoberg et al.                  [Page 15]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000



   Optional parameters for storage mode:     none

   Encoding considerations for RTP mode: See section 3 in this document.

   Encoding considerations for storage mode: See section 8.2 in this
   document.

   Security considerations: see chapter 6 "Security".

   Public specification: please refer to chapter 9 "References".

   Additional information for storage mode:
     Magic number: none
     File extensions: amr, AMR
     Macintosh file type code: none
     Object identifier or OID: none

   Person & email address to contact for further information:
     johan.sjoberg@ericsson.com
     ari.lakaniemi@nokia.com
     Bernhard.Wimmer@mch.siemens.de
   Intended usage: COMMON. It is expected that many VoIP applications
   (as well as mobile applications) will use this type.



   Author/Change controller:
     johan.sjoberg@ericsson.com
     ari.lakaniemi@nokia.com
     Bernhard.Wimmer@mch.siemens.de

8.4 Mapping to SDP Parameters

   Please note that this chapter applies to the RTP mode only.

   Parameters are mapped to SDP [12] as usual.
   Example usage in SDP:
    m=audio 49120 RTP/AVP 97
    a=rtpmap:97 AMR/8000
    a=fmtp:97 mode-set=0,2,5,7; maxframes=1


9.   References

   [1]  3G TS 26.090, "Adaptive Multi-Rate (AMR) speech transcoding".

   [2]  3G TS 26.101, "AMR Speech Codec Frame Structure".

   [3]  IETF RFC 2119, "Key words for use in RFCs to Indicate
        Requirement Levels".



Sjoberg et al.                  [Page 16]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000



   [4]  3G TS 26.093, "AMR Speech Codec; Source Controlled Rate
        operation".

   [5]  GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding".

   [6]  TIA/EIA -136-Rev.A, part 410 - "TDMA Cellular/PCS - Radio
        Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS-
        641. TIA published standard, 1998".

   [7]  ARIB, RCR STD-27H, "Personal Digital Cellular Telecommunication
        System RCR Standard".

   [8]  IETF RFC1889, "RTP: A Transport Protocol for Real-Time
        Applications".

   [9]  IETF draft-westberg-realtime-cellular-01.txt, "Realtime Traffic
        over Cellular Access Networks".

   [10] IETF draft-larzon-udplite-03.txt, "The UDP Lite Protocol".

   [11] GSM 06.92, "Comfort noise aspects for Adaptive Multi-Rate (AMR)
        speech traffic channels".

   [12] M. Handley and V. Jacobson, "SDP: Session Description
        Protocol", RFC 2327, April 1998

   [13] 3G TS 25.415 "UTRAN Iu Interface User Plane Protocols"

   [14] S. Floyd, M. Handley, J. Padhye, J. Widmer, "Equation-Based
        Congestion Control for Unicast Applications", ACM SIGCOMM 2000,
        Stockholm, Sweden
   [15] IETF draft-ietf-avt-ulp-00.txt, " An RTP Payload Format for
        Generic FEC with Uneven Level Protection ".


10. Authors' addresses

   Johan Sjoberg                  Tel:   +46 8 50878230
   Ericsson Research              EMail: Johan.Sjoberg@ericsson.com
   Ericsson Radio Systems AB
   Torshamnsgatan 23
   SE-164 80 Stockholm
   SWEDEN

   Magnus Westerlund              Tel:   +46 8 4048287
   Ericsson Research              EMail: Magnus.Westerlund@ericsson.com
   Ericsson Radio Systems AB
   Torshamnsgatan 23
   SE-164 80 Stockholm
   SWEDEN



Sjoberg et al.                  [Page 17]


INTERNET-DRAFT         RTP Payload Format for AMR      December 22, 2000



   Ari Lakaniemi                  Tel:   +358 40 5276440
   Nokia Research Center          EMail: ari.lakaniemi@nokia.com
   P.O.Box 407
   FIN-00045 Nokia Group
   Finland

   Petri Koskelainen
   Nokia Research Center          Email: petri.koskelainen@nokia.com
   P.O.Box 100
   FIN-33721 Tampere
   Finland


   Tim Fingscheidt                Tel:   +49 89 722 57658
   Siemens AG, ICP CD             Fax:   +49 89 722 46489
   Grillparzerstrasse 10-18       EMail: Tim.Fingscheidt@mch.siemens.de
   D - 81675 Munich
   Germany

   Bernhard Wimmer                Tel:   +49 89 722 23247
   Siemens AG, ICP CD             Fax:   +49 89 722 46489
   Grillparzerstrasse 10-18       EMail: Bernhard.Wimmer@mch.siemens.de
   D - 81675 Munich
   Germany

   Qiaobing Xie                   Tel:   +1-847-632-3028
   Motorola, Inc.                 EMail: qxie1@email.mot.com
   1501 W. Shure Drive, #2309
   Arlington Heights, IL 60004
   USA

   Sanjay Gupta                   Tel:   +1-847-435-0306
   Motorola, Inc.                 EMail: QA4496@email.mot.com
   1501 W. Shure Drive, #3205
   Arlington Heights, IL 60004
   USA



   This Internet-Draft expires June 22, 2001.












Sjoberg et al.                  [Page 18]