Internet Draft                                            Ari Lakaniemi
Document: draft-lakaniemi-avt-rtp-amr-00.txt          Petri Koskelainen
March 10, 2000                                                    Nokia
Expires September 2000



                       RTP Payload Format for AMR


Status of this Memo


   This document is an Internet-Drfat and is in full conformance with
   all provisions of Section 10 of RFC 2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts. Internet-Drafts are draft documents valid for a maximum of
   six months and may be updated, replaced, or obsoleted by other
   documents at any time. It is inappropriate to use Internet- Drafts
   as reference material or to cite them other than as "work in
   progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


1. Abstract

   This document specifies the encapsulation of AMR (Adaptive Multi-
   Rate) speech codec frames into payload of the Real-time Transport
   Protocol (RTP). The format enables encapsulation of one or several
   AMR speech frames into one RTP packet. Mode adaptation and
   discontinuous transmission (DTX) are also supported.


2. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in RFC 2119 [1].


3. Introduction

   This document specifies the encapsulation of the Adaptive Multi-Rate
   (AMR) speech codec [3] frames into payload of the Real-Time
   Transport Protocol (RTP) [2]. The AMR codec is a speech codec
   developed by the European Telecommunications Standards Institute

Lakaniemi/Koskelainen                                         [page 1]


RTP Payload Format for AMR                              March 10, 2000


   (ETSI). The AMR codec is standardized for GSM, and it is also chosen
   by the Third Generation Partnership Project (3GPP) as the mandatory
   speech codec for the third generation systems. AMR provides high
   speech quality under a wide range of transmission conditions and is
   well suitable also for other than mobile applications.

   The AMR includes eight different speech coding modes, whose bit-
   rates range from 4.75 to 12.2 kbit/s. The sampling rate is 8000 Hz
   and processing is performed on 20 ms frames. Some of the AMR speech
   coding modes are speech codecs specified for other standards:  the
   6.7 kbit/s mode as the ACELP codec specified in section 5.4 of [4]
   (PDC-EFR), the 7.4 kbit/s mode as IS-641 codec in TDMA [5] and the
   12.2 kbit/s mode as GSM EFR [6].

   AMR implementation according to [3] must support all eight coding
   modes. The mode change can occur at any time during operation and
   therefore the mode information is transmitted as a part of each AMR
   frame to allow mode change without any additional signaling.

   It is possible that the decoder may want to receive certain AMR mode
   for e.g. capacity or quality reasons. This can be signaled to the
   other end-point by including a mode request into transmitted packet.

   In addition to the speech codec, AMR specifications also include
   Discontinuous Transmission / Comfort Noise (DTX/CN) functionality
   [7]. The DTX/CN switches the tranmission off during silent parts of
   the speech and only CN parameter updates are sent in regular
   intervals. The three codec standards that are part of the AMR also
   have their own DTX/CN schemes ([4][5][8]). To enable
   interoperability with terminals supporting these standards, AMR can
   optionally support these additional CN schemes.


4. Payload format

   The RTP payload format for AMR codec consists of variable length
   payload header, followed by one or more AMR payload frames. In most
   cases the actual payload data does not fill the octet structure. In
   these cases the unused bits in the last octet of the payload are
   padded with bits of value 0.

4.1.    AMR payload header

   The length of the AMR payload header is either 1 or 5 bits and the
   header bits are defined as follows:

   R (1 bit): Indicates the existence of Mode Request field

   MR (4 bits): Optional field which is present only if R=1. If
   present, this field is used to request a specific AMR mode from the
   other end-point of the communication. The frame type indices from 0
   to 7 (see Table 1) can be used in mode request.



Lakaniemi/Koskelainen                                         [page 2]


RTP Payload Format for AMR                              March 10, 2000


   +-+
   |R|
   +-+
   Figure 1: Payload header with R=0

   +-+-+-+-+-+
   |R|   MR  |
   +-+-+-+-+-+
   Figure 2: Payload header with R=1. Bits are stored into MR field
   from LSB to MSB.

   Note that in multicast it is possible to receive conflicting mode
   requests from different receivers. Therefore, in multicast the
   sender can choose to ignore mode requests.

4.2.    AMR payload frame

   An AMR payload frame has variable size and it consists a 4-bit frame
   type field, followed by the AMR speech or CN bits. Note that the AMR
   payload frame format is exactly the same as the AMR Interface Format
   2 (AMR IF2) defined in Annex A of [9]. The AMR payload frame is
   defined as follows:

   FT (4 bits): Indicates the mode of the AMR payload frame. The
   mapping of FT bits into AMR frame type is shown in Table 1.

   SP (N bits): The speech/CN bits. The number of speech bits depends
   on the frame type, the number of speech/CN bits for each AMR frame
   type are shown in Table 1. The bit order for all frame types is
   defined in [9].

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- /// -+-+
   |   FT  |   Speech/CN bits            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- /// -+-+
   Figure 3: AMR payload frame. Bits are stored into FT field from LSB
   to MSB.

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |R|   MR  |  FT1  |                                             |
   +-+-+-+-+-+-+-+-+-+                                             +
   |                     SP1 (103 bits)                            |
   +                                                               +
   |                                                               |
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                               |  FT2  |                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                       +
   |                                                               |
   +                     SP2 (95 bits)                             +
   |                                                               |
   +                                     +-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Lakaniemi/Koskelainen                                         [page 3]


RTP Payload Format for AMR                              March 10, 2000


   Figure 4: An RTP payload for AMR with mode request (R=1) and two AMR
   payload frames (a 5.15 kbit/s frame followed by a 4.75 kbit/s frame.

   frame type                   speech
   index         mode            bits
   -------------------------------------
     0        AMR 4.75           95
     1        AMR 5.15          103
     2        AMR 5.9           118
     3        AMR 6.7           134
     4        AMR 7.4           148
     5        AMR 7.95          159
     6        AMR 10.2          204
     7        AMR 12.2          244
     8        AMR CN             39
     9        GSM EFR CN         43
    10        IS-641 CN          38
    11        PDC-EFR CN         37
    12 รป 14   For future use      -
    15        No transmission     0

    Table 1: Definition of AMR frame types


5. Payload octet structure

   The AMR payload is stored into octets starting from the LSB of the
   first octet and filling all octets from LSB to MSB. Possible unused
   bits in the MSB of the last octet of the payload are set to value 0.
   The octet structure is constructed as defined by the c-like pseudo
   code below. Note that in this formula LSB is bit 0 and MSB is bit 7.

   Nh      - number of header bits in the payload
   Nf      - number of payload frames
   N(j)    - number of payload frame bits in frame j
   h(j)    - bit j of the payload header
   f(n,k)  - bit k of the payload frame n
   b(n,k)  - bit k in payload octet n
   UB      - denotes unused bit (set to value 0)

   for (j = 0; j < Nh; j++)
     b(0,j) = h(j);

   c = j;
   for (j = 0; j < Nf; j++)
   {
     for (i = 0; i < N(j); i++)
     {
       n = c / 8;
       k = c % 8;
       b(n,k) = f(j,i);
       c++;
     }

Lakaniemi/Koskelainen                                         [page 4]


RTP Payload Format for AMR                              March 10, 2000


   }

   for (j = c % 8; j < 8; j++)
   {
     n = c / 8;
     k = c % 8;
     b(n,k) = UB;
   }

   Example: The payload illustrated in Figure 4 is encapsulated into
   octets of the payload: mode request is used (R=1), requesting for
   7.95 kbit/s frame (MR=5), AMR 5.15 kbit/s payload frame (FT=1 + 103
   speech bits denoted as c(n)) and AMR 4.75 kbit/s payload frame (FT=0
   + 95 speech bits, denoted as d(n)).

   Oct.|  MSB  |                Octet structure                |   LSB
   ----+-------+-----------------------------------------------+-------
     0 |   0       0       1       0       1       0       1       1
   ----+-------+-----------------------------------------------+-------
     1 |   ...     ...     ...     ...    c(2)    c(1)    c(0)     0
   ----+-------+-----------------------------------------------+-------
    13 | c(102)| c(101)  c(100)    ...     ...     ...     ...    ...
   ----+-------+-----------------------------------------------+-------
    14 |  d(3)    d(2)    d(1)    d(0)      0       0       0       0
   ----+-------+-----------------------------------------------+-------
    15 |   ...     ...     ...     ...     ...    d(6)    d(5)    d(4)
   ----+-------+-----------------------------------------------+-------
    25 | d(91)   d(90)   d(89)    ...     ...     ...     ...     ...
   ----+-------+-----------------------------------------------+-------
    26 |   UB      UB      UB      UB      UB     d(94)   d(93)   d(92)
   ----+-------+-----------------------------------------------+-------


6. RTP header usage

   The timestamp of the RTP header must indicate the sampling time of
   the first sample of the first frame in the packet. The time is
   indicated as samples, i.e. frame length 20 ms and sampling rate 8
   kHz mean that time stamp is advanced by 160 (samples) for each
   frame. All frames in a packet must be successive 20 ms frames,
   stored in the order they are generated by the encoder.

   The encoder shall set the marker bit (M) of the RTP header to value
   1 for packets containing the first active speech frame after a non-
   speech speech period. For all other packets the marker bit is set to
   0.


7. References





Lakaniemi/Koskelainen                                         [page 5]


RTP Payload Format for AMR                              March 10, 2000



   [1]  S. Bradner: "Key words for use in RFCs to Indicate Requirement
       Levels", RFC 2119

   [2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson: "RTP:
       A transport protocol for real-time applications", IETF
       Audio/Video Transport Working Group, RFC1889

   [3] GSM 06.90: Adaptive Multi-Rate (AMR) speech transcoding

   [4] RCR STD-27H, Personal Digital Cellular Telecommunication System
       RCR Standard

   [5] TIA/EIA -136-Rev.A, part 410 - TDMA Cellular/PCS - Radio
       Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS-
       641. TIA published standard, 1998

   [6] GSM 06.60: Enhanced Full Rate (EFR) speech transcoding

   [7] GSM 06.92: Comfort noise aspects for Adaptive Multi-Rate (AMR)
       speech traffic channels

   [8] GSM 06.62: Comfort noise aspect for Enhanced Full Rate (EFR)
       speech traffic channels

   [9] 3G TS 26.101: AMR Speech Codec Frame Structure, General
       description


8. Author's Addresses

   Ari Lakaniemi
   Nokia Research Center
   P.O.Box 407
   FIN-00045 Nokia Group
   Finland
   Email: ari.lakaniemi@nokia.com

   Petri Koskelainen
   Nokia Research Center
   P.O.Box 100
   FIN-33721 Tampere
   Finland
   Email: petri.koskelainen@nokia.com


   This internet-draft expires in September 2000.







Lakaniemi/Koskelainen                                         [page 6]