Internet Engineering Task Force Johan Sjoberg, Ericsson Audio Video Transport WG Magnus Westerlund, Ericsson INTERNET-DRAFT Ari Lakaniemi, Nokia July 14, 2001 Petri Koskelainen, Nokia Expires: January 14, 2000 RTP payload format for AMR <draft-sjoberg-avt-rtp-amr-01.txt> Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is an individual submission to the IETF. Comments should be directed to the authors. Abstract This document describes a proposed real-time transport protocol (RTP) [8] payload format for AMR speech encoded [1] signals. The AMR payload format is designed to be able to interoperate with existing AMR transport formats. This document also includes a MIME type registration for AMR. The MIME type is specified for both real-time transport and storage. Sjoberg [Page 1]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 1. Introduction The adaptive multi-rate (AMR) speech codec was developed by the European Telecommunications Standards institute (ETSI). The AMR codec is standardized for GSM, and is also chosen by 3GPP as the mandatory codec for third generation systems. It is currently under standardization for TDMA. I.e. the AMR codec will be widely used in cellular systems. The AMR codec is developed to preserve high speech quality under a wide range of transmission conditions. The AMR codec is a multi-mode codec with 8 narrow band modes with bit rates between 4.75 and 12.2 kbps. The sampling frequency is 8000 Hz and processing is done on 20 ms frames, i.e. 160 samples per frame. The AMR modes are closely related to each other and uses the same coding framework. Three of the AMR modes are already adopted and used standards of there own, the 6.7 kbps mode as PDC-EFR [7], the 7.4 kbps mode as IS-641 codec in TDMA [6], and the 12.2 kbps mode as GSM- EFR [5]. AMR implementations must support all 8 speech coding modes, and mode switching can occur to any mode at any time. The mode information must therefore be transmitted together with the speech encoded bits, to indicate the mode. It is possible for the decoder to signal to the encoder the mode it prefers to receive. The reason can be e.g. transmission bandwidth or quality. The AMR codec is designed with a voice activity detector (VAD) and generation of comfort noise (CN) parameters during silence periods. Hence, the AMR codec can reduce the number of transmitted bits and packets during silence periods to a minimum. The operation to send CN parameters at regular intervals during silence periods is usually called discontinuous transmission (DTX) or source controlled rate (SCR) operation. The three codec standards that are part of AMR [5][6][7] also have SCR/CN functionality specified. To enable interoperability with terminals supporting these standards the AMR can optionally be extended to support also these CN schemes, see [2]. Due to the flexibility and robustness of AMR, it is suitable also for other purposes than circuit switched cellular systems. Other suitable applications are real-time services over packet switched networks, e.g. over RTP. To be optimized for transmission over networks with high packet loss rates, the possibility to use extra redundancy is built into the RTP payload format for AMR. The speech encoded bits have different perceptual sensitivity to bit errors and cellular systems exploit this by using unequal error protection and detection (UEP and UED). This mechanism concentrates the correction and detection of corrupted bits to the perceptually most sensitive bits. A frame is only regarded as lost or damaged if errors are detected in the most sensitive bits. The UED can also be employed on RTP if UDP Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 2]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 lite is used as transport layer protocol (UDP lite [10] is work in progress). To enable this, the bits in the payload have to be ordered in sensitivity order. The AMR encoded bits are defined in sensitivity order in [2]. If the receiver supports option to retransmit redundant frames, the different sensitivity could also be used for transmitting only the most sensitive bits of a redundant frame. The special problems with IP real-time traffic over cellular access networks are further discussed in [9]. Other AMR scenarios are possible, e.g. one end is circuit switched GSM, which is connected through a gateway to IP network and an IP terminal in the other end. To improve quality, also frames damaged by the GSM radio should be transmitted to the decoder in the IP network. To make this possible, frame quality information has to be transmitted over the IP network. The quality bit is also needed for the AMR RTP payload format to interwork with for example the ATM AAL2 AMR profile. 2. Requirements The AMR payload format for RTP was designed to meet the following requirements: o Different levels of robustness must be supported, from no redundant data to extreme robustness capable of handling very high packet loss rates with no or small speech quality degradation. o Fast, frame-wise AMR mode adaptation must be supported. This means that it must be possible to send Codec Mode Requests back from the receiving side to the transmitting side with information on the preferred mode. Slower AMR mode adaptation may also be accomplished with external signaling. o Source controlled rate operation (SCR) and comfort noise parameter (CN) transmission defined in AMR must be supported. 3. Payload format The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [3]. The AMR payload format is designed to be flexible, ranging from very low overhead to an extended format with the possibility to send redundancy information and several speech frames in one packet. The payload format consists of payload header and zero or more payload frames. Neither the payload header nor the payload frames are Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 3]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 octet aligned on their own but the full payload is. If the option to transmit redundant information is enabled and employed, the full payload SHALL finally be ordered in descending bit error sensitivity order to be prepared for unequal error protection or unequal error detection schemes, e.g. UDP lite. The AMR encoded bit streams are defined in sensitivity order in Annex B of [2], the original order as delivered from the speech encoder is defined in [1]. The last octet of an AMR payload packet is padded with zeroes at the end if not all bits are used. The AMR frame types, or modes, are defined in [2]. Frame type 15, no transmission, is needed to indicate not transmitted frames or lost frames. Not transmitted could mean both no data produced by the speech encoder for this frame or no data transmitted in this payload, i.e. valid data for this frame could be sent in another payload. For example, when multiple frames are sent in each payload and comfort noise starts. A frame type sequence in a payload with 8 frames, speech frames with AMR mode 7 are interrupted by CN in the fifth frame, could look like: {7,7,7,7,8,15,15,8}. The AMR SCR is described in [4]. The AMR payload format supports robust transmission, multiple frames in one payload packet, and the use of fast codec mode adaptation. The robust behavior is accomplished by using the optional possibility to retransmit previously transmitted frames together with the current frame or frames. The redundant frames could be transmitted in their entirety or only partly. If only a part of the redundant frame is transmitted, the least sensitive bits are omitted. A partially transmitted redundant frame SHALL fill the number of used octets for that frame. The bits in the payload are sorted in descending sensitivity order to support UED, like in UDP lite [10], if partial redundancy is used. When bits in redundant frames are not transmitted, the not transmitted/received bits MUST be reconstructed on the receiver side. It is RECOMMENDED to produce the non received bits with state of the art ECU actions. Nothing giving worse quality than using a random generated bits SHOULD be used. To use a fixed pattern SHOULD be avoided for speech quality reasons. Note that the possibility to transmit partial redundant frames can be employed only if the receiver has signaled support for this in capability description. A frame quality indicator is included for interoperability with the ATM payload format described in ITU-T I.366.2 and the UTRAN Iu interface [14]. The speech quality is significantly increased if damaged frames are forwarded to the speech decoder error concealment unit and not dropped. Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 4]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 3.1. The payload header The payload header has dynamic length, 3 or 7 bits. The bits in the header are specified as follows: Q (1 bit): The payload quality bit indicates, if not set, that the payload is severely damaged and the receiver should set the RX_TYPE, see [4], to SPEECH_BAD or SID_BAD depending on the frame type (FT). L (1 bit): Indicates the existence of LEN fields in the payload frames and that sensitivity sorting is used. Note that this bit can be set only if the receiver has signaled support for option to transmit redundant data. R (1 bit): Indicates, if set, that the Codec Mode Request (CMR) is sent. CMR (4 bits): OPTIONAL field, depending on the R bit. Requested codec mode for the other communication direction. The mapping of existing AMR modes are given in Table 1a in [2]. 0 0 1 2 +-+-+-+ |Q|L|R| +-+-+-+ Figure 1: AMR payload header, R=0 0 0 1 2 3 4 5 6 +-+-+-+-+-+-+-+ |Q|L|R| CMR | +-+-+-+-+-+-+-+ Figure 2: AMR payload header, R=1 3.2. AMR payload frame An AMR payload frame represent one encoded speech frame. Each payload frame includes several specified fields as follows: F (1 bit): Indicates if this frame is followed by further frames. F=1 further frames follow, F=0 last frame. LEN (5 bits): OPTIONAL field, exists if the payload header bit L is set, L=1. LEN specifies the number of octets in the FT field and AMR encoded bits field in this frame. If LEN indicates more bits than the AMR mode information in the FT field, the implicit knowledge of the number of bits for the AMR mode indicated by FT is the valid number Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 5]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 of AMR encoded bits. If LEN indicates fewer bits than given by the mode information in the FT field, LEN gives the number of encoded bits. If a frame is transmitted only partially the least sensitive bits at the end of the frame are omitted. This use is intended for partial redundant data. FT (4 bits): Frame type indicator, indicating the AMR speech coding mode or comfort noise (CN) mode. The mapping of existing AMR modes are given in Table 1a in [2]. If FT=15 (No transmission) no LEN or AMR encoded bits follow. AMR encoded bits: This is the speech codec encoded data field. The length of this field is either defined implicitly by the AMR mode in the FT field, or by the LEN field. The last payload frame SHALL always contain a full AMR frame, i.e. no LEN field is needed. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F| LEN | FT | | +-+-+-+-+-+-+-+-+-+-+ + | | + + / AMR encoded bits / + +-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3: Payload frame format, F=1 and L=1 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F| FT | | +-+-+-+-+-+ + | | + + / AMR encoded bits / + +-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4: Payload frame format, F=0 or L=0 3.3. Payload block sorting A bit error in a more sensitive bit is subjectively more annoying than in a less sensitive bit. Therefore, to be able to protect the most sensitive bits in a payload packet with a forward error detection code, e.g. a CRC outside RTP, the bits inside a frame are Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 6]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 ordered into sensitivity order. If the option to transmit redundant data is employed, the full RTP payload MUST be further sorted into sensitivity order. The protection MAY then cover an appropriate number of octets from the beginning of the payload. How many octets depend on the channel and application. This can for example be accomplished by UDP lite [10] (work in progress). To maintain sensitivity ordering inside the AMR payload when more than one speech frame is transmitted in one packet reordering of the data is needed. The reordering is only performed if partial redundancy is used, i.e. L=1. The reordering to maintain the sensitivity ordered AMR payload SHALL be performed on bit level. The AMR payload header SHALL still be placed unchanged in the beginning of the payload. Thereafter, the payload frames are sorted with one bit alternating from each payload frame. +-------------+ | h(0)-h(H-1) | +------------------------+ | f(0,0) _ f(0,F(0)) | +----------------------------+ | f(1,0) _ f(1,F(1)) | +----------------------------+ | f(2,0) _ f(2,F(2)) | +----------------------+ \ \ +-------------------------------+ | f(N-1,0) _ f(N-1,F(N-1)) | +-------------------------------+ Figure 5: The payload header and N payload frames before sorting. The sorting algorithm can be described in C-code. b(m) - bit m of RTP final payload f(n,m) - bit m in payload frame n F(n) - number of bits in payload frame n, defined by FT or by LEN h(m) - bit m of payload header H - number of payload header bits, 3 or 7 bits N - number of payload frames in the payload S - number of unused bits Payload frames f(n,m) are ordered in consecutive order, where frame n=1 is preceding frame n=2. The sorting algorithm is defined in C-style as: for (i = 0; i < H; i++){ b(i) = h(i); } Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 7]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 max = max(F(0),..,F(N-1)); k = H; for (i = 0; i < max; i++){ for (j = 0; j < N; j++){ if (i < F(j)){ b(k++) = f(j,i); } } } S = 8 - k%8; if (S < 8){ for (i = 0; i < S; i++){ b(k++) = 0; } } Note that if multiple new frames are encapsulated into the payload and partial redundant data is not transmitted, payload bit-sorting SHALL NOT be performed but the payload is formed by concatenating the payload header and the bits from each AMR frame in the payload. However, the bits inside a frame are ordered into sensitivity order as defined in [2]. In this case the bits are stored into payload according to C-style algorithm below (see the definition of symbols above). for (i = 0; i < H; i++){ b(i) = h(i); } k = H; for (j = 0; j < N; j++){ for (i = 0; i < F(j); i++){ b(k++) = f(j,i); } } } S = 8 - k%8; if (S < 8){ for (i = 0; i < S; i++){ b(k++) = 0; } } 4. RTP header usage The RTP header marker bit (M) is used to mark (M=1) the packages containing the first speech frame after CN. All other packages the marker bit is set to 0 (M=0). The timestamp corresponds to the sampling time of the first sample encoded for the first encoded speech frame in the packet. The Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 8]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 timestamp unit is in samples. The duration of one AMR speech frame is 20 ms and the sampling frequency is 8 kHz, corresponding to 160 encoded speech samples per frame. Thus, the timestamp is increased by 160 for each consecutive frame. All frames in a packet MUST be successive 20 ms frames. 5. Examples 5.1. Simple example In the simple example we just send one full (L=0) frame in each RTP packet, no Codec Mode Request CMR is sent (R=0), the payload was not damaged at IP origin (Q=1). In this example we transmit one frame encoded with the 5.9 kbps mode (FT=2). The speech encoded bits are put into f(0) to f(117) in descending sensitivity order according to [2]. | Bit no. | Oct| 0 1 2 3 4 5 6 7 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 0 | Q=1 | L=0 | R=0 | F=0 | 0 | 0 | 1 | 0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 1 | f(0) | f(1) | f(2) | ... | ... | ... | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 15 | ... | ... | ... | f(115)| f(116)| f(117)| 0 | 0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ Figure 6: One frame per packet example. 5.2. Example with partial redundancy In this example the 6.7 kbps mode (FT=3) is sent with one redundant frame, also FT=3. Only a part of the redundant frame is sent, in this example 12 octets, (L=1, LEN=12). A mode request is sent(R=1), requesting the 10.2 kbps mode for the other link(CMR=6). The redundant frame (12 octets) is r(0) to r(95) and the current frame (134 bits) is f(0) to f(133). | Bit no. | Oct| 0 1 2 3 4 5 6 7 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 0 | Q=1 | L=1 | R=1 | 0 | 1 | 1 | 0 | F=1 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 1 | F=0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 2 | 1 | 0 | f(0) | 0 | f(1) | 0 | f(2) | 1 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 3 | f(3) | 1 | f(4) | r(0) | f(5) | r(1) | f(6) | r(2) | ---+-------+-------+-------+-------+-------+-------+-------+-------+ Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 9]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 4 | f(7) | r(3) | f(8) | ... | ... | ... | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 26 | f(95) | r(91) | f(96) | f(97) | f(98) | ... | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 30 | ... | ... | ... | ... | ... | f(131)| f(132)| f(133)| ---+-------+-------+-------+-------+-------+-------+-------+-------+ Figure 7: Example with partial redundancy. 5.3. Example with multiple frames per payload In this example two 5.9 kbps mode (FT=2) frames are sent in one packet. No partial redundancy is used (L=0). A mode request is sent(R=1), requesting the 7.95 kbps mode for the other link(CMR=5). The first frame is represented by the 118 bits f(0) to f(117) and the subsequent frame by g(0) to g(117). | Bit no. | Oct| 0 1 2 3 4 5 6 7 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 0 | Q=1 | L=0 | R=1 | 0 | 1 | 0 | 1 | F=1 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 1 | 0 | 0 | 1 | 0 | f(0) | f(1) | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 15 | ... | ... | ... | ... | ... | ... | ... | f(115)| ---+-------+-------+-------+-------+-------+-------+-------+-------+ 16 | f(116)| f(117)| F=0 | 0 | 0 | 1 | 0 | g(0) | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 17 | g(1) | g(2) | ... | ... | ... | ... | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 31 | ... | ... | ... | g(116)| g(117)| 0 | 0 | 0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ Figure 8: Example two frames per payload. 6. The AMR MIME type registration This chapter defines the MIME type for Adaptive Multi-Rate (AMR) speech codec [1]. The data format and parameters are specified for both real-time transport and for storage type applications (e.g. e- mail attachment, multimedia messaging). The former is referred as RTP mode and the latter as storage mode. AMR implementations according to [1] MUST support all eight coding modes. The mode change can occur at any time during operation and therefore the mode information is transmitted in-band together with speech bits to allow mode change without any additional signaling. Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 10]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 In addition to the speech codec, AMR specifications also include Discontinuous Transmission / comfort noise (DTX/CN) functionality [11]. The DTX/CN switches the transmission off during silent parts of the speech and only CN parameter updates are sent in regular intervals. 6.1 RTP mode It is possible that the decoder may want to receive certain AMR mode or a subset of AMR modes. In the end to end transmission parts of the chain may have limitations in the number of modes in the active codec set, e.g. the GSM radio link can only use a subset of maximum four modes. Therefore, it is possible to request specific set of AMR modes in capability description and it is mandatory for encoder to abide this request. If request for mode set is not given, encoder can freely decide which AMR mode to use. Although in principle AMR codec can perform a mode change at any time between any two modes, it is possible to set limitations for mode changes. The decoder has possibility to define the minimum number of frames between mode changes and to limit the mode change to happen into neighboring modes only. In addition to AMR DTX/CN scheme, the three codec standards that are part of the AMR also have their own DTX/CN schemes ([6][7][12]). To enable interoperability with terminals supporting these standards, AMR can optionally be extended to support also these CN schemes. The CN capabilities are signaled in capability description. If no CN capabilities are reported, it is assumed that AMR CN is supported. If CN capabilities are reported, all supported CN types (including AMR CN) must be signaled. It is also possible to limit the number of AMR frames encapsulated into one RTP packet. This is an optional feature and if no parameter is given in capability description, the transmitter can encapsulate any number of AMR speech frames into one RTP packet. There is also an option to retransmit one or more previously transmitted frames to help the receiver to recover from packet losses in difficult transmission conditions. It also possible to transmit these frames only partially in such a way that only the most sensitive bits are transmitted. Since the transmission of partly redundant frames is an optional property, it can be used only if the receiver has signaled support for this functionality in capability description. The partial redundancy is RECOMMENDED to be implemented and turned on at least for conversational services. Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 11]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 6.2 Storage mode For storing AMR frames e.g. as a file or e-mail attachment, the AMR frames must be formatted according to Annex A of [9]. Because no exchange of particular coding parameters, e.g. specific DTX/CN mode, can be signaled before downloading or receiving stored AMR data, the receiving entity (AMR decoder) MUST be able to decode all eight coding modes as well as the AMR DTX/CN [6]. 6.3 MIME Registration MIME-name for the AMR codec is allocated from IETF tree since AMR is expected to be widely used speech codec in VoIP applications. Some parts of this chapter will distinguish between RTP and storage modes. Media Type name: audio Media subtype name: AMR Required parameters: none Optional parameters for RTP mode: ptime: Definition as usual in RTP audio. mode-set: Requested AMR mode set. Restricts the active codec mode set to a subset of all modes. Possible values are: 0,...,7 (see Table 1a [2]). If not present, all speech modes are available. mode-change-period: Defines a number N which restricts the mode changes in such a way that mode changes are only allowed on multiples of N, initial state of the phase is arbitrary. If this parameter is not present, mode change can happen at any time. mode-change-neighbor: If present, mode changes SHALL be made to neighboring modes only. If not present, change between any two modes is allowed. amr-cn: If present, GSM AMR DTX/CN is supported. Note that if no CN capabilities are reported, AMR DTX/CN is assumed to be supported, i.e. this parameter is only sent together with one of the following CN parameters. pdc-efr-cn:If present, PDC-EFR DTX/CN is supported, otherwise not supported. is-641-cn: If present, IS-641 DTX/CN is supported, otherwise not supported. gsm-efr-cn:If present, GSM EFR DTX/CN is supported, otherwise not supported. maxframes: Maximum number of AMR speech frames in one RTP packet. The receiver may set this parameter in order to limit the buffering requirements or delay. redundancy:If present, transmission of partly redundant frames is supported, otherwise not supported. Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 12]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 Optional parameters for storage mode: none Encoding considerations for RTP mode: See section 3 in this document. Encoding considerations for storage mode: Each audio frame must be formatted in octet format according to AMR Interface Format 2 (AMR IF2) specified in Annex A of [2]. The audio frames must be stored in sequential order. This implies that the first octet after frame n must be the first octet of frame (n+1). Furthermore, missing frames and non-received frames between CN updates during non-speech period must be stored as NO_TRANSMISSION frames (frame type 15, see definition in [2]). Each receiving entity that accepts this MIME type must be able to decode all eight AMR coding modes [1] and the AMR DTX/CN [11]. Security considerations: none Interoperability considerations for RTP mode: If CN capabilities are not signaled in the capability description, only AMR CN is supported. Public specification: please refer to chapter 7 "References". Additional information for storage mode: Magic number: none File extensions: amr, AMR Macintosh file type code: none Object identifier or OID: none Person & email address to contact for further information: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com Intended usage: COMMON. It is expected that many VoIP applications (as well as mobile applications) will use this type. Author/Change controller: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com 6.4 Mapping to SDP Parameters Please note that this chapter applies to the RTP mode only. Parameters are mapped to SDP [13] as usual. Example usage in SDP: m=audio 49120 RTP/AVP 97 a=rtpmap:97 AMR a=fmtp:97 mode-set=0,2,5,7; maxframes=2 Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 13]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 7. References [1] GSM 06.90, "Adaptive Multi-Rate (AMR) speech transcoding". [2] 3G TS 26.101, "AMR Speech Codec Frame Structure". [3] IETF RFC 2119, "Key words for use in RFCs to Indicate Requirement Levels". [4] 3G TS 26.093, "AMR Speech Codec; Source Controlled Rate operation". [5] GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding". [6] TIA/EIA -136-Rev.A, part 410 - "TDMA Cellular/PCS - Radio Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS- 641. TIA published standard, 1998". [7] ARIB, RCR STD-27H, "Personal Digital Cellular Telecommunication System RCR Standard". [8] IETF RFC1889, "RTP: A Transport Protocol for Real-Time Applications". [9] IETF draft-westberg-realtime-cellular-01.txt, "Realtime Traffic over Cellular Access Networks". [10] IETF draft-larzon-udplite-02.txt, "The UDP Lite Protocol". [11] GSM 06.92, "Comfort noise aspects for Adaptive Multi-Rate (AMR) speech traffic channels". [12] GSM 06.62: Comfort noise aspect for Enhanced Full Rate (EFR) speech traffic channels [13] M. Handley and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998 [14] 3G TS 25.415 "UTRAN Iu Interface User Plane Protocols" Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 14]
INTERNET-DRAFT RTP Payload Format for AMR July 14, 2000 8. Authors' addresses Johan Sjoberg Ericsson Research Ericsson Radio Systems AB Torshamnsgatan 23 SE-164 80 Stockholm SWEDEN E-mail: Johan.Sjoberg@ericsson.com Magnus Westerlund Ericsson Research Ericsson Radio Systems AB Torshamnsgatan 23 SE-164 80 Stockholm SWEDEN E-mail: Magnus.Westerlund@ericsson.com Ari Lakaniemi Nokia Research Center P.O.Box 407 FIN-00045 Nokia Group Finland E-mail: ari.lakaniemi@nokia.com Petri Koskelainen Nokia Research Center P.O.Box 100 FIN-33721 Tampere Finland E-mail: petri.koskelainen@nokia.com This Internet-Draft expires January 14, 2001. Sjoberg/Westerlund/Lakaniemi/Koskelainen [Page 15]