AVT WG                                                          R. Zopf
Internet Draft                                      Lucent Technologies
Document: draft-ietf-avt-rtp-cn-00.txt                       March 2000
Category: Informational


                     RTP Payload for Comfort Noise


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026 [1].

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts. Internet-Drafts are draft documents valid for a maximum of
   six months and may be updated, replaced, or obsoleted by other
   documents at any time. It is inappropriate to use Internet- Drafts
   as reference material or to cite them other than as "work in
   progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


1. Abstract

   This document describes an RTP [2] payload format for transporting
   comfort noise (CN).  The CN payload type is primarily for use with
   audio codecs that do not support comfort noise as part of the codec
   itself such as ITU-T Recommendations G.711 [3], G.726 [4], G.727
   [5], G.728 [6], and G.722 [7].


2. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in RFC-2119 [8].


3. Introduction

   This document describes an RTP payload format for transporting
   comfort noise.  The payload format is based on Appendix II of ITU-T
   Recommendation G.711 [9] which defines a comfort noise payload
   format (or bit-stream) for ITU-T G.711 use in packet-based
   multimedia communication systems.  The payload format is generic and
   may also be used with other audio codecs without built-in

Zopf               Informational - September, 2000                  1

                    RTP Payload for Comfort Noise          March 2000


   Discontinuous Transmission (DTX) capability such as ITU-T
   Recommendations G.726 [4], G.727 [5], G.728 [6], and G.722 [7].  The
   payload format provides a minimum interoperability specification for
   communication of comfort noise parameters.  The comfort noise
   analysis and synthesis as well as the Voice Activity Detection (VAD)
   and DTX algorithms are unspecified and left implementation-specific.
   However, an example solution for G.711 has been tested and is
   described in the Appendix [9].  It uses the VAD and DTX of G.729
   Annex B [10] and a comfort noise generation algorithm (CNG) which is
   provided in the Appendix for information.

   The comfort noise payload consists of a single octet description of
   the noise level and MAY contain spectral information in subsequent
   octets.  An earlier version of the CN payload format consisting only
   of the noise level byte was defined in draft revisions of the RFC
   1890.  The extended payload format defined in this document should
   be backward compatible with implementations of the earlier version
   assuming that only the first byte is interpreted and any additional
   spectral information bytes are ignored.


4. CN Payload Definition

   The comfort noise payload consists of a description of the noise
   level and spectral information in the form of reflection
   coefficients. The use of spectral information is optional and the
   all-pole model order is left unspecified.   The encoder can
   determine the appropriate model order based on such considerations
   as quality, complexity, expected environmental noise, and signal
   bandwidth.  The model order is not explicitly transmitted since it
   can be derived from the length of the payload at the receiver. For
   complexity or other reasons, the decoder may reduce the model order
   by setting higher order reflection coefficients to zero.

4.1 Noise Level

   The magnitude of the noise level is packed into the least
   significant bits of the noise-level byte with the most significant
   bit unused and always set to 0 as shown below in Figure 1.  The
   least significant bit of the noise level magnitude is packed into
   the least significant bit of the byte.

   The noise level is expressed in -dBov, with values from 0 to 127
   representing 0 to -127 dBov.  dBov is the level relative to the
   overload of the system. (Note:  Representation relative to the
   overload point of a system is particularly useful for digital
   implementations, since one does not need to know the relative
   calibration of the analog circuitry.) For example, in a 16-bit
   linear PCM system (L16), a signal with 0 dBov represents a square
   wave with the maximum possible amplitude (+/-32767), and -63 dBov
   corresponds to -58 dBm0 in a standard telephone system. (dBm is the
   power level in decibels relative to 1 mW, with an impedance of 600
   Ohms.)

Zopf               Informational - September, 2000                  2

                    RTP Payload for Comfort Noise          March 2000



                        0 1 2 3 4 5 6 7
                       +-+-+-+-+-+-+-+-+
                       |0|   level     |
                       +-+-+-+-+-+-+-+-+

                 Figure 1: Noise Level Packing


4.2 Spectral Information

   The spectral information is transmitted using reflection
   coefficients [9]. Each reflection coefficient can have values
   between -1 and 1 and is quantized uniformly using 8 bits. The
   quantized value is represented by the 8 bit index N, where
   N=0..,254, and index N=255 is reserved for future use. Each index N
   is packed into a separate byte with the MSB first. The quantized
   value of each reflection coefficient, k_i, can be obtained from its
   corresponding index using:

        k_i(N_i) = 258*(N_i-127)     for N_i = 0...254; -1 < k_i < 1
                   -------------
                       32768

4.3 Payload Packing

   The first byte of the payload MUST contain the noise level as shown
   in Figure 1.  Quantized reflection coefficients are packed in
   subsequent bytes in ascending order as in Figure 2 where M is the
   model order.  The total length of the payload is M+1 bytes.  Note
   that a 0th order model (i.e. no spectral envelope information)
   reduces to transmitting only the energy level.

              Byte        1      2    3    ...   M+1
                       +-----+-----+-----+-----+-----+
                       |level|  N1 |  N2 | ... |  NM |
                       +-----+-----+-----+-----+-----+

                Figure 2: CN Payload Packing Format

5. Usage of RTP

   The RTP header for the comfort noise packet SHOULD be constructed as
   if the comfort noise were an independent codec. Thus, the RTP
   timestamp designates the beginning of the silence period. A static
   payload type of 13 is assigned for a sampling rate of 8,000 Hz; if
   other sampling rates are needed, they MUST be defined through
   dynamic payload types. The RTP packet SHOULD NOT have the marker bit
   set.

   Each RTP packet containing comfort noise MUST contain exactly one CN
   payload per channel.  This is required since the CN payload has a


Zopf               Informational - September, 2000                  3

                    RTP Payload for Comfort Noise          March 2000


   variable length.  If multiple audio channels are used, each channel
   MUST use the same spectral model order 'M'.

6. Guidelines for Use

   A audio codec with DTX capabilities generally includes VAD, DTX, and
   CNG algorithms.  The job of the VAD is to discriminate between
   active and inactive voice segments in the input signal.  During
   inactive voice segments, the role of the CNG is to sufficiently
   describe the ambient noise while minimising the transmission rate.
   A Silence Insertion Descriptor (SID) frame containing a description
   of the noise is packed into the CN payload and sent to the receiver.
   The DTX algorithm determines when a SID frame is transmitted.  The
   SID frame is sent once at the beginning of a silence period, but the
   update rate is left implementation specific.  For example, the SID
   frame may be sent periodically or only when there is a significant
   change in the background noise characteristics.  The CNG algorithm
   at the receiver uses the information in the SID to update its noise
   generation model and then produce an appropriate amount of comfort
   noise.

   The CN payload format provides a minimum interoperability
   specification for communication of comfort noise parameters.  The
   comfort noise analysis and synthesis as well as the VAD and DTX
   algorithms are unspecified and left implementation-specific.
   However, an example solution for G.711 has been tested and is
   described in Appendix II of ITU-T Recommendation G.711 [9].  It uses
   the VAD and DTX of G.729 Annex B [10] and a comfort noise generation
   algorithm (CNG), which is provided in the Appendix for information.
   Additional guidelines for use such as the factors affecting system
   performance in the design of the VAD/DTX/CNG algorithms are
   described in the Appendix.

7. MIME Media Type Registrations

   This section defines a new RTP payload name and associated MIME
   type, CN (audio/CN).

7.1 Registration of MIME media type audio/CN

   MIME media type name: audio

   MIME subtype name: CN

   Required parameters: None

   Optional parameters: rate

   Encoding considerations:
   This type is only defined for transfer via RTP [RFC XXXX, draft-
   ietf-avt-rtp-new].

   Security considerations: none

Zopf               Informational - September, 2000                  4

                    RTP Payload for Comfort Noise          March 2000



   Interoperability considerations: none

   Published specification:
   This document and Appendix II of ITU-T Recommendation G.711

   Applications which use this media type:
   Audio and video streaming and conferencing tools.

   Additional information: none

   Person & email address to contact for further information:
   Robert Zopf
   zopf@lucent.com

   Intended usage: COMMON

   Author/Change controller:
   Author: Robert Zopf
   Change controller: IETF AVT Working Group


8. References


   1  Bradner, S., "The Internet Standards Process -- Revision 3", BCP
      9, RFC 2026, October 1996.

   2  H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A
      Transport Protocol for Real-Time Applications", RFC 1889.

   3  ITU Recommendation G.711 (11/88) - Pulse code modulation (PCM) of
      voice frequencies.

   4  ITU Recommendation G.726 (12/90) - 40, 32, 24, 16 kbit/s Adaptive
      Differential Pulse Code Modulation (ADPCM).

   5  ITU Recommendation G.727 (12/90) - 5-, 4-, 3- and 2-bits sample
      embedded adaptive differential pulse code modulation (ADPCM).

   6  ITU Recommendation G.728 (09/92) - Coding of speech at 16 kbits/s
      using low-delay code excited linear prediction.

   7  ITU Recommendation G.722 (11/88) - 7 kHz audio-coding within 64
      kbit/s.

   8  Bradner, S., "Key words for use in RFCs to Indicate Requirement
      Levels", BCP 14, RFC 2119, March 1997.

   9  Appendix II to Recommendation G.711 (to be published) - A comfort
      noise payload definition for ITU-T G.711 use in packet-based
      multimedia communication systems.


Zopf               Informational - September, 2000                  5

                    RTP Payload for Comfort Noise          March 2000




   10 Annex B (08/97) to Recommendation G.729 - C source code and test
      vectors for implementation verification of the algorithm of the
      G.729 silence compression scheme.



9. Author's Address

   Robert Zopf
   Lucent Technologies
   INS Access VoIP Networks
   200 Schulz Drive
   Red Bank, NJ 07701
   USA

   e-mail: zopf@lucent.com
   Tel:    1-732-578-3207
   Fax:    1-732-578-3213


































Zopf               Informational - September, 2000                  6

                    RTP Payload for Comfort Noise          March 2000



Full Copyright Statement

   "Copyright (C) The Internet Society (date). All Rights Reserved.
   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implmentation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph
   are included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into