Audio/Video Transport WG                                  Ari Lakaniemi
Internet Draft                                              Ye-Kui Wang
Intended status: Standards track                                  Nokia
Expires: December 2008                                     July 7, 2008



        RTP payload format for the ITU-T Embedded Variable Bit-Rate
                            speech/audio codec
                    draft-lakaniemi-avt-rtp-evbr-01.txt


Status of this Memo

   By submitting this Internet-Draft, each author represents that
   any applicable patent or other IPR claims of which he or she is
   aware have been or will be disclosed, and any of which he or she
   becomes aware will be disclosed, in accordance with Section 6 of
   BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on December 7, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2008).

Abstract

   This document specifies the Real-Time Transport Protocol (RTP)
   payload format for the Embedded Variable Bit-Rate (EV-VBR)
   speech/audio codec. A media type registration for this RTP payload
   format is also included.



Lakaniemi, Wang        Expires December 7, 2008                [Page 1]


Internet-Draft          RTP payload for EV-VBR                June 2008


Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Table of Contents


   1. Introduction...................................................3
   2. Background.....................................................3
      2.1. EV-VBR codec..............................................3
      2.2. Benefits of layered design................................4
      2.3. Transmitting layered data.................................5
      2.4. Scaling scenarios & rate control..........................6
   3. EV-VBR RTP payload format......................................6
      3.1. Payload Structure.........................................6
         3.1.1. Payload Header.......................................7
         3.1.2. EV-VBR transport blocks..............................7
      3.2. Handling the Encoded data................................10
      3.3. EV-VBR scaling...........................................12
      3.4. CRC verification.........................................12
      3.5. EV-VBR session...........................................12
      3.6. Cross-stream/cross-layer timing synchronization..........12
      3.7. RTP Header usage.........................................14
   4. Codec bit-rate and layer configuration control................14
   5. Payload Format Parameters.....................................16
      5.1. Media Type Registration..................................16
      5.2. Mapping to SDP Parameters................................18
      5.3. Offer/answer considerations..............................19
      5.4. Declarative usage of SDP.................................19
      5.5. SDP examples.............................................19
   6. Security Considerations.......................................21
   7. Congestion control............................................22
   8. IANA Considerations...........................................22
   APPENDIX A: Payload examples.....................................23
      A.1. Simple payload examples..................................23
         A.1.1. All the layers in the same payload..................23
         A.1.2. Layers in separate RTP streams......................24
      A.2. Advanced examples........................................25
         A.2.1. Different update rate for subset of layers..........25
         A.2.2. Redundant frames with limited set of layers.........26
   9. References....................................................28
      9.1. Normative References.....................................28
      9.2. Informative References...................................29
   Author's Addresses...............................................29
   Intellectual Property Statement..................................30


Lakaniemi, Wang        Expires December 7, 2008                [Page 2]


Internet-Draft          RTP payload for EV-VBR                June 2008


   Disclaimer of Validity...........................................30

1. Introduction

   The International Telecommunication Union (ITU-T) recommendation
   G.xxx [ev-vbr] specifies the Embedded Variable Bit Rate (EV-VBR)
   speech/audio codec. This document specifies the Real-time Transport
   Protocol (RTP) [RFC3550] payload format for this codec.

2. Background

2.1. EV-VBR codec

   EV-VBR is an embedded variable rate speech codec having a layered
   design. The bitstream of the EV-VBR core codec consists of a core
   layer, denoted as L1, and four enhancement layers, denoted as L2-L5.
   The bit-rates of the EV-VBR core codec range from 8 kbit/s (core
   layer only) to 32 kbit/s (with all layers up to L5). Furthermore, the
   EV-VBR codec supports also discontinuous transmission (DTX) and
   comfort noise generation (CNG) by sending Silence Descriptor (SID)
   frames during periods of non-active input signal. The sampling
   frequency of the core codec is 16 kHz and the codec operates on 20 ms
   frames. Note that the EV-VBR codec is also capable of narrowband
   operation with audio input and/or output at 8 kHz sampling frequency.

   While transmitting/receiving the core layer L1 is enough for
   successful decoding of audio content, each of the enhancement layers
   Ln (n being 2 to 5, inclusive) provides an improvement to
   reconstructed audio quality. Thus, the core layer ensures the basic
   communication while the enhancement layers can be used to improve the
   perceptual quality. Furthermore, enhancement layers are dependent on
   all the lower layers in a sense that successful decoding of layer Ln
   requires also all the layers Lm with m<n to be available. The sizes
   of the EV-VBR core codec layers L1-L5 are summarized in Table 1
   below.

                          Table 1: EV-VBR layers

        Layer    Bytes    Cumulative source bit-rate
      --------------------------------------------
         L1        20             8 kbit/s
         L2        10            12 kbit/s
         L3        10            16 kbit/s
         L4        20            24 kbit/s
         L5        20            32 kbit/s




Lakaniemi, Wang        Expires December 7, 2008                [Page 3]


Internet-Draft          RTP payload for EV-VBR                June 2008


   The EV-VBR codec includes also an encoding mode that is compatible
   with the Adaptive Multi-Rate Wideband (AMR-WB) codec, for which the
   RTP payload format is specified in [RFC4867]. In this AMR-WB
   interoperable mode layers L1, L2 are replaced by L1' consisting of
   AMR-WB encoded data. Furthermore, together with L1' modified L3' is
   used instead of L3. The usage of layers L4 and L5 is not affected by
   transmitting AMR-WB data in the lower layers. Table 2 summarizes the
   AMR-WB interoperable mode.

          Table 2: EV-VBR layers in the AMR-WB interoperable mode

        Layer    Bytes    Cumulative source bit-rate
      --------------------------------------------
         L1'       32            12.8 kbit/s
         L3'        8            16 kbit/s
         L4        20            24 kbit/s
         L5        20            32 kbit/s


   Note that backward compatibility with existing AMR-WB end-points can
   be reached by using the AMR-WB RTP payload format [RFC4867].

   ITU-T SG16 is currently working on a set of extension layers in order
   to provide a so-called super-wideband (SWB) audio and stereophonic
   encoding extensions on top of the EV-VBR core codec. Further details
   and the usage of these layers are TBD.

   The main application of the EV-VBR codec is telephony. Other expected
   applications include audio/video conferencing and audio streaming.

2.2. Benefits of layered design

   The layered design enables simple scalability of the transmitted
   stream simply by conveying a suitable number of layers. The number of
   layers used in a session may be selected for example based on the
   capacity of the transmission channel, current transmission
   conditions, characteristics of the source signal or available
   processing capacity.

   Another obvious benefit of the layered codec design is the
   possibility to exploit the scalability to support congestion control
   by transmitting/dropping some of the (higher) enhancement layers in
   order to alleviate congestion in the network. See more detailed
   discussion on the congestion control in section 7.





Lakaniemi, Wang        Expires December 7, 2008                [Page 4]


Internet-Draft          RTP payload for EV-VBR                June 2008


   Furthermore, the layered design also implicitly provides possibility
   for unequal error detection/protection by employing different levels
   of protection on core layer and enhancement layers.

2.3. Transmitting layered data

   In principle there are two basic approaches to carry the data from a
   layered encoder:

   1. All the layers are carried within a single RTP session.

   2. The encoded data is divided over multiple RTP sessions, each
      session carrying a subset of layers. This is also referred to as
      session multiplexing.

   The first choice is the most efficient in terms of exploitation of
   transmission bandwidth. Furthermore, using only one packet to carry
   all encoded data layers of a frame requires less resources also from
   the end-systems (and intermediate systems) since the number of
   packets is kept at minimum and only single RTP packet stream needs to
   be handled. However, this option requires any intermediate network
   element performing the scaling operation to be fully media-aware
   since removing encoded layers requires modification of the payload.
   Furthermore, the intermediate network element needs to be within the
   security context to enable the meaningful manipulation of the
   payload, in case secure transport is employed. This might not be
   feasible in all systems/scenarios, but some special-purpose devices
   such as e.g. media gateways in cellular telephone systems may be able
   to implement this kind of media-aware functionality.

   The second alternative transmitting selected subsets of layers in
   separate RTP sessions facilitates simple scalability in intermediate
   network elements without the requirement of being fully media-aware.
   One use case of this alternative is layered multicast [Add ref]. On
   the other hand, this approach introduces separate packet header
   overhead for each subset of layers. When the size of the encoded data
   block per single layer is in the range of 10 to 20 bytes, the
   packetisation may result in relatively high amount of protocol
   overhead, which might be an expensive solution on bandwidth-limited
   links. Another drawback of this approach is somewhat more complex
   session setup and the additional complexity associated with handling
   of several concurrent RTP sessions. However, this is a trade-off that
   enables simple scalability also by intermediate network elements that
   are not aware of the details of the transmitted media.





Lakaniemi, Wang        Expires December 7, 2008                [Page 5]


Internet-Draft          RTP payload for EV-VBR                June 2008


2.4. Scaling scenarios & rate control

   In principle there are three different ways to make use of the
   layered design to control the bandwidth usage:

   1. A sender decides to change the number of layers it is transmitting
   (for example due to congestion control constrains)

   2. A receiver or an intermediate network element instructs a sender
      to change the number of layers it is transmitting

   3. An intermediate network element passes forward only a subset of
      layers it receives

   The most appropriate mechanism depends on the application and the
   employed network topology. For example point-to-point conversational
   audio connection can easily introduce rate control by changing the
   number of transmitted layers, while in centralized audio/video
   conferencing scenario the conference server is a more appropriate
   point to implement the rate control instead of transmitting end-
   point. Please refer to [topo] for extensive discussion on the
   different topologies and their implications to the transmission.

   However, the fundamental difference between these choices is that
   method 1 does not necessarily need any feedback from the receiver(s),
   while methods 2 and 3 require a signaling mechanism to support rate
   control.

3. EV-VBR RTP payload format

   The basic EV-VBR source data unit is one layer of an encoded frame.
   Since generally the term layer refers to time series of data
   representing certain encoding layer, in this specification we use the
   term Encoded Data Unit (EDU) to refer to a single layer of data from
   single encoded frame. Thus, each EDU has a (conceptual) frame number
   indicating its location in encoding/decoding order and a layer number
   indicating the encoding layer the EDU represents.

3.1. Payload Structure

   The EV-VBR payload format consists of a payload header, followed by
   one or more transport blocks (TB) forming the actual payload data.

    +-----------------+----------+----------+- /// -+----------+
    | Payload header  |  TB(1)   |  TB(2)   |          TB(n)   |
    +-----------------+----------+----------+- /// -+----------+



Lakaniemi, Wang        Expires December 7, 2008                [Page 6]


Internet-Draft          RTP payload for EV-VBR                June 2008


3.1.1. Payload Header

   The payload header consists of an 8-bit payload CRC checksum:

    +-+-+-+-+-+-+-+-+
    |     CRC       |
    +-+-+-+-+-+-+-+-+


   In the transmitting end the payload checksum is computed over the
   primary transport block (see the definition section 3.1.2) of the
   payload using the generator polynomial

      C(z) = z^8 + z^4 + z^3 + z^2 + 1.

   Subsequent transport blocks are prepared in such a way that the
   payload checksum is valid for any integer number of contiguous
   transport blocks starting from the beginning of the primary transport
   block.

   In the receiving end the payload CRC checksum can be used to verify
   the correct reception of any contiguous subset of transport blocks
   starting from the beginning of the primary transport block (see
   section 3.3 for detailed description).

3.1.2. EV-VBR transport blocks

   The basic building block of the EV-VBR RTP payload data is an EV-VBR
   transport block (TB). There are two types of transport blocks:
   primary transport block and secondary transport block.

   The structure of the primary transport block is depicted below.


     0 1 2 3 4 5 6 7
    +-+-+-+-+-+-+-+-+----------------------------+
    |   L-ID    |NF | Encoded data               |
    +-+-+-+-+-+-+-+-+----------------------------+

   The structure of the secondary transport block is depicted below.









Lakaniemi, Wang        Expires December 7, 2008                [Page 7]


Internet-Draft          RTP payload for EV-VBR                June 2008



     0 1 2 3 4 5 6 7                              0 1 2 3 4 5 6 7
    +-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+
    |   L-ID    |NF | Encoded data               |     Tail      |
    +-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+

   The layer ID (L-ID) and the NF fields form the transport block
   header. The L-ID field is used to identify the layer structure of the
   encoded data carried in this EV-VBR transport block, and the NF field
   indicates the number of encoded frames with this layer structure
   carried in the Encoded data part following the transport block
   header. The Tail field of the secondary transport block carries a
   modified 8-bit CRC checksum computed over the transport block, as
   specified below.

   An EV-VBR RTP payload SHALL include exactly one primary transport
   block and it MAY be followed by one or more secondary transport
   blocks. The data fields of both transport block types are described
   below.

   L-ID Identification (6 bits) of the encoded data carried in this
        transport block. Table 3 below specifies the mapping between L-
        ID and the encoded data.

                Table 3: Layer identification (L-ID) values

          L-ID    Encoded data
        --------------------------------------
            0     Empty frame
            1     L1
            2     L1-L2
            3     L1-L3
            4     L1-L4
            5     L1-L5
            6     L2
            7     L2-L3
            8     L2-L4
            9     L2-L5
           10     L3
           11     L3-L4
           12     L3-L5
           13     L4
           14     L4-L5
           15     L5
           16     L1'
           17     L1', L3'
           18     L1', L3', L4


Lakaniemi, Wang        Expires December 7, 2008                [Page 8]


Internet-Draft          RTP payload for EV-VBR                June 2008


           19     L1', L3', L4-L5
           20     EV-VBR SID
           21     AMR-WB SID
           22-62  Reserved for stereo and SWB layers
           63     Time synchronization element (see section 3.6)

             Author's note: The current approach provides maximum
             flexibility in terms of layer configuration. However,
             limiting choices would be one way to leave more bits for
             stereo & SWB layer configurations.

             Author's note: One suggested way to make sure we do not
             run out of L-ID values with the extension modes has been
             to make the mapping between L-ID and layer configuration
             it indicates dynamic (to be specified using SDP in session
             set-up). While this would provide effective usage of L-ID
             bits, it would require all elements processing the payload
             to be signaling-aware. A compromise solution would be to
             provide static mapping for selected layer configurations
             and leave 'more exotic' cases to be dynamically mapped on
             session basis. The usage of this type of approach is FFS.

   NF   Number of frames in this transport block (2 bits) decreased by
        one. The number of frames is equal to the value of NF
        incremented by one. For example, value NF=0 indicates that the
        transport block carries one frame, and value NF=3 indicate that
        the transport block carries four frames. If the sender wants to
        encapsulate more than four frames per payload, several
        transport blocks need to be used.

   Encoded data

        Encoded data consists of EDUs as specified by the values L-ID
        and NF fields, arranged according to rules given in section
        3.2.

   Tail The 8-bit tail field of the secondary transport block carries a
        bit field that is needed to modify the partial CRC checksum
        over the payload data up to the end of this TB to match the
        payload CRC field value carried in the payload header.

        In the transmitter the Tail bits for a secondary TB(n) are
        computed by first computing the CRC checksum CRC(n) over the
        payload data from the beginning of the primary TB up to the end
        of TB(n) using the generator polynomial C(z) given above. The
        bits of the Tail field of TB(n) are set to zero value for the
        CRC computation. The transmitted value of the Tail field in


Lakaniemi, Wang        Expires December 7, 2008                [Page 9]


Internet-Draft          RTP payload for EV-VBR                June 2008


        TB(n) is obtained by bitwise XOR operation between the payload
        CRC field value carried in the payload header and the CRC(n)
        computed for TB(n).

3.2. Handling the Encoded data

   In order to provide unique mapping of EDUs to encoded frames, the
   following rules on sequence of frames and sequence of layers need to
   be followed when creating a payload:

   o  The frames within a payload MUST form a set of contiguous frames
      in decoding order, i.e. if a payload carries frames n and n+N, all
      frames between n and n+N in decoding order MUST also be present in
      the payload.

   o  The layers within a frame MUST form a contiguous set of layers,
      i.e. if layers Lx and Ly of a frame are included in the payload,
      all layers between Lx and Ly layers MUST also be present.

   The EDUs within a transport block are arranged according to the
   following rules:

   o  The EDUs within a transport block MUST be arranged in increasing
      order of layer number

   o  The EDUs with the same layer number within a transport block MUST
      be arranged in the decoding order

   Explicit timing information for the transport blocks is not needed,
   since the ordering of EDUs in the payload and their mapping to
   transport blocks can be used to implicitly carry this information.
   The following rules apply:

   o  If the highest layer carried in transport block k is n, and the
      lowest layer carried by transport block k+1 is n+1, then the EDUs
      of transport block k and k+1 belong to the same encoded frame.
      Furthermore, if transport blocks k and k+1 carry EDUs belonging to
      the same encoded frame(s), these transport blocks MUST include the
      same number of EDUs.

   o  If the highest layer carried in transport block k is n, and the
      lowest layer carried by transport block k+1 is smaller than or
      equal to n, the EDUs of transport block k and k+1 belong to the
      two separate encoded frames, which are contiguous in decoding
      order.

   o  Multiple copies of an EDU MUST NOT be included in the payload.


Lakaniemi, Wang        Expires December 7, 2008               [Page 10]


Internet-Draft          RTP payload for EV-VBR                June 2008


   A set of EDUs can be allocated to transport blocks in several ways.
   For example each EDU can be encapsulated in its own transport block,
   all EDUs can be carried in single transport block, EDUs belonging to
   the same encoded frame can be encapsulated in dedicated transport
   block, or EDUs representing the same layer can be carried in their
   own transport blocks. Three examples on this with two frames with
   layers L1-L3 are given below. The first example illustrates the case
   using a single transport block for the whole payload, while the
   second payload example introduces separate transport blocks for each
   of the EDUs. The third example shows an approach where all layers are
   carried in dedicated transport blocks. The notation Fx-Ly is used to
   denote layer y of frame x.

   Example 1: All EDUs in a single transport block

     +---------+-----+-------+-------+-------+-------+-------+--------+
     | L-ID=3  |NF=1 | F1-L1 | F2-L1 | F1-L2 | F2-L2 | F1-L3 | F2-L3  |
     +---------+-----+-------+-------+-------+-------+-------+--------+

   Example 2: All EDUs in separate transport blocks

     +---------+-----+-------+---------+-----+-------+
     | L-ID=1  |NF=0 | F1-L1 | L-ID=1  |NF=0 | F2-L1 |
     +---------+-----+-------+---------+-----+-------+
     | L-ID=8  |NF=0 | F1-L2 | L-ID=8  |NF=0 | F2-L2 |
     +---------+-----+-------+---------+-----+-------+
     | L-ID=14 |NF=0 | F1-L3 | L-ID=14 |NF=0 | F2-L3 |
     +---------+-----+-------+---------+-----+-------+

   Example 3: Dedicated transport for EDUs of each layer

     +---------+-----+-------+-------+---------+-----+-------+-------+
     | L-ID=1  |NF=1 | F1-L1 | F2-L1 | L-ID=6  |NF=1 | F1-L2 | F2-L2 |
     +---------+-----+-------+-------+---------+-----+-------+-------+
     | L-ID=10 |NF=1 | F1-L3 | F2-L3 |
     +---------+-----+-------+-------+

   While the first example carrying data from all layers in the same
   transport block obviously consumes less bandwidth, the second example
   using separate transport block for each EDU, and the third example
   using dedicated transport blocks for each layer provide simple
   scaling possibility: while in the first case the removal of e.g.
   layer L3 (from each frame in the payload) would require changing the
   value of the L-ID in addition to removing the corresponding EDU(s),
   in the second and third options it is enough to just remove all
   transport blocks carrying L3 data and the remaining part of the
   payload can be left untouched.


Lakaniemi, Wang        Expires December 7, 2008               [Page 11]


Internet-Draft          RTP payload for EV-VBR                June 2008


3.3. EV-VBR scaling

   Any media-aware network element can modify the EV-VBR bitstream by
   dropping some of the layers in case congestion control or e.g. access
   link bandwidth requires such scaling to take place.

   A payload can be either completely dropped or some of the transport
   blocks it carries can be discarded. In case full payloads are dropped
   to implement scaling, a packet containing the core layer L1 SHOULD
   NOT be discarded, since the decoding of higher layers of the same
   encoded frame is not possible without the core layer data being
   available. This means that payloads with L-ID values equal to 1 to 5,
   inclusive and 16 to 19, inclusive, SHOULD NOT be completely
   discarded.

   In case the payload is forwarded with modified content, at least the
   primary transport block MUST be preserved in the payload, while some
   of the secondary transport blocks at the end of the payload MAY be
   discarded.

3.4. CRC verification

   In the receiving end the CRC verification is made in such a way that
   the CRC computation is started from the beginning of the primary TB,
   i.e. from the MSB of the first octet of the TB(1), and the
   computation is continued until the end of the payload data or until
   an erroneous TB is encountered. At the end of each TB a check MAY be
   performed: if the CRC value at the end of TB(n) matches the payload
   CRC value received in the payload header, the verification is
   successful and the data up TB(n) is valid. If the CRC value at the
   end of TB(n) does not match the payload CRC value received in the
   payload header, there is an error in the TB(n) and it MUST be
   discarded as corrupted. Furthermore, if the verification indicates
   corrupted TB(n), all subsequent transport blocks TB(m) with m>n MUST
   also be discarded.

3.5. EV-VBR session

   An EV-VBR session consists of one or several RTP sessions carrying
   encoded EV-VBR data according the payload format specified in section
   3.2.

3.6. Cross-stream/cross-layer timing synchronization

   In case an EV-VBR session consists of multiple RTP sessions, the RTP
   packets transmitted on separate RTP sessions need to be synchronized
   in order to enable reconstruction of the frames in the receiving end.


Lakaniemi, Wang        Expires December 7, 2008               [Page 12]


Internet-Draft          RTP payload for EV-VBR                June 2008


   Since each of the RTP sessions uses its own random initial value for
   the RTP timestamp, there is also a random offset between the RTP
   timestamps values carrying the EDUs belonging to the same encoded
   frame in different RTP sessions.

   The receiver MAY re-use the traditional RTCP based mechanism to
   synchronize streams by using the RTP and NTP timestamps of the RTCP
   Sender Reports (SR) it receives. The drawback of this approach is
   that the cross-session synchronization is not possible until the
   first RTCP SRs are received in each session. This implies that
   decoding only a subset of layers may be possible until RTCP SRs in
   all sessions have been received.

   To overcome the drawback of delayed cross-session synchronization
   this document specifies a payload specific timing synchronization
   element that can be used to establish the cross-session
   synchronization immediately in the beginning of the session:

                         1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  L-ID=63  |NF |   Reference TS                                |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | Ref. TS cont. |
    +-+-+-+-+-+-+-+-+

   NF MUST be set to value 0.

   Reference TS field (32 bits) includes the RTP TS value of the RTP
   session carrying the data from the core layer L1 and corresponding to
   the RTP timestamp value of the packet carrying this time
   synchronization element.

   A time synchronization element MUST NOT be included in the RTP
   session carrying the core layer L1 EDUs. A time synchronization
   element MAY be included in any packet of the RTP session not carrying
   core layer L1 data. A sender SHOULD include a time synchronization
   element in the first N (t.b.d.) payloads of the session.

   As an example, let's assume a two-RTP-session scenario, where the
   first RTP session (RTP1) carries layers L1 and L2 and the second RTP
   session (RTP2) carries layers L3, L4 and L5. The RTP timestamp value
   indicating the sampling time of frame n in RTP1 is TS1. The
   synchronization can be established by including a synchronization
   element with reference TS value TS1 in the RTP packet of RTP2 having
   the RTP timestamp value indicating the sampling time of frame n.



Lakaniemi, Wang        Expires December 7, 2008               [Page 13]


Internet-Draft          RTP payload for EV-VBR                June 2008


3.7. RTP Header usage

   This section specifies the usage of some fields of the RTP header
   (specified in section 5 of [RFC3550]) with the EV-VBR RTP payload
   format.

   In case the EV-VBR session consists of multiple RTP sessions, the RTP
   sessions are further separated by using different payload type (PT)
   values for each of the RTP streams. In case of all layers carried
   within a single RTP session there is need for only one PT. Note that
   the assignment of the PT number(s) for this payload format are
   outside the scope of this document. It is expected that the RTP
   profile under which this payload is used will either assign PT
   number(s) for this encoding or specify the PT number(s) to be
   dynamically assigned.

   The RTP timestamp corresponds to the sampling instant of the first
   encoded sample of the earliest frame in the payload. The timestamp
   clock frequency is 32 kHz.

   The marker bit (M) of each of the RTP streams of the session SHALL be
   set to value 1 if the payload carries an EDU belonging to the first
   speech frame after an inactive period, i.e. an EDU from the first
   speech frame of a talkspurt. For all other packets the marker bit is
   set to value 0.

4. Codec bit-rate and layer configuration control

   The media parameters defined in section 5.1 can be used to
   negotiate/declare the employed layer configuration by defining the
   lowest and highest layer numbers for each employed RTP session. The
   highest layer defined for the session specifies the initial maximum
   bandwidth, which cannot be changed without a session re-negotiation
   or codec bit-rate control signaling during the session.

   The codec bit-rate and layer configuration control during a session
   can be performed by instructing the transmitter in the far-end to
   change the layer configuration it is sending. Reasons for letting the
   receiver indicate its preference include

   o  In case there is need to go down in bit-rate due to changed
      bandwidth availability the receiver may wish to express its
      preferences of the new configuration. For example if the original
      configuration consisted of full set of layers including SWB and
      stereo, it would be useful to enable the receiver to indicate
      whether it wants to drop SWB or stereo.



Lakaniemi, Wang        Expires December 7, 2008               [Page 14]


Internet-Draft          RTP payload for EV-VBR                June 2008


   o  There might be (non-bandwidth related) receiver-originated reason
      to change bit-rate or layer configuration. Such reason could be
      related to e.g. change in available computational resources due to
      other applications running on the device.

   In case of an EV-VBR session consisting of single RTP session this
   will change the size of the inbound payloads, whereas in case of an
   EV-VBR session employing multiple RTP sessions either the number of
   streams will change and/or the configuration of the streams carrying
   the highest layers will change.

   The mechanism to deliver the codec bit-rate and layer configuration
   control request to the far-end transmitter is currently an open
   issue. However, a proposal for the data to be sent is illustrated
   below by using an application-defined RTCP packet. However, please
   not that this approach is currently considered as not appropriate
   usage of RTCP APP packet, and the illustration serves as an
   indication on the configuration request data to be delivered. The
   proposed mechanism that can be used include

   o  New RTCP packet type

   o  New AVPF FB packet type

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P| subtype |   PT=APP=204  |             length            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           SSRC/CSRC                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          name (ASCII)                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      padding          |  SWB  |Stereo |  ML   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   SWB

        Indicates the super-wideband configuration the sender of this
        packet would like to receive. Value 0 indicates that the sender
        of this packet would not like to receive the super-wideband
        option. Other values are T.B.D.

   Stereo




Lakaniemi, Wang        Expires December 7, 2008               [Page 15]


Internet-Draft          RTP payload for EV-VBR                June 2008


        Indicates the stereo configuration the sender of this packet
        would like to receive. Value 0 indicates that the sender of
        this packet would not like to receive the stereo option. Other
        values are T.B.D.

   ML

        Indicates the number of the highest layer of the core codec the
        sender of this packet wishes to receive. The valid values of ML
        are in the range from 1 to 5. Other values are reserved for
        future use. The receiver MUST ignore invalid values.

   A transmitter that has received a codec bit-rate control request
   SHOULD start using the requested mode configuration as soon as
   possible. The transmitter MAY transmit fewer layers than requested.

   The codec bit-rate control request is valid until the next request is
   received. Thus, a transmitter that has received a codec bit-rate
   control request message SHOULD follow the most recent request until a
   new request is received.

5. Payload Format Parameters

   This section defines the parameters that may be used to configure
   optional features in the EV-VBR RTP transmission.

   The parameters are defined here as part of the media subtype
   registration for the EV-VBR codec.  Mapping of the parameters into
   the Session Description Protocol (SDP) [RFC4566] is also provided for
   those applications that use SDP.  In control protocols that do not
   use MIME or SDP, the media type parameters must be mapped to the
   appropriate format used with that control protocol.

5.1. Media Type Registration

   This registration is done using the template defined in RFC 4288
   [RFC4288] and following RFC 4855 [RFC4855].

   Type name:  audio

   Subtype name:  EV-VBR

   Required parameters:  none

   Optional parameters:




Lakaniemi, Wang        Expires December 7, 2008               [Page 16]


Internet-Draft          RTP payload for EV-VBR                June 2008


      layers:    The numbers of the layers (in range from 1 to 5,
                 denoting layers from L1 to L5, respectively)
                 transmitted in this session, expressed as comma-
                 separated list of layer numbers. If the parameter is
                 present, at least layer L1 MUST be included in the
                 list of layers (at least?) in one of the RTP sessions
                 included in the EV-VBR session. If the parameter is
                 not present, all layers up to layer L5 MAY be used in
                 the session.

      ptime:     the recommended length of time (in milliseconds)
                 represented by the media in a packet.  See Section 6
                 of [RFC4566].

      maxptime:  the maximum length of time (in milliseconds) that can
                 be encapsulated in a packet.  See Section 6 of
                 [RFC4566]

      Author's note: Some further study needed to see if separate
      parameters for sending and receiving capabilities/preferences are
      needed -- especially for upcoming stereo and SWB options.

      Author's note: The support for upcoming SWB and stereo options
      needs to be taken into account. Basically we can either 1) extend
      the parameter "layers" to cover also this aspect, or 2) define
      separate parameter(s) for these new options when more details on
      the stereo/SWB support are available.

   Encoding considerations:

     This media type is framed and contains binary data; see Section 4.8
     of [RFC4288].

   Security considerations:  See Section 6 of RFC xxxx

   Interoperability considerations:  none

   Published specification:  RFC xxxx

   Applications which use this media type:

     For example Voice over IP, audio and video conferencing, audio
     streaming and voice messaging.

   Additional information:  none

   Person & email address to contact for further information:


Lakaniemi, Wang        Expires December 7, 2008               [Page 17]


Internet-Draft          RTP payload for EV-VBR                June 2008


     Ari Lakaniemi, ari.lakaniemi@nokia.com

   Intended usage:  COMMON

   Restrictions on usage:

     This media type depends on RTP framing, and hence is only defined
     for transfer via RTP [RFC3550]

   Author:

     Ari Lakaniemi, ari.lakaniemi@nokia.com

   Change controller:

     IETF Audio/Video Transport working group delegated from the IESG


5.2. Mapping to SDP Parameters

   The information carried in the media type specification has a
   specific mapping to fields of the SDP [RFC4566], which is commonly
   used to describe RTP sessions.  When SDP is used to specify sessions
   employing the EV-VBR codec, the mapping is as follows:

   o  The media type ("audio") goes in SDP "m=" as the media name.

   o  The media subtype ("EV-VBR") goes in SDP "a=rtpmap" as the
      encoding name.  The RTP clock rate in "a=rtpmap" MUST be 32000 for
      EV-VBR.

      Author's note: The current choice for the RTP clock rate is a
      'placeholder'. The clock rate needs to be set according to SWB
      sampling rate, which is still T.B.D. Since the core codec employs
      16000 Hz sampling rate, an integer multiple of 16000 Hz seems a
      preferable choice.

   o  The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and
      "a=maxptime" attributes, respectively.

   o  Any remaining parameters go in the SDP "a=fmtp" attribute by
      copying them directly from the media type string as a semicolon
      separated list of parameter=value pairs.






Lakaniemi, Wang        Expires December 7, 2008               [Page 18]


Internet-Draft          RTP payload for EV-VBR                June 2008


5.3. Offer/answer considerations

   The following considerations apply when using the SDP offer/answer
   [RFC3264] mechanism to negotiate the EV-VBR transport. The parameter
   "layers" MAY be used to indicate the layer configuration for the each
   RTP session belonging to current EV-VBR session an end-point making
   the offer is ready to transmit and wishes to receive.

   o  In case the EV-VBR session consists of a single RTP session, it is
      RECOMMENDED not to impose any layer restrictions for the session
      but to use the rate control functionality to set possible
      restrictions on usage of the higher/st layers. If the offer
      includes a layer configuration parameter, the answer MAY use
      different configuration, but the highest layer in the answer MUST
      NOT be higher than the highest layer of the offered configuration.

      Author's note: Support for answer modifying the layer
      configuration is FFS.

   In case the EV-VBR session consists of multiple RTP sessions, the
   answer MUST use the layer configurations provided in the offer for
   the sessions it accepts.

5.4. Declarative usage of SDP

   In declarative usage, such as SDP in RTSP [RFC2326] or SAP [RFC2974],
   the parameter "layers" SHALL be interpreted to provide a set of
   layers that the sender may use in the session.

5.5. SDP examples

   Some example SDP session descriptions utilizing EV-VBR encodings are
   provided below.

   The first example illustrates the simple case where the EV-VBR
   session employing a single RTP session and the AVPF profile is
   offered, and the answer accepts the offer without any changes.

   Offer:

     m=audio 49120 RTP/AVPF 97
     a=rtpmap:97 EV-VBR/32000/1

   Answer:





Lakaniemi, Wang        Expires December 7, 2008               [Page 19]


Internet-Draft          RTP payload for EV-VBR                June 2008


     m=audio 49120 RTP/AVPF 97
     a=rtpmap:97 EV-VBR/32000/1

   The second example shows a bit more complex case where the EV-VBR
   session using a single RTP session and the AVPF profile is offered
   with restriction to send/receive only with layers L1 and L2. The
   answer indicates that the other end-point is happy to receive (and
   send) layers up to L5.

   Offer:

     m=audio 49120 RTP/AVPF 97
     a=rtpmap:97 EV-VBR/32000/1
     a=fmtp:97 layers=1,2

   Answer:

     m=audio 49120 RTP/AVPF 97
     a=rtpmap:97 EV-VBR/32000/1
     a=fmtp:97 layers=1,2,3,4,5

   The third example shows an EV-VBR session using multiple RTP sessions
   with the AVPF profile. The answerer wishes to use only layers up to
   L3.

   Offer:

     m=audio 49120 RTP/AVPF 97
     a=rtpmap:97 EV-VBR/32000/1
     a=fmtp:97 layers=1,2
     a=mid=1

     m=audio 49122 RTP/AVPF 98
     a=rtpmap:98 EV-VBR/32000/1
     a=fmtp:98 layers=3
     a=mid=2
     a=depend:lay 1

     m=audio 49124 RTP/AVPF 99
     a=rtpmap:99 EV-VBR/32000/1
     a=fmtp:99 layers=4,5
     a=mid=3
     a=depend:lay 1 2

   Answer:




Lakaniemi, Wang        Expires December 7, 2008               [Page 20]


Internet-Draft          RTP payload for EV-VBR                June 2008


     m=audio 49120 RTP/AVPF 97
     a=rtpmap:97 EV-VBR/32000/1
     a=fmtp:97 layers=1,2
     a=mid=1

     m=audio 49120 RTP/AVPF 98
     a=rtpmap:98 EV-VBR/32000/1
     a=fmtp:98 layers=3
     a=mid=2
     a=depend:lay 1

   Note that the dependency signaling according to [smd-sdp] is used in
   the third example above to indicate the relationship between the
   layers distributed into separate RTP sessions.

6. Security Considerations

   RTP packets using the payload format defined in this specification
   are subject to the security considerations discussed in the RTP
   specification [RFC3550], and in any appropriate RTP profile (for
   example [RFC3551] or [RFC4585]).  This implies that confidentiality
   of the media streams is achieved by encryption; for example, through
   the application of SRTP [RFC3711].  Because the data compression used
   with this payload format is applied end-to-end, any encryption needs
   to be performed after compression.

   A potential denial-of-service threat exists for data encodings using
   compression techniques that have non-uniform receiver-end
   computational load.  The attacker can inject pathological datagrams
   into the stream that will increase the processing load of the decoder
   and may cause the receiver to be overloaded. For example inserting
   additional EDUs representing the higher enhancement layers on top of
   the ones actually transmitted may increase the decoder load. However,
   the EV-VBR codec is not particularly vulnerable to such an attack,
   since the majority of the computational load in an EV-VBR session is
   associated to the encoder.  Another form of possible attach might be
   forging of codec bit-rate control messages, which may result in
   encoder operating employing higher number of enhancement layers than
   originally intended and thereby requiring larger amount of
   computation resources. Therefore, the usage of data origin
   authentication and data integrity protection of at least the RTP
   packet is RECOMMENDED; for example, with SRTP [RFC3711].

   Note that the appropriate mechanism to ensure confidentiality and
   integrity of RTP packets and their payloads is very dependent on the
   application and on the transport and signaling protocols employed.



Lakaniemi, Wang        Expires December 7, 2008               [Page 21]


Internet-Draft          RTP payload for EV-VBR                June 2008


   Thus, although SRTP is given as an example above, other possible
   choices exist.

   Note that end-to-end security with either authentication, integrity
   or confidentiality protection will prevent a network element not
   within the security context from performing media-aware operations
   other than discarding complete packets.  To allow any (media-aware)
   intermediate network element to perform its operations, it is
   required to be a trusted entity which is included in the security
   context establishment.

7. Congestion control

   As scalable codec EV-VBR implicitly provides means for congestion
   control by providing a possibility for 'thinning' the bitstream. The
   RTP payload format according to this specification provides several
   different means for reducing the EV-VBR session bandwidth. The most
   appropriate mechanism (in terms of impact to the user experience)
   depends on the employed payload structure and also on the employed
   session configuration (single RTP session or multiple RTP sessions).
   The following means (in no particular order) can be used to assist
   congestion control procedures -- either by the sender or by the
   intermediate node.

   o  The transport blocks carrying the EDUs representing the highest
      layers within the payload may be dropped.

   o  The payloads carrying the EDUs representing the highest layers in
      an EV-VBR session are dropped.

   o  Transport blocks or payloads carrying EDUs belonging to redundant
      frames included in the payload are dropped.

8. IANA Considerations

   IANA is kindly requested to register a media type for the EV-VBR
   codec for RTP transport, as specified in section 5.1 of this
   document.











Lakaniemi, Wang        Expires December 7, 2008               [Page 22]


Internet-Draft          RTP payload for EV-VBR                June 2008


APPENDIX A: Payload examples

   The EV-VBR payload structure enables flexible transport either by
   carrying all layers in the same payload or separating the layers into
   separate payloads. The following subsections illustrate different
   possibilities for transport by simple examples. Note that examples do
   not show the full payload structure to keep the illustration simple.

A.1. Simple payload examples

A.1.1. All the layers in the same payload

   The illustration below shows layers L1-L3 from two encoded frames
   encapsulated into separate payloads using single transport block.

    +-------+--------+-----+------+------+------+
    | RTP1  | L-ID=3 |NF=0 |F1-L1 |F1-L2 |F1-L3 |
    +-------+--------+-----+------+------+------+

    +-------+--------+-----+------+------+------+
    | RTP2  | L-ID=3 |NF=0 |F2-L1 |F2-L2 |F2-L3 |
    +-------+--------+-----+------+------+------+


   In case the same layers from two input frames are encapsulated into
   one payload using single transport block, the structure is as shown
   below.

    +-------+--------+-----+------+------+------+------+------+------+
    | RTP1  | L-ID=3 |NF=1 |F1-L1 |F2-L1 |F1-L2 |F2-L2 |F3-L3 |F2-L3 |
    +-------+--------+-----+------+------+------+------+------+------+


   The third example illustrates the case where the layers L1-L3 from
   two input frames are encapsulated into one payload using two separate
   transport blocks, the first one carrying L1 and the other one
   containing L2 and L3.












Lakaniemi, Wang        Expires December 7, 2008               [Page 23]


Internet-Draft          RTP payload for EV-VBR                June 2008


    +-------+--------+-----+------+------+
    | RTP1  | L-ID=1 |NF=1 |F1-L1 |F2-L1 |
    +-------+--------+-----+------+------+------+------+
            | L-ID=7 |NF=1 |F1-L2 |F2-L2 |F2-L2 |F2-L3 |
            +--------+-----+------+------+------+------+

A.1.2. Layers in separate RTP streams

   In this case the data for each layer is transmitted in its own
   payload.

   In the first example each transport block including a single EDU is
   carried in its own RTP payload.

    +-------+--------+-----+-----+    +-------+--------+-----+-----+
    | RTP1a | L-ID=1 |NF=0 |F1-L1|    | RTP1b | L-ID=6 |NF=0 |F1-L2|
    +-------+--------+-----+-----+    +-------+--------+-----+-----+

    +-------+--------+-----+-----+    +-------+--------+-----+-----+
    | RTP1c |L-ID=10 |NF=0 |F1-L3|    | RTP2a | L-ID=1 |NF=0 |F2-L1|
    +-------+--------+-----+-----+    +-------+--------+-----+-----+

    +-------+--------+-----+-----+    +-------+--------+-----+-----+
    | RTP2b | L-ID=6 |NF=0 |F2-L2|    | RTP2c |L-ID=10 |NF=0 |F2-L3|
    +-------+--------+-----+-----+    +-------+--------+-----+-----+


   If the payloads carry data from two consecutive input frames, the
   same encoded data as in the previous example is arranged as follows.




















Lakaniemi, Wang        Expires December 7, 2008               [Page 24]


Internet-Draft          RTP payload for EV-VBR                June 2008


    +-------+--------+-----+-----+-----+
    | RTP1a | L-ID=1 |NF=1 |F1-L1|F2-L1|
    +-------+--------+-----+-----+-----+

    +-------+--------+-----+-----+-----+
    | RTP1b | L-ID=6 |NF=1 |F1-L2|F2-L2|
    +-------+--------+-----+-----+-----+

    +-------+--------+-----+-----+-----+
    | RTP1c |L-ID=10 |NF=1 |F1-L3|F2-L3|
    +-------+--------+-----+-----+-----+


A.2. Advanced examples

A.2.1. Different update rate for subset of layers

   An example employing different update rates (i.e. different number of
   frames per packet) for selected subsets of layers. In these examples
   all core codec layers L1-L5 are shown.





























Lakaniemi, Wang        Expires December 7, 2008               [Page 25]


Internet-Draft          RTP payload for EV-VBR                June 2008


    +-------+--------+-----+-----+-----+-----+-----+
    | RTP1  | L-ID=1 |NF=3 |F1-L1|F2-L1|F3-L1|F4-L1|
    +-------+--------+-----+-----+-----+-----+-----+

    +-------+--------+-----+-----+-----+-----+-----+
    | RTP2a | L-ID=7 |NF=1 |F1-L2|F2-L2|F1-L3|F2-L3|
    +-------+--------+-----+-----+-----+-----+-----+

    +-------+--------+-----+-----+-----+
    | RTP3a |L-ID=14 |NF=0 |F1-L4|F1-L5|
    +-------+--------+-----+-----+-----+

    +-------+--------+-----+-----+-----+
    | RTP3b |L-ID=14 |NF=0 |F2-L4|F2-L5|
    +-------+--------+-----+-----+-----+

    +-------+--------+-----+-----+-----+-----+-----+
    | RTP2b | L-ID=7 |NF=1 |F3-L2|F4-L2|F3-L3|F4-L3|
    +-------+--------+-----+-----+-----+-----+-----+

    +-------+--------+-----+-----+-----+
    | RTP3c |L-ID=14 |NF=0 |F3-L4|F3-L5|
    +-------+--------+-----+-----+-----+

    +-------+--------+-----+-----+-----+
    | RTP3d |L-ID=14 |NF=0 |F4-L4|F4-L5|
    +-------+--------+-----+-----+-----+


A.2.2. Redundant frames with limited set of layers

   An example transmitting layers L1-L3 as primary data and L1 (of the
   previous frame) as redundant data is shown below. Each payload
   carries one primary (i.e. new) frame in one transport block and one
   redundant frame, which in this example is the frame preceding the
   primary frame, in another transport block.













Lakaniemi, Wang        Expires December 7, 2008               [Page 26]


Internet-Draft          RTP payload for EV-VBR                June 2008


    +-------+--------+-----+-----+--------+-----+-----+-----+-----+
    | RTP1  | L-ID=1 |NF=0 |F0-L1| L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3|
    +-------+--------+-----+-----+--------+-----+-----+-----+-----+

    +-------+--------+-----+-----+--------+-----+-----+-----+-----+
    | RTP2  | L-ID=1 |NF=0 |F1-L1| L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3|
    +-------+--------+-----+-----+--------+-----+-----+-----+-----+

    +-------+--------+-----+-----+--------+-----+-----+-----+-----+
    | RTP3  | L-ID=1 |NF=0 |F2-L1| L-ID=3 |NF=0 |F3-L1|F3-L2|F3-L3|
    +-------+--------+-----+-----+--------+-----+-----+-----+-----+


   Alternatively, the payload carrying also redundant data for a subset
   of layers can be arranged differently, as shown in the example below.

    +-------+--------+-----+-----+-----+-----+--------+-----+-----+
    | RTP1  | L-ID=3 |NF=0 |F0-L1|F0-L2|F0-L3| L-ID=1 |NF=0 |F1-L1|
    +-------+--------+-----+-----+-----+-----+--------+-----+-----+

    +-------+--------+-----+-----+-----+-----+--------+-----+-----+
    | RTP2  | L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| L-ID=1 |NF=0 |F2-L1|
    +-------+--------+-----+-----+-----+-----+--------+-----+-----+

    +-------+--------+-----+-----+-----+-----+--------+-----+-----+
    | RTP3  | L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| L-ID=1 |NF=0 |F3-L1|
    +-------+--------+-----+-----+-----+-----+--------+-----+-----+


   Now the first transport block carries the primary data and the second
   transport block carries the redundant data, which in this case covers
   the frame following the primary frame. The benefit of this approach
   is that the redundant data is included in the last (secondary)
   transport block of the payload, which might be beneficial for
   possible payload scaling operation within the network.














Lakaniemi, Wang        Expires December 7, 2008               [Page 27]


Internet-Draft          RTP payload for EV-VBR                June 2008


9. References

9.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3550] Schulzrinne, H., Casner, S., Frederick, R. and Jacobson,
             V., "RTP: A Transport Protocol for Real-Time Applications",
             STD 64, RFC 3550, July 2003.

   [ev-vbr]  ITU-T Recommendation G.xxx

   [amr-wb]  3GPP TS 26.171, "Adaptive Multi-Rate Wideband (AMR-WB)
             speech codec; General description (Release 7)", v7.0.0,
             September 2006.

   [RFC4867] Sjoberg, J., Westerlund, M., Lakaniemi, A., Xie, Q., "RTP
             Payload Format and File Storage Format fort he Adaptive
             Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB)
             Audio Codecs", RFC 4867, April 2007.

   [ccm]     Wenger, S., Chandra, U., Westerlund, M., Burman, B., "Codec
             Control Messages in the RTP Audio-Visual Profile with
             Feedback (AVPF)", draft-ietf-avt-avpf-ccm-10 (work in
             progress), October 2007.

   [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J.,
             "Extended RTP Profile for Real-Time Transport Control
             Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
             2006.

   [RFC4566] Handley, M., Jacobson, V. and Perkins, C., "SDP: Session
             Description Protocol", RFC 4566, July 2006.

   [RFC4288] Freed, N., Klensin, J., "Media Type Specifications and
             Registration Procedures", BCP 13, RFC 4288, December 2005.

   [RFC4855] Casner, S., "Media Type Registration of RTP Payload
             Formats", RFC 4855, February 2007.

   [RFC3264] Rosenberg, J., Schulzrinne, H., "An Offer/Answer Model with
             Session Description Protocol (SDP)", RFC 3264, June 2002.






Lakaniemi, Wang        Expires December 7, 2008               [Page 28]


Internet-Draft          RTP payload for EV-VBR                June 2008


   [smd-sdp] Schierl, T., Wenger, S., "Signaling media decoding
             dependency in Session Description Protocol (SDP)", draft-
             schierl-mmusic-layered-codec-04 (work in progress), June
             2007.

   [RFC3551] Schulzrinne, H., Casner, S., "RTP Profile for Audio and
             Video Conferences with Minimal Control", STD 65, RFC 3551,
             July 2003.

   [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., Norrman,
             K., "The Secure Real-Time Transport Protocol (SRTP)", RFC
             3711, March 2004.

9.2. Informative References

   [topo]    Westerlund, M., Wenger, S., "RTP Topologies", draft-ietf-
             avt-topologies-07 (work in progress), October 2007.

   [RFC2326] Schulzrinne, H., Rao, A., Lanphier, R., "Real Time
             Streaming Protocol (RTSP)", RFC 2326, April 1998.

   [RFC2974] Handley, M., Perkins, C., Whelan, E., "Session Announcement
             Protocol", RFC 2974, October 2000.

Author's Addresses

   Ari Lakaniemi
   Nokia
   P.O.Box 407
   FIN-00045 Nokia Group, FINLAND

   Phone: +358-71-8008000
   Email: ari.lakaniemi@nokia.com


   Ye-Kui Wang
   Nokia Research Center
   P.O. Box 1000
   33721 Tampere
   Finland

   Phone: +358-50-466-7004
   EMail: ye-kui.wang@nokia.com






Lakaniemi, Wang        Expires December 7, 2008               [Page 29]


Internet-Draft          RTP payload for EV-VBR                June 2008


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.

Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.



Lakaniemi, Wang        Expires December 7, 2008               [Page 30]


Internet-Draft          RTP payload for EV-VBR                June 2008



















































Lakaniemi, Wang        Expires December 7, 2008               [Page 31]