Internet Engineering Task Force             Audio Visual Transport WG
   INTERNET_DRAFT                                 C.Guillemot, P.Christ,
                                                    S.Wesner, A. Klemets
   draft-guillemot-genrtp-01.txt         INRIA / Univ. Stuttgart - RUS /
                                                               Microsoft
                                                           June, 25 1999
                                              Expires: December, 24 1999




                     RTP Payload Format for MPEG-4
                                  with
                 Scaleable & Flexible Error Resiliency


                            STATUS OF THIS MEMO

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.
   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.
   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet- Drafts as refer-
   ence material or to cite them other than as "work in progress."
   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


                                 Abstract

   This document describes a payload format, which can be used for the
   transport of both MPEG-4 Elementary Streams (ES) as well as MPEG-4
   Sync Layer packet streams, in RTP [1] packets.  The payload format
   allows for protection against loss in a generic way, through frag-
   mentation, grouping and extension data mechanisms, which can dynami-
   cally adapt to network conditions. These mechanisms can operate both
   on full and partial MPEG-4 Access Units, such as Sync Layer packets,
   or typed "segments",   These mechanisms can cover a broad range of
   protection schemes and avoid extra connection management complexity
   - e.g. for separate FEC channels - in MPEG-4 applications with a
   high number of streams.







C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 1]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

                             Table of Contents


    1      Introduction..............................................3
    2      MPEG-4 overview...........................................4
    3      Design Considerations.....................................5
    4      Payload Format specification..............................7
    4.1    RTP Header Usage..........................................7
    4.2    Payload Header............................................8
    5      Examples of payload headers..............................10
    5.1    The payload contains Extension data followed by one object
           containing AU data.......................................10
    5.2    The payload contains Extension data followed by 2 AU"s...11
    6      Usage of Extension data field for redundant data.........12
    7      Extension data field for FEC data........................12
    7.1    Extension data field for Parity Codes....................12
    7.2    Extension data field for RS-based Unequal Error Protection
           of typed segments........................................15
    8      Multiplexing.............................................16
    9      Security Considerations..................................16
    10     Authors Addresses........................................17
    11     References...............................................18

                              List of Figures


    Figure 1  Architecture..........................................6
    Figure 2  Example of ESI........................................6
    Figure 3  Sample RTP payload, using the payload format..........8
    Figure 4  Portrait of the unified approach for transport of ES and
              SL packetized streams................................10
    Figure 5  Sample RTP payload for SL-PDU transport..............10
    Figure 6  RTP Payload Example 1................................11
    Figure 7  RTP Payload Example 2................................11
    Figure 8  FEC Header for Parity Codes..........................13
    Figure 9  Simplified FEC Header for Parity Codes  (with default
              masks)...............................................14
    Figure 10 FEC Header for Reed-Solomon Codes....................15
    Figure 11 Example of Interleaving (for P=7)....................15
    Figure 12 Example of data organization for RS-based UEP........16















C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 2]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999


   1     Introduction

   This document is motivated by the large variety of MPEG-4 compressed
   streams, and by the large variety of error control mechanisms that
   can be applied to them.  In addition to having a unique payload for-
   mat for both MPEG-4 Elementary Streams (ES) and Synchronization
   Layer packet streams (SL-PDU Streams), another motivation is flexi-
   bility in associating error control mechanisms with the compressed
   media streams.  The error control mechanisms can be dynamically
   adapted to network characteristics and to different types of stream
   segments. They can evolve without having to define a new payload
   format.

   This design of this payload format has been inspired by previous
   proposals for generic payload formats, [2-3].  Additionally, it at-
   tempts to federate different error control approaches under a single
   protocol support mechanism.  The rationale for this payload format
   consists in:

     - Generality - a unified approach for both MPEG-4 ES and MPEG-4
        sync layer packet streams - with simple fragmentation and
        grouping mechanisms.

     - Protection against packet loss with a generic protocol support.
        The mechanism could also be used for adding protections against
        binary errors, in the case of IP over wireless. If used, for
        protection against packet loss, this in-band mechanism avoids
        extra connection management complexity possibly brought by
        separate FEC channels. Indeed, in MPEG-4 applications, the num-
        ber of streams can potentially be high.

     - Flexible support of a range of error control mechanisms, from
        no protection to FEC and redundant data, which could be adapted
        and applied to typed segments and to network characteristics.
        Typed segments are partial Access Units (AUs) or segments being
        - in terms of the encoding syntax - syntactical and semanti-
        cally meaningful parts of an AU - cf. [4], 7.2.3, "Such partial
        AUs may have significance for improved error resilience"). Ac-
        cess Units are the smallest entities in the bitstream that can
        be attributed individual timestamps. Redundant data, as in the
        sense of [5], or of [6-8] (e.g. under the form of repeated pic-
        ture headers, or of the HEC field of the MPEG-4 video syntax
        [9]) can be supported by a single mechanism.

     - A common solution for "live" and "VOD" (or "pre-recorded") con-
        tent.



   The list of all the protection schemes supported will be announced
   via an out-of-band signaling at the beginning of the session, using
   for example SDP [10].  The protection scheme used at a specific in-


C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 3]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   stant during the session will be signaled via the extension type
   (XT) field in the payload header.

   2     MPEG-4 overview

   An MPEG-4 scene is composed of media objects. The MPEG-4 dynamic-
   scene description framework, which defines the spatio-temporal rela-
   tion of the media objects as well as their contents, is inspired by
   VRML.  The compressed binary representation of the scene description
   is called BIFS (Binary Format for Scenes), [4]. The compressed scene
   description is conveyed through one or more Elementary Streams (ES).

   A compression layer produces the compressed representations of the
   audio-visual objects that will be inserted into the scene.  These
   compressed representations are organized into Elementary Streams
   (ES).  Elementary Stream Descriptors provide information relative to
   the stream, such as the compression scheme used.  Elementary stream
   data is partitioned into Access Units.  The delineation of an Access
   Unit is completely determined by the entity - the compression layer
   - that generates the elementary stream.  An Access Unit is the
   smallest data entity to which timing information can be attributed.
   Two Access Units shall never refer to the same point in time.

   Natural and animated synthetic objects may refer to an Object De-
   scriptor (OD), which points to one or more Elementary Streams that
   carry the coded representation of the object or its animation data.
   An OD serves as a grouping of one or more Elementary Stream Descrip-
   tors that refer to a single media object.  The OD also defines the
   hierarchical relations and properties of the Elementary Streams De-
   scriptors.

   A complete set of ODs can be seen as an MPEG-4 resource or session
   description.  The Object Descriptors are conveyed through one or
   more Elementary Streams.  By conveying the session (or resource) de-
   scription as well as the scene description through their own Elemen-
   tary Streams, it becomes possible to change portions of scenes
   and/or properties of media streams separately and dynamically at
   well-known instants of time.

   The MPEG-4 Systems specification [4] also defines a packetization of
   ES data into access units or parts thereof.  The packets are called
   SL packets, or SL-PDUs.  The resulting sequence of SL packets is
   called the SL-Packetized Stream (SPS).  Access Units are the only
   semantic entities at this layer and their content is opaque.  Pack-
   etization information has to be exchanged between the entity that
   generates an elementary stream and the sync layer.  This relation is
   best described by a conceptual interface between both layers, termed
   the Elementary Stream Interface (ESI).

   A SL packet (SL-PDU) consists of a SL packet header and a SL packet
   payload.  The SL packet header provides means for continuity check-
   ing in case of data loss and carries the coded representation of the
   time stamps and associated information.  This syntax is configurable


C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 4]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   to adapt to the needs of different types of elementary streams and
   is defined in the SLConfigDescriptor (as defined in [4])

   A SL-PDU does not contain an indication of its length.  Therefore,
   SL packets must be framed by a suitable lower layer protocol. Conse-
   quently, a SL-PDU stream is not a self-contained data stream that
   can be stored or decoded without such framing.


   3     Design Considerations

   The design goals of this RTP payload format are to provide the fol-
   lowing:
    - a unified solution, with error protection easily adaptable to
       varying network conditions, for both "live" and "pre-recorded"
       contents.
    - a unified solution for the transport of SL packet streams - with
       a possible 1-to-N mapping - and for the transport of robust ES
       data.

  Figure 1, on the following page, shows the adopted model. It relies
  on an optional network adaptation layer, which supports protection
  mechanisms. Ideally, this network adaptation layer is be both media
  and network aware.

   The compression layer organizes the ESs in Access Units (AU).  The
   AUs are the smallest entities that can be attributed individual
   timestamps.  The timestamps may be obtained directly, through the
   ESI, with syntax as specified by the SLConfigDescriptor.  If the
   SLConfigDescriptor indicates that timestamps are absent, the time-
   stamps may be obtained indirectly, for example, by using the frame
   rate.

   The compression layer passes full or partial Access Units (i.e.
   typed "segments"), together with indications of AU boundaries, ran-
   dom access points, desired timing information as described by the
   SLConfigDescriptor, directly to the network adaptation layer or in-
   directly via the sync layer.  It is however preferable, for imple-
   mentation efficiency, to pass the ES data directly to the network
   adaptation layer, i.e. to avoid producing the full SL packets. Par-
   tial AUs or typed segments are - in terms of the encoding syntax -
   syntactical and semantically meaningful parts of an AU - cf. [4],
   7.2.3, "Such partial AUs may have significance for improved error
   resilience".)











C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 5]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   ---         ----------------------------------
   |S|        |         Compression Layer        |   Media aware
   |L|        -----------------------------------
   | |                                         |
   |C|           ES Descriptor |               |
   |o|              |----------|---------|     |
   |n|            ES Type   RAP Flag    QoS    |
   |f|              |          |         |     |
   |.| -------------V----------V---------V-----|---- ESI
   |D|                                         |
   |e|        -------------------------------  |
   |s|        |                             |  |
   |c|        | Network Adaptation Layer    |<-O     Network aware
   |r|        | ->Redundancy, FEC |         |  |
   |.|        | |                 |         |  |
   -----------|-+- - - - - - - - -| - - -|  |  |
              --|-----------------|------|---  |
                |                 |      |     |
   -------------|--  -------------V------V-----V------
   | QoS          |  | RTP | | Ext.  | |"SL" | Media |
   | monitoring   |  | Hdr.| | Data= | |     |       |
   ----------------  |     | | e.g.  | |     |       |
                     |     | | FEC   | |     |       |
                     ---------------------------------
                         Figure 1   Architecture


   Figure 2 lists parameters that should be passed along with the ES
   data.  The SLConfigDescriptor indicates the presence or absence of
   each parameter.  When any of these parameters are present, then the
   adaptation layer will directly produce the "stripped down" SL header
   to be inserted in the payload of the RTP packet.

   Note that, the normative behavior is assured by the SLConfigDescrip-
   tor, which is visible in the compression layer.


                   DTS: Decoding Time Stamp
                   CTS: Composition Time Stamp
                   OCR: Object Clock Reference
                   IdleFlag
                   loop(randomAccess Flag
                        AUStartFlag
                        AUEndFlag
                        Esdata
                        dataLength
                        degradationPriority
                        segmentType )

                        Figure 2   Example of ESI.


   The payload format also specifies a mechanism for grouping an AU or
   a partial AU or an SL-PDU together with protection data (FEC, redun-

C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 6]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   dant data).  This mechanism makes it possible to adapt the protec-
   tion of the different typed segments, or SL-PDUs, to varying network
   conditions during the session, as well as to a degradation priority
   indicated by the SLConfigDescriptor.  The grouping mechanism can be
   used for grouping SL-PDUs with different SL header parameters (CTS,
   DTS, etc.)  This mechanism also allows several AUs to be grouped,
   with possibly non-monotonically increasing time stamps, in a single
   packet.  The grouping mechanism can also be used for grouping low
   bit rate data streams with low delay requirements, such as facial
   animation parameters.  The mechanism can also be used for interleav-
   ing data in order to increase the error resiliency.

   Consecutive segments (e.g. video packets [9]) of the same type will
   be packed consecutively in the same RTP payload without using the
   grouping mechanism.  The grouping mechanism will be used to group
   partial AUs (or typed-segments) of different types only if UEP -
   Unequal Error Protection - is used (see section 7.3).

   The payload format also supports a fragmentation mechanism where the
   full AUs or the partial AUs passed by the compression layer are
   fragmented at arbitrary boundaries.  This may result in fragments
   that are not independently decodable.  This kind of fragmentation
   may be used in situations when the RTP packets are not allowed to
   exceed the path-MTU size.  However, this media-unaware fragmentation
   is not recommended.  It is preferable that the compression layer
   provides partial AUs, in the form of typed segments, of a size small
   enough so that the resulting RTP packet can fit the MTU size.  Note
   that passing partial AUs of small size will also facilitate conges-
   tion and rate control based on the real output buffer management.
   RTP packets that transport fragments belonging to the same AU will
   have their RTP timestamp set to the same value.

   The protocol support for fragmentation and grouping is inspired from
   [2-3] with an attempt for simplification.


   4     Payload Format specification

   The packet will consist of an RTP header followed by possibly
   multiple payloads.

   4.1  RTP Header Usage

   Each RTP packet starts with a fixed RTP header. The following fields
   of the fixed RTP header are used:

   - Marker bit (M bit): The marker bit of the RTP header is set to 1
     when the current packet carries the end of an access unit AU, or
     the last fragment of an AU.

   - Payload Type (PT): The payload type shall be set to a value as-
     signed to this format or a payload type in the dynamic range
     should be chosen.


C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 7]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   - Timestamp: The RTP timestamp encodes the presentation time of the
     first AU contained in the packet. The RTP timestamp may be the
     same on successive packets if an AU occupies more than one packet.
     If the packet contains only "extension" data objects (see below),
     then the RTP timestamp is set at the value of the presentation
     time of the AU to which the first extension data object (e.g. FEC
     or redundant data) applies.

   The RTP timestamp is set to the composition timestamp (CTS), if its
   presence is indicated by the SLConfigDescriptor, and if its length
   is not more than 32 bits.  Otherwise, the RTP timestsamp should be
   set to the sampling instant of the first AU contained in the packet.

   SSRC: A mapping between the ES identifiers and the SSRCs should be
   provided via out-of-band signaling (e.g. SDP).

   4.2  Payload Header

   The payload header is always present, with a variable length, and is
   defined as follows:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |G|E|    XT     |        LENGTH                 | TSOFFSET      .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   .TSOFFSET(cnt"  |     Extension Data                            .
   +-+-+-+-+-+-+-+-+                               +-+-+-+-+-+-+-+-+
   .     Extension Data (continued)                |G|E|F|  res    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |             LENGTH            |               FOFFSET         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               .
   .                  Media Payload                                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         Figure 3   Sample RTP payload, using the payload format.


   G (Group) (1 bit): If this field is 1, it indicates that the object
   associated to the current header is followed by another
   object.

   E (Extension) (1 bit): If its value is 1 then the next object
   contains Extension data. If its value is 0, then the next object
   contains AU data (full AU or partial AU - typed segment -).

   F (Fragmentation) (1 bit):  This field is only present when the E-
   field is 0. If its value is 1, then the next object is a fragment of
   a typed segment.  If this field is 0, then the next object is a com-
   plete typed segment or complete AU.

   res (Reserved) (5 bits): this field is only present if the E-field
   is 0, resulting in always 1 byte for {G,E=1,XT} or {G,E=0,F,res}

C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 8]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999


   XT (Extension type) (6 bits):  This field is only present if E is
   set to 1. It then specifies the type of extension data. Examples of
   types will be FEC data with the specification of the FEC coding
   scheme (parity codes, block codes such as Reed Solomon codes,...),
   redundant data with the specification of the redundant data encoding
   scheme, duplicated high priority - e.g. headers - data,...etc.

   LENGTH (16 bits): this field specifies the length in bytes of the
   next object. If the object is the last object of the payload (G=0)
   then this field is not present.

   FOFFSET (16 bits): This field is present only when the F field is
   present and F=1. It contains the byte offset of the first byte of
   the fragment of the typed segment from the beginning of the typed
   segment.  This field should be indeed rarely present.

   TSOFFSET (Time Stamp OFFSET) (16 bits): The value of the field is an
   unsigned 16 bit integer. The default value is 0. If the E field is
   "1", then the next object carries extension data, and the TSOFFSET
   added to the value of the RTP timestamp yields the presentation time
   of the AU to which the extension data apply. The TSOFFSET is, in
   this case set to the difference between the media TS and the TS of
   the media to which the extension data apply. If the E field is "0",
   then the next object contains AU data. If this object is not the
   first object in the payload containing AU data, then the TSOFFSET
   added to the value of the RTP timestamp yields the presentation time
   of the following AU data. If this object is the first in the payload
   containing AU data,(even if it has been preceded by extension data)
   then this field is not present.  Note that the TSOFFSET is also use-
   ful for grouping AUs with non-monotonically increasing Time Stamps,
   as well as for data interleaving.


   Media payload:

   If the presence of the DTS - Decoding Time Stamp - is indicated by
   the SLConfigDescriptor, then the DTS value is placed as the first
   data of the media payload, the length of the field being provided by
   the SLConfigDescriptor.

   If the presence of the OCR - Object Clock Reference - is indicated
   by the SLConfigDescriptor, then the OCR value is placed as the sec-
   ond field of the media payload, the length of the field being pro-
   vided by the SLConfigDescriptor.

   If the payload format is used to accommodate SL-packet streams, the
   SN number, if present, can be placed as the third field of the media
   payload. Corresponding length values are provided by the SLConfigDe-
   scriptor.





C.Guillemot, P.Christ, S.Wesner, A. Klemets                   [Page 9]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   If the resulting optional parameters consume a non-integer number of
   bytes, zero padding bits must be inserted at the end of these pa-
   rameters to byte-align the rest of the payload.

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Payload Header | Optional Extension| Opt. parameters | Media   |
   |               | data              | as indicated by |.........|
   |               |                   | SLConfigDesc    | payload |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Figure 4   Portrait of the unified approach for transport of ES and
                          SL packetized streams.


   In scenarios where the sync layer is used without a need for further
   protection, the payload will be as illustrated in Figure 5.

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |G|E|F|  res    | optional SL header paramaters as indicated by .
   +-+-+-+-+-+-+-+-+    the SLConfigDescriptor                     .
   |                                                               .
   .                  Media payload                                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

           Figure 5   Sample RTP payload for SL-PDU transport.


   5     Examples of payload headers


   5.1  The payload contains Extension data followed by one object
         containing AU data

   First payload header:  G=1, E=1, so F not present, FOFFSET
                          not present;
   Second payload header: G=0, E=0, F=0, XT not present, res present,
                          FOFFSET not present (F=0).
                          last object (G=0) containing AU data in the
                          payload, so the length field is not
                          present.











C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 10]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |G|E|  X T      |            LENGTH             |  TSOFFSET     .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   . TSOFFSET(cnt")|          Extension Data                       .
   +-+-+-+-+-+-+-+-+                                               .
   .                                                               .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |G|E|F|  res    |                                               .
   +-+-+-+-+-+-+-+-+                                               .
   .                                                               .
   .                              AU data                          .
   .                                                               .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 6   RTP Payload Example 1.


   5.2  The payload contains Extension data followed by 2 AU"s


   First payload header:  G=1, E=1, so F field not present
   Second payload header: G=1, E=0, F=0, XT not present, res present,
                          first object containing AU data in the
                          payload, so TSOFFSET is not present.
   Third payload header:  G=0, E=0, F=0, XT field not present,
                          Last object in the payload, so LENGTH
                          field not present

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |G|E|    X T    |            LENGTH             |    TSOFFSET   .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   .TSOFFSET(cnt") |          Extension Data                       .
   +-+-+-+-+-+-+-+-+                                               .
   .                                                               .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |G|E|F|   res   |           LENGTH              |               .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               .
   .                                                               .
   .                              AU data                          .
   .                                                               .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |G|E|F|   res   |                                               .
   +-+-+-+-+-+-+-+-+                                               .
   .                                                               .
   .                              AU data                          .
   .                                                               .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                     Figure 7   RTP Payload Example 2


C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 11]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999


   6     Usage of Extension data field for redundant data

   All AU-level decoder configuration information can be considered as
   information of high priority, since, if lost, the whole AU is lost.
   In addition, it does not tolerate increased latency.

   The extension data field may hence contain duplicated data (e.g. du-
   plicated headers) in "n" consecutive packets. The parameter "n" may
   be chosen so that the probability that "n" consecutive packets are
   lost is below a given threshold.  But these decision mechanisms are
   outside the scope of this document.

   As a special type of FEC, it has been proposed in [5] to use, lower
   rate, secondary encoding of the media data to be protected. The
   mechanism described above is directly useable for the transport of
   secondary compressed  streams along with primary compressed data.
   Note that the secondary compressed stream can also be a lower layer
   (with a lower rate) of a scaleable compression scheme, such as
   specified in [9] and [11] for respectively video and audio.


   7     Extension data field for FEC data

   7.1  Extension data field for Parity Codes

   The Extension data field can be used for transporting FEC (parity
   codes) data in the spirit of [12]. The XT field is set at to the
   type associated to the FEC mechanism (parity codes) used. The XT
   field semantic, with all the FEC mechanisms supported, is announced
   via a non-RTP out of band signaling, such as SDP [10], with
   appropriate extensions. Then the FEC mechanisms can, during the ses-
   sion, and depending on the segment type, and on the network charac-
   teristics, be adapted with a simple in-band signaling.

   The FEC operation, as defined in [12], acts on a stream of media
   packets without extension data, and generates a stream of FEC pack-
   ets. The media payload of the above media packets is then encapsu-
   lated in the object containing the AU data. The FEC header and FEC
   data are encapsulated in the extension data field. The extension
   data length field is set to the length of the FEC header plus FEC
   payload.

   The FEC header in the case of parity codes is given in Figure 8.

   It is inspired from the header specified in [12], with the following
   modifications: 1)- the PT recovery field is not used, since the
   payload type of the packets transported in a given channel is
   supposed to be known, namely to be of the type corresponding to this
   proposed payload; 2)- a R bit has been added in order to protect the
   marker bit of the media packets; 3)- In order for the FEC header to
   be byte-aligned, it is also proposed to reduce the mask length by 2
   bits (22 bits instead of 24).  This should be acceptable, since 24
   bits induces a very high delay.

C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 12]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999


   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      SN Base                  |        length recovery        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |E|R|                Mask                       |               .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   .                 TS Recovery                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                  Figure 8   FEC Header for Parity Codes

   On the receiver side, the FEC packets will be reconstructed as de-
   fined in [12], by copying the sequence number, SSRC, CC field, RTP
   version and extension bit from the RTP header of the packets re-
   ceived.


   The fields SN base, E, Mask, TS recovery of the FEC header are de-
   fined as in [12]. The bit R is the Marker recovery bit. The marker
   bit is computed from the RTP media packets marker bits M, to which
   is applied the protection operation.

   The Length Recovery field determines the length of the recovered
   packets and is here computed via the protection operation applied to
   the 16 bit natural binary representation of the lengths (in bytes)
   of the media payload, CSRC list, extension and padding of media
   packets associated with this FEC data, PLUS THE MARKER BIT.


   The length recovery field makes it possible to apply the procedure
   to media packets that are not of the same length.

   When the extension data carries this type of FEC, then the TSOFFSET
   of the extension data header is not used and should be set to zero.
   The protection also applies to sync layer parameters when present in
   the payload of the media packets. The advantage of the approach -
   with respect to having separate FEC packets - is a reduced overhead
   for sending the FEC data.

   It is also proposed to allocate 3 Extension Types to parity codes
   with 3 different default masks in order to reduce the overhead of
   the FEC header which would therefore become as in Figure 9 below:











C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 13]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      SN Base                  |        length recovery        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |E|R| res       |            TS Recovery                     .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   .               |
   +-+-+-+-+-+-+-+-+

            Figure 9   Simplified FEC Header for Parity Codes
                           (with default masks)
   The Extension data field can be used for transporting FEC (parity
   codes) data in the spirit of [13]. The XT field is set at to the
   type associated to the FEC mechanism (parity codes) used. The XT
   field semantic, with all the FEC mechanisms supported, is announced
   via a non-RTP out of band signaling, such as SDP [10], with  appro-
   priate extensions.

   The FEC operation, as defined in [13], acts on a stream of media
   packets without extension data, generating a stream of FEC packets.
   The media payload of the above media packets is then encapsulated in
   the object containing the AU data. The FEC header and FEC data are
   encapsulated in the extension data field. The extension data length
   field is set to the length of the FEC header plus FEC payload.

   The FEC header for Reed-Solomon codes is provided in figure 10. It
   is inspired from the header specified in [13], with the following
   modifications: 1)- the PT recovery field is not used, since the pay-
   load type of the packets transported in a given channel is supposed
   to be known, namely to be of the type corresponding to this proposed
   payload; 2)- a R bit has been added in order to protect the marker
   bit of the media packets; 3)- In order for the FEC header to be
   byte-aligned, it is also proposed to reduce the length of the K
   field to 6 bits instead of 8 bits. Indeed, 8 bits would allow to
   process 256 media packets inducing a very high delay.  The length of
   the N field is also reduced to 7 bits (corresponding to the maximum
   code rate of 1/2) instead of 8 bits, and accordingly reduce the
   length of the i field from 8 to 6 bits, since the i field indicates
   the position of the packet within the N-K FEC packets.4)- A P field
   has been added allowing for interleaving in order to create a FEC
   code capable of correcting longer bursts of packet losses.  The P
   field defines the interleaving periodicity minus 1, as illustrated
   in figure 11 below for the special case of P=7.











C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 14]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      SN Base                  |        length recovery        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |E|R|       N     |        k  |       i   |  P  |TS Recovery    .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   .                  TS Recovery (cnt"d)          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

               Figure 10  FEC Header for Reed-Solomon Codes


   When the extension data carries this type of FEC, then the TSOFFSET
   of the extension data header is not used and should be set to zero.
   The advantage of the approach - with respect to having separate FEC
   packets - is a reduced overhead for sending the FEC data.

                  +-----+-----+-----+-----+-------+-----+
                  |  1  |  8  |  15 | 22  | ...   |mn-6 |
                  +-----+-----+-----+-----+-------+-----+
                  |  2  |  9  |  16 | 23  | ...   |mn-5 |
                  +-----+-----+-----+-----+-------+-----+
                  .                                     .
                  +-----+-----+-----+-----+-------+-----+
                  |  7  |  14 |  21 | 28  | ...   |mn   |
                  +-----+-----+-----+-----+-------+-----+

               Figure 11  Example of Interleaving (for P=7)

   7.2  Extension data field for RS-based Unequal Error Protection of
         typed segments.

   The separation of data with different priority levels into separate
   packets, in order to apply different levels of protection, is not
   always feasible. Indeed, in most situations, these data of different
   priority levels are not independently decodable. In this case, these
   typed segments corresponding to different degradation priorities
   should be grouped into one packet as shown in figure 11.  The group-
   ing mechanism provided by the payload format allows to do so.

   The scheme assumes a fixed pattern in terms of number of objects
   carrying AU data in one packet, e.g. 3 in the example of figure 11
   below.

   The protection operation, as described above for Reed-Solomon, ap-
   plies on objects in consecutive packets transporting partial AUs of
   same type (typed segments of same priority).







C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 15]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   +-----------------+-----------------+-----------------+
   |AU data object   | AU data object  | AU data object  |RTP packet 1
   +-----------------+-----------------+-----------------+------
   |Typed segment  1 | Typed segment 2 | Typed segment 3 |RTP packet 2
   +-----------------------------------------------------+------
   .                 .                 .                 .RTP packet 3
   +-----------------+-----------------+-----------------+------
   .                 .                 .                 ...
   +-----------------+-----------------+-----------------+------
   |     R-S         |                 |                 .RTP packet i
   ------------------+-----------------+-----------------+------
   |                 |                 |                 |
   ------------------------------------------------------+------
   |     K1/N        |     R-S         |                 |
   ------------------------------------+-----------------+------
   |                 |     K2/N        |     R-S         |
   -------------------------------------------------------------
   |                 |                 |     K3/N        |RTP packet n
   +-----------------+-----------------+-----------------+

         Figure 12: Example of data organization for RS-based UEP.

   8     Multiplexing

   MPEG-4 applications can involve a large number of ESs, and thus also
   a large number of RTP sessions. A multiplexing scheme allowing se-
   lective bundling of ES may therefore be necessary for some applica-
   tions. The multiplexing problem is outside the scope of this payload
   format and can be solved by using a generic solution as defined in
   [14].

   9     Security Considerations


   RTP packets transporting information with the proposed payload for-
   mat are subject to the security considerations discussed in the RTP
   specification [1]. This implies that confidentiality of the media
   streams is achieved by encryption.

   If the entire stream (extension data and AU data) is to be secured
   and all the participants are expected to have the keys to decode the
   entire stream, then the encryption is performed in the usual manner,
   and there is no conflict between the two operations (encapsulation
   and encryption).

   The need for a portion of stream (e.g. extension data) to be en-
   crypted with a different key, or not to be encrypted, would require
   application level signaling protocols to be aware of the usage of
   the XT field, and to exchange keys and negotiate their usage on the
   media and extension data separately.





C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 16]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999

   10    Authors Addresses

   Christine Guillemot
   INRIA
   Campus Universitaire de Beaulieu
   35042 RENNES Cedex, FRANCE
   email: Christine.Guillemot@irisa.fr

   Paul Christ
   Computer Center - RUS University of Stuttgart
   Allmandring 30
   D70550 Stuttgart, Germany.
   email: Paul.Christ@rus.uni-stuttgart.de

   Stefan Wesner
   Computer Center - RUS University of Stuttgart
   Allmandring 30
   D70550 Stuttgart, Germany.
   email: wesner@rus.uni-stuttgart.de

   Anders Klemets
   1 Microsoft Way
   Redmond, WA 98052-6399
   USA.
   E-mail: anderskl@microsoft.com






























C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 17]


INTERNET-DRAFT      draft-guillemot-genrtp-01.txt       June 25, 1999


   11    References

  [1]   H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson "RTP: A
        Transport Protocol for Real Time Applications",  RFC 1889,
        Internet Engineering Task Force, January 1996.
  [2]   A. Klemets, "Common Generic RTP Payload Format", draft-klemets
        generic-rtp-00, March 13, 1998.
  [3]   A. Periyannan, D. Singer, M. Speer, "Delivering Media Generi-
        cally over RTP", draft-periyannan-generic-rtp-00, March 13,
        1998
  [4]   ISO/IEC 14496-1 FDIS MPEG-4 Systems November 1998
  [5]   C. Perkins, I. Kouvelas, O. Hodson, V. Hardman,  M. Handley, J.
        Bolot, A. Vega-Garcia, S. Fosse-Parisis,  "RTP Payload for Re-
        dundant Audio Data", draft-ietf-avt-redundancy-revised-00.txt,
        10-Aug-98
  [6]   C. Zhu, "RTP payload format for H.263 Video Streams", RFC 2190.
  [7]   C. Borman, L. Cline, G. Deisher, T. Gardos, C. Maciocco, D.
        Newell, J. Ott, S. Wenger, C. Zhu, "RTP payload format for the
        1998 version of  ITU-T Rec. H.263 video (H.263+)", draft-ietf-
        avt-rtp-h263-video-02.txt, 7-May-98.
  [8]   D. Hoffman, G. Fernando, V. Goyal, M. Civanlar, "RTP Payload
        format for MPEG1/MPEG2 video", RFC 2250, January 1998.
  [9]   ISO/IEC 14496-2 FDIS MPEG-4 Visual November 1998
  [10]  Mark Handley, Van Jacobson, "SDP:Session Description Protocol",
        draft-ietf-mmusic-sdp-07.txt, 2nd Apr 1998.
  [11]  ISO/IEC 14496-3 FDIS MPEG-4 Audio November 1998.
  [12]  J. Rosenberg, H. Schulzrinne, "An RTP Payload format for
        Generic Forward Error Correction", draft-ietf-avt-fec-05.txt,
        26 Feb. 1999.
  [13]  J. Rosenberg, H. Schulzrinne, "An RTP Payload format for Reed
        Solomon Codes", draft-ietf-avt-reedsolomon-00.txt, 3 November
        1998.
  [14]  M. Handley, "GeRM: Generic RTP Multiplexing," work in progress,
        draft-ietf-avt-germ-00.txt, November 1998.
  [15]  S. Bradner, Key words for use in RFCs to Indicate  Requirement
        Levels, RFC 2119, March 1997.


















C.Guillemot, P.Christ, S.Wesner, A. Klemets                  [Page 18]