Audio/Video Transport WG                                M.M. Hannuksela
Internet Draft                                               Y.-K. Wang
Intended status: Standards track                                  Nokia
Expires: January 2009                                     July 14, 2008





                    Session Multiplexing for SVC Video
                    draft-hannuksela-avt-rtp-svc-01.txt




Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on January 14, 2009.

Copyright Notice

   Copyright (C) The IETF Trust (2008).








Hannuksela, Wang       Expires January 14, 2009                [Page 1]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


Abstract

   This memo describes two alternative methods for decoding order
   recovery of the Network Abstraction Layer (NAL) units carried in
   multiple RTP sessions for Scalable Video Coding (SVC), which is
   defined in Annex G of the ITU-T Recommendation H.264 video codec that
   is technically identical to Amendment 3 of ISO/IEC International
   Standard 14496-10.  The methods apply when non-interleaved
   transmission of NAL units using the Single NAL Unit packetization
   mode or the Non-Interleaved packetization mode defined in RFC 3984 is
   in use.



Table of Contents


   Status of this Memo...............................................1
   Copyright Notice..................................................1
   Abstract..........................................................2
   Table of Contents.................................................2
   1. Introduction...................................................4
   2. Conventions....................................................4
   3. Definitions and Abbreviations..................................4
      3.1. Definitions...............................................4
         3.1.1. Definitions from the SVC Specification...............4
         3.1.2. Definitions Specific to This Memo....................4
      3.2. Abbreviations.............................................4
   4. RTP Payload Format.............................................5
      4.1. Design Principles.........................................5
      4.2. RTP Header Usage..........................................5
      4.3. Common Structure of the RTP Payload Format................5
      4.4. NAL Unit Header Usage.....................................5
      4.5. Packetization Modes.......................................5
         4.5.1. Packetization Modes for Multi-Session Transmission...5
      4.6. Decoding Order Number (DON)...............................7
      4.7. Identification of Access Units for Decoding Order Recovery in
      Multi-Session Transmission.....................................7
         4.7.1. Access Unit Identifier (AUID) for the NI-A Mode......8
         4.7.2. Timestamp Difference (TSD) for the NI-TSD Mode.......8
      4.8. Aggregation Packets.......................................9
      4.9. Fragmentation Units (FUs).................................9
      4.10. Payload Content Scalability Information (PACSI) NAL Unit10
         4.10.1. PACSI NAL Unit Modifications for the NI-A Mode.....10
         4.10.2. PACSI NAL Unit Modifications for the NI-TSD Mode...10
   5. Packetization Rules...........................................10
      5.1. Packetization Rules for Multi-Session Transmission.......10


Hannuksela, Wang       Expires January 14, 2009                [Page 2]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


         5.1.1. NI-A and NI-TSD MST Packetization Rules.............11
         5.1.2. Packetization rules for non-VCL NAL units...........12
         5.1.3. Packetization rules for Prefix NAL units............12
   6. De-Packetization Process......................................12
      6.1. De-Packetization Process for Multi-Session Transmission..12
         6.1.1. Decoding Order Recovery for the NI-A Mode...........12
            6.1.1.1. Example 1 (Informative)........................13
            6.1.1.2. Example 2 (Informative)........................15
         6.1.2. Decoding Order Recovery for the NI-TSD Mode.........17
            6.1.2.1. Example 1 (Informative)........................18
            6.1.2.2. Example 2 (Informative)........................20
         6.1.3. Informative Algorithm for NI-A and NI-TSD Decoding Order
         Recovery within an Access Unit.............................22
   7. Payload Format Parameters.....................................22
      7.1. Media Type Registration..................................22
      7.2. SDP Parameters...........................................23
      7.3. Examples.................................................23
      7.4. Parameter Set Considerations.............................23
   8. Security Considerations.......................................23
   9. Congestion Control............................................23
   10. IANA Consideration...........................................23
   11. Informative Appendix: Application Examples...................23
   12. References...................................................23
      12.1. Normative References....................................23
      12.2. Informative References..................................24
   13. Authors' Addresses...........................................24
   Intellectual Property Statement..................................24
   Disclaimer of Validity...........................................25
   Copyright Statement..............................................25




















Hannuksela, Wang       Expires January 14, 2009                [Page 3]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


1. Introduction

   Section 1 of draft-ietf-avt-rtp-svc-13 applies.

   This memo specifies two alternative methods for decoding order
   recovery of NAL units carried in a non-interleaved manner in multiple
   RTP sessions, referred to as Multi-Session Transmission (MST).
   Either of these two introduced MST packetization modes could be used
   to replace those specified in draft-ietf-avt-rtp-svc-13.

2. Conventions

   Section 2 of draft-ietf-avt-rtp-svc-13 applies.

3. Definitions and Abbreviations

3.1. Definitions

3.1.1. Definitions from the SVC Specification

   Section 3.1.1 of draft-ietf-avt-rtp-svc-13 applies.

3.1.2. Definitions Specific to This Memo

   Section 3.1.2 of draft-ietf-avt-rtp-svc-13 applies with the following
   addition.

      access unit identifier (AUID): A variable that is derived for
      each access unit when the single NAL unit packetization mode or
      the non-interleaved packetization mode is in use in Multi-Session
      Transmission.  The value of AUID is identical for all NAL units
      of an access unit regardless of the session in which the NAL
      units are conveyed in.  The AUID values of consecutive access
      units differ regardless of which sessions are decoded, but there
      are no other constraints of AUID values.

3.2. Abbreviations

   Section 3.2 of draft-ietf-avt-rtp-svc-13 applies with the following
   additions.

      AUID:     Access Unit Identifier

      TSD:      Timestamp Difference





Hannuksela, Wang       Expires January 14, 2009                [Page 4]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


4. RTP Payload Format

4.1. Design Principles

   Section 5.1 of draft-ietf-avt-rtp-svc-13 applies.

4.2. RTP Header Usage

   Section 5.2 of draft-ietf-avt-rtp-svc-13 applies.

4.3. Common Structure of the RTP Payload Format

   Section 5.3 of draft-ietf-avt-rtp-svc-13 applies.

4.4. NAL Unit Header Usage

   Section 5.4 of draft-ietf-avt-rtp-svc-13 applies.

4.5. Packetization Modes

   Section 5.4 of RFC 3984 applies when MST is not in use.  The
   packetization modes specified in Section 5.4 of RFC 3984 are also
   referred to as session packetization modes.

   When MST is in use, the following applies in addition.

4.5.1. Packetization Modes for Multi-Session Transmission

   This memo specifies two MST packetization modes for non-interleaved
   MST:

   o  Non-interleaved AUID-based mode (NI-A)

   o  Non-interleaved timestamp-difference-based mode (NI-TSD)

   In the NI-A and NI-TSD modes, NAL units in each RTP session are
   transmitted in NAL unit decoding order.

   NI-A or NI-TSD could be used instead of the MST packetization modes
   NI-T, NI-C, and NI-TC specified in draft-ietf-avt-rtp-svc-13.  The
   NI-A and NI-TSD modes simplify the packetization rules compared to
   those of NI-T, NI-C, and NI-TC.  In the NI-A and NI-TSD modes,
   senders need not add NAL units to the stream and receivers need not
   remove the added NAL units as must be done in the NI-T and NI-TC
   modes.  Moreover, the NI-MTAP packet introduced for NI-T and NI-TC
   modes is not needed and hence one precious NAL unit type value (the
   last one left for use in RTP payload specifications after the


Hannuksela, Wang       Expires January 14, 2009                [Page 5]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   introduction of the PACSI NAL unit in the SVC draft) is saved for
   future extensions.  The decoding order recovery process for the NI-A
   and NI-TSD modes does not require the reception and processing of
   RTCP sender reports, which makes the decoding order recovery process
   more straightforward compared to that of the NI-T mode.

   The operation of the NI-A mode is very similar to the NI-TSD mode -
   the only difference being how access units are identified.  The NI-A
   mode labels each access unit with an identifier, while the NI-TSD
   mode identifies access units with their RTP timestamp, which is
   indicated relative to the current packet in order to avoid
   dependencies on the random initial RTP timestamp.  However, when the
   NI-TSD mode is in use, the same initial RTP timestamp offset MUST be
   used in each associated RTP session as proposed in [I-D.lennox-avt-
   rtp-layered-encoding-timestamps].  As the NI-TSD mode leaves less
   implementation freedom for senders and hence reduces the likelihood
   of ill-behaving sender implementations, it is the preferred mode
   proposed in this memo.  However, as the usage of the same initial RTP
   offset in all sessions as proposed in [I-D.lennox-avt-rtp-layered-
   encoding-timestamps] has not been agreed yet, we included both NI-A
   and NI-TSD in this memo.

   This memo does not specify any MST mode for interleaved transmission,
   which would allow transmission of NAL units out of NAL unit decoding
   order in each RTP session.

   The MST packetization mode in use is signaled by the pmode media type
   parameter or by external means.

   The used MST packetization mode governs which session packetization
   modes are allowed in the involved RTP sessions, which in turn govern
   which NAL unit types are allowed as RTP payloads.

   Table 3.1 summarizes the allowed session packetization modes for the
   NI-A and NI-TSD MST packetization modes.



   Table 3.1  Summary of allowed session packetization modes for the NI-
   A and NI-TSD MST packetization modes (yes = allowed, no = disallowed)

      Session-Specific Mode    Base Session    Enhancement Session
      ----------------------------------------------------------
      Single NAL Unit Mode         yes             no
      Non-Interleaved Mode         yes            yes
      Interleaved Mode              no             no



Hannuksela, Wang       Expires January 14, 2009                [Page 6]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   Table 3.2 summarizes the allowed packet payload types for each
   allowed session packetization mode of the NI-A and NI-TSD MST
   packetization modes.

    Table 3.2  Summary of allowed packet payload types for each session
     packetization mode of the NI-A and NI-TSD MST packetization modes
               (yes = allowed, no = disallowed, ig = ignore)

      Packet    Packet  Single NAL    Non-Interleaved
      Payload   Type    Unit Mode           Mode
      Type
      ------------------------------------------------
      0      undefined     ig               ig
      1-23   NAL unit     yes              yes
      24     STAP-A        no              yes
      25     STAP-B        no               no
      26     MTAP16        no               no
      27     MTAP24        no               no
      28     FU-A          no              yes
      29     FU-B          no        no (base session)
                                     yes (enh. session)
      30     PACSI        yes              yes
      31     undefined     ig               ig

         Informative note: FU-B are allowed in the enhancement session
         as specified in Section 4.9.

   The packet payload type values indicated as undefined in Table 3.2
   are reserved for future extensions.  NAL units of those type values
   SHOULD NOT be sent by a sender (as packet payloads in single NAL unit
   packets or aggregation units in aggregation packets, or in FU
   packets) and MUST be ignored by a receiver.  Note that NAL unit types
   30 and 31 are indicated as undefined in RFC 3984, therefore RFC 3984
   receivers MUST ignore NAL units of these types, if present.

4.6. Decoding Order Number (DON)

   Section 5.5 of [RFC3984] applies when MST is not in use.

4.7. Identification of Access Units for Decoding Order Recovery in
   Multi-Session Transmission

   The decoding order recovery process in the NI-A and NI-TSD MST
   packetization modes proposed in this memo consists of three steps.
   First, a set of candidate access units is formed by including the
   next access unit in transmission order (in relation to the access
   unit that has just been processed) in each of the sessions.  Second,


Hannuksela, Wang       Expires January 14, 2009                [Page 7]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   for each candidate access unit, the previous access unit in decoding
   order in the same or a lower session is identified by information in
   the associated PACSI NAL unit or FU-B NAL unit.  In the NI-A mode,
   the Access Unit Identifier is used for the identification of the
   previous access unit.  In the NI-TSD mode, the signed timestamp
   difference between the current access unit and the previous access
   unit in the same or a lower session is indicated.  Third, the next
   access unit in decoding order is the access unit in the highest
   session among the candidate access units for which the indicated
   previous access unit is not a candidate access unit.

4.7.1. Access Unit Identifier (AUID) for the NI-A Mode

   When the NI-A MST packetization mode is in use, the packetization of
   each session MUST be as specified in Section 5.1. and the following
   applies.

   The NI-A mode uses two fields, AUID and PAUID, for the recovery of
   the decoding order of NAL units.  AUID and PAUID are conveyed in
   PACSI NAL units or in FU-B packets.  AUID and PAUID MUST be conveyed
   in at least one PACSI NAL unit or FU-B packet for each access unit in
   each session.

   AUID indicates the access unit identifier.  The AUID value for all
   NAL units having the same NALU-time MUST be identical.  The AUID
   value for consecutive access units in any set of sessions in the
   session dependency order MUST differ.

   PAUID indicates the access unit identifier of the previous access
   unit in decoding order among the session containing the packet
   including the PAUID field and the sessions below it in the session
   dependency hierarchy specified according to [I-D.ietf-mmusic-
   decoding-dependency].

   AUID and PAUID are 8-bit unsigned integers.

4.7.2. Timestamp Difference (TSD) for the NI-TSD Mode

   When the NI-TSD MST packetization mode is in use, the packetization
   of each session MUST be as specified in Section 5.1.  and the
   following applies.

   The NI-TSD mode uses the RTP timestamp and one field, TSD, for the
   recovery of the decoding order of NAL units.  TSD is conveyed in
   PACSI NAL units or in FU-B packets.  TSD MUST be conveyed in at least
   one PACSI NAL unit or FU-B packet for each access unit in each
   session.


Hannuksela, Wang       Expires January 14, 2009                [Page 8]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   The TSD field SHALL be set as follows:

   TSD = (TS(p) - TS(c)) / AUTICK, when abs(TS(p) - TS(c)) <= 2^31

   TSD = (TS(p) - 2^32 - TS(c)) / AUTICK, when TS(p) - TS(c) > 2^31

   TSD = (2^32 - TS(p) - TS(c)) / AUTICK, when TS(c) - TS(p) > 2^31

   where TS(p) is the RTP timestamp of the previous access unit
   containing NAL units within this session (conveying the TSD field),
   TS(c) is the RTP timestamp of the current access unit (conveying the
   TSD field), and AUTICK is the value of the sprop-au-tick media type
   parameter.

         Informative note: The second and third equation above cover
         the cases where TS(c) and TS(p), respectively, have wrapped
         over the maximum value for 32-bit unsigned integer, while the
         first equation covers the cases where neither of TS(p) and
         TS(c) have wrapped over.

   TSD is a 16-bit signed integer.

4.8. Aggregation Packets

   Section 5.6 of draft-ietf-avt-rtp-svc-13 applies.

4.9. Fragmentation Units (FUs)

   Section 5.7 of draft-ietf-avt-rtp-svc-13 applies with the following
   modifications.

   When fragmentation units are used in the NI-A mode, FU-B MUST be used
   in enhancement sessions for the first fragmentation unit of a
   fragmented NAL unit.  The DON field of the FU-B header in enhancement
   sessions is replaced by the AUID field followed by the PAUID field.
   The AUID field MUST be equal to the AUID value for the access unit
   containing the fragmented NAL unit.  The semantics of the PAUID field
   are specified in Section 4.7.1.

   When fragmentation units are used in the NI-TSD mode, FU-B MUST be
   used in enhancement sessions for the first fragmentation unit of a
   fragmented NAL unit.  The DON field of the FU-B header in enhancement
   sessions is replaced by the TSD field.  The semantics of the TSD
   field are specified in Section 4.7.2.





Hannuksela, Wang       Expires January 14, 2009                [Page 9]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


4.10. Payload Content Scalability Information (PACSI) NAL Unit

   Section 5.8 of draft-ietf-avt-rtp-svc-13 applies with the following
   modifications.

4.10.1. PACSI NAL Unit Modifications for the NI-A Mode

   The DONC field is replaced by the AUID field followed by the PAUID
   field.

   The semantics of DONC are removed.

   The occurrences of "DONC" are replaced with "AUID and PAUID".

   The semantics of AUID and PAUID are specified as follows.

   o  When present, the field AUID indicates the access unit identifier
      for all the NAL units in the aggregation packet (when the PACSI
      NAL unit is included in an aggregation packet) or the AUID of the
      next non-PACSI NAL unit in transmission order (when the PACSI NAL
      unit is included in a single NAL unit packet).  The constraints in
      Section 4.7.1. apply for the AUID.

   o  The semantics of the PAUID field are specified in Section 4.7.1.

4.10.2. PACSI NAL Unit Modifications for the NI-TSD Mode

   The DONC field is replaced by the TSD field.

   The semantics of DONC are removed.

   The occurrences of "DONC" are replaced with "TSD".

   The semantics of TSD are specified in Section 4.7.2.

5. Packetization Rules

   Section 6 of draft-ietf-avt-rtp-svc-13 applies.

5.1. Packetization Rules for Multi-Session Transmission

   When MST is used, decoding order recovery for NAL units carried in
   the associated RTP sessions is needed.  The following packetization
   rules ensure that decoding order of NAL units carried in the
   associated sessions can be correctly recovered for each of the MST
   packetization modes according to the de-packetization process
   specified in Section 6.1. .


Hannuksela, Wang       Expires January 14, 2009               [Page 10]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


5.1.1. NI-A and NI-TSD MST Packetization Rules

   When the NI-A or NI-TSD mode is in use, the following applies.

   o  For each single NAL unit packet containing a non-PACSI NAL unit,
      if present, the previous packet MUST have the same RTP timestamp
      as the single NAL unit packet, and the following applies.

         If the NALU-time of the non-PACSI NAL unit is not equal to the
          NALU-time of the previous non-PACSI NAL unit in decoding
          order, the previous packet MUST contain a PACSI NAL unit
          containing the AUID and PAUID fields when the NI-A mode is in
          use or the TSD field when the NI-TSD mode is in use;

         Otherwise (the NALU-time of the non-PACSI NAL unit is equal to
          the NALU-time of the previous non-PACSI NAL unit in decoding
          order), the previous packet MAY contain a PACSI NAL unit
          containing the AUID and PAUID fields when the NI-A mode is in
          use or the TSD field when the NI-TSD mode is in use.

   o  For each STAP-A packet, if present, if the RTP timestamp is
      different from the RTP timestamp of the previous STAP-A packet,
      the first NAL unit in the STAP-A packet MUST be a PACSI NAL unit
      containing the AUID and PAUID fields when the NI-A mode is in use
      or the TSD field when the NI-TSD mode is in use.

   o  For each FU-A packet, if present, the previous packet MUST have
      the same RTP timestamp as the FU-A packet, and the following
      applies.

         If the FU-A packet is the start of the fragmented NAL unit, the
          following applies;

              If the NALU-time of the fragmented NAL unit is not equal
               to the NALU-time of the previous non-PACSI NAL unit in
               decoding order, the previous packet MUST contain a PACSI
               NAL unit containing the AUID and PAUID fields when the
               NI-A mode is in use or the TSD field when the NI-TSD mode
               is in use;

              Otherwise (the NALU-time of the fragmented NAL unit is
               equal to the NALU-time of the previous non-PACSI NAL unit
               in decoding order), the previous packet MAY contain a
               PACSI NAL unit containing the AUID and PAUID fields when
               the NI-A mode is in use or the TSD field when the NI-TSD
               mode is in use.



Hannuksela, Wang       Expires January 14, 2009               [Page 11]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   o  For each single NAL unit packet containing a PACSI NAL unit, if
      present, the PACSI NAL unit MUST contain the AUID and PAUID fields
      when the NI-A mode is in use or the TSD field when the NI-TSD mode
      is in use.

5.1.2. Packetization rules for non-VCL NAL units

   Section 6.1.4 of draft-ietf-avt-rtp-svc-13 applies.

5.1.3. Packetization rules for Prefix NAL units

   Section 6.1.5 of draft-ietf-avt-rtp-svc-13 applies.

6. De-Packetization Process

   For single-session transmission, where a single RTP session is used,
   the de-packetization process specified in Section 7 of [RFC3984]
   applies.

   For multi-session transmission, where more than one RTP sessions are
   used to receive data from the same SVC bitstream, the de-
   packetization process is specified in Section 6.1.

6.1. De-Packetization Process for Multi-Session Transmission

6.1.1. Decoding Order Recovery for the NI-A Mode

   The following process SHALL be applied when the NI-A mode is in use.

   The decoding order recovery SHOULD start from an access unit where
   NAL units are present for the base session, herein referred to as
   access unit F.  Any packets preceding the first received packet of
   access unit F in reception order SHOULD be discarded.  The decoding
   order of NAL units of access unit F is specified below.

   For subsequent access units to be ordered, the following applies.
   Let AUID(n) and PAUID(n) be the AUID and PAUID values, respectively,
   of the first access unit in decoding order containing data in session
   n.  The first access unit in decoding order containing data in
   session n can be identified by the smallest value of RTP sequence
   number within session n (taking into account the potential wraparound
   of RTP sequence numbers) among those packets whose payloads have not
   been passed to the decoder yet.  Let a set of sessions S consist of
   those values of n for which NAL units are present in the first access
   unit in decoding order containing data in session n but are not
   present in a higher session in the same access unit.  In other words,



Hannuksela, Wang       Expires January 14, 2009               [Page 12]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   the set of sessions S contains the highest session of those access
   units that are candidates of being next in decoding order.

   The next access unit in decoding order is the access unit with the
   greatest value of m, where PAUID(m) is not equal to AUID(i), where m
   is any value within the set of sessions S and i is any value less
   than m within the set of sessions S.  In other words, the next access
   unit in decoding order is found by investigating the candidate access
   units in session dependency order from the highest session to the
   lowest session according to the highest session for which the
   candidate access units contain NAL units.  The next access unit in
   decoding order is the first access unit in the above investigation
   order that is not indicated to follow any candidate access unit in a
   lower session in decoding order.  The decoding order of NAL units of
   the access unit having AUID equal to AUID(m) is specified below.

         Informative note: In practical implementations, the set of
         sessions S can be formed by considering only those access
         units that have arrived within a certain inter-session jitter
         compensation period.  Consequently, it may not be necessary to
         wait access units from all sessions to arrive at a particular
         time for decoding order recovery.

   If several NAL units share the same value of AUID, the order in which
   NAL units are passed to the decoder is specified as follows:

   o  Collect all NAL units NU(y) associated with the same value of
      AUID.

   o  Place the collected NAL units in the session dependency order
      specified according to [I-D.ietf-mmusic-decoding-dependency] and
      then in the consecutive order of appearance within each session
      into an access unit while satisfying the NAL unit order rules in
      SVC access units as specified in [SVC] and summarized as an
      informative algorithm in Section 6.1.3.

6.1.1.1. Example 1 (Informative)

   The example shown in Figure 1 refers to three RTP sessions A, B and C
   containing a multiplexed SVC bitstream.  In the example, the
   dependency signaling [I-D.ietf-mmusic-decoding-dependency] indicates
   that Session A is the base RTP session, B is the first enhancement
   RTP session and depends on A, and C is the second RTP enhancement
   session and depends on A and B.  In the example, Session A has the
   lowest frame rate and Session B and C have the same, but a higher
   frame rate (using a hierarchical prediction structure).  Arbitrary
   values of AUID values have been used in the example.


Hannuksela, Wang       Expires January 14, 2009               [Page 13]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   Figure 1 shows an example for de-jitter buffering with different
   jitters present in the sessions, i.e. at buffering startup not all
   packets with the same timestamp are available in all the de-jittering
   buffers.  Jitter between the sessions is first assumed to be
   compensated by removing all NAL units preceding NAL unit with AUID
   equal to 2 (TS[1]).

   At the next step, the first access unit with data present in the base
   session is identified.  In this example, it is the access unit with
   AUID euqal to 4 (TS[8]).  The preceding access units (with AUID equal
   to 2 (TS[1]) and AUID equal to 5 (TS[3])) are removed.  NAL units of
   access unit with AUID equal to 4 (TS[8]) are passed to the decoder in
   layer dependency order.

   The next access unit (with AUID equal to 6 (TS[6])) has NAL units
   present in each session, hence it is selected as the next access unit
   to be decoded.

   Within independent sessions the next NAL units in decoding order
   belong to the access unit with AUID equal to 8 (TS[5]) (in sessions B
   and C) and to access unit AUID equal to 9 (TS[12]) (in session A).
   As session B and session A are not the highest sessions for the
   access unit with AUID equal to 8 and 9, respectively, the set of
   sessions S consists of only one session and the access unit with AUID
   equal to AUID(C) is selected as the next access unit in decoding
   order.

   The decoding order recovery process is then continues similarly for
   the following access units.




















Hannuksela, Wang       Expires January 14, 2009               [Page 14]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   Decoding order and dependency of NAL units per received RTP session
   with different jitter in sessions at buffering startup time:

   C: -------------(2,3)-(5,2)-(4,5)-(6,4)-(8,6)-(7,8)-(9,7)-
        |     |     |     |     |     |     |     |     |
   B: -(1,a)-(3,1)-(2,3)-(5,2)-(4,5)-(6,4)-(8,6)-(7,8)-(9,7)-
        |     |                 |     |                 |
   A: -------(3,a)-------------(4,3)-(6,4)-------------(9,6)-
   ---------------------------------------------------------->
   TS: [4]   [2]   [1]   [3]   [8]   [6]   [5]   [7]   [12]


   Key:
   A, B, C                - RTP sessions
   '( )'                  - (AUID, PAUID) a=any value in this example
   '|'                    - indicates corresponding NAL units of the
                            same access unit AU(TS[..]) in the RTP
                            sessions
   Integer values in '[]' - media Timestamp (TS), sampling time as
                            derived from RTP timestamps associated to
                            the access unit AU(TS[..]).

          Figure 1  Example for MST with different jitter in session at
                                     startup

6.1.1.2. Example 2 (Informative)

   The example shown in Figure 2 refers to three RTP sessions A, B and C
   containing a multiplexed SVC bitstream.  In the example, the
   dependency signaling [I-D.ietf-mmusic-decoding-dependency] indicates
   that Session A is the base RTP session, B is the first enhancement
   RTP session and depends on A, and C is the second RTP enhancement
   session and depends on A and B.  Sessions A, B and C represent
   different levels of temporal scalability.  Arbitrary values of AUID
   values have been used in the example.  The initial de-jittering is
   assumed to be tackled similarly as in the previous example and not
   illustrated in Figure 2.

   At the beginning, the first access unit with data present in the base
   session is identified.  In this example, it is the access unit with
   AUID euqal to 3 (TS[8]).  The preceding access unit (with AUID equal
   to 2 (TS[3]) is removed.




Hannuksela, Wang       Expires January 14, 2009               [Page 15]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   The next NAL units in decoding order belong to access unit with AUID
   equal to 9, 5, and 1 for session A, B, and C respectively, hence
   AUID(A)=9, PAUID(A)=3, AUID(B)=5, PAUID(B)=3, AUID(C)=1, PAUID(C)=5.
   All three sessions are present in the set of sessions S.  As PAUID(C)
   is equal to AUID(B), the access unit with AUID equal to AUID(C) is
   not selected as the next access unit in decoding order.  As PAUID(B)
   is not equal to AUID(A), the access unit with AUID equal to AUID(B)
   is selected as the next access unit in decoding order.

   The next NAL units in decoding order belong to access unit with AUID
   equal to 9, 8, and 1 for session A, B, and C respectively, hence
   AUID(A)=9, PAUID(A)=3, AUID(B)=8, PAUID(B)=9, AUID(C)=1, PAUID(C)=5.
   All three sessions are present in the set of sessions S.  As PAUID(C)
   is not equal to AUID(B) or AUID(A), the access unit with AUID equal
   to AUID(C) is selected as the next access unit in decoding order.
   After that, access unit with AUID equal to 4 is selected similarly as
   the next in decoding order.

   The next NAL units in decoding order belong to access unit with AUID
   equal to 9, 8, and 7 for session A, B, and C respectively, hence
   AUID(A)=9, PAUID(A)=3, AUID(B)=8, PAUID(B)=9, AUID(C)=7, PAUID(C)=8.
   All three sessions are present in the set of sessions S.  As PAUID(C)
   is equal to AUID(B) and PAUID(B) is equal to AUID(A), the access unit
   with AUID equal to AUID(C) or AUID(B) is not selected as the next
   access unit in decoding order.  As there is no session below session
   A, the access unit with AUID equal to AUID(A) is selected as the next
   access unit in decoding order.

   The decoding order recovery process is then continues similarly for
   the following access units.



















Hannuksela, Wang       Expires January 14, 2009               [Page 16]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   Decoding order and dependency of NAL units per received RTP session:

   C: --(2,a)-------------(1,5)-(4,1)-------------(7,8)-(6,7)-

   B: --------------(5,3)-------------------(8,9)-------------

   A: --------(3,a)-------------------(9,3)-------------------
   ----------------------------------------------------------->
   TS:  [3]   [8]   [6]   [5]   [7]   [12]  [10]  [9]   [11]


   Key:
   A, B, C                - RTP sessions
   '( )'                  - (AUID, PAUID) a=any value in this example
   '|'                    - indicates corresponding NAL units of the
                            same access unit AU(TS[..]) in the RTP
                            sessions
   Integer values in '[]' - media Timestamp (TS), sampling time as
                            derived from RTP timestamps associated to
                            the access unit AU(TS[..]).

          Figure 2  Example for MST with different jitter in session at
                                     startup

6.1.2. Decoding Order Recovery for the NI-TSD Mode

   The following process SHALL be applied when the NI-TSD session-
   multiplexing packetization mode is in use.

   The decoding order recovery SHOULD start from an access unit where
   NAL units are present for the base session, herein referred to as
   access unit F.  Any packets preceding the first received packet of
   access unit F in reception order SHOULD be discarded.  The decoding
   order of NAL units of access unit F is specified below.

   For subsequent access units to be ordered, the following applies.
   Let TS(n) and TSD(n) be the RTP timestamp and TSD values,
   respectively, of the first access unit in decoding order containing
   data in session n.  The first access unit in decoding order
   containing data in session n can be identified by the smallest value
   of RTP sequence number within session n (taking into account the
   potential wraparound of RTP sequence numbers) among those packets
   whose payloads have not been passed to the decoder yet.  Let a set of
   sessions S consist of those values of n for which NAL units are


Hannuksela, Wang       Expires January 14, 2009               [Page 17]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   present in the first access unit in decoding order containing data in
   session n but are not present in a higher session in the same access
   unit.  In other words, the set of sessions S contains the highest
   session of those access units that are candidates of being next in
   decoding order.

   The next access unit in decoding order is the access unit with the
   greatest value of m, where TS(m) + TSD(m) * AUTICK (where AUTICK is
   the value of the sprop-au-tick media type parameter) is not equal to
   TS(i), where m is any value within the set of sessions S and i is any
   value less than m within the set of sessions S.  In other words, the
   next access unit in decoding order is found by investigating the
   candidate access units in session depedency order from the highest
   session to the lowest session according to the highest session for
   which the candidate access units contain NAL units.  The next access
   unit in decoding order is the first access unit in the above
   investigation order that is not indicated to follow any candidate
   access unit in a lower session in decoding order.  The decoding order
   of NAL units of the access unit having RTP timestamp equal to TS(m)
   is specified below.

         Informative note: In practical implementations, the set of
         sessions S can be formed by considering only those access
         units that have arrived within a certain inter-session jitter
         compensation period.  Consequently, it may not be necessary to
         wait access units from all sessions to arrive at a particular
         time for decoding order recovery.

   If several NAL units share the same value of RTP timestamp, the order
   in which NAL units are passed to the decoder is specified as follows:

   o  Collect all NAL units NU(y) associated with the same value of RTP
      timestamp.

   o  Place the collected NAL units in the session dependency order
      specified according to [I-D.ietf-mmusic-decoding-dependency] and
      then in the consecutive order of appearance within each session
      into an access unit while satisfying the NAL unit order rules in
      SVC access units as specified in [SVC] and summarized as an
      informative algorithm in Section 6.1.3.

6.1.2.1. Example 1 (Informative)

   The video stream in this example is identical to that of Section
   6.1.1.1.




Hannuksela, Wang       Expires January 14, 2009               [Page 18]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   The example shown in Figure 3 refers to three RTP sessions A, B and C
   containing a multiplexed SVC bitstream.  In the example, the
   dependency signaling [I-D.ietf-mmusic-decoding-dependency] indicates
   that Session A is the base RTP session, B is the first enhancement
   RTP session and depends on A, and C is the second RTP enhancement
   session and depends on A and B.  In the example, Session A has the
   lowest frame rate and Session B and C have the same, but a higher
   frame rate (using a hierarchical prediction structure).

   Figure 3 shows an example for de-jitter buffering with different
   jitters present in the sessions, i.e. at buffering startup not all
   packets with the same timestamp are available in all the de-jittering
   buffers.  Jitter between the sessions is first assumed to be
   compensated by removing all NAL units preceding NAL unit with TS[1].

   At the next step, the first access unit with data present in the base
   session is identified.  In this example, it is the access unit with
   TS[8].  The preceding access units (with TS[1] and TS[3]) are
   removed.  NAL units of access unit with TS[8] are passed to the
   decoder in layer dependency order.

   The next access unit (with TS[6]) has NAL units present in each
   session, hence it is selected as the next access unit to be decoded.

   Within independent sessions the next NAL units in decoding order
   belong to the access unit with TS[5] (in sessions B and C) and to
   access unit with TS[12] (in session A).  As session B and session A
   are not the highest sessions for the access unit with TS[5] and
   TS[12], respectively, the set of sessions S consists of only one
   session and the access unit with TS[5] is selected as the next access
   unit in decoding order.

   The decoding order recovery process is then continues similarly for
   the following access units.















Hannuksela, Wang       Expires January 14, 2009               [Page 19]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   Decoding order and dependency of NAL units per received RTP session
   with different jitter in sessions at buffering startup time:

   C: -------------(1)---(-2)--(-5)--(2)---(1)---(-2)--(-5)--
        |     |     |     |     |     |     |     |     |
   B: -( )---(2)---(1)---(-2)--(-5)--(2)---(1)---(-2)--(-5)--
        |     |                 |     |                 |
   A: -------(2)---------------(-6)--(2)---------------(-6)--
   ---------------------------------------------------------->
   TS: [4]   [2]   [1]   [3]   [8]   [6]   [5]   [7]   [12]


   Key:
   A, B, C                - RTP sessions
   '( )'                  - (TSD)
   '|'                    - indicates corresponding NAL units of the
                            same access unit AU(TS[..]) in the RTP
                            sessions
   Integer values in '[]' - media Timestamp (TS), sampling time as
                            derived from RTP timestamps associated to
                            the access unit AU(TS[..]).

          Figure 3  Example for MST with different jitter in session at
                                     startup

6.1.2.2. Example 2 (Informative)

   The video stream in this example is identical to that of Section
   6.1.1.2.

   The example shown in Figure 4 refers to three RTP sessions A, B and C
   containing a multiplexed SVC bitstream.  In the example, the
   dependency signaling [I-D.ietf-mmusic-decoding-dependency] indicates
   that Session A is the base RTP session, B is the first enhancement
   RTP session and depends on A, and C is the second RTP enhancement
   session and depends on A and B.  Sessions A, B and C represent
   different levels of temporal scalability.  The initial de-jittering
   is assumed to be tackled similarly as in the previous example and not
   illustrated in Figure 4.

   At the beginning, the first access unit with data present in the base
   session is identified.  In this example, it is the access unit with
   TS[8].  The preceding access unit (with TS[3] is removed.



Hannuksela, Wang       Expires January 14, 2009               [Page 20]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   The next NAL units in decoding order belong to access unit with
   TS[12], TS[6], and TS[5] for sessions A, B, and C, respectively,
   hence TS(A)=12, TSD(A)=-4, TS(B)=6, TSD(B)=2, TS(C)=5, and TSD(C)=1.
   All three sessions are present in the set of sessions S.  As
   TS(C) + TSD(C) = 5 + 1 = 6 = TS(B), the access unit with TS[5] is not
   selected as the next access unit in decoding order.  As
   TS(B) + TSD(B) = 6 + 2 = 8 is not equal to TS(A), the access unit
   with TS[6] is selected as the next access unit in decoding order.

   The next NAL units in decoding order belong to access unit with
   TS[12], TS[10], and TS[5] for sessions A, B, and C, respectively,
   hence TS(A)=12, TSD(A)=-4, TS(B)=10, TSD(B)=2, TS(C)=5, and TSD(C)=1.
   All three sessions are present in the set of sessions S.  As
   TS(C) + TSD(C) = 5 + 1 = 6 is not equal to TS(A) or TS(B), the access
   unit with TS[5] is selected as the next access unit in decoding
   order.  After that, access unit with TS[7] is selected similarly as
   the next in decoding order.

   The next NAL units in decoding order belong to access unit with
   TS[12], TS[10], and TS[9] for sessions A, B, and C, respectively,
   hence TS(A)=12, TSD(A)=-4, TS(B)=10, TSD(B)=2, TS(C)=9, and TSD(C)=1.
   All three sessions are present in the set of sessions S.  As
   TS(C) + TSD(C) = 9 + 1 = 10 = TS(B) and TS(B) + TSD(B) = 10 + 2 = 12
   = TS(A), the access unit with TS[9] or TS[10] is not selected as the
   next access unit in decoding order.  As there is no session below
   session A, the access unit with TS[12] is selected as the next access
   unit in decoding order.

   The decoding order recovery process is then continues similarly for
   the following access units.



















Hannuksela, Wang       Expires January 14, 2009               [Page 21]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   Decoding order and dependency of NAL units per received RTP session:

   C: --(-2)--------------(1)---(-2)--------------(1)---(-2)-

   B: --------------(2)---------------------(2)--------------

   A: --------(-4)--------------------(-4)-------------------
   ---------------------------------------------------------->
   TS:  [3]   [8]   [6]   [5]   [7]   [12]  [10]  [9]   [11]


   Key:
   0, 1, 2                - RTP sessions
   '( )'                  - (TSD)
   '|'                    - indicates corresponding NAL units of the
                            same access unit AU(TS[..]) in the RTP
                            sessions
   Integer values in '[]' - media Timestamp (TS), sampling time as
                            derived from RTP timestamps associated to
                            the access unit AU(TS[..]).

          Figure 4  Example for MST with different jitter in session at
                                     startup

6.1.3. Informative Algorithm for NI-A and NI-TSD Decoding Order Recovery
   within an Access Unit

   Section 7.1.1.1 of draft-ietf-avt-rtp-svc-13 applies.

7. Payload Format Parameters

   Section 8 of draft-ietf-avt-rtp-svc-13 applies.

7.1. Media Type Registration

   Section 8.1 of draft-ietf-avt-rtp-svc-13 applies with the following
   modifications.

      pmode:
         This parameter signals the properties of a NAL unit stream
         carried in more than one RTP session using MST or the
         capabilities of a receiver implementation.  When the value of
         pmode is equal to "NI-A", the NI-A mode MUST be used.  When
         the value of pmode is equal to "NI-TSD", the NI-TSD mode MUST


Hannuksela, Wang       Expires January 14, 2009               [Page 22]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


         be used.  This parameter MUST NOT be present, when
         "packetization-mode" is present.

      sprop-au-tick:
         This parameter indicates the number of 90000-kHz clock ticks
         used as a multiplier in the NI-TSD mode.  The parameter MUST
         NOT be present when pmode is not equal to "NI-TSD".  If the
         parameter is not present and the NI-TSD mode is in use, sprop-
         au-tick is inferred to be equal to 1.  The value of sprop-au-
         tick MUST be a positive integer.

7.2. SDP Parameters

   Section 8.2 of draft-ietf-avt-rtp-svc-13 applies.

7.3. Examples

   Section 8.3 of draft-ietf-avt-rtp-svc-13 applies.

7.4. Parameter Set Considerations

   Section 8.4 of draft-ietf-avt-rtp-svc-13 applies.

8. Security Considerations

   Section 9 of draft-ietf-avt-rtp-svc-13 applies.

9. Congestion Control

   Section 10 of draft-ietf-avt-rtp-svc-13 applies.

10. IANA Consideration

   Section 11 of draft-ietf-avt-rtp-svc-13 applies.

11. Informative Appendix: Application Examples

   Section 12 of draft-ietf-avt-rtp-svc-13 applies.

12. References

12.1. Normative References

   Section 13.1 of draft-ietf-avt-rtp-svc-13 applies with the following
   additions.




Hannuksela, Wang       Expires January 14, 2009               [Page 23]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   [I-D.ietf-avt-rtp-svc]  Wenger, S., Wang, Y.-K., Schierl, T., and
             Eleftheriadis, A., "RTP payload format for SVC video",
             draft-ietf-avt-rtp-svc-13 (work in progress), July 2008.

   [I-D.lennox]   Lennox, J., Schierl, T., and Ganesan S., "Real-Time
             Transport Protocol (RTP) Timestamps for Layered Encodings",
             draft-lennox-avt-rtp-layered-encoding-timestamps-00, June
             2, 2008.

12.2. Informative References

   Section 13.2 of draft-ietf-avt-rtp-svc-13 applies.

13. Authors' Addresses

   Miska M. Hannuksela
   Nokia Research Center
   P.O. Box 1000
   33721 Tampere
   Finland

   Phone: +358-7180-73151
   EMail: miska.hannuksela@nokia.com

   Ye-Kui Wang
   Nokia Research Center
   P.O. Box 1000
   33721 Tampere
   Finland

   Phone: +358-50-466-7004
   EMail: ye-kui.wang@nokia.com


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.


Hannuksela, Wang       Expires January 14, 2009               [Page 24]


Internet-Draft    Session Multiplexing for SVC Video          July 2008


   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.

Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.



















Hannuksela, Wang       Expires January 14, 2009               [Page 25]