MMUSIC Working Group                                         T. Schierl
Internet Draft
Document: draft-schierl-mmusic-layered-codec-01
Expires: April 2007
                                                           October 2006





            Signaling of layered and multi description media
                 in Session Description Protocol (SDP)

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 23, 2007.

Copyright Notice

   Copyright (C) The Internet Society (2006).


Abstract

This memo defines semantics that allow for signaling decoding dependency
of different media descriptions with the same media type in the Session
Description Protocol (SDP).  This is required, for example, if media
data is separated and transported in different network streams as a
result of the use of a layered media coding process.

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

A new grouping type "DDP" -- decoding dependency -- is defined, to be
used in conjunction with RFC 3388 entitled "Grouping of Media Lines in
the Session Description Protocol".  In addition, an attribute is
specified describing the relationship of the media streams in a "DDP"
group.
Finally, this memo defines SDP semantics indicating SSRC multiplexing
for media sessions in case RTP is used as the protocol for media
transport.

[Edt. note: This is one of the key questions: should this draft address
RTP specifics?  Should it address a concept that may make sense in niche
applications for SVC, but perhaps no where else?  Or should we move the
SSRC stuff to the SVC payload spec instead?]










































Schierl                     Standards Track                   [page 2]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

Table of Content

   1.   Introduction.................................................4
   2.   Terminology..................................................4
   3.   Motivation and use cases.....................................4
   3.1.   Motivation for media dependency signaling..................4
   3.2.   Use cases for layered and MDC coding and transport.........6
   4.   Signaling in SDP for media dependency........................6
   4.1.   Design Principles..........................................6
   4.2.   Definitions................................................7
   4.3.   Semantics..................................................8
   4.3.1.  SDP grouping semantics for decoding dependency............8
   4.3.2.  Attribute for dependency signaling per media-stream.......8
   4.3.3.  Attribute for signaling implicit SSRC multiplexing........9
   5.   Usage of new semantics in SDP...............................10
   5.1.1.  Usage with the SDP Offer/Answer Model....................10
   5.1.2.  Network elements not supporting dependency signaling.....10
   5.2.   Examples..................................................10
   6.   Security Considerations.....................................12
   7.   IANA Consideration..........................................12
   8.   Acknowledgements............................................12
   9.   References..................................................12
   9.1.   Normative References......................................12
   9.2.   Informative References....................................13
   10.  Author's Addresses..........................................13
   11.  Intellectual Property Statement.............................13
   12.  Disclaimer of Validity......................................14
   13.  Copyright Statement.........................................14
   14.  RFC Editor Considerations...................................14
   15.  Open Issues.................................................14
   16.  Changes Log.................................................14























Schierl                     Standards Track                   [page 3]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006


1. Introduction

   An SDP session description may contain one or more media
   descriptions, each identifying a single media stream.  A media
   description is identified by one "m=" line.  If more than one "m="
   line exist, indicating the same media type, a receiver or network
   element cannot possibly identify an existing relationship between
   those "m=" lines.  This is certainly the case if the receiver or
   network element is not aware of the media specific information, which
   may be carried within in the "fmtp:" attribute.

   Recently, an interest has been expressed to signal relationships of
   media streams.  Different reasons can be envisioned, for example the
   transporting of bitstream partitions of a hierarchical media coding
   process (also known as layered media coding process) or of a multi
   description coding (MDC) in different network streams.  Trigger for
   this draft has been the standardization process of the SVC payload
   format [SVCpayld].

   At present, SDP does not allow for signaling such relations.

   This memo also defines signaling extensions to be specifically used
   with SSRC multiplexing techniques in case using RTP as transport
   protocol.

2. Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in BCP 14, RFC 2119
   [RFC2119].

3. Motivation and use cases

3.1. Motivation for media dependency signaling

   There may be various reasons for the concurrent transport of various
   media (as identified by a media description) of the same media type,
   among which certain dependencies may exist.  But the basic idea for
   all cases is the separation of partitions of a media bitstream to
   allow scalability in network elements.

   Two types of dependency are discussed in the following in more
   detail, as they are conceptually well understood:

   o Layered/Hierarchical decoding dependency:



Schierl                     Standards Track                   [page 4]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

   In layered coding, the partitions of a media bitstream are known as
   media layers or simply layers.  One or more layers may be transported
   in different network streams.  A classic use case is known as
   receiver-driven layered multicast, in which a receiver selects a
   combination of media streams conveyed in their own (in this case RTP-
   ) session in response to quality or bit-rate requirements.

   Back in the mid 1990s, the then available layered media formats and
   codecs envisioned primarily (or even exclusively) a one-dimensional
   hierarchy of layers.  That is, each so-called enhancement layer
   referred to exactly one layer "below".  The single exception has been
   the base layer, which is self-contained.  Therefore, an
   identification of one enhancement layer fully specifies the operation
   point of a layered decoding scheme, including knowledge about all the
   other layers that need to be decoded.
   [RFC4456] contains rudimentary support for exactly this use case and
   media formats, in that it allows for signaling a range of transport
   addresses for a certain media description.  By definition, a higher
   transport address identifies a higher layer in the one-dimensional
   hierarchy.  A receiver needs only to decode data conveyed over this
   transport address and lower transport addresses to decode this
   operation point of the scalable bit stream.

   Newer media formats depart from this simple one-dimensional
   hierarchy, in that highly complex (at least-tree-shaped) dependency
   hierarchies can be implemented.  Compelling use cases for these
   complex hierarchies have been identified by industry as well.
   Support for it is therefore desirable.  However, SDP, in its current
   form does not take into account that different combination of a
   layered media bitstream result in different operation points
   (represented by a layer or a combination of layers) of the media
   bitstream.

   o Multi descriptive decoding dependency:

   In the most basic form of multiple descriptive coding (MDC), each
   partition forms an independent representation of the media.  That is,
   decoding of any of the partition yields useful reproduced media data.
   When more than one partition is available, then a decoder can process
   them jointly, and the resulting media quality increases.  The highest
   reproduced quality is available if all original partitions are
   available for decoding.

   More complex forms of multiple descriptive coding can also be
   envisioned, i.e. where, as a minimum, N out of M total partitions
   need to be available to allow meaningful decoding.




Schierl                     Standards Track                   [page 5]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

   MDC has not yet been embraced heavily by the media standardization
   community, though it is subject of a lot of academic research.  As an
   example, we refer to [MDC].

3.2. Use cases for layered and MDC coding and transport

   o Receiver driven layered multicast
   This technology is discussed in [RFC3550] and references therein.  We
   refrain from elaborating further; the subject is well known and
   understood.

   o Multiple end-to-end transmission with different properties
   Assume a unicast (point-to-point) topology, wherein one endpoint
   sends media to another.  Assume further that different forms of media
   transmission are available.  The difference may lie in the cost of
   the transmission (free, charged), in the available protection
   (unprotected/secure), in the quality of service (guaranteed quality /
   best effort) or other factors.
   Layered and MDC coding allows to match the media characteristics to
   the available transmission path.  For example, in layered coding it
   makes sense to convey the base layer over high QoS and/or over an
   encrypted transmission path.  Enhancement layers, on the other hand,
   can be conveyed over best effort, as they are "optional" in their
   characteristic -- nice to have, but non-essential for media
   consumption.  Similarly, while it is essential that the base layer is
   encrypted, there is (at least conceptually) no need to encrypt the
   enhancement layer, as the enhancement layer may be meaningless
   without the (encrypted) base layer.  In a different scenario, the
   base layer may be offered in a non-encrypted session as a free
   preview.  And an encrypted enhancement layer allowing optimal quality
   play-back may be only accessible for users activated by a conditional
   access mechanism.

   o Differentiation on transport level within a media stream (e.g. RTP
   session):
   An application may benefit from a more detailed differentiation on
   transport level.  This may particularly be the case, if using RTP
   with SSRC multiplexing as described in section 13.5 of [SVCpayld].

4. Signaling in SDP for media dependency

4.1. Design Principles

   The dependency signaling is only feasible between media descriptions
   described with a "m="-line and with an assigned media identification
   attribute ("mid") defined in RFC3388.




Schierl                     Standards Track                   [page 6]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

   If an application requires SSRC multiplexing to be used, this memo
   describes a media level attribute for signaling the use of this RTP
   multiplexing type.

4.2. Definitions

   Media stream:
   As used in [RFC4456].

   Media bitstream:
   A valid, decodable stream, containing ALL media partitions generated
   by the encoder.  A media bitstream normally conforms to a media
   coding standard.

   Media partition:
   A subset of a media bitstream indented for independent
   transportation.  An integer number of partitions form a media
   bitstream.  In layered coding, a media partition represents a layer.
   In MDC coding, a media partition represents a description.

   Decoding dependency:
   The class of relationship media partitions have to each other.  At
   present, this memo defines two decoding dependencies: layering and
   multiple description.

   Hierarchical/layered coding dependency:
   Each media partition is only useful (i.e. can be decoded) when ALL
   media partitions it depends on are available.  The dependencies
   between the media partitions create a directed graph.  Note:
   normally, in layered/hierachical coding, the more media partitions
   are employed (following the rule above), the better the reproduced
   quality evolves.

   Multi description coding (MDC) dependency:
   N of M media partitions are required to form a valid media bitstream,
   but there is no hierarchy between these media partitions.  Most MDC
   schemes aim at an increase of reproduced media quality when more
   media partitions are decoded than necessarily required to form an
   Operation Point.

   Operation point:
   A subset of a layered or MDC media bitstream that includes all
   partitions required for reconstruction at a certain point of quality
   or error resilience, and does not include any other Media Partitions.


   The following terms are itemized for clarification on RTP [RFC3550]
   multiplexing techniques.  Further discussion can be found in section
   5.2 of [RFC3550].

Schierl                     Standards Track                   [page 7]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006


   Session multiplexing:  The scalable SVC bitstream is distributed onto
   different RTP sessions, whereby each RTP session carries one RTP
   packet stream.  Each RTP session requires a separate signaling and
   has a separate Timestamp, Sequence Number, and SSRC space.
   Dependency between sessions MUST be signaled according to this memo.

   SSRC multiplexing:  The scalable SVC bitstream is distributed in a
   single RTP session, but that session comprises more than one RTP
   packet stream, identified by its SSRC.
   The use of SSRC multiplexing MUST be signaled according to this memo.

4.3. Semantics

4.3.1.    SDP grouping semantics for decoding dependency

   This specification defines the new grouping semantics
   Decoding Dependency "DDP":

   DDP associates a media stream, identified by its mid attribute, with
   a DDP group.  Each media stream MUST be composed of an integer number
   of media partitions.  All media streams of a DDP group MUST have the
   same type of coding dependency (as signaled by attribute defined in
   4.3.2) and MUST belong to one media bitstream.  All media streams
   MUST be part of at least one operation point.  The DDP group type
   informs a receiver about the requirement for treating the media
   streams of the group according to the new media level attribute
   "depend", as defined in 4.3.2.

4.3.2.    Attribute for dependency signaling per media-stream

   This memo defines a new media-level value attribute, "depend", with
   the following BNF [RFC2234]. The "identification-tag" (if used) is
   defined in [RFC3388]:

          depend-attribute     = "a=depend:" dependency-type-tag
                                  *(space identification-tag)
          dependency-type-tag  = dependency
          dependency           = "lay" / "mdc" / "lay-ssrc"


   The "depend"-attribute describes the decoding dependency.  The
   "depend"-attribute may be followed by a sequence of identification-
   tag(s) which identify the directly related media streams.  The
   attribute MAY be used with multicast as well as with unicast
   transport addresses.  The following types of dependencies are
   defined:



Schierl                     Standards Track                   [page 8]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

   o lay:  Layered decoding dependency -- identifies the described media
   stream as one or more partitions of a layered media bitstream.  When
   lay is used, all media streams MUST be identified by the following
   identification-tag(s) that are required for a successful use of the
   media stream.  The identification-tag(s) MUST be present when lay is
   in use.  Further the described media stream represents one operation
   point of the layered media bitstream.  As a result, all other media
   streams belonging to the same dependency group, but not identified by
   an identification-tag in the media description, are not required for
   a successful reproduction of the operation point.  Hence, a media
   sender MAY omit sending them when that is advantageous from a
   scalability or transport viewpoint.

   o lay-ssrc: Layered decoding dependency in media stream.  This
   attribute indicates the presence of hierarchical relationship within
   the media stream.  For more details refer to section 4.3.3. This
   value MUST NOT be used with an identification-tag.

   o mdc:  Multi descriptive decoding dependency -- signals that the
   described media stream is or one more partitions of a multi
   description coding (MDC) media bitstream.  By definition, at least N
   out of M streams of the group MUST be received for allowing decoding
   the media, whereby N and M are media stream dependent and not
   signaled.  Receiving more than one media stream of the group may
   enhance the decodable quality of the media bitstream.  This type of
   dependency does not require the signaling of the depended media
   streams.

4.3.3.    Attribute for signaling implicit SSRC multiplexing

   This specification defines a new media-level value attribute,
   "ssrcmux".  Therefore the formatting in SDP is described by the
   following BNF [RFC2234].

          ssrcmux-attribute     = "a=ssrcmux:" 1*DIGIT

   The "ssrcmux" attribute indicates that implicit SSRC multiplexing is
   used.  Therefore the transport protocol type of the media MUST be RTP
   [RFC3551] and the RTP profile MUST be any of RTP/AVP [RFC3551],
   RTP/SAVP [RFC3711], RTP/AVPF [RFC4585], or RTP/SAVPF [SAVPF].
   Implicit SSRC multiplexing implies that layers or combination of
   layers are conveyed in their own respective RTP transport stream
   within the same RTP session.  The dependency order, from higher to
   lower important layers, is indicated by SSRC values -- the higher the
   importance of a layer is, the higher its SSRC value is.  The number
   following the "ssrcmux"-attribute indicates the number SSRCs values
   used, and therefore the number of different RTP packet streams within
   a media description.  This attribute SHALL be used in combination
   with a=depend:lay-ssrc attribute only.

Schierl                     Standards Track                   [page 9]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

   This signaling SHALL NOT be used with multicast transport addresses.

5. Usage of new semantics in SDP

5.1.1.    Usage with the SDP Offer/Answer Model

   If an Answerer does not understand the decoding dependency signaling,
   it SHOULD be able detect the 'base' media only for a layered media
   session or SHOULD be able to detect only one partition of MDC media
   session.  That is, the session description MUST offer a backward
   compatible partition of the media stream with a separate media
   description.  This media description may point to the same transport
   address as used for an extended media session description using the
   features defined in this memo.  Thus for both described cases, an
   Answerer may not understand the full media description, but may be
   able to request a valid sub-set of the offered media.

   If an Offerer is not able to interpret the decoding dependency
   signaling, the Offerer SHALL NOT offer the features defined in this
   memo.

5.1.2.    Network elements not supporting dependency signaling

   Network elements that do not understand the new grouping type, but
   understand grouping in general, MAY detect a general requirement of
   treating the media streams of the group in a certain way.  Network
   elements that do not understand the decoding dependency signaling MAY
   treat all media streams of a session in the same way or MAY use their
   knowledge about the media format description for treatment of media
   streams, if such knowledge does exist.  Receivers that do not
   understand the signaling defined in this memo may detect a subset of
   the separated media only, thus the receiver may not understand the
   full media description, but may be able to understand and/or request
   a subset of the media.

5.2. Examples

   a.)  Example for signaling transport of operation points of a layered
        video bitstream in different network streams:


          v=0
          o=svcsrv 289083124 289083124 IN IP4 host.example.com
          s=LAYERED VIDEO SIGNALING Seminar
          t=0 0

          c=IN IP4 224.2.17.12/127
          a=group:DDP 1 2 3 4


Schierl                     Standards Track                   [page 10]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

          m=video 40000 RTP/AVP 94
          b=AS:96
          a=framerate:15
          a=rtpmap:94 h264/90000
          a=mid:1

          m=video 40002 RTP/AVP 95
          b=AS:64
          a=framerate:15
          a=rtpmap:95 svc1/90000
          a=mid:2
          a=depend:lay 1

          m=video 40004 RTP/AVP 96
          b=AS:128
          a=framerate:30
          a=rtpmap:96 svc1/90000
          a=mid:3
          a=depend:lay 1

          m=video 40004 RTP/SAVP 100
          c=IN IP4 224.2.17.13/127
          b=AS:512
          k=uri:conditional-access-server.example.com
          a=framerate:30
          a=rtpmap:100 svc1/90000
          a=mid:4
          a=depend:lay 1 3


   b.)  Example for signaling transport of streams of a multi
        description (MDC) video bitstream in different network streams:

          v=0
          o=mdcsrv 289083124 289083124 IN IP4 host.example.com
          s=MULTI DESCRIPTION VIDEO SIGNALING Seminar
          t=0 0

          c=IN IP4 224.2.17.12/127
          a=group:DDP 1 2 3
          m=video 40000 RTP/AVP 94
          a=mid:1
          a=depend:mdc

          m=video 40002 RTP/AVP 95
          a=mid:2
          a=depend:mdc

          m=video 40004 RTP/AVP 96

Schierl                     Standards Track                   [page 11]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

          c=IN IP4 224.2.17.13/127
          a=mid:3
          a=depend:mdc

   c.)  Example for signaling implicit SSRC multiplexing for an RTP
        session containing three RTP packet streams:

          v=0
          o=svcsrv 289083124 289083124 IN IP4 host.example.com
          s=LAYERED SSRC MUX VIDEO SIGNALING Seminar
          t=0 0

          c=IN IP4 131.160.1.112

          m=video 40000 RTP/AVP 96
          b=AS:512
          a=framerate:30
          a=rtpmap:96 svc1/90000
          a=ssrcmux:3
          a=depend:lay-ssrc


6. Security Considerations


7. IANA Consideration


8. Acknowledgements

   Funding for the RFC Editor function is currently provided by the
   Internet Society.  Further, the author Thomas Schierl of Fraunhofer
   HHI is sponsored by the European Commission under the contract number
   FP6-IST-0028097, project ASTRALS.

9. References

9.1. Normative References

[RFC4456]    Handley, M., Jacobson, V, and C. Perkins, "SDP: Session
             Description Protocol", IETF work in progress, July
             2006.
[RFC3388]    Camarillo, G., Holler, J., and H. Schulzrinne, "Grouping of
             Media Lines in the Session Description Protocol (SDP)",
             RFC 3388, December 2002.
[RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.



Schierl                     Standards Track                   [page 12]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

[RFC3550]    Schulzrinne, H., Casner, S., Frederick, R., and V.
             Jacobson, "RTP: A Transport Protocol for Real-Time
             Applications", STD 64, RFC 3550, July 2003.
[RFC2234]    Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
             Specifications: ABNF", RFC 2234, November 1997
[RFC3551]    Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
             Video Conferences with Minimal Control", STD 65, RFC 3551,
             July 2003.
[RFC3711]    Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
             Norrman, "The Secure Real-time Transport Protocol (SRTP)",
             RFC 3711, March 2004.
[RFC4585]    Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
             "Extended RTP Profile for Real-time Transport Control
             Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
             2006
[SAVPF]      Ott, J., and E. Carrara, "draft-ietf-avt-profile-savpf-
             08.txt", October 2006


9.2. Informative References

[SVCpayld]   Wenger,S., Wang, Y.-K., and T. Schierl, "RTP Payload Format
             for SVC Video", "draft-wenger-avt-rtp-svc-03.txt",
             October 2006
[RFC3984]    Wenger, S., Hannuksela, M., Stockhammer, T.,
             Westerlund, M. and D. Singer, "RTP Payload Format for H.264
             Video", RFC 3984, February 2005
[MDC]        Vitali, A., Borneo, A., Fumagalli, M., and R. Rinaldo,
             "Video over IP using Standard-Compatible Multiple
             Description Coding: an IETF proposal", Packet Video
             Workshop, April 2006, Hangzhou, China

10.  Author's Addresses

   Thomas Schierl                       Phone: +49-30-31002-227
   Fraunhofer HHI                       Email: schierl@hhi.fhg.de
   Einsteinufer 37
   D-10587 Berlin
   Germany

11.  Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be

Schierl                     Standards Track                   [page 13]


INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-01    October 2006

   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.

12.  Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

13.  Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.

14.  RFC Editor Considerations

   none



















Schierl                     Standards Track                   [page 14]