MMUSIC Working Group                                         T. Schierl
Internet Draft                                           Fraunhofer HHI
Intended status: Standards Track                              S. Wenger
Expires: May 11, 2008                                             Nokia
                                                      November 12, 2007

  Signaling media decoding dependency in Session Description Protocol

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

   This Internet-Draft will expire on May 11, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).


This memo defines semantics that allow for signaling the decoding
dependency of different media descriptions with the same media type in
the Session Description Protocol (SDP).  This is required, for example,
if media data is separated and transported in different network streams
as a result of the use of a layered or multiple descriptive media coding

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

A new grouping type "DDP" -- decoding dependency -- is defined, to be
used in conjunction with RFC 3388 entitled "Grouping of Media Lines in
the Session Description Protocol".  In addition, an attribute is
specified describing the relationship of the media streams in a "DDP"

Schierl & Wenger         Expires May 11, 2008             [page 2]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

Table of Content

   1.  Introduction .................................................. 4
   2.  Terminology ................................................... 4
   3.  Definitions ................................................... 5
   4.  Motivation, Use Cases, and Architecture ....................... 6
   4.1.  Motivation .................................................. 6
   4.2.  Use cases ................................................... 7
   5.  Signaling Media Dependencies .................................. 8
   5.1.  Design Principles ........................................... 8
   5.2.  Semantics ................................................... 8
   5.2.1.  SDP grouping semantics for decoding dependency ............ 8
   5.2.2.  Attribute for dependency signaling per media-stream ....... 9
   6.  Usage of new semantics in SDP ................................ 10
   6.1.  Usage with the SDP Offer/Answer Model ...................... 10
   6.2.  Declarative usage .......................................... 10
   6.3.  Usage with Capability Negotiation .......................... 10
   6.4.  Examples ................................................... 10
   7.  Security Considerations ...................................... 12
   8.  IANA Considerations .......................................... 12
   9.  Acknowledgements ............................................. 12
   10. References ................................................... 13
   10.1.  Normative References ...................................... 13
   10.2.  Informative References .................................... 13
   11. Author's Addresses ........................................... 13
   12. Intellectual Property Statement .............................. 14
   13. Disclaimer of Validity ....................................... 14
   14. Copyright Statement .......................................... 14
   15. RFC Editor Considerations .................................... 14
   16. Change Log: .................................................. 15

Schierl & Wenger         Expires May 11, 2008             [page 3]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

1. Introduction

   An SDP session description may contain one or more media
   descriptions, each identifying a single media stream.  A media
   description is identified by one "m=" line.  Today, if more than one
   "m=" lines exist indicating the same media type, a receiver cannot
   identify a specific relationship between those media.

   A Multiple Description Coding (MDC) or layered Media Bitstream
   contains, by definition, one or more Media Partitions that are
   conveyed in their own media stream.  In Multi View Coding (MVC) [MVC]
   dependencies between views are used for increasing coding efficiency.
   The cases we are interested in are a layered, MDC and MVC Bitstreams
   with two or more Media Partitions.  Carrying more than one Media
   Partition in its own session is one of the key use cases for
   employing layered or MDC coded media.  In MVC, different views or
   Media Partitions, all e.g. depending on a base view, are conveyed in
   different sessions.  Senders, network elements, or receivers can
   suppress sending/forwarding/subscribing/decoding individual Media
   Partitions and still preserve perhaps suboptimal, but still useful
   media quality.

   One property of all Media Bitstreams relevant to this memo is that
   their Media Partitions have a well-defined usage relationship.  For
   example, in layered coding, "higher" Media Partitions are useless
   without "lower" ones.  In MDC coding, Media Partitions are
   complementary -- the more Media Partitions one receives, the better
   the reproduced quality may be possible.  At present, SDP and its
   supporting infrastructure of RFCs lack the means to express such a
   usage relationship.

   Trigger for the present memo has been the standardization process of
   the RTP payload format for the Scalable Video Coding extension to
   ITU-T Rec. H.264 / MPEG-4 AVC [I-D.ietf-avt-rtp-svc].  When drafting
   [I-D.ietf-avt-rtp-svc] , it was observed that the aforementioned lack
   in signaling support is one that's not specific to SVC, but applies
   to all layered or MDC codecs.  Therefore, this memo presents a
   generic solution.

   The mechanisms defined herein are media transport protocol
   independent, i.e. applicable beyond the use of RTP [RFC3550].

   The SDP grouping of Media Lines of different media types is out of
   scope of this memo.

2. Terminology

Schierl & Wenger         Expires May 11, 2008             [page 4]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in BCP 14, RFC 2119

3. Definitions

   Media stream:
   As per [RFC4566].

   Media Bitstream:
   A valid, decodable stream, containing all media partitions generated
   by the encoder.  A Media Bitstream normally conforms to a media
   coding standard.

   Media Partition:
   A subset of a Media Bitstream intended for independent
   transportation.  An integer number of Media Partitions forms a Media
   Bitstream.  In layered coding, a Media Partition represents one or
   more layers that are handled as a unit.  In MDC coding, a Media
   Partition represents one or more descriptions that are handled as a
   unit.  In Multi View Coding (MVC), a media partition is a view which
   may depend on other views.

   Decoding dependency:
   The class of relationships media partitions have to each other.  At
   present, this memo defines two decoding dependencies: layering and
   multiple description.

   Layered coding dependency:
   Each Media Partition is only useful (i.e. can be decoded) when all of
   the Media Partitions it depends on are available.  The dependencies
   between the Media Partitions therefore create a directed graph.
   Note: normally, in layered coding, the more Media Partitions are
   employed (following the rule above), the better a reproduced quality
   is possible.  The dependencies in a layered Media Bitstream can be
   also caused by Multi View Coding (MVC), using inter-view dependencies
   for increasing coding efficiency.

   Multi description coding (MDC) dependency:
   N of M Media Partitions are required to form a Media Bitstream, but
   there is no hierarchy between these Media Partitions.  Most MDC
   schemes aim at an increase of reproduced media quality when more
   media partitions are decoded.  Some MDC schemes require more than one
   Media Partition to form an Operation point.

   Operation point:

Schierl & Wenger         Expires May 11, 2008             [page 5]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

   In layered coding, a subset of a layered Media Bitstream that
   includes all Media Partitions required for reconstruction at a
   certain point of quality,  error resilience, or another property, and
   does not include any other Media Partitions.  In MDC coding, a subset
   of an MDC Media Bitstream that is compliant with the MDC coding
   standard in question.  In MVC, an operation point, represents a
   number of views, which are decodeable with the set of available Media

4. Motivation, Use Cases, and Architecture

4.1. Motivation

   This memo is concerned with two types of decoding dependencies:
   layered, and multi-description.  The transport of layered and multi
   description coding share as key motivators the desire for media
   adaptation to network conditions, i.e. related to bandwidth, error
   rates, connectivity of endpoints in multicast or broadcast scenarios,
   and similar.

   o Layered decoding dependency:

   In layered coding, the partitions of a Media Bitstream are known as
   media layers or simply layers.  One or more layers may be transported
   in different media streams in the sense of [RFC4566].  A classic use
   case is known as receiver-driven layered multicast, in which a
   receiver selects a combination of media streams in response to
   quality or bit-rate requirements.

   Back in the mid 1990s, the then available layered media formats and
   codecs envisioned primarily (or even exclusively) a one-dimensional
   hierarchy of layers.  That is, each so-called enhancement layer
   referred to exactly one layer "below".  The single exception has been
   the base layer, which is self-contained.  Therefore, the
   identification of one enhancement layer fully specifies the operation
   point of a layered coding scheme, including knowledge about all the
   other layers that need to be decoded.

   [RFC4566] contains rudimentary support for exactly this use case and
   media formats, in that it allows for signaling a range of transport
   addresses in a certain media description.  By definition, a higher
   transport address identifies a higher layer in the one-dimensional
   hierarchy.  A receiver needs only to decode data conveyed over this
   transport address and lower transport addresses to decode this
   Operation Point.

   Newer media formats depart from this simple one-dimensional
   hierarchy, in that highly complex (at least tree-shaped) dependency

Schierl & Wenger         Expires May 11, 2008             [page 6]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

   hierarchies can be implemented.  Compelling use cases for these
   complex hierarchies have been identified by industry.  Support for it
   is therefore desirable.  However, SDP, in its current form, does not
   allow for the signaling of these complex relationships.  Therefore,
   receivers cannot make an informed decision on which layers to
   subscribe (in case of layered multicast).

   Layered decoding dependency may also exit in a Multi View Coding
   (MVC).  Views may be coded using inter-view dependencies for
   increasing coding efficiency.  This results in Media Bitstreams,
   which logically may be separated into Media Partitions representing
   different views by the reconstructed video signal.  These Media
   Partitions cannot be decoded independently, thus other Media
   Partitions are used for reconstruction.  This requires signaling of
   dependencies of views separated in Media Partitions.

   o Multi descriptive decoding dependency:

   In the most basic form of MDC, each Media Partition forms an
   independent representation of the media.  That is, decoding of any of
   the Media Partitions yields useful reproduced media data.  When more
   than one Media Partition is available, then a decoder can process
   them jointly, and the resulting media quality increases.  The highest
   reproduced quality is available if all original Media Partitions are
   available for decoding.

   More complex forms of multiple description coding can also be
   envisioned, i.e. where, as a minimum, N out of M total Media
   Partitions need to be available to allow meaningful decoding.

   MDC has not yet been embraced heavily by the media standardization
   community, though it is subject of a lot of academic research.  As an
   example, we refer to [MDC].

   In this memo, we cover MDC because we a) envision that MDC media
   formats will come into practical use within the lifetime of this
   memo, and b) the solution for its signaling is very similar to the
   one of layered coding.

4.2. Use cases

   o Receiver driven layered multicast
   This technology is discussed in [RFC3550] and references therein.  We
   refrain from elaborating further; the subject is well known and

   o Multiple end-to-end transmission with different properties

Schierl & Wenger         Expires May 11, 2008             [page 7]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

   Assume a unicast and point-to-point topology, wherein one endpoint
   sends media to another.  Assume further that different forms of media
   transmission are available.  The difference may lie in the cost of
   the transmission (free, charged), in the available protection
   (unprotected/secure), in the quality of service (guaranteed quality /
   best effort), or other factors.

   Layered and MDC coding allow to match the media characteristics to
   the available transmission path(s).  For example, in layered coding
   it makes sense to convey the base layer over high QoS.  Enhancement
   layers, on the other hand, can be conveyed over best effort, as they
   are "optional" in their characteristic -- nice to have, but non-
   essential for media consumption.  In a different scenario, the base
   layer may be offered in a non-encrypted session as a free preview.
   An encrypted enhancement layer references this base layer and allows
   optimal quality play-back; however, it is only accessible to users
   who have the key, which may have been distributed by a conditional
   access mechanism.

5. Signaling Media Dependencies

5.1. Design Principles

   The dependency signaling is only feasible between media descriptions
   described with an "m="-line and with an assigned media identification
   attribute ("mid"), as defined in [RFC3388].

5.2. Semantics

5.2.1.    SDP grouping semantics for decoding dependency

   This specification defines a new grouping semantic
   Decoding Dependency "DDP":

   DDP associates a media stream, identified by its mid attribute, with
   a DDP group.  Each media stream MUST be composed of an integer number
   of Media Partitions.  All media streams of a DDP group MUST have the
   same type of decoding dependency (as signaled by the attribute
   defined in 5.2.2), and MUST belong to one Media Bitstream.  All media
   streams (identified by an "m="-line) MUST contain at least one
   operation point.  The DDP group type informs a receiver about the
   requirement for treating the payload type numbers of the media
   streams of the group according to the new media level attribute
   "depend", as defined in 5.2.2.
   When using multiple codecs, e.g. for Offer/Answer model, the media
  streams MUST have the same dependency structure, regardless which
  payload type number is used.

Schierl & Wenger         Expires May 11, 2008             [page 8]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

5.2.2.    Attribute for dependency signaling per media-stream

   This memo defines a new media-level attribute, "depend", with the
   following ABNF [RFC4234]. The "identification-tag" is defined in

     depend-attribute     = "a" "=" "depend" ":"
                            payload-type dependency-tag
                            *("," SP payload- type dependency-tag )
     dependency-tag       = dependency-type 1*( SP identification-tag":"
                            payload-type-dependency )
     dependency-type      = "lay" / "mdc"

   "payload-type", indicates the payload type number, as defined in
   [RFC4566], of the media description in question for which the
   dependencies are indicated in the pair(s) of "identification-tag" and
   "payload-type-dependency" following the "dependency-type".

   "payload-type-dependency", indicates the payload type number of the
   media stream indicated by the identification-tag, which the payload
   type number of the media stream in question depends on.

   The "depend"-attribute describes the decoding dependency.  The
   "depend"-attribute MAY be followed by a sequence of identification-
   tag(s) which identify all related media streams.  The attribute MAY
   be used with multicast as well as with unicast transport addresses.
   The following types of dependencies are defined:

   o lay:  Layered decoding dependency -- identifies the described media
   stream as one or more Media Partitions of a layered or multi view
   coding (MVC) Media Bitstream.  When "lay" is used, all required media
   streams for the Operation Point MUST be identified by identification-
   tag(s) following the "lay" string.

   o mdc:  Multi descriptive coding dependency -- signals that the
   described media stream is part of a set of a MDC Media Bitstream

     By definition, at least N out of M media streams of the group need
   to be available to from an Operation Point. The values of N and M
   depend on the properties of the Media Bitstream and are not signaled
   within this context.    When "mdc" is used, all required media
   streams for the Operation Point MUST be identified by identification-
   tag(s) following the "lay" string.

Schierl & Wenger         Expires May 11, 2008             [page 9]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

6. Usage of new semantics in SDP

6.1. Usage with the SDP Offer/Answer Model

   The backward compatibility in offer / answer is generally handled as
   specified in [RFC3388].

   Depending on the implementation, a node that does not understand DDP
   grouping (either does not understand line grouping at all, or just
   does not understand the DDP semantics) SHOULD respond to an offer
   containing DDP grouping either (1) with an answer that ignores the
   grouping attribute (only possible with "lay" dependency) or (2) with
   a refusal to the request (e.g., 488 Not acceptable here or 606 Not
   acceptable in SIP).

   In the first case, the original sender of the offer MUST respond by
   offering a single media stream that represents an Operation Point.
   Note: in most cases, this will be the base layer of a layered Media
   Bitstream, equally possible are Operation Points containing a set of
   enhancement layers as long as all are part of a single media stream.
   In the second case, if the sender of the offer still wishes to
   establish the session, it SHOULD re-try the request with an offer
   including only a single media stream.

6.2. Declarative usage

   If an RTSP receiver understands signaling according to this memo, it
   SHALL setup all media streams that are required to decode the
   Operation Point of its choice.

   If an RTSP receiver does not understand the signaling defined within
   this memo, it falls back to normal SDP processing.  Two likely cases
   have to be distinguished: (1) if at least one of the media types
   included in the SDP is within the receiver's capabilities, it selects
   among those candidates according to implementation specific criteria
   for setup, as usual. (2) If none of the media type included in the
   SDP can be processed, then obviously no setup can occur.

6.3. Usage with Capability Negotiation

   This memo does not cover the interaction with Capability Negotiation
   [I-D.ietf-mmusic-sdp-capability-negotiation].  This issue will be
   addressed in a second memo.

6.4. Examples

   a.)  Example for signaling layered decoding dependency dependency:

Schierl & Wenger         Expires May 11, 2008             [page 10]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

          o=svcsrv 289083124 289083124 IN IP4
          t=0 0

          c=IN IP4
          a=group:DDP 1 2 3 4

          m=video 40000 RTP/AVP 94 194
          a=rtpmap:94 H264/90000
          a=rtpmap:194 H264/90000

          m=video 40002 RTP/AVP 95 195
          a=rtpmap:95 SVC/90000
          a=rtpmap:195 SVC/90000
          a=depend:95 lay 1:94,195 lay 1:194

          m=video 40004 RTP/AVP 96 196
          a=rtpmap:96 SVC/90000
          a=rtpmap:196 SVC/90000
          a=depend:96 lay 1:94, 196 lay 1:194

          m=video 40004 RTP/SAVP 100 200
          c=IN IP4

          a=rtpmap:100 SVC/90000
          a=rtpmap:200 SVC/90000
          a=depend:100 lay 1:94 3:96,200 lay 1:194, 3:196

   b.)  Example for signaling of multi descriptive coding dependency:

          o=mdcsrv 289083124 289083124 IN IP4
          t=0 0

Schierl & Wenger         Expires May 11, 2008             [page 11]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

          c=IN IP4
          a=group:DDP 1 2 3
          m=video 40000 RTP/AVP 94
          a=depend:94 mdc 2:95 3:96

          m=video 40002 RTP/AVP 95
          a=depend:95 mdc 1:94 3:96

          m=video 40004 RTP/AVP 96
          c=IN IP4
          a=depend:96 mdc 1:94 2:95

7. Security Considerations

   All security implications of SDP apply.

   There may be a risk of manipulation the dependency signaling of a
   session description by an attacker.  This may mislead a receiver or
   middle box, e.g. a receiver may try to compose a bitstream that does
   not form an Operation Point, although the signaling made it believe
   it would form a valid Operation Point, with potential fatal
   consequences for the media decoding process.  It is recommended that
   the receiver SHOULD perform an integrity check on SDP and follow the
   security considerations of SDP to only trust SDP from trusted

8. IANA Considerations

   This document defines the "DDP" semantics to be used with grouping of
   media lines in SDP as defined in RFC 3388. The "DDP" semantics
   defined in this memo are to be registered by the IANA when it is
   published in standard track RFCs.

   The attribute "depend" is to be registered by IANA as a new media-
   level attribute.  The purpose of this attribute is to express a
   dependency, which may exist between "m"-lines of a media session.

9. Acknowledgements

   Funding for the RFC Editor function is currently provided by the
   Internet Society.  Further, the author Thomas Schierl of Fraunhofer
   HHI is sponsored by the European Commission under the contract number
   FP6-IST-0028097, project ASTRALS.

Schierl & Wenger         Expires May 11, 2008             [page 12]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

10.  References

10.1.     Normative References

[RFC4566]    Handley, M., Jacobson, V, and C. Perkins, "SDP: Session
             Description Protocol", RFC 4566, July 2006.
[RFC3388]    Camarillo, G., Holler, J., and H. Schulzrinne, "Grouping of
             Media Lines in the Session Description Protocol (SDP)",
             RFC 3388, December 2002.
[RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3550]    Schulzrinne, H., Casner, S., Frederick, R., and V.
             Jacobson, "RTP: A Transport Protocol for Real-Time
             Applications", STD 64, RFC 3550, July 2003.
[RFC4234]    Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
             Specifications: ABNF", RFC 4234, October 2005.
             Andreasen, F., "SDP Capability Negotiation",
             draft-ietf-mmusic-sdp-capability-negotiation-07, (work in
             progress), October 2008

10.2.     Informative References

             Wenger, S., Wang Y.-K. and T. Schierl, "RTP Payload Format
             for SVC Video", draft-ietf-avt-rtp-svc-02 (work in
             progress), June 2007.
[MDC]        Vitali, A., Borneo, A., Fumagalli, M., and R. Rinaldo,
             "Video over IP using Standard-Compatible Multiple
             Description Coding: an IETF proposal", Packet Video
             Workshop, April 2006, Hangzhou, China
[MVC]        Joint Video Team, "Joint Draft 4 of MVC ", available
             site/2007_06_Geneva/, Geneva,
             Switzerland, June 2007.

11.  Author's Addresses

   Thomas Schierl                       Phone: +49-30-31002-227
   Fraunhofer HHI                       Email:
   Einsteinufer 37
   D-10587 Berlin

   Stephan Wenger                       Phone: +1-650-862-7368
   Nokia                                Email:
   955 Page Mill Road

Schierl & Wenger         Expires May 11, 2008             [page 13]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

   Palo Alto, CA, 94304

12.  Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at

13.  Disclaimer of Validity

   This document and the information contained herein are provided on an

14.  Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

15.  RFC Editor Considerations


Schierl & Wenger         Expires May 11, 2008             [page 14]

Internet-Draft  draft-ietf-mmusic-decoding-dependency-00  November 2007

16.  Change Log:

   19Dec06 / TS:
   removed SSRC multiplexing and with that various information about RTP
   draft title correction
   corrected SDP reference
   editorial modifications throughout the document
   added Stephan Wenger to the list of authors
   removed section "network elements not supporting dependency
   20-28Dec06 / TS, StW: Editorial improvements
   3Mar07 / TS: adjustment for new I-D style, added Offer/Answer text,
   corrected ABNF reference, added Security and IANA considerations,
   added section Usage with existing entities not supporting new
   signaling, added text for Declarative usage section, added Open
   issues section.
   21-Jun07: Numerous editorial changes and reworked section 6.
   11-Nov07: Added Payload Type of media stream in question to
   dependency signaling. Note on usage with Cap. Negotiation. Added
   multi view coding (MVC) dependency as part of 'lay'-dependency. Added
   ref. to MVC activity at ITU-T/MPEG.

Schierl & Wenger         Expires May 11, 2008             [page 15]