Network Working Group                                      M. Westerlund
Internet-Draft                                                 B. Burman
Intended status: Standards Track                                Ericsson
Expires: April 25, 2014                                    S. Nandakumar
                                                                   Cisco
                                                        October 22, 2013


                    Using Simulcast in RTP Sessions
               draft-westerlund-avtcore-rtp-simulcast-03

Abstract

   In some application scenarios it may be desirable to send multiple
   differently encoded versions of the same Media Source in independent
   Source Packet Streams.  This is called Simulcast.  This document
   discusses the best way of accomplishing Simulcast in RTP and how to
   signal it in SDP.  A solution is defined by making three extensions
   to SDP, and using RTP/RTCP identification methods to relate RTP
   Source Packet Streams.  The first SDP extension consists of two new
   session level SDP attributes that express capability to send or
   receive Simulcast Source Packet Streams, respectively.  The second
   SDP extension introduces an SDP media level attribute that groups and
   identifies a selected set of media level parameters for a specific
   direction, called a media configuration.  The third SDP extension
   describes how to group such media configurations on SDP session or
   media level for Simulcast purposes.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 25, 2014.

Copyright Notice





Westerlund, et al.       Expires April 25, 2014                 [Page 1]


Internet-Draft                RTP Simulcast                 October 2013


   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
     2.2.  Requirements Language . . . . . . . . . . . . . . . . . .   4
   3.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  Reaching a Diverse Set of Receivers . . . . . . . . . . .   5
     3.2.  Application Specific Media Source Handling  . . . . . . .   6
     3.3.  Receiver Adaptation in Multicast/Broadcast  . . . . . . .   7
     3.4.  Receiver Media Source Preferences . . . . . . . . . . . .   7
   4.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   8
   5.  Proposed Solution Overview  . . . . . . . . . . . . . . . . .   9
   6.  Proposed Signaling  . . . . . . . . . . . . . . . . . . . . .  10
     6.1.  Simulcast Capability  . . . . . . . . . . . . . . . . . .  11
       6.1.1.  Declarative Use . . . . . . . . . . . . . . . . . . .  12
       6.1.2.  Offer/Answer Use  . . . . . . . . . . . . . . . . . .  12
     6.2.  Media Configuration . . . . . . . . . . . . . . . . . . .  13
       6.2.1.  Simulcast Limitations . . . . . . . . . . . . . . . .  16
       6.2.2.  Declarative Use . . . . . . . . . . . . . . . . . . .  17
       6.2.3.  Offer/Answer Use  . . . . . . . . . . . . . . . . . .  17
     6.3.  Grouping Simulcast Configurations . . . . . . . . . . . .  18
       6.3.1.  Declarative Use . . . . . . . . . . . . . . . . . . .  19
       6.3.2.  Offer/Answer Use  . . . . . . . . . . . . . . . . . .  19
     6.4.  Relating Simulcast Versions . . . . . . . . . . . . . . .  20
     6.5.  Two-Phase Negotiation . . . . . . . . . . . . . . . . . .  20
     6.6.  Signaling Examples  . . . . . . . . . . . . . . . . . . .  21
       6.6.1.  Unified Plan Client . . . . . . . . . . . . . . . . .  21
       6.6.2.  Multi-Transport Client  . . . . . . . . . . . . . . .  24
       6.6.3.  Multi-Source Client . . . . . . . . . . . . . . . . .  26
   7.  Network Aspects . . . . . . . . . . . . . . . . . . . . . . .  28
   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  29
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  29
   10. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  29
   11. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  30



Westerlund, et al.       Expires April 25, 2014                 [Page 2]


Internet-Draft                RTP Simulcast                 October 2013


   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  30
     12.1.  Normative References . . . . . . . . . . . . . . . . . .  30
     12.2.  Informative References . . . . . . . . . . . . . . . . .  31
   Appendix A.  Discussion on Receiver Diversity . . . . . . . . . .  32
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  34

1.  Introduction

   Most of today's multiparty video conference solutions make use of
   centralized servers to reduce the bandwidth and CPU consumption in
   the endpoints.  Those servers receive Source Packet Streams from each
   participant and send some suitable set of possibly modified streams
   to the rest of the participants, which usually have heterogeneous
   capabilities (screen size, CPU, bandwidth, codec, etc).  One of the
   biggest issues is how to perform stream adaptation to different
   participants' constraints with the minimum possible impact on video
   quality and server performance.

   Simulcast is the act of simultaneously sending multiple different
   versions of the same media content, e.g. the same video source
   encoded with different video encoder types or image resolutions.
   This can be done in several ways and for different purposes.  This
   document focuses on the case where it is desirable to provide a Media
   Source as multiple Source Packet Streams over RTP [RFC3550] towards
   an intermediary so that the intermediary can provide the wanted
   functionality by selecting which Source Packet Stream to forward to
   other participants in the session, and more specifically how the
   identification and grouping of the involved Source Packet Streams are
   done.  From an RTP perspective, Simulcast is a specific application
   of the aspects discussed in RTP Multiplexing Guidelines
   [I-D.ietf-avtcore-multiplex-guidelines].

   The purpose of this document is to describe a few scenarios where it
   is motivated to use Simulcast, and propose a suitable solution for
   signaling and performing RTP Simulcast.

2.  Definitions

2.1.  Terminology

   This document makes use of the terminology defined in RTP Taxonomy
   [I-D.lennox-raiarea-rtp-grouping-taxonomy].  In addition, the
   following terms are used:

   Media Configuration:  A specific set of parameter values applied on
      the encoding and packetization process that creates a specific
      Source Packet Stream.  In SDP, the applicable parameter values are
      described by the joint set of "rtpmap" parameters, "fmtp"



Westerlund, et al.       Expires April 25, 2014                 [Page 3]


Internet-Draft                RTP Simulcast                 October 2013


      parameters, and the "config-id" (Section 6.2) parameters,
      including extensions.

2.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

3.  Use Cases

   Many use cases of Simulcast as described in this document relate to a
   multi-party Communication Session where one or more central nodes are
   used to adapt the view of the Communication Session towards
   individual Participants, and facilitate the Media Transport between
   Participants.  Thus, these cases targets the RTP Mixer topology
   defined in [RFC5117] (Section 3.4: Topo-Mixer), further elaborated
   and extended with other topologies in
   [I-D.ietf-avtcore-rtp-topologies-update] (Section 3.6 to 3.9).

   There are two principle approaches for an RTP Mixer to provide this
   adapted view of the Communication Session to each receiving
   Participant:

   o  Transcoding (decoding and re-encoding) received Source Packet
      Streams with characteristics adapted to each receiving
      Participant.  This often include mixing or composition of Media
      Sources from multiple Participants into a mixed Media Source
      originated by the RTP Mixer.  The main advantage of this approach
      is that it achieves close to optimal adaptation to individual
      receiving Participants.  The main disadvantages are that it can be
      very computationally expensive to the RTP Mixer and typically also
      degrades media Quality of Experience (QoE) such as end-to-end
      delay for the receiving Participants.

   o  Switching a subset of all received Source Packet Streams or sub-
      streams to each receiving Participant, where the used subset is
      typically specific to each receiving Participant.  The main
      advantages of this approach are that it is computationally cheap
      to the RTP Mixer and it has very limited impact on media QoE.  The
      main disadvantage is that it can be difficult to combine a subset
      of received Source Packet Streams into a perfect fit to the
      resource situation of a receiving Participant.

   The use of Simulcast is relates to the latter approach, where it is
   more important to reduce the load on the RTP Mixer and/or minimize
   QoE impact than to achieve an optimal adaptation of resource usage.




Westerlund, et al.       Expires April 25, 2014                 [Page 4]


Internet-Draft                RTP Simulcast                 October 2013


   A multicast/broadcast case where the receivers themselves selects the
   most appropriate simulcast version and tune in to the right transport
   to receive that version is also considered (Section 3.3) . This
   enables large receiver populations with heterogeneity where it comes
   to capabilities and the use network paths bandwidth.

   In this section, an "RTP switch" is used as a common short term for
   the terms "switching RTP mixer", "source projecting middlebox", and
   "video switching MCU" as discussed in
   [I-D.ietf-avtcore-rtp-topologies-update].

3.1.  Reaching a Diverse Set of Receivers

   The Media Sources provided by a sending Participant potentially need
   to reach several receiving Participants that differ in terms of
   available resources.  A discussion on that topic is included in
   Appendix A. The receiver resources that typically differ include, but
   are not limited to:

   Codec:  This includes codec type (such as SDP MIME type) and can
      include codec configuration options (e.g. SDP fmtp parameters).  A
      couple of codec resources that differ only in codec configuration
      will be "different" if they are somehow not "compatible", like if
      they differ in video codec profile, or the transport packetization
      configuration.

   Sampling:  This relates to how the Media Source is sampled, in
      spatial as well as in temporal domain.  For video streams, spatial
      sampling affects image resolution and temporal sampling affects
      video frame rate.  For audio, spatial sampling relates to the
      number of audio channels and temporal sampling affects audio
      bandwidth.  This may be used to suit different rendering
      capabilities or needs at the receiving endpoints, as well as a
      method to achieve different transport capabilities, bitrates and
      eventually QoE by controlling the amount of source data.

   Bitrate:  This relates to the amount of bits spent per second to
      transmit the Media Source as an Source Packet Stream, which
      typically also affects the Quality of Experience (QoE) for the
      receiving user.

   Letting the sending Participant create a Simulcast of a few
   differently configured Source Packet Streams per Media Source can be
   a good trade-off when using an RTP switch as middlebox, instead of
   sending a single Source Packet Stream and using an RTP Mixer to
   create individual transcodings to each receiving Participant.





Westerlund, et al.       Expires April 25, 2014                 [Page 5]


Internet-Draft                RTP Simulcast                 October 2013


   This requires that the receiving Participants can be categorized in
   terms of available resources and that the sending Participant can
   choose a matching configuration for a single Source Packet Stream per
   category and Media Source.

   For example, assume for simplicity a set of receiving Participants
   that differ only in that some have support to receive Codec A, and
   the others have support to receive Codec B. Further assume that the
   sending participant can send both Codec A and B. It can then reach
   all receivers by creating two Simulcasted Source Packet Streams from
   each Media Source; one for Codec A and one for Codec B.

   In another simple example, a set of receiving Participants differ
   only in screen resolution; some are able to display video with at
   most 360p resolution and some support 720p resolution.  A sending
   Participant can then reach all receivers by creating a Simulcast of
   Source Packet Streams with 360p and 720p resolution for each sent
   video Media Source.

   In more elaborate cases, the receiving Participants differ both in
   available Sampling and Bitrate, and maybe also Codec, and it is up to
   the RTP switch to find a good trade-off in which Simulcasted stream
   to choose for each intended receiver.  It is also the responsibility
   of the RTP switch to negotiate a good fit of Simulcast streams with
   the sending Participant.

   The maximum number of Simulcasted Source Packet Streams that can be
   sent is mainly limited by the amount of processing and uplink network
   resources available to the sending Participant.

3.2.  Application Specific Media Source Handling

   The application logic that controls the Communication Session may
   include special handling of some Media Sources.  It is for example
   commonly the case that the media from a sending Participant is not
   sent back to itself.

   It is also common that a currently active speaker Participant is
   shown in larger size or higher quality than other Participants (the
   Sampling or Bitrate aspects of Section 3.1).  Not sending the active
   speaker media back to itself means there is some other Participant's
   media instead that receive special handling towards the active
   speaker; typically the previous active speaker.  This way, the
   previously active speaker is needed both in larger size (to current
   active speaker) and in small size (to the rest of the Participants),
   which can be solved with a Simulcast from the previously active
   speaker to the RTP switch.




Westerlund, et al.       Expires April 25, 2014                 [Page 6]


Internet-Draft                RTP Simulcast                 October 2013


3.3.  Receiver Adaptation in Multicast/Broadcast

   When using Broadcast or Multicast technology to distribute real-time
   media streams to large populations of receivers there can still be
   significant heterogeneity among the receiver population.  This can
   depend on several factors:

   Network Bandwidth:  The network paths to individual receivers will
      have variations in the bandwidth.  Thus putting different limits
      on the supported bit-rates that can be received.

   Endpoint Capabilities:  The endpoint's hardware and software can have
      varying capabilities in relation to screen resolution, decoding
      capabilities, and supported media codecs.

   To handle these variations, a transmitter of real-time media may want
   to apply Simulcast to its Source Packet Streams and provide a set of
   media configurations, enabling the receivers to select the best fit
   from these sets themselves.  The endpoint capabilities will usually
   result in a single initial choice.  However, the network bandwidth
   can vary over time, which requires a client to continuously monitor
   its reception to determine if the received media streams still fit
   within the available bandwidth.  If not, another Simulcast media
   configuration containing a thinner set of Source Packet Streams will
   have to be chosen.

   When one uses IP multicast, the level of Simulcast granularity that
   the receiver can select from is by choosing different multicast
   addresses.  Thus, different Simulcast versions need to be put on
   different Media Transports using different multicast addresses.  If
   these Simulcast versions are described using SDP, they need to be
   part of different SDP media descriptions, as SDP binds to transport
   on media description level.  To enable more than the initial choice
   to function well, there is a need to enable correct mapping of Source
   Packet Streams in one Simulcast media configuration to a
   corresponding Source Packet Stream in another Simulcast media
   configuration on another multicast group.

3.4.  Receiver Media Source Preferences

   The application logic that controls the Communication Session may
   allow receiving Participants to apply preferences to the
   characteristics of the Source Packet Stream they receive, for example
   in terms of the aspects listed in Section 3.1.  Sending a Simulcast
   of Source Packet Streams is one way of accommodating receivers with
   conflicting or otherwise incompatible preferences.





Westerlund, et al.       Expires April 25, 2014                 [Page 7]


Internet-Draft                RTP Simulcast                 October 2013


4.  Requirements

   The following requirements need to be met to support the use cases in
   previous sections:

   REQ-1:  Identification.  It must be possible to identify a set of
      simulcasted Source Packet Streams as originating from the same
      Media Source:

      REQ-1.1:  In SDP signaling.

      REQ-1.2:  On RTP/RTCP level.

   REQ-2:  Transport usage.  The solution must work when distributing
      different Simulcast versions on:

      REQ-2.1:  Same Media Transport and RTP session.

      REQ-2.2:  Different Media Transports and RTP sessions.

   REQ-3:  Capability negotiation.  It must be possible that:

      REQ-3.1:  Sender can express capability of sending simulcast.

      REQ-3.2:  Receiver can express capability of receiving simulcast.

      REQ-3.3:  Sender can express maximum number of Simulcast versions
            that can be provided.

      REQ-3.4:  Receiver can express maximum number of Simulcast
            versions that can be received.

      REQ-3.5:  Sender can detail the characteristics of the Simulcast
            versions that can be provided.

      REQ-3.6:  Receiver can detail the characteristics of the Simulcast
            versions that it prefers to receive.

   REQ-4:  Distinguishing features.  It must be possible to have
      different Simulcast versions use different values for any
      combination of:

      REQ-4.1:  Codec.  This includes both codec type and configuration
            options for both codec and RTP packetization.  It also
            includes different layers from a scalable codec, but only as
            long as those layers are possible to identify on RTP level.

      REQ-4.2:  Bitrate of Source Packet Stream.



Westerlund, et al.       Expires April 25, 2014                 [Page 8]


Internet-Draft                RTP Simulcast                 October 2013


      REQ-4.3:  Sampling in spatial as well as in temporal domain.

   REQ-5:  Compatibility.  It must be possible to use Simulcast in
      combination with other RTP mechanisms that generate additional
      Source Packet Streams:

      REQ-5.1:  RTP Retransmission [RFC4588].

      REQ-5.2:  RTP Forward Error Correction [RFC5109].

   REQ-6:  Interoperability.  The solution must also be able to use in:

      REQ-6.1:  Interworking with non-simulcast legacy clients using a
            single Media Source per media type.

      REQ-6.2:  WebRTC "Unified Plan" environment.

5.  Proposed Solution Overview

   Signaling Simulcast is about negotiating between media sender and
   receiver what the different Simulcast versions should be, how to
   identify them in terms of Source Packet Streams, and how to inter-
   relate those Source Packet Streams.

   The proposed solution consists of:

   o  Signaling Simulcast capability in an optional, pre-stage Offer/
      Answer:

      *  Separate send and receive Simulcast capabilities as SDP session
         level attributes.

      *  Media properties that are supported as base for different
         Simulcast versions are listed as parameters that are also
         possible to rank.

      *  Early indication of maximum number of available encoding/
         decoding resources on SDP media level.

   o  Including detailed information for the Simulcast in a main Offer/
      Answer:

      *  Including Simulcast capability indications, as described above,
         being kept from the pre-stage Offer/Answer, if any.

      *  Defining and labeling of the media configuration for each
         Simulcast version to be sent or received.




Westerlund, et al.       Expires April 25, 2014                 [Page 9]


Internet-Draft                RTP Simulcast                 October 2013


      *  The media configuration for a Simulcast version can include
         acceptable parameter ranges for parameters that are most likely
         used to distinguish Simulcast versions.

      *  Indicating the use of Simulcast, separately per direction, by
         grouping the defined media configurations, not individual
         streams, that will constitute the Simulcast.

      *  Allowing that any one of the media configurations in a specific
         Simulcast is signaled inactive from the start of the session.
         This is defined as equivalent to the affected Source Packet
         Stream being in PAUSED state
         [I-D.westerlund-avtext-rtp-stream-pause].

      *  Adding and/or modifying SDP media descriptions as needed to
         accommodate the negotiated Simulcast streams.

      *  Parameter limits to the aggregate of media configurations are
         signaled by existing SDP attributes on session and media
         description level.

      *  Including media level indication of maximum number of available
         encoding/decoding resources on SDP media level.  They MAY be
         modified compared to the pre-stage Offer/Answer, if any.

      *  Identifying which Source Packet Stream corresponds to which
         media configuration by including the configuration label as
         part of the SDES item SRCNAME
         [I-D.westerlund-avtext-rtcp-sdes-srcname] information include
         in the RTP and RTCP packets.  The optional mechanism for source
         specific signalling defined in SRCNAME could be used to let
         Simulcast sender pre-announce such a relationship before
         sending the Source Packet Stream.

   o  Adding Simulcast information to the Source Packet Stream:

      *  Identifying Source Packet Streams from same Media Source using
         the new RTCP SDES Item SRCNAME
         [I-D.westerlund-avtext-rtcp-sdes-srcname], and as described
         there including the possibility to send the same information as
         an RTP Header Extension [RFC5285].

      *  Using PAUSE/RESUME [I-D.westerlund-avtext-rtp-stream-pause]
         functionality to temporarily turn individual Simulcast versions
         on or off.

6.  Proposed Signaling




Westerlund, et al.       Expires April 25, 2014                [Page 10]


Internet-Draft                RTP Simulcast                 October 2013


   This section further details the signaling solution outlined above
   (Section 5).

6.1.  Simulcast Capability

   There are numerous media properties that can be varied to construct a
   set of Simulcast versions.  A Simulcast enabled endpoint could also
   support Simulcast based on several of those properties.  As long as
   those properties are relatively independent and if each Simulcast
   version need explicit definition in the SDP, this would lead to an
   exponential number of Simulcast version candidates and a very long
   SDP that is likely also hard to interpret.  There is thus a need to
   limit the Simulcast version candidates included in the SDP to cover
   as small set of properties as possible.

   If a legacy endpoint not supporting Simulcast were to be presented
   with an SDP including media descriptions for a set of Simulcast
   versions, it may not know how to correctly handle or interpret these
   "surplus" media descriptions.

   Based on the functionality that Simulcast is intended to achieve, it
   should be clear that the reasons to send Simulcast versions are not
   the same as to receive Simulcast versions, seen from a single
   endpoint.

   For these reasons, it is proposed to define two new SDP session level
   attributes, "a=sim-send-cap" and "a=sim-recv-cap", which explicitly
   signal support for Simulcast media transmission and Simulcast media
   reception, respectively, for that media description.  "a=sim-send-
   cap" and "a=sim-recv-cap" MAY be used independently and
   simultaneously.  These attributes are also proposed to have
   parameters indicating the media properties used to create the
   Simulcast versions, and their preferred ranking.  The meaning of the
   attributes on SDP media level is undefined and MUST NOT be used.

   simulcast-cap   = "a="( "sim-send-cap:" / "sim-recv-cap:" )
                     cap-prop-list
   cap-prop-list   = cap-prop-entry *(WSP cap-prop-entry)
   cap-prop-entry  = cap-prop ["=" q-value]
   cap-prop        = "rtpmap"
                   / "fmtp"
                   / "imageattr"
                   / "framerate"
                   / token ; for future extensions
   q-value         = ( "0" "." 1*2DIGIT )
                   / ( "1" "." 1*2("0") )
                   ; Values between 0.00 and 1.00
   ; WSP and DIGIT defined in [RFC5234]



Westerlund, et al.       Expires April 25, 2014                [Page 11]


Internet-Draft                RTP Simulcast                 October 2013


   ; token defined in [RFC4566]


                  Figure 1: ABNF for Simulcast Capability

   The media property values are taken from existing (and could be
   extended to cover other or future) SDP attributes that express media
   properties that can be varied to create different Simulcast versions:

   rtpmap:  Differences in codec type, sampling rate (see Section 4),
      and number of channels.

   fmtp:  Differences in codec-specific encoding parameters.

   imageattr:  Differences in video resolution and aspect ratio
      [RFC6236].

   framerate:  Differences in framerate.

   The optional q-value expresses the relative preference to base a
   Simulcast version on that media property, with 1.00 meaning maximum
   (100%) preference and 0.00 meaning no (0%) preference.  Several media
   properties can share the same q-value, in which case they are equally
   preferred.  Not including any q-value for a media property value
   SHALL default to a q-value of 1.00.

   The list of media properties is made extensible, to allow introducing
   additional dimensions for Simulcast versions.

6.1.1.  Declarative Use

   When used as a declarative media description, sim-recv-cap indicates
   the configured end-point's required capability to recognize and
   receive a specified set of Source Packet Streams as Simulcast
   streams.  In the same fashion, sim-send-cap requests the end-point to
   send a specified set of Source Packet Streams as Simulcast streams.
   sim-recv-cap and sim-send-cap MAY be used independently and at the
   same time and they need not specify the same capability properties.

6.1.2.  Offer/Answer Use

   An offerer wanting to use Simulcast SHALL include either one or both
   of those attributes, depending on in which direction(s) Simulcast is
   both supported and desirable.  An offerer that receives an answer
   without "a=sim-send-cap" or "a=sim-recv-cap" MUST NOT define or use
   any Simulcast alternatives in that direction to the answerer.





Westerlund, et al.       Expires April 25, 2014                [Page 12]


Internet-Draft                RTP Simulcast                 October 2013


   An answerer that does not understand the concept of Simulcast will
   also not know those attributes and will remove them in the SDP
   answer, as defined in existing SDP Offer/Answer procedures.  An
   answerer that does understand the attributes and that wants to
   support Simulcast in the indicated direction SHALL reverse
   directionality of the attribute; "sim-send-cap" becomes "sim-recv-
   cap" and vice versa, and include it in the answer.

   An offerer that intends to send Simulcast alternatives and thus
   includes "a=sim-send-cap", MUST also include at least one media
   property parameter that it intends to use to construct the Simulcast
   alternatives, but it MAY include more media property parameters.
   Including multiple media property parameters in "a=sim-send-cap"
   SHALL be interpreted as an offer to send Simulcast versions covering
   all combinations thereof, but MAY be further restricted by other
   information in the SDP such as for example the number of simulcast-
   related media descriptions in the SDP or use of max-ssrc signaling
   [I-D.westerlund-mmusic-max-ssrc].

   An offerer that is capable of receiving Simulcast alternatives and
   thus includes "a=sim-recv-cap", MUST also include at least one media
   property parameter that it is willing to use as discriminator between
   received Simulcast alternatives, but MAY include more media property
   parameters.  Including multiple media property parameters in "a=sim-
   recv-cap" SHALL be interpreted as an offer to receive Simulcast
   versions covering all combinations thereof, but MAY be further
   restricted by other information in the SDP such as for example the
   number of simulcast-related media descriptions in the SDP or use of
   max-ssrc signaling [I-D.westerlund-mmusic-max-ssrc].

   An answerer that either lacks the capability or does not desire to
   use Simulcast versions based on a certain media property parameter in
   a specific direction MUST remove such media property parameter from
   "a=sim-send-cap" or "a=sim-recv-cap".  The answerer MUST NOT add any
   media property parameters that were not included in the offer.

   An answerer SHOULD take the offerer's q-values into account when
   choosing which media configurations (Section 6.2) to include in the
   answer and how to group them (Section 6.3) into the resulting
   Simulcast(s).

6.2.  Media Configuration

   Media that constitutes a Simulcast version has certain desirable
   characteristics that is meant to suit one category of diverse
   receivers (Section 3.1).  A receiver that is willing to receive
   Simulcast streams must be given sufficient means to express what it
   is capable of and desires to receive.  A sender that is willing to



Westerlund, et al.       Expires April 25, 2014                [Page 13]


Internet-Draft                RTP Simulcast                 October 2013


   send Simulcast streams must similarly be given sufficient means to
   express what it is capable of and desires to send.

   An obvious candidate to express those characteristics is the media
   format in an SDP media description, defined by the rtpmap and fmtp
   attributes, which is typically mapped to an RTP Payload Type.  Some
   of the most interesting characteristics for Simulcast purposes are
   however not included in rtpmap or fmtp, but are instead defined as
   separate attributes.  Some of those individual attributes are
   possible to directly relate to a defined media format and could form
   a configuration together with the media format, but some attributes
   cannot be related to a specific media format and using the existing
   media format as a common identifier for a media configuration is not
   fully sufficient.

   The act of Simulcast is trying to handle senders and receivers
   belonging to the vast multi-dimensional parameter space of "media
   configuration" by sub-dividing that parameter space into manageable
   and meaningful sub-sets.  Communication between a sender and a
   receiver can be established successfully only when the actually sent
   media configuration (sub-set) fits within the receiver's available
   media configuration sub-set.  At the same time, practical and
   implementation aspects often limits the size of those sub-sets.  When
   that receiver or sender sub-set is either too small or is not known,
   the probability of successful communication decreases significantly.
   To increase the probability of finding a match between sender and
   receiver media configurations, it is essential that a media
   configuration can be a set instead of a single point in the parameter
   space, i.e. include parameter listings and/or ranges instead of
   single values.

   Therefore, it is proposed to define a new media level SDP attribute,
   "a=config-id", which has relate the needed parameter types and the
   corresponding value ranges that together constitute a Simulcast media
   configuration.  Each SDP media description MAY contain zero or more
   config-id attributes.  The meaning of the attribute on SDP session
   level is undefined and MUST NOT be used.

   configuration    = "a=config-id:" config-id WSP config-dir
                       WSP config-list
   config-id        = token
   config-dir       = "send"
                    / "recv"
   config-list      = config-entry *(WSP config-entry)
   config-entry     = "pt" "=" pt-value *("," pt-value)
                    / image-attr
                    / "framerate" "=" fr-param
                    / "b" "=" bw-mod ":" bw-value *1("-" bw-value)



Westerlund, et al.       Expires April 25, 2014                [Page 14]


Internet-Draft                RTP Simulcast                 October 2013


                    / ext-config-id [ "=" ext-config-value ]
                       ; for future ext
   image-attr       = "imageattr" "=" resolution-list
   resolution-list  = resolution-set *("," resolution-set)
   ext-config-id    = token
   ext-config-value = non-ws-string
   pt-value         = 1*3DIGIT ; could be made more strict
   resolution-set   = "[" "x=" xyrange "," "y=" xyrange *key-values "]"
   key-values       = ( "," key-value )
   key-value        = ( "sar=" srange )
                    / ( "par=" prange )
                    / ( "q=" qvalue )
   onetonine        = "1" / "2" / "3" / "4" / "5"
                    / "6" / "7" / "8" / "9"
   xyvalue          = onetonine *5DIGIT
   step             = xyvalue
   xyrange          = ( "[" xyvalue ":" [ step ":" ] xyvalue "]" )
                    / ( "[" xyvalue 1*( "," xyvalue ) "]" )
                    / ( xyvalue )
   spvalue          = ( "0" "." onetonine *3DIGIT )
                    / ( onetonine "." 1*4DIGIT )
   srange           =  ( "[" spvalue 1*( "," spvalue ) "]" )
                    / ( "[" spvalue "-" spvalue "]" )
                    / ( spvalue )
   prange           =  ( "[" spvalue "-" spvalue "]" )
   qvalue           = ( "0" "." 1*2DIGIT )
                    / ( "1" "." 1*2("0") )
   fr-param         = fr-value *("," fr-value)
                    / fr-value "-" fr-value
   fr-value         = 1*3DIGIT [ "." 1*2DIGIT ]
   bw-mod           = "AS"
                    / "TIAS"
                    / token ; for future extensions
   bw-value         = 1*DIGIT
   ; WSP, DQUOTE and DIGIT defined in [RFC5234]
   ; token and non-ws-string defined in [RFC4566]


                  Figure 2: ABNF for Media Configuration

   A media configuration is thus identified by:

   config-id:  A token that identifies the media configuration, which
      MUST be unique across all media configurations and media
      descriptions in the SDP.

   config-dir:  The direction for the stream(s) receiving the media
      configuration, as seen from the part issuing the SDP.



Westerlund, et al.       Expires April 25, 2014                [Page 15]


Internet-Draft                RTP Simulcast                 October 2013


   The media configuration MUST contain at least one and MAY contain
   more of the below media configuration entries.  Each entry type MUST
   NOT appear more than once in every media configuration.

   pt:  A comma-separated list of media formats, RTP payload types,
      which MUST be defined within the same media description as config-
      id.  This describes the allowed set of codecs or codec
      configurations for this media configuration.  MUST be present in
      every media configuration.

   imageattr:  An OPTIONAL listing of preferred image resolutions for
      this media configuration.  MUST NOT be used with other than video
      and image media types.  An imageattr media configuration entry
      MUST NOT conflict with any "a=imageattr" attribute present in the
      same media description.

   framerate:  An OPTIONAL range or enumeration of preferred framerates
      for this media configuration.  MUST NOT be used with other than
      video media types.  The high end of the range MUST be equal to or
      larger than the low end.  An enumerating framerate media
      configuration entry MUST include the value of the "a=framerate"
      attribute, if any.  A framerate range media configuration entry
      MUST include the "a=framerate" value in the range.

   b: An acceptable bandwidth range for this media configuration.
      Either one of the defined bandwidth modifiers MAY be used, which
      MUST share semantics with corresponding bandwidth modifiers from
      the SDP bandwidth attribute.  The bandwidth value MUST be
      interpreted as defined by the bandwidth modifier.  The high end of
      the range MUST be equal to or larger than the low end.  The high
      end of the range MUST NOT exceed the bandwidth parameter in the
      same media description, if any.  The sum of bandwidth range low
      ends for all media configurations within a media description MUST
      NOT exceed the value of that media description's bandwidth
      parameter.  MUST be present in every media configuration.

   Media configuration entry types "pt" and "b" MUST be supported by all
   implementations of this specification.  Otherwise, an implementation
   MAY ignore any media configuration entry types that are not
   understood.  A media configuration MAY be re-used to describe more
   than a single Source Packet Stream.

6.2.1.  Simulcast Limitations

   The Session and Media level attributes and parameters outside of
   individual media configurations (a=config-id) provides limitations on
   the set of media configurations in simultanuous use.  For example a
   media description bandwidth limitation using b=AS would apply on all



Westerlund, et al.       Expires April 25, 2014                [Page 16]


Internet-Draft                RTP Simulcast                 October 2013


   the Packet Streams sent within the scope of that media description,
   thus forcing the sum of the media configuration bandwidth in use to
   share that available bandwidth.  Don't forget other Packet Streams
   such as RTP retransmission or FEC flows that also needs to be
   included.

   There exist a number of different limitations, and this section does
   not intend to be complete.  The payload formats and their
   configurations can offer limitations, for example video profile and
   levels imposes a joint limit on bit-rate, frame-rate and resolution.
   The bandwidth parameters on session and media description level apply
   according to their semantics and their level.  Packetization
   limitations, e.g. maxptime, as well as recommendations apply to all
   the configurations within the scope where this parameter is defined.

   It is important to note that limits, such as bandwidth expressed
   within a media configuration are not limited by the media description
   values.  First of all, the sum of bit-rates across all media
   configurations in a media description can be greater than the media
   description limit as not all configurations may be in simultanuous
   use.  For example, only a single configuration can be enabled, which
   is then allowed to consume the full outer limit.  Secondly, the media
   configuration directionality needs to be taken into account, for
   example that SDP receiver limitations are not applied to the sender
   configuration.

6.2.2.  Declarative Use

   When used as a declarative media description, config-id with recv
   parameter indicates the configured end-point's required media
   configuration to receive a specified set of Source Packet Streams as
   Simulcast streams.  In the same fashion, config-id with send
   parameter requests the end-point to use the specified media
   configuration when sending a specified set of Source Packet Streams
   as Simulcast streams.

6.2.3.  Offer/Answer Use

   An offerer wanting to use Simulcast in a specific direction SHALL use
   config-id to describe the media configurations to use in that
   direction in the Offer.

   An answerer receiving a config-id media configuration for a specific
   direction, accepting to use that media configuration SHALL include a
   corresponding media configuration with the reverse direction in the
   Answer.  The config-id identification value MUST be kept between the
   Offer and the Answer.  An answerer not accepting to use a specific
   media configuration SHALL remove it from the Answer.



Westerlund, et al.       Expires April 25, 2014                [Page 17]


Internet-Draft                RTP Simulcast                 October 2013


   The Answer MUST keep exactly the same media configuration types in a
   media configuration as were present in the corresponding media
   configuration in the Offer.

   The answerer MAY remove values from enumerations and MAY reduce
   ranges of media configuration entries in the Answer.  If the reduced
   media configuration entry relates to the answerer's send direction,
   negotiation is complete and no further action is needed.  If the
   reduced media configuration relates to the answerer's receive
   direction, the offerer SHOULD send another Offer where that related,
   send direction media configuration is reduced at least to the level
   in the previous Answer, but MAY be reduced even more, and MAY be
   removed entirely.

6.3.  Grouping Simulcast Configurations

   A set of media configurations (Section 6.2) is needed to describe a
   Simulcast.  Each Source Packet Stream in the Simulcast share the same
   Media Source, but have different media configurations.  Thus, the
   actual grouping of media configurations is what defines a specific
   Simulcast.  It is proposed to define two new media level and session
   level SDP attributes, "a=sim-send" and "a=sim-recv", which uses
   config-id values to group media configurations for the purpose of
   Simulcast transmission and reception, respectively. "a=sim-send" and
   "a=sim-recv" MAY be used independently and simultaneously.  They MAY
   be used on session level to group media configurations when different
   Simulcast encodings of a Media Source are to be sent in different
   Media Transports and RTP sessions.  They MAY also be used on media
   level to group media configurations when different Simulcast
   encodings of a Media Source are to be sent based on the same media
   description and thus use the same Media Transport and RTP session.
   When used on media level, the Simulcast direction MAY conflict with
   the general media description direction, but a conflict MUST be
   interpreted as the Simulcast being effectively inhibited.  For
   example, sim-send in a recvonly media description means that no
   Simulcast Source Packet Streams are sent.

   simulcast         = "a="( "sim-send:" / "sim-recv:" ) config-id-list
   config-id-list    = config-item *(WSP config-item)
   config-item       = config-id [":" config-param-list]
   config-id         = token
   config-param-list = config-param *("," config-param)
   config-param      = "inactive"
                     / token ["=" param-value] ; for future extension
   param-value       = 1*(value-char)
                     / DQUOTE non_ws_string DQUOTE
   value-char        = token-char / %x28 / %x29 / %x2F / %x3A-3C
                     / %x3E-40 / %x5B-5D ; VCHAR except "=" and ","



Westerlund, et al.       Expires April 25, 2014                [Page 18]


Internet-Draft                RTP Simulcast                 October 2013


   ; WSP and VCHAR defined in [RFC5234]
   ; token, token-char and non_ws_string defined in [RFC4566]


            Figure 3: ABNF for Simulcast Configuration Grouping

   The config-id identification of a media configuration MUST be defined
   by a "config-id" attribute in any of the media descriptions that are
   part of the SDP.

6.3.1.  Declarative Use

   When used as a declarative media description, sim-recv indicates the
   configured end-point's required ability to receive Source Packet
   Streams with the specified set of media configurations as Simulcast
   streams.  In the same fashion, sim-send requests the end-point to
   send Source Packet Streams with the specified set of media
   configurations as Simulcast streams.

   The configuration parameter "inactive" SHALL be interpreted as the
   related Source Packet Stream is in PAUSED state
   [I-D.westerlund-avtext-rtp-stream-pause] at the start of the session,
   and applicable RTP level procedures from that specification SHALL be
   applied.

6.3.2.  Offer/Answer Use

   An offerer wanting to send a set of Source Packet Streams as
   Simulcast streams includes sim-send in the Offer to describe which
   media configurations to use for that Simulcast.  Similarly, an
   offerer wanting to receive a set of Source Packet Streams as
   Simulcast streams includes sim-recv in the Offer to describe which
   media configurations to use for that Simulcast.

   An answerer receiving sim-send, accepting to receive those media
   configurations as Simulcasted Source Packet Streams SHALL include
   sim-recv with the accepted media configurations in the Answer.
   Similarly, an answerer receiving sim-recv, accepting to send those
   media configurations as Simulcasted Source Packet Streams SHALL
   include sim-send with the accepted media configurations in the
   Answer.  An answerer MAY remove media configurations from sim-send or
   sim-recv included in the Answer compared to the ones included in the
   sim-send or sim-recv in the Offer.  The answerer MUST NOT add any
   media configurations to sim-send or sim-recv in the Answer that were
   not in the corresponding ones in the Offer.

   An "inactive" parameter present in the Offer MUST be kept in the
   Answer.  The Answer MAY add an "inactive" parameter to any of the



Westerlund, et al.       Expires April 25, 2014                [Page 19]


Internet-Draft                RTP Simulcast                 October 2013


   media configurations.  An "inactive" parameter on a media
   configuration in "sim-recv" is equivalent to a PAUSE (or in some
   cases, an equivalent TMMBR 0) message
   [I-D.westerlund-avtext-rtp-stream-pause] being sent for the received
   Source Packet Stream at the start of the session, and applicable RTP
   level procedures from that specification SHALL be applied.  An
   "inactive" parameter on a media configuration in "sim-send" is
   equivalent to the related Source Packet Stream being in PAUSED state
   at the start of the session, and applicable RTP level procedures
   SHALL be applied.

   The number of different Source Packet Streams used for a Simulcast
   related to a single media description MUST NOT exceed the number of
   listed media configurations in the corresponding sim-recv in that
   media description sent by the media receiver.

6.4.  Relating Simulcast Versions

   To ensure that Simulcast Packet Streams can be related correctly on
   RTP level, SDES SRCNAME [I-D.westerlund-avtext-rtcp-sdes-srcname]
   MUST be used to label Simulcast versions belonging to the same Media
   Source.  The RTP Header Extension option of that specification MAY be
   used with Simulcast.

   The SRCNAME identifier for Simulcast MUST contain a first part that
   uniquely identifies the Media Source within a given CNAME, followed
   by a single "." (period) and the config-id as defined above
   (Section 6.2).

   The SRCNAME parameter to source-specific signaling [RFC5576]
   ("a=ssrc") MAY be used for Source Packet Streams in the send
   direction to relate SRCNAME to SSRC already in the SDP.

6.5.  Two-Phase Negotiation

   The new "a=sim-send-cap" and "a=sim-recv-cap" attributes MAY be
   included in the SDP as an optional pre-stage in a two-phased
   approach, where the pre-stage involves a first SDP Offer/Answer
   procedure that only establishes Simulcast capability at both the
   offerer and the answerer.  This has the additional advantage to avoid
   sending media descriptions related to Simulcast to an endpoint that
   does not support simulcast.  In case two Offer/Answer procedures are
   already used for other reasons, it will not incur any significant
   extra signaling round-trips.  Such other two-phase techniques include
   use of SIP OPTIONS, SIP UPDATE [RFC3311] with reliable provisional
   responses, and BUNDLE [I-D.ietf-mmusic-sdp-bundle-negotiation].





Westerlund, et al.       Expires April 25, 2014                [Page 20]


Internet-Draft                RTP Simulcast                 October 2013


   Thus, when using the pre-stage Offer/Answer, it SHOULD NOT include
   any simulcast-grouped media descriptions, which SHOULD then instead
   be added in a main Offer/Answer phase.  When using the pre-stage
   Offer/Answer, half a signaling round-trip time can sometimes be saved
   if main phase is initiated by the Simulcast receiver, meaning that
   the endpoint that included "a=sim-recv" in the pre-stage SDP is the
   offerer in the main phase.  If both endpoints are Simulcast
   receivers, it does not matter which endpoint sends the main Offer,
   using regular Offer/Answer rules to handle any race conditions.

   It is not possible to use any pre-stage to establish capability with
   declarative SDP, in which case it SHALL be by-passed, using only the
   main phase directly.

6.6.  Signaling Examples

   These examples are for a case of client to video conference service
   using a centralized media topology with an RTP mixer.

                    +---+      +-----------+      +---+
                    | A |<---->|           |<---->| B |
                    +---+      |           |      +---+
                               |   Mixer   |
                    +---+      |           |      +---+
                    | F |<---->|           |<---->| J |
                    +---+      +-----------+      +---+

                Figure 4: Four-party Mixer-based Conference

6.6.1.  Unified Plan Client

   Alice is calling in to the mixer with a Simulcast-enabled Unified
   Plan client capable of a single Media Source per media type.  The
   only difference to a non-Simulcast client is capability to send video
   resolution [RFC6236] ("imageattr") and framerate based Simulcast.
   Alice uses a pre-stage Offer, which looks like:

   v=0
   o=alice 2362969037 2362969040 IN IP4 192.0.2.156
   s=Simulcast Enabled Unified Plan Client
   t=0 0
   c=IN IP4 192.0.2.156
   b=AS:665
   a=sim-send-cap:imageattr framerate
   m=audio 49200 RTP/AVP 96 8
   b=AS:145
   a=rtpmap:96 G719/48000/2
   a=rtpmap:8 PCMA/8000



Westerlund, et al.       Expires April 25, 2014                [Page 21]


Internet-Draft                RTP Simulcast                 October 2013


   m=video 49300 RTP/AVP 97
   b=AS:520
   a=rtpmap:97 H264/90000
   a=fmtp:97 profile-level-id=42c01e
   a=imageattr:97 send [x=640,y=360] [x=320,y=180] \
       recv [x=640,y=360] [x=320,y=180]


             Figure 5: Unified Plan Simulcast Pre-Stage Offer

   In this pre-stage, the only thing in the SDP that indicates Simulcast
   capability is the line in the video media description containing the
   "sim-send-cap" attribute, which also indicates that sent Simulcast
   versions can differ in video resolution and/or framerate.

   The Answer from the server indicates both that it too is Simulcast
   capable and that it would prefer to use video resolution
   ("imageattr") based Simulcast, but that it supports both video
   resolution and framerate.  Should it not have been Simulcast capable,
   the "a=sim-recv-cap" line would not have been present and
   communication would have started with the media negotiated in the
   SDP.

   v=0
   o=server 823479283 1209384938 IN IP4 192.0.2.2
   s=Answer to Simulcast Enabled Unified Plan Client
   t=0 0
   c=IN IP4 192.0.2.43
   b=AS:665
   a=sim-recv-cap:imageattr=1.0 framerate=0.8
   m=audio 49200 RTP/AVP 96
   b=AS:145
   a=rtpmap:96 G719/48000/2
   m=video 49300 RTP/AVP 97
   b=AS:520
   a=rtpmap:97 H264/90000
   a=fmtp:97 profile-level-id=42c01e
   a=imageattr:97 send [x=640,y=360] [x=320,y=180] \
       recv [x=640,y=360] [x=320,y=180]


             Figure 6: Unified Plan Simulcast Pre-Stage Answer

   Since the server is the Simulcast media receiver, it immediately
   initiates another Offer/Answer including details on the Simulcast
   versions.  The server also keeps the "sim-recv-cap" as explicit
   Simulcast capability indication in this main Offer/Answer.  Note that
   the "non-simulcast" media can be started already now, before the main



Westerlund, et al.       Expires April 25, 2014                [Page 22]


Internet-Draft                RTP Simulcast                 October 2013


   Offer/Answer, with the only restriction that the Simulcast
   functionality is not yet established.

   v=0
   o=server 823479283 1209384938 IN IP4 192.0.2.2
   s=Server Inviting Simulcast Enabled Unified Plan Client
   t=0 0
   c=IN IP4 192.0.2.43
   b=AS:825
   a=sim-recv-cap:imageattr=1.0 framerate=0.8
   m=audio 49200 RTP/AVP 96
   b=AS:145
   a=rtpmap:96 G719/48000/2
   m=video 49300 RTP/AVP 97
   b=AS:2200
   a=rtpmap:97 H264/90000
   a=fmtp:97 profile-level-id=42c01e
   a=config-id:a recv pt=97 imageattr=[x=640,y=360],[x=1280,y=720] \
       framerate=25-60 b=AS:500-2500
   a=config-id:b recv pt=97 imageattr=[x=320,y=180],[x=640,y=360] \
       framerate=25-60 b=AS:150-500
   a=config-id:c recv pt=97 imageattr=[x=256,y=144],[x=320,y=180] \
       framerate=10-30 b=AS:100-250
   a=sim-recv:a b c


                Figure 7: Unified Plan Simulcast Main Offer

   The server chooses to structure the Answer according to Unified Plan
   and has added three config-id lines in the video media description,
   one for each Simulcast media configuration that it is prepared to
   receive.  Each media configuration refers to a defined media format,
   and lists a set of preferred video resolutions as well as a range of
   acceptable framerates, concluded by a bandwidth range.  It also
   includes the sim-recv attribute for those three media configurations,
   indicating that the Simulcast it is prepared to receive in this media
   description can include one or more of those media configurations.

   Alice's Answer is:

   v=0
   o=alice 2362969037 2362969040 IN IP4 192.0.2.156
   s=Final answer from Simulcast Enabled Unified Plan Client
   t=0 0
   c=IN IP4 192.0.2.156
   b=AS:825
   a=sim-send-cap:imageattr framerate
   m=audio 49200 RTP/AVP 96



Westerlund, et al.       Expires April 25, 2014                [Page 23]


Internet-Draft                RTP Simulcast                 October 2013


   b=AS:145
   a=rtpmap:96 G719/48000/2
   m=video 49300 RTP/AVP 97
   b=AS:520
   a=rtpmap:97 H264/90000
   a=fmtp:97 profile-level-id=42c01e
   a=config-id:b send pt=97 imageattr=[x=640,y=360] \
       framerate=25-30 b=AS:150-400
   a=config-id:c send pt=97 imageattr=[x=320,y=180] \
       framerate=10-12.5 b=AS:100-150
   a=sim-send:b c:inactive
   a=ssrc:31053821 cname=SDIe93850aQFid9P srcname=1.b
   a=ssrc:43298172 cname=SDIe93850aQFid9P srcname=1.c
   a=imageattr:97 send [x=640,y=360] [x=320,y=180] \
       recv [x=640,y=360] [x=320,y=180]


               Figure 8: Unified Plan Simulcast Main Answer

   The Simulcast capability, sim-send-cap, is kept from Alice's previous
   Offer.  One of the media configurations from the server Offer,
   config-id:a, is not acceptable to Alice's client for some reason and
   is removed from the Answer.  The resulting Simulcast, described by
   sim-send, thus contains two media configurations, b and c, where c is
   initially set to "inactive" that effectively means it is paused from
   the start of the session.  The media configuration parameter value
   ranges are in some cases reduced, which makes a more precise
   definition of what will actually be sent.  This Answer SDP also
   includes a specification of the SSRC values that will be sent and
   what media configurations those SSRC will carry, by including the
   srcname parameter.  The first part of srcname, before the ".", is the
   Media Source identification.  Both SSRC share the same Media Source
   identification, since they are part of the same Simulcast.  The
   second part, after the ".", is the config-id of the media
   configuration sent with that SSRC.

6.6.2.  Multi-Transport Client

   Bob is calling in to the mixer with a Simulcast-enabled client, like
   Alice's capable of a single Media Source per media type, but also
   capable of sending Source Packet Streams as Simulcast versions on
   separate Media Transports.  In this example, Bob's client knows that
   the server is capable of Simulcast and does not use any pre-stage
   Offer, but goes straight to the main Offer.

   v=0
   o=bob 94572932847 3429478298 IN IP4 192.0.2.93
   s=Offer from Simulcast Enabled Multi-Transport Client



Westerlund, et al.       Expires April 25, 2014                [Page 24]


Internet-Draft                RTP Simulcast                 October 2013


   t=0 0
   c=IN IP4 192.0.2.93
   b=AS:825
   a=sim-send-cap:imageattr=1.0 framerate=0.9
   a=sim-send:x y
   m=audio 50138 RTP/AVP 101
   b=AS:145
   a=rtpmap:101 G719/48000/2
   m=video 50226 RTP/AVP 118
   b=AS:500
   a=rtpmap:118 H264/90000
   a=fmtp:118 profile-level-id=42c01e
   a=config-id:x send pt=118 imageattr=[x=320,y=180],[x=640,y=360] \
       framerate=25-50 b=AS:200-500
   a=ssrc:3929384298 cname=Nsdko39Oen828FKn srcname=M.x
   a=imageattr:118 send [x=640,y=360] [x=320,y=180] \
       recv [x=640,y=360] [x=320,y=180]
   m=video 50228 RTP/AVP 119
   b=AS:150
   a=config-id:y send pt=119 imageattr=[x=256,y=144],[x=320,y=180] \
       framerate=12.5-25 b=AS:100-200
   a=ssrc:1923419284 cname=Nsdko39Oen828FKn srcname=M.y
   a=imageattr:119 send [x=320,y=180] [x=256,y=144]
   a=sendonly


              Figure 9: Multi-Transport Simulcast Main Offer

   As can be seen from above, this Offer uses sim-send on session level
   and has split the Simulcast media configurations on two media
   descriptions, in order to be able to use separate Media Transports
   and enable differentiated treatment of the two Simulcast streams.

   The server accepts this structure to the Answer:

   v=0
   o=server 283479882 9384298374 IN IP4 192.0.2.2
   s=Server Answering Simulcast Enabled Multi-Transport Client
   t=0 0
   c=IN IP4 192.0.2.45
   b=AS:825
   a=sim-recv-cap:imageattr framerate
   a=sim-recv:x y
   m=audio 49200 RTP/AVP 96
   b=AS:145
   a=rtpmap:96 G719/48000/2
   m=video 49300 RTP/AVP 118
   b=AS:500



Westerlund, et al.       Expires April 25, 2014                [Page 25]


Internet-Draft                RTP Simulcast                 October 2013


   a=rtpmap:118 H264/90000
   a=fmtp:118 profile-level-id=42c01e
   a=config-id:x recv pt=118 imageattr=[x=640,y=360] \
       framerate=25-50 b=AS:350-500
   a=imageattr:118 send [x=640,y=360] [x=320,y=180] \
       recv [x=640,y=360] [x=320,y=180]
   m=video 49300 RTP/AVP 119
   b=AS:150
   a=rtpmap:119 H264/90000
   a=fmtp:119 profile-level-id=42c01e
   a=config-id:y recv pt=119 imageattr=[x=256,y=144] \
       framerate=12.5-25 b=AS:120-150
   a=imageattr:119 recv [x=320,y=180] [x=256,y=144]
   a=recvonly


             Figure 10: Multi-Transport Simulcast Main Answer

6.6.3.  Multi-Source Client

   Fred is calling in to the same conference as in the examples above
   with a three-camera, three-display system, thus capable of handling
   three separate Media Sources in each direction, where each Media
   Source is also Simulcast-enabled in the send direction.  Fred's
   client is a Unified Plan client, restricted to a single Media Source
   per media description.

   v=0
   o=fred 238947129 823479223 IN IP4 192.0.2.125
   s=Offer from Simulcast Enabled Multi-Source Client
   t=0 0
   c=IN IP4 192.0.2.125
   b=AS:825
   a=sim-send-cap:imageattr=1.0 framerate=0.5

   m=audio 49200 RTP/AVP 98
   b=AS:145
   a=rtpmap:98 G719/48000/2

   m=video 49600 RTP/AVP 100
   b=AS:3500
   a=rtpmap:100 H264/90000
   a=fmtp:100 profile-level-id=42c02a
   a=config-id:1h send pt=100 imageattr=[x=1920,y=1080] \
       framerate=30-60 b=AS:2000-3500
   a=config-id:1m send pt=100 imageattr=[x=1280,y=720] \
       framerate=15-60 b=AS:1000-2000
   a=config-id:1l send pt=100 imageattr=[x=640,y=360] \



Westerlund, et al.       Expires April 25, 2014                [Page 26]


Internet-Draft                RTP Simulcast                 October 2013


       framerate=10-60 b=AS:200-1000
   a=sim-send:1h 1m 1l
   a=ssrc:2397234521 cname=EkeS32892FeO29DK srcname=1.1h
   a=ssrc:1023894789 cname=EkeS32892FeO29DK srcname=1.1m
   a=ssrc:4029284928 cname=EkeS32892FeO29DK srcname=1.1l
   a=imageattr:100 send [x=1920,y=1080] [x=1280,y=720] [x=640,y=360] \
       recv [x=1920,y=1080] [x=1280,y=720] [x=640,y=360]

   m=video 49600 RTP/AVP 100
   b=AS:3500
   a=rtpmap:100 H264/90000
   a=fmtp:100 profile-level-id=42c02a
   a=config-id:2h send pt=100 imageattr=[x=1920,y=1080] \
       framerate=30-60 b=AS:2000-3500
   a=config-id:2m send pt=100 imageattr=[x=1280,y=720] \
       framerate=15-60 b=AS:1000-2000
   a=config-id:2l send pt=100 imageattr=[x=640,y=360] \
       framerate=10-60 b=AS:200-1000
   a=sim-send:2h 2m 2l
   a=ssrc:2301017618 cname=EkeS32892FeO29DK srcname=2.2h
   a=ssrc:639711316 cname=EkeS32892FeO29DK srcname=2.2m
   a=ssrc:3293473905 cname=EkeS32892FeO29DK srcname=2.2l
   a=imageattr:100 send [x=1920,y=1080] [x=1280,y=720] [x=640,y=360] \
       recv [x=1920,y=1080] [x=1280,y=720] [x=640,y=360]

   m=video 49600 RTP/AVP 100
   b=AS:3500
   a=rtpmap:100 H264/90000
   a=fmtp:100 profile-level-id=42c02a
   a=config-id:3h send pt=100 imageattr=[x=1920,y=1080] \
       framerate=30-60 b=AS:2000-3500
   a=config-id:3m send pt=100 imageattr=[x=1280,y=720] \
       framerate=15-60 b=AS:1000-2000
   a=config-id:3l send pt=100 imageattr=[x=640,y=360] \
       framerate=10-60 b=AS:200-1000
   a=sim-send:3h 3m 3l
   a=ssrc:4115355057 cname=EkeS32892FeO29DK srcname=3.3h
   a=ssrc:3196538337 cname=EkeS32892FeO29DK srcname=3.3m
   a=ssrc:3757973912 cname=EkeS32892FeO29DK srcname=3.3l
   a=imageattr:100 send [x=1920,y=1080] [x=1280,y=720] [x=640,y=360] \
       recv [x=1920,y=1080] [x=1280,y=720] [x=640,y=360]


            Figure 11: Fred's Multi-Source Simulcast Main Offer







Westerlund, et al.       Expires April 25, 2014                [Page 27]


Internet-Draft                RTP Simulcast                 October 2013


   The three media descriptions for video are essentially the same,
   except values that needs to be unique are provided unique values.
   The above also assumes that BUNDLE will be used across these three
   video media description to create a common RTP session.

7.  Network Aspects

   Simulcast is in defined as the act of sending multiple alternative
   encodings of the same underlying media source.  When transmitting
   multiple independent streams that originate from the same source, it
   could potentially be done in several different ways using RTP.  A
   general discussion on considerations for use of the different RTP
   multiplexing alternatives can be found in Guidelines for Multiplexing
   in RTP [I-D.ietf-avtcore-multiplex-guidelines].  Discussion and
   clarification on how to handle multiple streams in an RTP session can
   be found in [I-D.ietf-avtcore-rtp-multi-stream].

   The network aspects that are relevant for Simulcast are:

   Quality of Service:  When using Simulcast it might be of interest to
      prioritize a particular Simulcast version, rather than applying
      equal treatment to all versions.  For example, lower bit-rate
      versions may be prioritized over higher bit-rate versions to
      minimize congestion or packet losses in the low bit-rate versions.
      Thus, there is a benefit to use a Simulcast solution that supports
      QoS as good as possible.  By separating Simulcast versions into
      different RTP sessions and send those RTP sessions over different
      Media Transports, a Simulcast version can be prioritized by
      existing flow based QoS mechanisms.  When using unicast, QoS
      mechanisms based on individual packet marking are also feasible,
      which do not require separation of Simulcast versions into
      different RTP sessions to apply different QoS.

   NAT/FW Traversal:  Using multiple RTP sessions will incur more cost
      for NAT/FW traversal unless they can re-use the same transport
      flow, which can be achieved by either one of multiplexing multiple
      RTP sessions on a single lower layer transport
      [I-D.westerlund-avtcore-transport-multiplexing] or Multiplexing
      Negotiation Using SDP Port Numbers
      [I-D.ietf-mmusic-sdp-bundle-negotiation].  If flow based QoS with
      any differentiation is desirable, the cost for additional
      transport flows is likely necessary.

   Multicast:  Multiple RTP sessions will be required to enable
      combining Simulcast with multicast.  Different Simulcast versions
      have to be separated to different multicast groups to allow a
      multicast receiver to pick the version it wants, rather than
      receive all of them.  In this case, the only reasonable



Westerlund, et al.       Expires April 25, 2014                [Page 28]


Internet-Draft                RTP Simulcast                 October 2013


      implementation is to use different RTP sessions for each multicast
      group so that reporting and other RTCP functions operate as
      intended.

8.  IANA Considerations

   This document requests that five new attributes, sim-send-cap, sim-
   recv-cap, sim-send, sim-recv, and config-id.  It is also requested to
   make a new registry of defined parameters taken from existing SDP
   attributes for sim-send-cap, sim-recv-cap, and config-id.

   Formal registrations to be written.

9.  Security Considerations

   The Simulcast capability and configuration attributes and parameters
   are vulnerable to attacks in signaling.

   A false inclusion of Simulcast attributes may result in generation of
   a second phase SDP that potentially contains a large number of non-
   supported media descriptions expressing Simulcast alternatives.  A
   correct SDP implementation will however be able to reject any non-
   supported media descriptions and the effect from that should be
   limited.

   A hostile removal of the Simulcast attributes will result in skipping
   any second phase Offer/Answer and that Simulcast is not used.

   The Simulcast grouping semantics are vulnerable to attacks in the
   signalling.  Changing the set of media configurations that are used
   in a Simulcast will impact the number of Source Packet Streams.

   A hostile removal of Simulcast grouping will prevent streams from
   being interpreted as Simulcast, which obviously prevents use of the
   Simulcast functionality.  It will also risk that intended Simulcast
   streams are instead presented as separate, independent streams to a
   receiver.

   Neither of the above will likely have any major consequences and can
   be mitigated by signaling that is at least integrity and source
   authenticated to prevent an attacker to change it.

10.  Contributors

   Morgan Lindqvist and Fredrik Jansson, both from Ericsson, have
   contributed with important material to the first versions of this
   document.




Westerlund, et al.       Expires April 25, 2014                [Page 29]


Internet-Draft                RTP Simulcast                 October 2013


11.  Acknowledgements

12.  References

12.1.  Normative References

   [I-D.westerlund-avtext-rtcp-sdes-srcname]
              Westerlund, M., "RTCP Source Description Item SRCNAME to
              Label Individual Media Sources", draft-westerlund-avtext-
              rtcp-sdes-srcname-03 (work in progress), October 2013.

   [I-D.westerlund-avtext-rtp-stream-pause]
              Akram, A., Burman, B., Grondal, D., and M. Westerlund,
              "RTP Media Stream Pause and Resume", draft-westerlund-
              avtext-rtp-stream-pause-03 (work in progress), October
              2012.

   [I-D.westerlund-mmusic-max-ssrc]
              Holmberg, C., Westerlund, M., and F. Jansson, "Multiple
              Synchronization Sources (SSRC) in SDP Media Descriptions",
              draft-westerlund-mmusic-max-ssrc-02 (work in progress),
              September 2013.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3311]  Rosenberg, J., "The Session Initiation Protocol (SIP)
              UPDATE Method", RFC 3311, October 2002.

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, July 2003.

   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
              Description Protocol", RFC 4566, July 2006.

   [RFC4568]  Andreasen, F., Baugher, M., and D. Wing, "Session
              Description Protocol (SDP) Security Descriptions for Media
              Streams", RFC 4568, July 2006.

   [RFC5109]  Li, A., "RTP Payload Format for Generic Forward Error
              Correction", RFC 5109, December 2007.

   [RFC5234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, RFC 5234, January 2008.

   [RFC5285]  Singer, D. and H. Desineni, "A General Mechanism for RTP
              Header Extensions", RFC 5285, July 2008.



Westerlund, et al.       Expires April 25, 2014                [Page 30]


Internet-Draft                RTP Simulcast                 October 2013


   [RFC5576]  Lennox, J., Ott, J., and T. Schierl, "Source-Specific
              Media Attributes in the Session Description Protocol
              (SDP)", RFC 5576, June 2009.

   [RFC5888]  Camarillo, G. and H. Schulzrinne, "The Session Description
              Protocol (SDP) Grouping Framework", RFC 5888, June 2010.

   [RFC6236]  Johansson, I. and K. Jung, "Negotiation of Generic Image
              Attributes in the Session Description Protocol (SDP)", RFC
              6236, May 2011.

12.2.  Informative References

   [I-D.ietf-avtcore-multiplex-guidelines]
              Westerlund, M., Perkins, C., and H. Alvestrand,
              "Guidelines for using the Multiplexing Features of RTP to
              Support Multiple Media Streams", draft-ietf-avtcore-
              multiplex-guidelines-01 (work in progress), July 2013.

   [I-D.ietf-avtcore-rtp-multi-stream]
              Lennox, J., Westerlund, M., Wu, W., and C. Perkins,
              "Sending Multiple Media Streams in a Single RTP Session",
              draft-ietf-avtcore-rtp-multi-stream-01 (work in progress),
              July 2013.

   [I-D.ietf-avtcore-rtp-topologies-update]
              Westerlund, M. and S. Wenger, "RTP Topologies", draft-
              ietf-avtcore-rtp-topologies-update-00 (work in progress),
              April 2013.

   [I-D.ietf-mmusic-sdp-bundle-negotiation]
              Holmberg, C., Alvestrand, H., and C. Jennings,
              "Multiplexing Negotiation Using Session Description
              Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp-
              bundle-negotiation-05 (work in progress), October 2013.

   [I-D.lennox-raiarea-rtp-grouping-taxonomy]
              Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro,
              "A Taxonomy of Grouping Semantics and Mechanisms for Real-
              Time Transport Protocol (RTP) Sources", draft-lennox-
              raiarea-rtp-grouping-taxonomy-03 (work in progress),
              October 2013.

   [I-D.westerlund-avtcore-transport-multiplexing]
              Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a
              Single Lower-Layer Transport", draft-westerlund-avtcore-
              transport-multiplexing-06 (work in progress), August 2013.




Westerlund, et al.       Expires April 25, 2014                [Page 31]


Internet-Draft                RTP Simulcast                 October 2013


   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
              with Session Description Protocol (SDP)", RFC 3264, June
              2002.

   [RFC3569]  Bhattacharyya, S., "An Overview of Source-Specific
              Multicast (SSM)", RFC 3569, July 2003.

   [RFC4588]  Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
              Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
              July 2006.

   [RFC5117]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
              January 2008.

   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
              (ICE): A Protocol for Network Address Translator (NAT)
              Traversal for Offer/Answer Protocols", RFC 5245, April
              2010.

   [RFC6190]  Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis,
              "RTP Payload Format for Scalable Video Coding", RFC 6190,
              May 2011.

Appendix A.  Discussion on Receiver Diversity

   Receiver diversity can be handled in a number of different ways, each
   with its own advantages and disadvantages.  In that, there are
   relations between RTP Mixer processing requirement, bandwidth usage
   on uplink from sending Participant to RTP Mixer, bandwidth usage on
   downlink from RTP Mixer to receiving Participant, and media Quality
   of Experience at the receiving Participant.

   The following is a listing of possible approaches:

   1.  Lowest Common Denominator: Create a single Source Packet Stream
       per Media Source and, assuming that everyone can receive a
       "simple" stream, adapt the characteristics of that Source Packet
       Stream already at the sending Participant to the lowest common
       denominator among all receiving Participants.  Let the RTP Mixer
       forward this single Source Packet Stream to all receiving
       Participants.  The advantages are low bandwidth usage on both
       uplink and downlink and low RTP Mixer processing requirements.
       The disadvantage is that the least capable receiver and/or
       network path dictates the (low) QoE for everyone else.

   2.  Individual Transcoding: Create a single Source Packet Stream per
       Media Source with characteristics governed by resources available
       to the sending Participant and the network path to the RTP Mixer.



Westerlund, et al.       Expires April 25, 2014                [Page 32]


Internet-Draft                RTP Simulcast                 October 2013


       Let the RTP Mixer transcode (decode and re-encode) that into
       individual Source Packet Streams for each receiving Participant,
       governed by the RTP Mixer resources, receiving Participant
       resources, and the network path to that Participant.  The
       advantages are adapted although overall slightly lowered QoE (due
       to transcoding) to each Participant and optimised bandwidth usage
       on both uplink and downlink.  The disadvantage is (very) high RTP
       Mixer processing requirements.

   3.  Individual Simulcast: Create individual Source Packet Streams of
       each Media Source to each receiving Participant, constituting a
       complete individual Simulcast.  Let the RTP Mixer forward each
       individual Source Packet Stream to the targeted receiving
       Participant.  The advantages are low RTP Mixer processing and
       optimised downlink bandwidth.  The disadvantage is (very) high
       uplink bandwidth.

   4.  Grouped Simulcast: For each Media Source, create a "suitable"
       logical grouping of receiving Participants in sub-groups with
       respect to available receiver resources, for example the
       resources listed above (Section 3.1).  Create a set of Source
       Packet Streams for this Media Source with well-chosen
       characteristics, where each Source Packet Stream in the set is a
       good-enough fit to the receiving sub-group of Participants.  This
       set of Source Packet Streams constitutes a Simulcast of the Media
       Source.  The size of the set and the characteristics of each
       Source Packet Stream can be adjusted to cater for various
       restrictions in the sending Participant, receiving Participants
       in the sub-group, and network path(s) to the Participants in the
       sub-group.  Let the RTP Mixer forward the same Source Packet
       Stream to all Participants in a sub-group, for all Source Packet
       Streams and sub-groups.  The advantages are low RTP Mixer
       processing, near optimum QoE, and near optimum downlink
       bandwidth.  The disadvantages are high uplink bandwidth and
       arguably that downlink bandwidth and QoE are optimum only for a
       sub-group and not per individual receiving Participant.

   A summary of the advantages and disadvantages of the above four
   principle alternatives is given below (Table 1):

     +--------+-----------+-----------+--------------+--------------+
     | Method | Mixer CPU | Uplink    | Downlink     | QoE          |
     +--------+-----------+-----------+--------------+--------------+
     | 1      | Low       | Low       | Low          | Low          |
     | 2      | Very high | Optimum   | Optimum      | Near optimum |
     | 3      | Low       | Very high | Optimum      | Optimum      |
     | 4      | Low       | High      | Near optimum | Near optimum |
     +--------+-----------+-----------+--------------+--------------+



Westerlund, et al.       Expires April 25, 2014                [Page 33]


Internet-Draft                RTP Simulcast                 October 2013


              Table 1: Receiver Diversity Handling Comparison

   The authors of this document believes that alternative 4, the Grouped
   Simulcast, can be a good tradeoff whenever supported by sufficient
   uplink resources.

Authors' Addresses

   Magnus Westerlund
   Ericsson
   Farogatan 6
   SE-164 80 Kista
   Sweden

   Phone: +46 10 714 82 87
   Email: magnus.westerlund@ericsson.com


   Bo Burman
   Ericsson
   Farogatan 6
   SE-164 80 Kista
   Sweden

   Phone: +46 10 714 13 11
   Email: bo.burman@ericsson.com


   Suhas Nandakumar
   Cisco
   170 West Tasman Drive
   San Jose, CA  95134
   USA

   Email: snandaku@cisco.com
















Westerlund, et al.       Expires April 25, 2014                [Page 34]