Network Working Group B. Burman
Internet-Draft M. Westerlund
Intended status: Standards Track Ericsson
Expires: April 21, 2016 S. Nandakumar
M. Zanaty
Cisco
October 19, 2015
Using Simulcast in SDP and RTP Sessions
draft-ietf-mmusic-sdp-simulcast-03
Abstract
In some application scenarios it may be desirable to send multiple
differently encoded versions of the same media source in different
RTP streams. This is called simulcast. This document discusses the
best way of accomplishing simulcast in RTP and how to signal it in
SDP. A solution is defined by making an extension to SDP, and using
RTP/RTCP identification methods to relate RTP streams belonging to
the same media source. The SDP extension consists of a new media
level SDP attribute that expresses capability to send and/or receive
simulcast RTP streams. RTP/RTCP identification using either payload
types or a separately defined method for RTP stream configuration are
defined.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 21, 2016.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
Burman, et al. Expires April 21, 2016 [Page 1]
Internet-Draft Simulcast October 2015
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Requirements Language . . . . . . . . . . . . . . . . . . 4
3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1. Reaching a Diverse Set of Receivers . . . . . . . . . . . 5
3.2. Application Specific Media Source Handling . . . . . . . 6
3.3. Receiver Media Source Preferences . . . . . . . . . . . . 7
4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7
5. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6. Detailed Description . . . . . . . . . . . . . . . . . . . . 9
6.1. Simulcast Capability . . . . . . . . . . . . . . . . . . 9
6.1.1. Declarative Use . . . . . . . . . . . . . . . . . . . 11
6.1.2. Offer/Answer Use . . . . . . . . . . . . . . . . . . 12
6.2. Relating Simulcast Streams . . . . . . . . . . . . . . . 14
6.3. Signaling Examples . . . . . . . . . . . . . . . . . . . 14
6.3.1. Unified Plan Client . . . . . . . . . . . . . . . . . 14
6.3.2. Multi-Source Client . . . . . . . . . . . . . . . . . 16
7. Network Aspects . . . . . . . . . . . . . . . . . . . . . . . 18
8. Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 18
8.1. Single RTP Session . . . . . . . . . . . . . . . . . . . 18
8.2. SDP Format Identification . . . . . . . . . . . . . . . . 19
8.3. RID Identification . . . . . . . . . . . . . . . . . . . 19
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
10. Security Considerations . . . . . . . . . . . . . . . . . . . 20
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 20
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 20
13.1. Normative References . . . . . . . . . . . . . . . . . . 20
13.2. Informative References . . . . . . . . . . . . . . . . . 21
Appendix A. Changes From Earlier Versions . . . . . . . . . . . 23
A.1. Modifications Between WG Version -02 and -03 . . . . . . 23
A.2. Modifications Between WG Version -01 and -02 . . . . . . 24
A.3. Modifications Between WG Version -00 and -01 . . . . . . 24
A.4. Modifications Between Individual Version -00 and WG
Version -00 . . . . . . . . . . . . . . . . . . . . . . . 24
Burman, et al. Expires April 21, 2016 [Page 2]
Internet-Draft Simulcast October 2015
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24
1. Introduction
Most of today's multiparty video conference solutions make use of
centralized servers to reduce the bandwidth and CPU consumption in
the endpoints. Those servers receive RTP streams from each
participant and send some suitable set of possibly modified RTP
streams to the rest of the participants, which usually have
heterogeneous capabilities (screen size, CPU, bandwidth, codec, etc).
One of the biggest issues is how to perform RTP stream adaptation to
different participants' constraints with the minimum possible impact
on both video quality and server performance.
Simulcast is defined in this memo as the act of simultaneously
sending multiple different encoded streams of the same media source,
e.g. the same video source encoded with different video encoder types
or image resolutions. This can be done in several ways and for
different purposes. This document focuses on the case where it is
desirable to provide a media source as multiple encoded streams over
RTP [RFC3550] towards an intermediary so that the intermediary can
provide the wanted functionality by selecting which RTP stream(s) to
forward to other participants in the session, and more specifically
how the identification and grouping of the involved RTP streams are
done. From an RTP perspective, simulcast is a specific application
of the aspects discussed in RTP Multiplexing Guidelines
[I-D.ietf-avtcore-multiplex-guidelines].
This document describes a few scenarios where it is motivated to use
simulcast, and also defines the needed SDP signaling for it.
2. Definitions
2.1. Terminology
This document makes use of the terminology defined in RTP Taxonomy
[I-D.ietf-avtext-rtp-grouping-taxonomy], RTP Topology [RFC5117] and
RTP Topologies Update [I-D.ietf-avtcore-rtp-topologies-update]. In
addition, the following terms are used:
RTP Mixer: An RTP middle node, defined in [RFC5117] (Section 3.4:
Topo-Mixer), further elaborated and extended with other topologies
in [I-D.ietf-avtcore-rtp-topologies-update] (Section 3.6 to 3.9).
RTP Switch: A common short term for the terms "switching RTP mixer",
"source projecting middlebox", and "video switching MCU" as
discussed in [I-D.ietf-avtcore-rtp-topologies-update].
Burman, et al. Expires April 21, 2016 [Page 3]
Internet-Draft Simulcast October 2015
Simulcast Stream: One Encoded Stream or Dependent Stream from a set
of concurrently transmitted Encoded Streams and optional Dependent
Streams, all sharing a common Media Source, as defined in
[I-D.ietf-avtext-rtp-grouping-taxonomy]. Decoding a Dependent
Stream also requires the related (Dependent and) Encoded
Stream(s), but in the context of simulcast that is considered a
property of the Dependent Stream constituting the simulcast
stream. For example, HD and thumbnail video simulcast versions of
a single Media Source sent concurrently as separate RTP Streams.
Simulcast Format: Different formats of a simulcast stream serve the
same purpose as alternative RTP payload types in non-simulcast
SDP, to allow multiple alternative media formats for a given RTP
Stream. As for multiple RTP payload types on the m-line, any one
of the alternative formats can be used at a given point in time,
but not more than one (based on RTP timestamp), and what format is
used can change dynamically from one RTP packet to another. For
example, if all participants in a group video call can decode
H.264 and H.265 video, but only some can encode H.265, both H.264
and H.265 can be kept as alternative formats, and the format may
dynamically switch between H.264 and H.265 as different
participants become active speaker.
2.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. Use Cases
Many use cases of simulcast as described in this document relate to a
multi-party communication session where one or more central nodes are
used to adapt the view of the communication session towards
individual participants, and facilitate the media transport between
participants. Thus, these cases targets the RTP Mixer type of
topology.
There are two principle approaches for an RTP Mixer to provide this
adapted view of the communication session to each receiving
participant:
o Transcoding (decoding and re-encoding) received RTP streams with
characteristics adapted to each receiving participant. This often
include mixing or composition of media sources from multiple
participants into a mixed media source originated by the RTP
Mixer. The main advantage of this approach is that it achieves
close to optimal adaptation to individual receiving participants.
Burman, et al. Expires April 21, 2016 [Page 4]
Internet-Draft Simulcast October 2015
The main disadvantages are that it can be very computationally
expensive to the RTP Mixer and typically also degrades media
Quality of Experience (QoE) such as end-to-end delay for the
receiving participants.
o Switching a subset of all received RTP streams or sub-streams to
each receiving participant, where the used subset is typically
specific to each receiving participant. The main advantages of
this approach are that it is computationally cheap to the RTP
Mixer and it has very limited impact on media QoE. The main
disadvantage is that it can be difficult to combine a subset of
received RTP streams into a perfect fit to the resource situation
of a receiving participant.
The use of simulcast relates to the latter approach, where it is more
important to reduce the load on the RTP Mixer and/or minimize QoE
impact than to achieve an optimal adaptation of resource usage.
3.1. Reaching a Diverse Set of Receivers
The media sources provided by a sending participant potentially need
to reach several receiving participants that differ in terms of
available resources. The receiver resources that typically differ
include, but are not limited to:
Codec: This includes codec type (such as SDP MIME type) and can
include codec configuration options (e.g. SDP fmtp parameters).
A couple of codec resources that differ only in codec
configuration will be "different" if they are somehow not
"compatible", like if they differ in video codec profile, or the
transport packetization configuration.
Sampling: This relates to how the media source is sampled, in
spatial as well as in temporal domain. For video streams, spatial
sampling affects image resolution and temporal sampling affects
video frame rate. For audio, spatial sampling relates to the
number of audio channels and temporal sampling affects audio
bandwidth. This may be used to suit different rendering
capabilities or needs at the receiving endpoints, as well as a
method to achieve different transport capabilities, bitrates and
eventually QoE by controlling the amount of source data.
Bitrate: This relates to the amount of bits spent per second to
transmit the media source as an RTP stream, which typically also
affects the Quality of Experience (QoE) for the receiving user.
Letting the sending participant create a simulcast of a few
differently configured RTP streams per media source can be a good
Burman, et al. Expires April 21, 2016 [Page 5]
Internet-Draft Simulcast October 2015
tradeoff when using an RTP switch as middlebox, instead of sending a
single RTP stream and using an RTP mixer to create individual
transcodings to each receiving participant.
This requires that the receiving participants can be categorized in
terms of available resources and that the sending participant can
choose a matching configuration for a single RTP stream per category
and media source.
For example, assume for simplicity a set of receiving participants
that differ only in that some have support to receive Codec A, and
the others have support to receive Codec B. Further assume that the
sending participant can send both Codec A and B. It can then reach
all receivers by creating two simulcasted RTP streams from each media
source; one for Codec A and one for Codec B.
In another simple example, a set of receiving participants differ
only in screen resolution; some are able to display video with at
most 360p resolution and some support 720p resolution. A sending
participant can then reach all receivers by creating a simulcast of
RTP streams with 360p and 720p resolution for each sent video media
source.
In more elaborate cases, the receiving participants differ both in
available sampling and bitrate, and maybe also codec, and it is up to
the RTP switch to find a good trade-off in which simulcasted stream
to choose for each intended receiver. It is also the responsibility
of the RTP switch to negotiate a good fit of simulcast streams with
the sending participant.
The maximum number of simulcasted RTP streams that can be sent is
mainly limited by the amount of processing and uplink network
resources available to the sending participant.
3.2. Application Specific Media Source Handling
The application logic that controls the communication session may
include special handling of some media sources. It is for example
commonly the case that the media from a sending participant is not
sent back to itself.
It is also common that a currently active speaker participant is
shown in larger size or higher quality than other participants (the
sampling or bitrate aspects of Section 3.1). Not sending the active
speaker media back to itself means there is some other participant's
media that instead has to receive special handling towards the active
speaker; typically the previous active speaker. This way, the
previously active speaker is needed both in larger size (to current
Burman, et al. Expires April 21, 2016 [Page 6]
Internet-Draft Simulcast October 2015
active speaker) and in small size (to the rest of the participants),
which can be solved with a simulcast from the previously active
speaker to the RTP switch.
3.3. Receiver Media Source Preferences
The application logic that controls the communication session may
allow receiving participants to apply preferences to the
characteristics of the RTP stream they receive, for example in terms
of the aspects listed in Section 3.1. Sending a simulcast of RTP
streams is one way of accommodating receivers with conflicting or
otherwise incompatible preferences.
4. Requirements
The following requirements need to be met to support the use cases in
previous sections:
REQ-1: Identification. It must be possible to identify a set of
simulcasted RTP streams as originating from the same media source:
REQ-1.1: In SDP signaling.
REQ-1.2: On RTP/RTCP level.
REQ-2: Transport usage. The solution must work when using:
REQ-2.1: Legacy SDP with separate media transports per SDP media
description.
REQ-2.2: Bundled [I-D.ietf-mmusic-sdp-bundle-negotiation] SDP
media descriptions.
REQ-3: Capability negotiation. It must be possible that:
REQ-3.1: Sender can express capability of sending simulcast.
REQ-3.2: Receiver can express capability of receiving simulcast.
REQ-3.3: Sender can express maximum number of simulcast streams
that can be provided.
REQ-3.4: Receiver can express maximum number of simulcast streams
that can be received.
REQ-3.5: Sender can detail the characteristics of the simulcast
streams that can be provided.
Burman, et al. Expires April 21, 2016 [Page 7]
Internet-Draft Simulcast October 2015
REQ-3.6: Receiver can detail the characteristics of the simulcast
streams that it prefers to receive.
REQ-4: Distinguishing features. It must be possible to have
different simulcast streams use different codec parameters, as can
be expressed by SDP format values and RTP payload types.
REQ-5: Compatibility. It must be possible to use simulcast in
combination with other RTP mechanisms that generate additional RTP
streams:
REQ-5.1: RTP Retransmission [RFC4588].
REQ-5.2: RTP Forward Error Correction [RFC5109].
REQ-5.3: Related payload types such as audio Comfort Noise and/or
DTMF.
REQ-6: Interoperability. The solution must be possible to use in:
REQ-6.1: Interworking with non-simulcast legacy clients using a
single media source per media type.
REQ-6.2: WebRTC "Unified Plan" environment with a single media
source per SDP media description.
5. Overview
As an overview, the above requirements are met by signaling simulcast
capability and configurations in SDP [RFC4566]:
o An offer or answer can contain a number of simulcast streams,
separate for send and receive directions.
o An offer or answer can contain multiple, alternative simulcast
streams in the same fashion as multiple, alternative codecs can be
offered in a media description.
o A single media source per SDP media description is assumed, which
is aligned with the concepts defined in
[I-D.ietf-avtext-rtp-grouping-taxonomy] and will specifically work
in a WebRTC context, both with and without BUNDLE
[I-D.ietf-mmusic-sdp-bundle-negotiation] grouping.
o The codec configuration for a simulcast stream can be expressed in
two alternative ways, with complementing drawbacks and benefits:
Burman, et al. Expires April 21, 2016 [Page 8]
Internet-Draft Simulcast October 2015
* Through existing SDP formats (corresponding to RTP payload
types), enabling the use of simulcast with a minimum set of
additions to existing SDP specifications.
* Through use of a separately specified RTP-level identification
mechanism [I-D.pthatcher-mmusic-rid], which complements and
effectively extends the available simulcast stream
identification and configuration possibilities provided by
using SDP formats.
o It is possible, but not required to use source-specific signaling
[RFC5576] with the proposed solution.
6. Detailed Description
This section further details the overview above (Section 5).
6.1. Simulcast Capability
Simulcast capability is expressed as a new media level SDP attribute,
"a=simulcast". For each desired direction (send/recv), the simulcast
attribute defines a list of simulcast streams (separated by
semicolons), each of which is a list of simulcast formats (separated
by commas). The meaning of the attribute on SDP session level is
undefined and MUST NOT be used. The ABNF [RFC5234] for this
attribute is:
sc-attr = "a=simulcast:" 1*2( WSP sc-str-list ) [WSP sc-pause-list]
sc-str-list = sc-dir WSP sc-id-type "=" sc-alt-list *( ";" sc-alt-list )
sc-pause-list = "paused=" sc-alt-list
sc-dir = "send" / "recv"
sc-id-type = "pt" / "rid" / token
sc-alt-list = sc-id *( "," sc-id )
sc-id = fmt / rid-identifier / token
; WSP defined in [RFC5234]
; fmt, token defined in [RFC4566]
; rid-identifier defined in [I-D.pthatcher-mmusic-rid]
Figure 1: ABNF for Simulcast
There are separate and independent sets of parameters for simulcast
in send and receive directions. When listing multiple directions,
each direction MUST NOT occur more than once on the same line.
Two simulcast stream identification methods are defined; "pt" using
RTP payload type (SDP format), and "rid" using an additional RTP-
level identification mechanism [I-D.pthatcher-mmusic-rid]. Different
Burman, et al. Expires April 21, 2016 [Page 9]
Internet-Draft Simulcast October 2015
identification methods MUST NOT be used for different directions on a
single "a=simulcast" line. Implementations that support both
identification methods MAY include one "a=simulcast" line for each
identification method for the same "m="-line. Multiple "a=simulcast"
lines with the same identification method MUST NOT be used for a
single "m="-line.
Attribute parameters are grouped by direction and consist of a
listing of simulcast stream identifications to be used. The number
of (non-alternative, see below) identifications in the list sets a
limit to the number of supported simulcast streams in that direction.
The order of the listed simulcast versions in the "send" direction
suggests a proposed order of preference, in decreasing order: the
stream listed first is the most preferred Section 3.1, and subsequent
streams have progressively lower preference. The order of the listed
simulcast streams in the "recv" direction expresses a preference
which simulcast streams that are preferred, with the leftmost being
most preferred. This can be of importance if the number of actually
sent simulcast streams have to be reduced for some reason.
Formats that have explicit dependencies [RFC5583]
[I-D.pthatcher-mmusic-rid] to other formats (even in the same media
description) MAY be listed as different simulcast streams.
Alternative simulcast formats MAY be specified as part of the
attribute parameters by expressing each simulcast stream as a comma-
separated list of alternative format identifiers. In this case,
there MUST NOT be any capability restriction in what alternative
formats can be used across different simulcast streams, like
requiring all simulcast streams to use the same codec format
alternative. The order of the format alternatives within a simulcast
stream is significant; the alternatives are listed from (left) most
preferred to (right) least preferred. For the use of simulcast, this
overrides the normal codec preference as expressed by format type
ordering on the "m="-line, using regular SDP rules. This is to
enable a separation of general codec preferences and simulcast stream
configuration preferences.
A simulcast stream can use a codec defined such that the same RTP
SSRC can change RTP payload type multiple times during a session,
possibly even on a per-packet basis. A typical example can be a
speech codec that makes use of Comfort Noise [RFC3389] and/or DTMF
[RFC4733] formats. In those cases, such "related" formats MUST NOT
be listed explicitly in the attribute parameters, since they are not
strictly simulcast streams of the media source, but rather a specific
way of generating the RTP stream of a single simulcast stream with
varying RTP payload type. Instead, only a single simulcast stream
identification MUST be used per simulcast stream or alternative
Burman, et al. Expires April 21, 2016 [Page 10]
Internet-Draft Simulcast October 2015
simulcast format (if there are such) in the SDP. The used simulcast
stream identification SHOULD be the codec format most relevant to the
media description, if possible to identify, for example the audio
codec rather than the DTMF. What codec format to choose in the case
of switching between multiple equally "important" formats is left
open, but it is assumed that in the presence of such strong relation
it does not matter which is chosen.
If RTP stream pause/resume [I-D.ietf-avtext-rtp-stream-pause] is
supported, the optional "paused=" parameter MAY be used in
conjunction with "rid" simulcast stream identification to specify
that a certain simulcast stream is initially paused already from
start of the RTP session. In this case, support for RTP stream
pause/resume MUST also be included under the same "m="-line listing
"a=simulcast". Initially paused simulcast streams MUST NOT be used
with "pt" identification. Initially paused simulcast streams are
resumed as described by the RTP pause/resume specification.
An initially paused simulcast stream in "send" direction MUST be
considered equivalent to an unsolicited locally paused stream, and be
handled accordingly.
An initially paused simulcast stream in "recv" direction SHOULD cause
the remote RTP sender to put the stream as unsolicited locally
paused, unless there are other RTP stream receivers that do not mark
the simulcast stream as initially paused. The reason to require an
initially paused "recv" stream to be considered locally paused by the
remote RTP sender, instead of making it equivalent to implicitly
sending a pause request, is because the pausing RTP sender cannot
know which SSRC owns the restriction when TMMBR/TMMBN are used for
pause/resume signaling since the RTP receiver's SSRC in send
direction is not known yet.
Use of the redundant audio data [RFC2198] format could be seen as a
form of simulcast for loss protection purposes, but is not considered
conflicting with the mechanisms described in this memo and MAY
therefore be used as any other format. In this case the "red"
format, rather than the carried formats, SHOULD be the one to list as
a simulcast stream on the "a=simulcast" line.
6.1.1. Declarative Use
When used as a declarative media description, a=simulcast "recv"
direction formats indicates the configured end point's required
capability to recognize and receive a specified set of RTP streams as
simulcast streams. In the same fashion, a=simulcast "send" direction
requests the end point to send a specified set of RTP streams as
simulcast streams.
Burman, et al. Expires April 21, 2016 [Page 11]
Internet-Draft Simulcast October 2015
If multiple simulcast formats are listed, it means that the
configured end point MUST be prepared to receive any of the "recv"
formats, and MAY send any of the "send" formats for that simulcast
stream.
Editor's note: The RID identification mechanism currently lacks a
declarative use definition. As declarative use may also not
follow unified plan with a single media source per '"m="-line, it
is uncertain if declarative can be defined for the mechanism in
its current shape.
6.1.2. Offer/Answer Use
An offerer wanting to use simulcast SHALL include the "a=simulcast"
attribute in the offer. An offerer that receives an answer without
"a=simulcast" MUST NOT use simulcast towards the answerer. An
offerer that receives an answer with "a=simulcast" not listing a
direction or without any simulcast stream identifications in a
specified direction MUST NOT use simulcast in that direction.
An answerer that does not understand the concept of simulcast will
also not know the attribute and will remove it in the SDP answer, as
defined in existing SDP Offer/Answer [RFC3264] procedures.
An answerer that does understand the attribute and that wants to
support simulcast in an indicated direction SHALL reverse
directionality of the unidirectional direction parameters; "send"
becomes "recv" and vice versa, and include it in the answer. Note
that, like all other use of SDP format tags ("pt:") for the send
direction in Offer/Answer, format tags related to the simulcast
stream identification send direction in an offer are placeholders
that refer to information in the offer SDP, and the actual formats
that will be used on the wire (including RTP Payload Format numbers)
depends on information included in the SDP answer.
An offerer listing a set of receive simulcast streams and/or
alternative formats in the offer MUST be prepared to receive RTP
streams for any of those simulcast streams and/or alternative formats
from the answerer.
An answerer that receives an offer with simulcast containing an
"a=simulcast" attribute listing alternative formats for simulcast
streams MAY keep all the alternatives in the answer, but it MAY also
choose to remove any non-desirable alternatives per simulcast stream
in the answer. The answerer MUST NOT add any alternatives that were
not present in the offer.
Burman, et al. Expires April 21, 2016 [Page 12]
Internet-Draft Simulcast October 2015
An answerer that receives an offer with simulcast that lists a number
of simulcast streams, MAY reduce the number of simulcast streams in
the answer, but MUST NOT add simulcast streams.
An offerer that receives an answer where some simulcast formats are
kept MUST be prepared to receive any of the kept send direction
alternatives, and MAY send any of the kept receive direction
alternatives from the answer. Similarly, the answerer MUST be
prepared to receive any of the kept receive direction alternatives,
and MAY send any of the kept send direction alternatives in the
answer.
The offerer and answerer MUST NOT send more than a single alternative
format at a time (based on RTP timestamps) per simulcast stream, but
MAY change format on a per-RTP packet basis. This corresponds to the
existing (non-simulcast) SDP offer/answer case when multiple formats
are included on the "m="-line in the SDP answer.
An offerer that receives an answer where some of the simulcast
streams are removed MAY release the corresponding resources (codec,
transport, etc) in its receive direction and MUST NOT send any RTP
streams corresponding to the removed simulcast streams.
Simulcast streams or formats using undefined simulcast stream
identifications MUST NOT be used as valid simulcast streams by an RTP
stream receiver.
An offerer that is capable of using both simulcast stream
identification methods MAY include one "a=simulcast" line per
identification method in the offer. Note that it is in general not
expected that the "pt" identification method will provide feature
parity with the "rid" method, and the different "a=simulcast" lines
can therefore express different use of simulcast functionality.
However, for some configurations the different identification methods
can be equivalent.
An answerer receiving an offer listing both simulcast stream
identification methods MUST choose only one and remove the other from
the answer. An answerer not supporting a simulcast stream
identification method in the offer MUST remove the non-supported
"a=simulcast" line from the answer, possibly falling back to not
using simulcast at all.
The media formats and corresponding characteristics of encoded
streams used in a simulcast SHOULD be chosen such that they are
different. If this difference is not required, RTP duplication
[RFC7104] procedures SHOULD be considered instead of simulcast.
Burman, et al. Expires April 21, 2016 [Page 13]
Internet-Draft Simulcast October 2015
Note: The inclusion of "a=simulcast" or the use of simulcast does
not change any of the interpretation or Offer/Answer procedures
for other SDP attributes, like "a=fmtp" or "a=rid".
6.2. Relating Simulcast Streams
As long as there is only a single media source per SDP media
description, simulcast RTP streams can be related on RTP level
through the RTP payload type and (optionally) RID
[I-D.pthatcher-mmusic-rid], as specified in the SDP "a=simulcast"
attribute (Section 6.1) parameters. When using BUNDLE
[I-D.ietf-mmusic-sdp-bundle-negotiation] with multiple SDP media
descriptions to specify a single RTP session, there is an
identification mechanism that allows relating RTP streams back to
individual media descriptions, after which the above RTP payload type
and RID relations can be used.
BUNDLE's MID is an RTCP source description (SDES) item. To ensure
rapid initial reception, required to correctly process the RTP
streams, it is also defined as an RTP header extension [RFC5285].
6.3. Signaling Examples
These examples describe a client to video conference service, using a
centralized media topology with an RTP mixer.
+---+ +-----------+ +---+
| A |<---->| |<---->| B |
+---+ | | +---+
| Mixer |
+---+ | | +---+
| F |<---->| |<---->| J |
+---+ +-----------+ +---+
Figure 2: Four-party Mixer-based Conference
6.3.1. Unified Plan Client
Alice is calling in to the mixer with a simulcast-enabled Unified
Plan client capable of a single media source per media type. The
client can send a simulcast of 2 video resolutions and frame rates:
HD 1280x720p 30fps and thumbnail 320x180p 15fps. This is defined
below using the "imageattr" [RFC6236]. Media formats (RTP payload
types) are used as simulcast stream identification. Alice's Offer:
Burman, et al. Expires April 21, 2016 [Page 14]
Internet-Draft Simulcast October 2015
v=0
o=alice 2362969037 2362969040 IN IP4 192.0.2.156
s=Simulcast Enabled Unified Plan Client
t=0 0
c=IN IP4 192.0.2.156
m=audio 49200 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49300 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000
a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=simulcast: send pt=97;98 recv pt=97
Figure 3: Unified Plan Simulcast Offer
The only thing in the SDP that indicates simulcast capability is the
line in the video media description containing the "simulcast"
attribute. The included format parameters indicates that sent
simulcast streams can differ in video resolution.
The Answer from the server indicates that it too is simulcast
capable. Should it not have been simulcast capable, the
"a=simulcast" line would not have been present and communication
would have started with the media negotiated in the SDP.
v=0
o=server 823479283 1209384938 IN IP4 192.0.2.2
s=Answer to Simulcast Enabled Unified Plan Client
t=0 0
c=IN IP4 192.0.2.43
m=audio 49672 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49674 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000
a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=simulcast: recv pt=97;98 send pt=97
Figure 4: Unified Plan Simulcast Answer
Burman, et al. Expires April 21, 2016 [Page 15]
Internet-Draft Simulcast October 2015
Since the server is the simulcast media receiver, it reverses the
direction of the "simulcast" attribute parameters.
6.3.2. Multi-Source Client
Fred is calling in to the same conference as in the example above
with a two-camera, two-display system, thus capable of handling two
separate media sources in each direction, where each media source is
simulcast-enabled in the send direction. Fred's client is restricted
to a single media source per media description.
The first two simulcast streams for the first media source use
different codecs, H264-SVC [RFC6190] and H264 [RFC6184]. These two
simulcast streams also have a temporal dependency. Two different
video codecs, VP8 [I-D.ietf-payload-vp8] and H264, are offered as
alternatives for the third simulcast stream for the first media
source. RID is used as simulcast stream identification, reducing the
number of media formats needed. Only the highest fidelity simulcast
stream are sent from start, the lower fidelity streams being
initially paused.
The second media source is offered with three different simulcast
streams. All video streams of this second media source are loss
protected by RTP retransmission [RFC4588]. RID is used as simulcast
stream identification. Also here, all but the highest fidelity
simulcast stream are initially paused.
Fred's client is also using BUNDLE to send all RTP streams from all
media descriptions in the same RTP session on a single media
transport. Although using many different simulcast streams in this
example, use of RID as simulcast stream identification enables use of
a low number of RTP payload types. Note that the use of both BUNDLE
and RID recommends using the RTP header extension [RFC5285] for
carrying these fields.
Burman, et al. Expires April 21, 2016 [Page 16]
Internet-Draft Simulcast October 2015
v=0
o=fred 238947129 823479223 IN IP4 192.0.2.125
s=Offer from Simulcast Enabled Multi-Source Client
t=0 0
c=IN IP4 192.0.2.125
a=group:BUNDLE foo bar zen
m=audio 49200 RTP/AVP 99
a=mid:foo
a=rtpmap:99 G722/8000
m=video 49600 RTP/AVPF 100 101 103
a=mid:bar
a=rtpmap:100 H264-SVC/90000
a=rtpmap:101 H264/90000
a=rtpmap:103 VP8/90000
a=fmtp:100 profile-level-id=42400d; max-fs=3600; max-mbps=108000; \
mst-mode=NI-TC
a=fmtp:101 profile-level-id=42c00d; max-fs=3600; max-mbps=54000
a=fmtp:103 max-fs=900; max-fr=30
a=rid:1 send pt=100;max-width=1280;max-height=720;max-fr=60;depend=2
a=rid:2 send pt=101;max-width=1280;max-height=720;max-fr=30
a=rid:3 send pt=101;max-width=640;max-height=360
a=rid:4 send pt=103;max-width=640;max-height=360
a=depend:100 lay bar:101
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:rid
a=rtcp-fb:* ccm pause nowait
a=simulcast: send rid=1;2;4,3 paused=2,3,4
m=video 49602 RTP/AVPF 96 104
a=mid:zen
a=rtpmap:96 VP8/90000
a=fmtp:96 max-fs=3600; max-fr=30
a=rtpmap:104 rtx/90000
a=fmtp:104 apt=96;rtx-time=200
a=rid:5 send pt=96;max-fs=921600;max-fr=30
a=rid:6 send pt=96;max-fs=614400;max-fr=15
a=rid:7 send pt=96;max-fs=230400;max-fr=30
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:rid
a=rtcp-fb:* ccm pause nowait
a=simulcast: send rid=5;6;7 paused=6,7
Figure 5: Fred's Multi-Source Simulcast Offer
Burman, et al. Expires April 21, 2016 [Page 17]
Internet-Draft Simulcast October 2015
Note: Empty lines in the SDP above are added only for readability
and would not be present in an actual SDP.
7. Network Aspects
Simulcast is in this memo defined as the act of sending multiple
alternative encoded streams of the same underlying media source.
When transmitting multiple independent streams that originate from
the same source, it could potentially be done in several different
ways using RTP. A general discussion on considerations for use of
the different RTP multiplexing alternatives can be found in
Guidelines for Multiplexing in RTP
[I-D.ietf-avtcore-multiplex-guidelines]. Discussion and
clarification on how to handle multiple streams in an RTP session can
be found in [I-D.ietf-avtcore-rtp-multi-stream].
The network aspects that are relevant for simulcast are:
Quality of Service: When using simulcast it might be of interest to
prioritize a particular simulcast stream, rather than applying
equal treatment to all streams. For example, lower bit-rate
streams may be prioritized over higher bit-rate streams to
minimize congestion or packet losses in the low bit-rate streams.
Thus, there is a benefit to use a simulcast solution that supports
QoS as good as possible.
NAT/FW Traversal: Using multiple RTP sessions incurs more cost for
NAT/FW traversal unless they can re-use the same transport flow,
which can be achieved by Multiplexing Negotiation Using SDP Port
Numbers [I-D.ietf-mmusic-sdp-bundle-negotiation].
8. Limitations
The chosen approach has a few limitations that are described in this
section. Some relate to the use of a single RTP session for all
simulcast formats of a media source, while others relate to the two
different simulcast stream identification methods.
8.1. Single RTP Session
The limitations in this section come from sending all simulcast
streams related to a media source under the same SDP media
description, which also means they are sent in the same RTP session.
It is not possible to use different simulcast streams on different
transports, limiting the possibilities to apply different QoS to
different simulcast streams. When using unicast, QoS mechanisms
based on individual packet marking are feasible, since they do not
Burman, et al. Expires April 21, 2016 [Page 18]
Internet-Draft Simulcast October 2015
require separation of simulcast streams into different RTP sessions
to apply different QoS.
It is not possible to separate different simulcast streams into
different multicast groups to allow a multicast receiver to pick the
stream it wants, rather than receive all of them. In this case, the
only reasonable implementation is to use different RTP sessions for
each multicast group so that reporting and other RTCP functions
operate as intended.
8.2. SDP Format Identification
The limitations in this section come from and thus apply only when
using SDP format (RTP payload type) as simulcast stream
identification method.
The available RTP payload type number space may not be sufficient
when many different media formats and/or simulcast streams are used
in the SDP. This can be particularly prominent when BUNDLE is used,
and for any technology that adds to the number of required RTP
payload types in a multiplicative way, such as for example adding RTP
retransmission [RFC4588] and Forward Error Correction [RFC5109].
Flexible FEC Scheme [I-D.ietf-payload-flexible-fec-scheme] can be
used for RTP retransmissions and would avoid the double consumption
of the PT space that RTP Retransmission [RFC4588] causes.
Only existing SDP attributes and parameters can be used to define
codec configuration for a simulcast format. Any codec that does not
define a sufficient set of codec parameters in "a=fmtp", or can make
use of other SDP attributes, may not be capable of expressing the
desired simulcast format dimensions (Section 3.1) with necessary
precision, or not at all. One example of this is the ability to
separate simulcast formats by bandwidth for codecs lacking a codec-
specific bandwidth parameter, since the SDP "b="-line covers all RTP
payload types listed on an "m="-line.
A simulcast stream signaled as initially paused is not possible to
resume by a remote peer, because it cannot know which target SSRC to
use in the RESUME message [I-D.ietf-avtext-rtp-stream-pause].
8.3. RID Identification
The limitations in this section come from and thus apply only when
using RID as simulcast stream identification method.
Use of the additional "a=rid"-line in SDP and the corresponding RID
RTCP SDES item and RTP header extension requires some additional
Burman, et al. Expires April 21, 2016 [Page 19]
Internet-Draft Simulcast October 2015
implementation complexity, and incurs some extra bandwidth cost to
carry the RID RTCP SDES item and RTP header extension.
9. IANA Considerations
This document requests to register a new SDP attribute, simulcast.
Formal registrations to be written.
10. Security Considerations
The simulcast capability, configuration attributes and parameters are
vulnerable to attacks in signaling.
A false inclusion of the "a=simulcast" attribute may result in
simultaneous transmission of multiple RTP streams that would
otherwise not be generated. The impact is limited by the media
description joint bandwidth, shared by all simulcast streams
irrespective of their number. There may however be a large number of
unwanted RTP streams that will impact the share of bandwidth
allocated for the originally wanted RTP stream.
A hostile removal of the "a=simulcast" attribute will result in
simulcast not being used.
Neither of the above will likely have any major consequences and can
be mitigated by signaling that is at least integrity and source
authenticated to prevent an attacker to change it.
11. Contributors
Morgan Lindqvist and Fredrik Jansson, both from Ericsson, have
contributed with important material to the first versions of this
document. Robert Hansen and Cullen Jennings, from Cisco, and Peter
Thatcher, from Google, contributed significantly to subsequent
versions.
12. Acknowledgements
13. References
13.1. Normative References
[I-D.ietf-avtext-rtp-stream-pause]
Burman, B., Akram, A., Even, R., and M. Westerlund, "RTP
Stream Pause and Resume", draft-ietf-avtext-rtp-stream-
pause-10 (work in progress), September 2015.
Burman, et al. Expires April 21, 2016 [Page 20]
Internet-Draft Simulcast October 2015
[I-D.pthatcher-mmusic-rid]
Thatcher, P., Zanaty, M., Nandakumar, S., Burman, B.,
Roach, A., and B. Campen, "RTP Payload Format
Constraints", draft-pthatcher-mmusic-rid-02 (work in
progress), October 2015.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
July 2003, <http://www.rfc-editor.org/info/rfc3550>.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
July 2006, <http://www.rfc-editor.org/info/rfc4566>.
[RFC5109] Li, A., Ed., "RTP Payload Format for Generic Forward Error
Correction", RFC 5109, DOI 10.17487/RFC5109, December
2007, <http://www.rfc-editor.org/info/rfc5109>.
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008,
<http://www.rfc-editor.org/info/rfc5234>.
[RFC7104] Begen, A., Cai, Y., and H. Ou, "Duplication Grouping
Semantics in the Session Description Protocol", RFC 7104,
DOI 10.17487/RFC7104, January 2014,
<http://www.rfc-editor.org/info/rfc7104>.
13.2. Informative References
[I-D.ietf-avtcore-multiplex-guidelines]
Westerlund, M., Perkins, C., and H. Alvestrand,
"Guidelines for using the Multiplexing Features of RTP to
Support Multiple Media Streams", draft-ietf-avtcore-
multiplex-guidelines-03 (work in progress), October 2014.
[I-D.ietf-avtcore-rtp-multi-stream]
Lennox, J., Westerlund, M., Wu, W., and C. Perkins,
"Sending Multiple Media Streams in a Single RTP Session",
draft-ietf-avtcore-rtp-multi-stream-09 (work in progress),
September 2015.
Burman, et al. Expires April 21, 2016 [Page 21]
Internet-Draft Simulcast October 2015
[I-D.ietf-avtcore-rtp-topologies-update]
Westerlund, M. and S. Wenger, "RTP Topologies", draft-
ietf-avtcore-rtp-topologies-update-10 (work in progress),
July 2015.
[I-D.ietf-avtext-rtp-grouping-taxonomy]
Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
B. Burman, "A Taxonomy of Semantics and Mechanisms for
Real-Time Transport Protocol (RTP) Sources", draft-ietf-
avtext-rtp-grouping-taxonomy-08 (work in progress), July
2015.
[I-D.ietf-mmusic-sdp-bundle-negotiation]
Holmberg, C., Alvestrand, H., and C. Jennings,
"Negotiating Media Multiplexing Using the Session
Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle-
negotiation-23 (work in progress), July 2015.
[I-D.ietf-payload-flexible-fec-scheme]
Singh, V., Begen, A., Zanaty, M., and G. Mandyam, "RTP
Payload Format for Flexible Forward Error Correction
(FEC)", draft-ietf-payload-flexible-fec-scheme-01 (work in
progress), October 2015.
[I-D.ietf-payload-vp8]
Westin, P., Lundin, H., Glover, M., Uberti, J., and F.
Galligan, "RTP Payload Format for VP8 Video", draft-ietf-
payload-vp8-17 (work in progress), September 2015.
[RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
DOI 10.17487/RFC2198, September 1997,
<http://www.rfc-editor.org/info/rfc2198>.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
DOI 10.17487/RFC3264, June 2002,
<http://www.rfc-editor.org/info/rfc3264>.
[RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for
Comfort Noise (CN)", RFC 3389, DOI 10.17487/RFC3389,
September 2002, <http://www.rfc-editor.org/info/rfc3389>.
[RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
DOI 10.17487/RFC4588, July 2006,
<http://www.rfc-editor.org/info/rfc4588>.
Burman, et al. Expires April 21, 2016 [Page 22]
Internet-Draft Simulcast October 2015
[RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF
Digits, Telephony Tones, and Telephony Signals", RFC 4733,
DOI 10.17487/RFC4733, December 2006,
<http://www.rfc-editor.org/info/rfc4733>.
[RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
DOI 10.17487/RFC5117, January 2008,
<http://www.rfc-editor.org/info/rfc5117>.
[RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP
Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July
2008, <http://www.rfc-editor.org/info/rfc5285>.
[RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific
Media Attributes in the Session Description Protocol
(SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009,
<http://www.rfc-editor.org/info/rfc5576>.
[RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding
Dependency in the Session Description Protocol (SDP)",
RFC 5583, DOI 10.17487/RFC5583, July 2009,
<http://www.rfc-editor.org/info/rfc5583>.
[RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP
Payload Format for H.264 Video", RFC 6184,
DOI 10.17487/RFC6184, May 2011,
<http://www.rfc-editor.org/info/rfc6184>.
[RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis,
"RTP Payload Format for Scalable Video Coding", RFC 6190,
DOI 10.17487/RFC6190, May 2011,
<http://www.rfc-editor.org/info/rfc6190>.
[RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image
Attributes in the Session Description Protocol (SDP)",
RFC 6236, DOI 10.17487/RFC6236, May 2011,
<http://www.rfc-editor.org/info/rfc6236>.
Appendix A. Changes From Earlier Versions
NOTE TO RFC EDITOR: Please remove this section prior to publication.
A.1. Modifications Between WG Version -02 and -03
o Removed text on multicast / broadcast from use cases, since it is
not supported by the solution.
o Removed explicit references to unified plan draft.
Burman, et al. Expires April 21, 2016 [Page 23]
Internet-Draft Simulcast October 2015
o Added possibility to initiate simulcast streams in paused mode.
o Enabled an offerer to offer multiple stream identification (pt or
rid) methods and have the answerer choose which to use.
o Added a preference indication also in send direction offers.
o Added a section on limitations of the current proposal, including
identification method specific limitations.
A.2. Modifications Between WG Version -01 and -02
o Relying on the new RID solution for codec constraints and
configuration identification. This has resulted in changes in
syntax to identify if pt or RID is used to describe the simulcast
stream.
o Renamed simulcast version and simulcast version alternative to
simulcast stream and simulcast format respectively, and improved
definitions for them.
o Clarification that it is possible to switch between simulcast
version alternatives, but that only a single one be used at any
point in time.
o Changed the definition so that ordering of simulcast formats for a
specific simulcast stream do have a preference order.
A.3. Modifications Between WG Version -00 and -01
o No changes. Only preventing expiry.
A.4. Modifications Between Individual Version -00 and WG Version -00
o Added this appendix.
Authors' Addresses
Bo Burman
Ericsson
Kistavagen 25
SE-164 80 Stockholm
Sweden
Email: bo.burman@ericsson.com
Burman, et al. Expires April 21, 2016 [Page 24]
Internet-Draft Simulcast October 2015
Magnus Westerlund
Ericsson
Farogatan 2
SE-164 80 Stockholm
Sweden
Phone: +46 10 714 82 87
Email: magnus.westerlund@ericsson.com
Suhas Nandakumar
Cisco
170 West Tasman Drive
San Jose, CA 95134
USA
Email: snandaku@cisco.com
Mo Zanaty
Cisco
170 West Tasman Drive
San Jose, CA 95134
USA
Email: mzanaty@cisco.com
Burman, et al. Expires April 21, 2016 [Page 25]