ControLling mUltiple streams for tElepresence

Document Charter ControLling mUltiple streams for tElepresence WG (clue)
Title ControLling mUltiple streams for tElepresence
Last updated 2011-01-11
State Approved
WG State Concluded
IESG Responsible AD Adam Roach
Charter Edit AD (None)
Send notices to (None)


In the context of this WG, the term telepresence is used in a general
  manner to describe systems that provide high definition, high quality
  audio/video enabling a "being-there" experience.  One example is an
  immersive telepresence system using specially designed and special
  purpose rooms with multiple displays permitting life size image
  reproduction using multiple cameras, encoders, decoders, microphones
  and loudspeakers.
  Current telepresence systems are based on open standards such as RTP,
  SIP, H.264, the H.323 suite. However, they cannot easily interoperate
  with each other without operator assistance and expensive additional
  equipment which translates from one vendor to another. A major factor
  limiting the interoperability of telepresence systems is the lack of a
  standardized way to describe and negotiate the use of the multiple
  streams of audio and video comprising the media flows.
  The WG will create specifications for SIP-based conferencing systems
  to enable communication of information about media streams so that a
  sending system, receiving system, or intermediate system can make
  reasonable decisions about transmitting, selecting, and rendering
  media streams. This enables systems to make choices that optimize user
  This working group is chartered to specify the following information
  about media streams from one entity to another entity:
  * Spatial relationships of cameras, displays, microphones, and
    loudspeakers - relative to each other and to likely positions of
  * Viewpoint, field of view/capture for
    camera/microphone/display/loudspeaker - so that senders and
    intermediate devices can understand how best to compose streams for
    receivers, and the receiver will know the characteristics of its
    received streams
  * Usage of the stream, for example whether the stream is presentation,
    or document camera output
  * Aspect ratio of cameras and displays
  * Which sources a receiver wants to receive.  For example, it might
    want the source for the left camera, or might want the source chosen
    by VAD (Voice Activity Detection)
  Information between sources and sinks about media stream capabilities
  will be exchanged.
  The working group will define the semantics, syntax, and transport
  mechanism for communicating the necessary information. It will
  consider whether existing protocols for signaling, messaging and
  transport are adequate or need to be extended. Any extensions to IETF
  protocols will be done in appropriate WGs, for example extensions to
  The scope of the work includes describing relatively static relations
  between entities (participants and devices). It also includes handling
  more dynamic relationships, such as specifying the audio and video
  streams for defined speakers. Specifying the location of the current
  speakers relative to display microphones needs to be provided
  dynamically as speakers move.
  As part of the receiver telling the sender what it wants dynamically,
  explicit receiver notification to the sender of the desired video
  stream and video pause will be considered.
  The scope includes both systems that provide a fully immersive
  experience, and systems that interwork with them and therefore need to
  understand the same multiple stream semantics.
  The focus of this work is on multiple RTP audio and video streams.
  Other media types may be considered, however development of
  methodologies for them is not within the scope of this work.
  Interoperation with SIP and related standards for audio and video is
  required.  However, backwards compatibility with existing
  non-standards compliant telepresence systems is not required.
  This working group is not currently chartered to work on issues of
  continuous conference control including: far end camera control, floor
  control, conference roster. The working group may identify
  interoperability obstacles in existing open standards. If so, the WG
  will develop requirements to be communicated to other IETF WGs or
  Standards Forums, or recharter as appropriate.
  Reuse of existing protocols and backwards compatibility with
  SIP-compliant audio/video endpoints are important factors for the
  working group to consider. The work will closely coordinate with the
  appropriate areas (e.g., OPS and SEC), and working groups including