datatracker.ietf.org
Sign in
Version 5.6.2.p6, 2014-09-03
Report a bug

A Real-time Transport Protocol (RTP) Header Extension for Mixer-to-Client Audio Level Indication
RFC 6465

Internet Engineering Task Force (IETF)                      E. Ivov, Ed.
Request for Comments: 6465                                         Jitsi
Category: Standards Track                                E. Marocco, Ed.
ISSN: 2070-1721                                           Telecom Italia
                                                               J. Lennox
                                                                   Vidyo
                                                           December 2011

       A Real-time Transport Protocol (RTP) Header Extension for
                 Mixer-to-Client Audio Level Indication

Abstract

   This document describes a mechanism for RTP-level mixers in audio
   conferences to deliver information about the audio level of
   individual participants.  Such audio level indicators are transported
   in the same RTP packets as the audio data they pertain to.

Status of This Memo

   This is an Internet Standards Track document.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Further information on
   Internet Standards is available in Section 2 of RFC 5741.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc6465.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Ivov, et al.                 Standards Track                    [Page 1]
RFC 6465         Mixer-to-Client Audio Level Indication    December 2011

Table of Contents

   1. Introduction ....................................................2
   2. Terminology .....................................................4
   3. Protocol Operation ..............................................4
   4. Audio Levels ....................................................5
   5. Signaling Information ...........................................7
   6. Security Considerations .........................................9
   7. IANA Considerations ............................................10
   8. Acknowledgments ................................................10
   9. References .....................................................10
      9.1. Normative References ......................................10
      9.2. Informative References ....................................11
   Appendix A. Reference Implementation ..............................12
      A.1. AudioLevelCalculator.java .................................12

1.  Introduction

   "A Framework for Conferencing with the Session Initiation Protocol
   (SIP)" [RFC4353] presents an overall architecture for multi-party
   conferencing.  Among others, the framework borrows from RTP [RFC3550]
   and extends the concept of a mixer entity "responsible for combining
   the media streams that make up a conference, and generating one or
   more output streams that are delivered to recipients".  Every
   participant would hence receive, in a flat single stream, media
   originating from all the others.

   Using such centralized mixer-based architectures simplifies support
   for conference calls on the client side, since they would hardly
   differ from one-to-one conversations.  However, the method also
   introduces a few limitations.  The flat nature of the streams that a
   mixer would output and send to participants makes it difficult for
   users to identify the original source of what they are hearing.

   Mechanisms that allow the mixer to send to participants cues on
   current speakers (e.g., the contributing source (CSRC) fields in RTP
   [RFC3550]) only work for speaking/silent binary indications.  There
   are, however, a number of use cases where one would require more
   detailed information.  Possible examples include the presence of
   background chat/noise/music/typing, someone breathing noisily in
   their microphone, or other cases where identifying the source of the
   disturbance would make it easy to remove it (e.g., by sending a

[include full document text]