Network Working Group                                         H. Kaplan
Internet Draft                                              Acme Packet
Intended status: Informational                               D. Burnett
Expires: April 21, 2012                                           Voxeo
                                                           N. Stratford
                                                                  Voxeo
                                                             Tim Panton
                                                      PhoneFromHere.com
                                                       October 21, 2011


               API Requirements for RTCWEB-enabled Browsers

                      draft-kaplan-rtcweb-api-reqs-00


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with
   the provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 21, 2011.

Copyright and License Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with



Kaplan                  Expires April 24, 2012                [Page 1]


Internet-Draft                Tao of Web                  October 2011


   respect to this document.  Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the BSD License.

Abstract

   This document discusses the advantages and disadvantages of several
   proposed approaches to what type of API and architectural model
   RTCWeb Browsers should expose and use.  The document then defines
   the requirements for an API that treats the Browser as a library and
   interface as opposed to a self-contained application agent.

Table of Contents

   1. Terminology...................................................2
   2. Introduction..................................................2
   3. Defining an RTCWeb Protocol in the Browser....................4
   4. Leaving Logic to Web Developers...............................6
   5. API Requirements..............................................8
      5.1. Browser User-Interface Requirements......................8
      5.2. Media Properties.........................................9
      5.3. RTP/RTCP Properties.....................................10
      5.4. Data-stream Properties..................................11
      5.5. IP and ICE Properties...................................11
      5.6. API Design Recommendations..............................12
   6. Security Considerations......................................12
   7. IANA Considerations..........................................12
   8. Acknowledgments..............................................12
   9. References...................................................13
      9.1. Informative References..................................13
   Authors' Addresses...............................................13


1. Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.  The
   terminology in this document conforms to RFC 2828, "Internet
   Security Glossary".


2. Introduction

   There has been a long discussion in the RTCWeb WG mailing list
   concerning whether any "signaling" or protocol should be
   standardized between the Browser and its Server, other Browsers, or
   the JavaScript it runs.


Kaplan, et al            Expires - April 2011                 [Page 2]


Internet-Draft                Tao of Web                  October 2011



   Within the context of the RTCWeb Browser architecture, shown in
   Figure 1 below, the discussion is centered on how much intelligence,
   logic, state, and decisions are built into the Browser, vs. provided
   by Javascript.

                           +------------------------+  On-the-wire
                           |                        |  Protocols
                           |      Servers           |--------->
                           |                        |
                           |                        |
                           +------------------------+
                                       ^
                                       |
                                       |
                                       | HTTP/
                                       | Websockets
                                       |
                                       |
                         +----------------------------+
                         |    Javascript/HTML/CSS     |
                         +----------------------------+
                      Other  ^                 ^RTCWEB
                      APIs   |                 |API
                         +---|-----------------|------+
                         |   |                 |      |
                         |                 +---------+|
                         |                 | Browser ||  On-the-wire
                         | Browser         | RTC     ||  Protocols
                         |                 | Function|----------->
                         |                 |         ||
                         |                 |         ||
                         |                 +---------+|
                         +---------------------|------+
                                               |
                                               V
                                          Native OS Services

                          Figure 1: Browser Model

   There has been some discussion that the protocol running over the
   HTTP/Websockets connection between the Javascript and the Server be
   standardized, which will be discussed in this document.

   There has also been discussion that the interface between the
   Javascript and Browser be a protocol, rather than an API, such that
   the JavaScript could pass the protocol's messages as opaque blobs
   between two Browsers to establish the media-plane characteristics.



Kaplan, et al            Expires - April 2011                 [Page 3]


Internet-Draft                Tao of Web                  October 2011


   An example of such a protocol interface is described in [draft-
   roap].  That will also be discussed in this document.

   The conclusion of the discussion in this document concerning those
   designs is that they are detrimental to the applications enabled
   with RTCWeb, and contrary to the Web Application model in general.
   Therefore, this document defines the beginning list of requirements
   for an API that we feel is more appropriate for Browsers to expose.

3. Defining an RTCWeb Protocol in the Browser

   Proposals have been made for integrating an entire session signaling
   protocol into the Browser, in [draft-signaling], and for integrating
   an SDP offer/answer protocol in the Browser, in [draft-roap].  This
   section discusses the benefits and drawbacks of doing such things in
   Browsers.

   1) For a session signaling mechanism to work, it is not sufficient
     to just implement something in the browser.  The Server also has
     to have be involved in the protocol, in order to forward the
     protocol messages between the appropriate Browsers.  Minimally,
     this requires identity and location services, such as a user
     database and which Browser connections are which user, etc.  Often
     it involves authentication and authorization decisions as well.
     In SIP, for example, this would be the role of a Proxy and
     Registrar.  In Web applications, such things are typically handled
     in an application-specific way based on the needs and architecture
     of the specific Web application.  For example, a gaming website
     already knows who its users are and where they are as part of
     their game application, while Facebook already knows who its users
     are and where they are in its specific application.  There is no
     need to standardize this in any way, and attempting to do so would
     be fruitless since it would have to make assumptions about the
     applications that could not possibly be known in advance, and thus
     not usable in practice.
   2) For either session signaling or SDP offer/answer protocols,
     integrating the protocols into the Browser means more logic and
     state is in the Browser, and ultimately more code.  This leads to
     the following properties:
        a.  It is easier for simple Web-application developers to
          initially deploy, if the code they need was built-in to the
          Browser the way they needed it to be.
        b.  But, the more logic is placed into the Browser, the more
          need there is to extend/enhance/fix that logic in the future,
          and Web-application developers have little control over users
          to upgrade their Browsers.
        c.  History has shown that the more complex the interface is
          between two implementations, the more interoperability
          problems occur.  Ultimately the best way to provide


Kaplan, et al            Expires - April 2011                 [Page 4]


Internet-Draft                Tao of Web                  October 2011


          interoperability is to run the same actual source-code; short
          of that, the less logic placed into it, the better the odds
          of interoperability.
   3) For the SDP offer/answer protocol proposal, a benefit is that for
     some very simple applications it makes deployment easier, if the
     simple application does not need to know anything about the SDP
     content or offer/answer semantics.  If an application needs
     control at the media layer, then it could create a fake
     shim/interface from some real SDP in Javascript, to the SDP
     offer/answer protocol in the Browser.  Thus it trades off
     additional simplicity for simple applications, against additional
     complexity for advanced applications.  If the goal of RTCWEB is
     purely simplicity, this might seem a reasonable trade-off; if the
     goal is innovation, however, then making it harder for advanced
     uses means making it harder to innovate.
   4) For the SDP offer/answer protocol proposal, an argument has been
     made that the logic/state required for media already has to exist
     in the Browser itself, and thus splitting the domain of
     responsibility between the Browser and Javascript is more
     difficult than keeping it all in the Browser.  We believe this
     conclusion is drawn from an implicit assumption that the Browser
     should be dealing with SDP to begin with.  Unfortunately, SDP is
     not just about media characteristics.  There are numerous
     attributes in SDP that are actually properties of a higher layer
     than RTP and codecs.  For example, the following IANA-registered
     SDP attributes would be unknown to a media library in the browser
     and only known to the Javascript: cat, keywds, tool, type,
     charset, lang, setup, connection, confid, userid, floorid, and
     probably a bunch more (we haven't investigated them all).  The
     point is that it is NOT true that "all the SDP information needs
     to be handled by the Browser, so why not put offer/answer in it
     too?".
   5) Building the SDP offer/answer model into the Browser restricts
     the Web application to only being able to do things that can be
     encoded and communicated with the SDP offer/answer model.  As an
     example of something that cannot be accomplished because of this:
     imagine a Web-application that allows the Browser to communicate
     with a TelePresence (TP) system.  TP systems have multiple
     cameras, screen displays, microphones, and speakers.  A PC-based
     Browser typically only has a single microphone and camera, but can
     display multiple video feeds separately and can render-mix the
     incoming audio streams.  Thus, a Browser to TP system would
     produce an asymmetric media stream model: multiple video streams
     from the TP system to the Browser, and one video stream from the
     Browser to the TP system, and the same for audio.  Each TP stream
     is an independent RTP session and has unique attributes to
     indicate position (left/center/right).  Encoding that is currently
     not possible with SDP offer/answer; not only because the SDP
     attributes aren't yet defined, but because the offer/answer model


Kaplan, et al            Expires - April 2011                 [Page 5]


Internet-Draft                Tao of Web                  October 2011


     assumes a symmetric number of media-lines (m= lines), and also
     that attributes represent media-receiving characteristics as
     opposed to media-sending capabilities.  Clearly if and when SDP is
     changed to handle TelePresence cases, Browsers could be upgraded
     to handle it as well sometime after; but they wouldn't need to if
     the Browser hadn't been involved in SDP to begin with.  SDP
     information isn't strictly that of an RTP library layer; it's not
     a one-to-one correlation.
   6) Some Web application developers may prefer to make the decision
     of which codecs/media-properties to use in the Server, and command
     all the Browsers in a given session to do so.  In some respects
     this is the very simplest model possible; but with SDP
     offer/answer model being forced on the developer it becomes much
     more complicated to achieve.
   7) Since SDP offer/answer mechanism is a protocol, involving both
     state machines and encoding schemes, interoperability between
     different vendor implementations is not guaranteed.  In fact,
     real-world SIP deployments have experienced interoperability
     problems with both SDP and the offer/answer model.
   8) For both session signaling and SDP offer/answer, troubleshooting
     and debugging become difficult for the web-app provider if a
     problem occurs in the protocol built in the Browser.  Even if the
     Javascript snoops on SIP or ROAP message exchanges and pushes back
     copies to the server in case of failure, the developer has to
     guess what the cause of an error response is.  In other words,
     it's the difference between having only Wireshark traces to debug
     with vs. also having internal logs from code procedures.


4. Leaving Logic to Web Developers

   The alternative to embedding protocols in the Browser, is to leave
   the work up to Javascript, for whatever "work" might be required for
   the particular application.  After all, the actual knowledge of what
   the specific Web application does, wants, how it encodes it, etc. is
   only fully known by the Web developer for that application, and thus
   by the Javascript+Server-code combination employed (i.e., the
   application "source-code").

   Clearly the Browser needs to perform quite a bit of "logic": for
   implementing codec rendering/encoding, RTP/RTCP protocols, SRTP, and
   ICE.  That is unavoidable, and not in question.  The question is who
   should be in control, where any additional logic should be placed,
   and what the API model should be.

   There has been discussion that RTCWeb should strive to enable media
   communication session with about "20 lines of code".  We assert the
   only means of achieving that goal in a production-deployment manner
   is to use Javascript, and in particular Javascript libraries.


Kaplan, et al            Expires - April 2011                 [Page 6]


Internet-Draft                Tao of Web                  October 2011


   Javascript libraries are used by a huge number of Web applications,
   and they work.  Some of the libraries are so popular, reference
   books have been published for them.  Yes, there are a lot of
   libraries, but that's a *good thing*.

   The properties of using Javascript, and Javascript libraries, are as
   follows:
   1) The logic is under the control of the Web developer.  This menas:
        a.  If something is broken, the Web developer can generate log-
          type debug information within the javascript and push it back
          to the server or log collector, and determine what is broken
          and when to fix it; they do not need to rely on asking for
          the user to provide Browser logs, rely on the Browser to
          generate useful logs, understand the logs, nor rely on
          Browser manufacturers to prioritize fixing them.
        b.  If an enhancement can be made, the Web developer can decide
          if and when to do so; they do not need to rely on Browser
          manufacturer decisions and timescales.
   2) All the Browsers using a given application site run the same
     literal Javascript source-code provided for that application.
     There is no greater means of achieving interoperability than that.
   3) If specific Browsers enable something proprietary, or some new
     media extension, the Web developer can decide whether to use it or
     not, when, and how.  And the mechanism can be made flexibly; for
     example, new codecs do not need new Javascript code to be used,
     unless the Javascript wishes to follow a model where they do need
     new Javascript to be used. (see API section on this)  In other
     words, the Web-application developer can be as conservative or
     liberal as he/she wishes.  There are already known use-cases where
     a Web-app will never want to use new codecs or capabilities
     introduced into Browser RTP libraries, and there are known use-
     cases for the opposite.  Let the Web-app developer make that
     choice.
   4) The Javascript code does need to be downloaded (although Browser
     caching does exist), and clearly the larger the Javascript, the
     longer it takes to download.  BUT, popular Javascript libraries
     are so necessary in modern Web applications, that they are often
     available for free and fast downloading by local delivery
     networks.
   5) There are properties of the media library API that Javascript may
     need to access that cannot and should not be expressed in SDP.
     Some of these are described in the "Hints" and "Stats" section of
     [draft-roap].  These will need a true API rather than an SDP
     offer/answer protocol to learn, yet they are tied to the
     information in the SDP regarding the media streams and codecs.
     Therefore, it is not the case that the Javascript does not need to
     understand SDP and could treat it as an opaque blob.
   6) There are settings of the media library API that Javascript may
     want to set that cannot be expressed in SDP.  For example, setting


Kaplan, et al            Expires - April 2011                 [Page 7]


Internet-Draft                Tao of Web                  October 2011


     which local audio or video sources to fork to two or more remote
     parties.  Another example is local Javascript setting the media
     library to use audio only even if an incoming session's remote
     peer Browser indicates both audio and video, because the local
     user only wants to use audio right now (e.g., they pressed some
     Javascript-provided button which meant "audio-only" because
     they're not wearing proper attire for this particular session).
     These types of decisions and logic are not the domain of the
     Browser, but rather the Javascript, yet they are also integral to
     the SDP offer/answer.
   7) Debugging issues with signaling or SDP and the offer/answer model
     (if it's even used to begin with) are easier for the Web-app
     provider because they can save whatever information they need to
     debug their Javascriopt, within their own Javascript code's
     execution logic, and push it to their Server over HTTP/websocket
     however they see fit.

5. API Requirements

   Some requirements for an API are already documented in [draft-use-
   cases-and-requirements].  This section expands upon those in further
   detail, and adds new ones.  In all cases, the term "Application"
   means the Javascript, and "Web API" refers to the Javascript <->
   Browser API.

   It is not the goal of this document to define the actual API -
   that's W3C's job.  [Note: this is a strawman list]

5.1. Browser User-Interface Requirements

      REQ-ID          DESCRIPTION
      ----------------------------------------------------------------
      A1-1            The Web API MUST provide a means for the
                      application to ask the browser for permission
                      to use cameras and microphones as input devices.
      ----------------------------------------------------------------
      A1-2            The Web API MUST provide a means for the
                      application to ask the browser for permission
                      to the screen, a certain area on the screen
                      or what a certain application displays on the
                      screen as input to streams, and which stream.
      ----------------------------------------------------------------
      A1-3            The Web API MUST provide a means for the
                      application to disable/enable the microphone and
                      camera inputs. [Note: this does NOT mean
                      disabling RTP transmission]
      ----------------------------------------------------------------
      A1-4            The Web API MUST provide a means for the
                      application to disable/enable the rendering of


Kaplan, et al            Expires - April 2011                 [Page 8]


Internet-Draft                Tao of Web                  October 2011


                      received audio and video, per stream.
      ----------------------------------------------------------------


5.2. Media Properties

      REQ-ID          DESCRIPTION
      ----------------------------------------------------------------
      A2-1            The Web API MUST provide a means for the
                      application to learn what codecs and codec
                      properties the Browser supports
      ----------------------------------------------------------------
      A2-2            The Web API MUST provide a means for the
                      Browser to indicate codecs and codec properties
                      such that the application does not need to know
                      about the specific codec type in advance
      ----------------------------------------------------------------
      A2-3            The Web API MUST provide a means for the
                      Browser to indicate codecs and codec properties
                      such that the application can use them in SDP,
                      for example by providing the IANA-registered
                      encoding name for the payload format, and the
                      format specific parameters as strings, such that
                      they could be used in the 'a=rtpmap' and 'a=fmtp'
                      lines of SDP should the Javascript wish to
                      create SDP containing codecs unknown to it.
      ----------------------------------------------------------------
      A2-4            The Web API MUST provide means for the
                      application to get the following media codec
                      properties: bandwidth, clock rate, number of
                      channels, type (audio vs. video)
      ----------------------------------------------------------------
      A2-5            The Web API MUST provide a means for the
                      application to get the bandwidth values for
                      codecs which support multiple levels, and set
                      it for codecs which can be controlled/primed.
      ----------------------------------------------------------------
      A2-6            The Web API MUST provide a means for the
                      application to set whether to use silence
                      suppression or not, for codecs which support it.
      ----------------------------------------------------------------
      A2-7            The Web API MUST provide a means for the
                      Browser to notify the application when a used
                      codec falls below a given quality threshold
                      [Note: it is TBD what "quality" means]
      ----------------------------------------------------------------
      A2-8            The Web API MUST provide a means for the web
                      application to detect the level in audio
                      streams.


Kaplan, et al            Expires - April 2011                 [Page 9]


Internet-Draft                Tao of Web                  October 2011


      ----------------------------------------------------------------
      A2-9            The Web API MUST provide a means for the web
                      application to adjust the level in audio
                      streams.
      ----------------------------------------------------------------


5.3. RTP/RTCP Properties

      REQ-ID          DESCRIPTION
      ----------------------------------------------------------------
      A3-1            The Web API MUST provide a means for the
                      application to get and set the SSRC value(s)
      ----------------------------------------------------------------
      A3-2            The Web API MUST provide a means for the
                      application to get and set the CNAME value(s)
      ----------------------------------------------------------------
      A3-3            The Web API MUST provide a means for the
                      application to get and set the Payload Type
                      value(s) for each of the codecs
      ----------------------------------------------------------------
      A3-4            The Web API MUST provide a means for the
                      application to set the audio and video codecs
                      to be used for each stream, for both rendering
                      and generating separately, at any time.
      ----------------------------------------------------------------
      A3-5            The Web API MUST provide means for the
                      application to set whether to use SRTP, its
                      encryption algorithm and key length, with or
                      without authentication
      ----------------------------------------------------------------
      A3-6            The Web API MUST provide a means for the
                      application to set whether to use SRTP or not,
                      and which key exchange type to use
                      [Note: this is TBD pending SRTP decisions of WG]
      ----------------------------------------------------------------
      A3-7            The Web API MUST provide a means for the
                      application to set the SRTP master key value(s)
      ----------------------------------------------------------------
      A3-8            The Web API MUST provide a means for the
                      application to get DTLS-SRTP fingerprint value(s)
      ----------------------------------------------------------------
      A3-10           The Web API MUST provide a means for the
                      application to enable/disable generating RTP per
                      stream [Note: this does not disable RTCP]
      ----------------------------------------------------------------
      A3-11           The Web API MUST provide a means for the
                      application to be notified by the Browser if
                      RTCP is no longer being received from the far-end


Kaplan, et al            Expires - April 2011                [Page 10]


Internet-Draft                Tao of Web                  October 2011


      ----------------------------------------------------------------


5.4. Data-stream Properties

   This section will detail requirements for the API for the client-to-
   client data connection stream.

   [TBD, since no other document has proposed anything for this yet
   either]

5.5. IP and ICE Properties

      ----------------------------------------------------------------
      A5-1            The Web API MUST provide a means for the
                      application to get IPv4/v6 addresses and ports
                      for receiving ICE/RTP/RTCP on, per stream
      ----------------------------------------------------------------
      A5-2            The Web API MUST provide a means for the
                      application to set a list of the remote IPv4/v6
                      addresses and ports to send to, per stream
      ----------------------------------------------------------------
      A5-3            The Web API MUST provide a means for the
                      application to set a list of TURN servers to use,
                      including passwords
      ----------------------------------------------------------------
      A5-4            The Web API MUST provide a means for the
                      application to set a list of STUN servers to use
      ----------------------------------------------------------------
      A5-5            The Web API MUST provide a means for the
                      application to set the local ICE username and
                      password
      ----------------------------------------------------------------
      A5-6            The Web API MUST provide a means for the
                      application to set the remote ICE username and
                      password to perform connectivity checks with
      ----------------------------------------------------------------
      A5-7            The Web API MUST provide a means for the
                      application to set the remote IP Addresses and
                      ports to perform connectivity checks with
      ----------------------------------------------------------------
      A5-8            The Web API MUST provide a means for the
                      application to get any IP Addresses and ports
                      learned by the Browser from STUN, TURN, or other
                      methods (such as UPnP, NAT-PMP, PCP), including
                      their candidate-type, foundation, etc.
      ----------------------------------------------------------------
      A5-9            The Web API MUST provide a means for the
                      application to be notified by the Browser for


Kaplan, et al            Expires - April 2011                [Page 11]


Internet-Draft                Tao of Web                  October 2011


                      ICE event state changes
      ----------------------------------------------------------------
      A5-10           The Web API MUST provide a means for the
                      application to be notified by the Browser if
                      the local in-use IP address changes or becomes
                      inactive (e.g., link loss)
      ----------------------------------------------------------------


5.6. API Design Recommendations

   Technically the API design is the role of the W3C.  That hasn't
   stopped people in the IETF RTCWEB mailing list from discussing it ad
   nauseum, however, and even defining a protocol for it.  This
   document therefore recommends the following to W3C:
   1) That the API setters/getters function-arguments use
     separate/discrete values, instead of one long string of separate
     tokens in a pseudo-arbitrary order with weak and complex encoding
     rules.
   2) That when the Javascript calls an API setter function to the
     Browser, that it be treated as a *command*, not a protocol
     negotiation.
   3) *IF* any "blob" of information should be passed from the Browser
     to the Javascript and vice-versa, for use in such things as SDP,
     that it be something for which there would not likely be any use
     to a Javascript programmer and for which future extensions/changes
     would require Browser changes only but would not be easily
     representable in discrete fields.  The most likely candidate for
     such a need for a "blob" would be ICE-specific SDP attributes.
   4) That when a IETF documents start telling you how to build
     Javascript APIs, you should run far away... quickly.  :)


6. Security Considerations

   There are no security implications for this document, yet - this is
   just a strawman document.

7.   IANA Considerations

   This document makes no request of IANA.

8.   Acknowledgments

   Many of the topics discussed in this document came from numerous
   email posts and threads on the IETF RTCWEB mailing list over the
   past couple months, so we will likely forget to recognize some
   people who have had their input written herein.  We believe, though,
   that the following folks have possibly emailed something we've


Kaplan, et al            Expires - April 2011                [Page 12]


Internet-Draft                Tao of Web                  October 2011


   stolen^M^M borrowed: Matthew Kaufman, Roman Shpount, Inaki Baz
   Castillo, Albert Einstein, Saul Ibarra Corretge, Victor Pascual,
   Henry Sinnreich, and Bernard Aboba.

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).

9.   References

9.1. Informative References

   TBD


Authors' Addresses

   Hadriel Kaplan
   Acme Packet
   Email: hkaplan@acmepacket.com

   Dan Burnett
   Voxeo
   Email: dburnett@voxeo.com

   Neil Stratford
   Voxeo
   Email: nstratford@voxeo.com

   Tim Panton
   PhoneFromHere.com
   Email: tim@phonefromhere.com




















Kaplan, et al            Expires - April 2011                [Page 13]