SIPPING                                                      J. Van Dyke
Internet-Draft                                            E. Burger, Ed.
Expires: September 10, 2004                                   A. Spitzer
                                                SnowShore Networks, Inc.
                                                          March 12, 2004


       Media Server Control Markup Language (MSCML) and Protocol
                         draft-vandyke-mscml-04

Status of this Memo

   By submitting this Internet-Draft, I certify that any applicable
   patent or other IPR claims of which I am aware have been disclosed,
   and any of which I become aware will be disclosed, in accordance with
   RFC 3667.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at http://
   www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 10, 2004.

Copyright Notice

   Copyright (C) The Internet Society (2004). All Rights Reserved.

Abstract

   Media Server Control Markup Language (MSCML) is a markup language
   used in conjunction with SIP to provide advanced conferencing
   functions.  MSCML presents an application-level model for conference
   control, as opposed to device-level conference control models.  One
   use of this protocol is for communications between a conference focus
   and mixer in the IETF SIP Conferencing Framework.

Intellectual Property Rights




Van Dyke, et al.       Expires September 10, 2004               [Page 1]


Internet-Draft                   MSCML                        March 2004


   SnowShore Networks, Inc. is making their intellectual property right
   interest in MSCML available on a royalty-free basis, per the terms
   described in SnowShore's IPR disclosure in the online IETF list of
   claimed rights at <http://www.ietf.org/ietf/IPR/
   SNOWSHORE-draft-vandyke-mscml.txt>.

Conventions used in this document

   RFC2119 [1] provides the interpretations for the key words "MUST",
   "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
   "RECOMMENDED", "MAY", and "OPTIONAL" found in this document.

Table of Contents

   1.    Introduction . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.    MSCML Approach . . . . . . . . . . . . . . . . . . . . . . .  5
   3.    Use of SIP Request Methods . . . . . . . . . . . . . . . . .  5
   4.    MSCML Design . . . . . . . . . . . . . . . . . . . . . . . .  7
   4.1   Transaction Model  . . . . . . . . . . . . . . . . . . . . .  7
   4.2   XML Usage  . . . . . . . . . . . . . . . . . . . . . . . . .  8
   5.    Advanced Conferencing  . . . . . . . . . . . . . . . . . . .  9
   5.1   Conference Model . . . . . . . . . . . . . . . . . . . . . .  9
   5.2   Configure Conference Tag . . . . . . . . . . . . . . . . . . 10
   5.2.1 Conference Leg Attributes  . . . . . . . . . . . . . . . . . 11
   5.3   Terminating a Conference . . . . . . . . . . . . . . . . . . 12
   5.4   Conference Manipulation  . . . . . . . . . . . . . . . . . . 13
   6.    Interactive Voice Response (IVR) . . . . . . . . . . . . . . 15
   6.1   Play Audio <play>  . . . . . . . . . . . . . . . . . . . . . 17
   6.2   Collect Digits <playcollect> . . . . . . . . . . . . . . . . 17
   6.3   Recording Audio <playrecord> . . . . . . . . . . . . . . . . 20
   6.4   Stop Request <stop>  . . . . . . . . . . . . . . . . . . . . 22
   6.5   Prompt Block <prompt>  . . . . . . . . . . . . . . . . . . . 22
   7.    Fax Processing . . . . . . . . . . . . . . . . . . . . . . . 23
   7.1   Recording Fax <faxrecord>  . . . . . . . . . . . . . . . . . 23
   7.2   Sending Fax <faxplay>  . . . . . . . . . . . . . . . . . . . 25
   8.    Response Attributes and Return Codes . . . . . . . . . . . . 27
   8.1   Mechanism  . . . . . . . . . . . . . . . . . . . . . . . . . 27
   8.2   <response> Attributes  . . . . . . . . . . . . . . . . . . . 27
   9.    Formal Syntax  . . . . . . . . . . . . . . . . . . . . . . . 29
   9.1   Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
   10.   IANA Considerations  . . . . . . . . . . . . . . . . . . . . 41
   10.1  IANA Registration of MIME media type
         application/mediaservercontrol+xml . . . . . . . . . . . . . 41
   11.   Security Considerations  . . . . . . . . . . . . . . . . . . 41
         Normative References . . . . . . . . . . . . . . . . . . . . 42
         Informative References . . . . . . . . . . . . . . . . . . . 42
         Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 43
   A.    Contributors . . . . . . . . . . . . . . . . . . . . . . . . 44



Van Dyke, et al.       Expires September 10, 2004               [Page 2]


Internet-Draft                   MSCML                        March 2004


   B.    Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 44
         Intellectual Property and Copyright Statements . . . . . . . 45

















































Van Dyke, et al.       Expires September 10, 2004               [Page 3]


Internet-Draft                   MSCML                        March 2004


1. Introduction

   This document describes the Media Server Control Markup Language
   (MSCML).  This document describes payloads that one can send with a
   standard SIP INVITE to a media server.  Basic Network Media Services
   with SIP [6] describes media server SIP URI formats.

   Prior to MSCML, there was not a standard way to deliver SIP-based
   enhanced conferencing.  Basic SIP constructs, such as described in
   Basic Network Media Services with SIP [6], serves simple n-way
   conferencing well.  The SIP URI provides a natural mechanism for
   identifying a specific SIP conference, while INVITE and BYE methods
   elegantly implement conference join and leave semantics.  However,
   enhanced conferencing applications also require features such as
   sizing and resizing, in-conference IVR operations (e.g. recording and
   playing participant names to the full conference) and conference
   event reporting.  MSCML payloads within standard SIP methods realize
   these features.

   The structure and approach of MSCML satisfy the requirements set out
   in conferencing-framework [7] and cc-framework [8].  In particular,
   MSCML serves as the interface between the conference factory and a
   centralized conference mixer.  In this case, a media server has the
   role of the conference mixer.

   There are two broad classes of MSCML functionality.  The first class
   includes primitives for advanced conferencing such as conference
   configuration, participant leg manipulation and conference event
   reporting.  The second class comprises primitives for interactive
   voice response (IVR).  These include playing audio, collecting
   digits, and recording audio.

   The IVR features of MSCML began as an adjunct for conferencing.  In
   many scenarios it was impractical or inconvenient to establish a
   dialog with a distinct IVR resource and then re-join the conference.
   Over time, many SIP Proxy Servers, Media Gateway Controllers, and
   SIP-based applications have used MSCML for simple IVR such as
   prompt-and-collect.  Do note that for complex IVR it may be more
   appropriate to employ a full IVR markup language such as VoiceXML
   [9].

   In general, a media server offers services to SIP UAC's such as
   application servers, feature servers, and media gateway controllers.
   See the IPCC Reference Architecture [10] for definitions of these
   terms.  It is unlikely, but not prohibited, for end user SIP UAC's to
   have a direct signaling relationship with a media server.

   This document describes a working framework and protocol with which



Van Dyke, et al.       Expires September 10, 2004               [Page 4]


Internet-Draft                   MSCML                        March 2004


   there is considerable implementation experience.  Application
   developers and service providers have created several MSCML-based
   services since the initial version was made available more than a
   year ago.  This experience is highly relevant to the ongoing work of
   the IETF, particularly the <http://www.ietf.org/html.charters/
   sip-charter.html>, <http://www.ietf.org/html.charters/
   sippin-charter.html>, <http://www.ietf.org/html.charters/
   mmusic.html>, and <http://www.ietf.org/html.charters/xcon.html> work
   groups, as well as the CCXML work in the Voice Browser Work Group of
   the W3C.

2. MSCML Approach

   It is critically important to emphasize the goal of MSCML is to
   provide a development environment that follows the SIP, HTTP, and XML
   development paradigm.  That is, the mixing resource is a server that
   operates on application level constructs such as call participants.

   Some developers may desire low-level of control over DSP resources.
   Examples of such control include path establishment between DSP
   blocks such as tone detectors, tone generators, or other speech
   resources.  For such users, we STRONGLY suggest using a protocol such
   as H.248.1 [2].  Such control does not fit the SIP model.  It is, of
   course, possible to transport such low-level instructions in SIP.
   However, the programming model moves from the client-server peer
   paradigm of SIP to the master-slave controller model of H.248.1, in
   which case H.248.1 is a much more appropriate solution.

   The MSCML paradigm is important to the developer community, in that
   developers and operators conceptually write applications about calls,
   conferences, and call legs.  The H.248.1 paradigm is conceptually
   about resources and plumbing.  That is a whole level of
   implementation details that, for the majority of developers, adds no
   value.

3. Use of SIP Request Methods

   As mentioned above, MSCML payloads may be carried in either SIP
   INVITE or INFO requests.  The initial INVITE, which creates an
   enhanced conference, MUST include an MSCML payload.  The initial
   INVITE, which joins a participant leg to an enhanced conference, MAY
   include an MSCML payload.  All mid-call MSCML payloads are sent via
   SIP INFO requests.

   MSCML responses are transported in the final response to the SIP
   INVITE containing the matching MSCML request or in a SIP INFO
   message.  The only allowable final response to a SIP INFO containing
   a message body is a 200 OK, per RFC2976 [11].  Therefore, when the



Van Dyke, et al.       Expires September 10, 2004               [Page 5]


Internet-Draft                   MSCML                        March 2004


   MSCML request is sent via SIP INFO, the MSCML response is carried in
   a separate INFO request.  In general, these responses are
   asynchronous in nature and require a separate transaction due to
   timing considerations.

   There has been considerable debate on the use of the SIP INFO method
   for any purpose.  Our experience is that MSCML would not have been
   possible without it.  When MSCML was implemented the first SIP Event
   Notification draft had just been published.  At that time, use of
   SUBSCRIBE/NOTIFY within an existing dialog was undefined.  This
   prevented its use in MSCML since all events occurred in an INVITE
   established dialog.  And while SUBSCRIBE/NOTIFY was well suited for
   reporting conference events its semantics seemed inappropriate for
   modifying a participant leg or conference setting where the only
   "event" was the success or failure of the request.  Lastly, since SIP
   INFO was an established RFC it was well supported in all the SIP
   stack implementations available at that time.  We had few if any
   interoperability issues as a result.

   As it turns out, using NOTIFY is not appropriate, as the NOTIFY would
   be in response to an implicit subscription.  The issues of implicit
   subscription have been discussed on the SIP and SIPPING lists.

   Using SUBSRCIBE is not appropriate for two reasons.  The first is
   semantic.  The purpose of SUBSCRIBE is to register interest in User
   Agent state.  However, using SUBSCRIBE for MSCML results in the
   SUBSCRIBE modifying the User Agent state.  The second reason
   SUBSCRIBE is not appropriate is because MSCML is inherently
   call-based.  The association of a SIP dialog with a call leg means
   MSCML can be incredibly straightforward.  For example, if one used
   SUBSCRIBE or other SIP method to send commands about some context,
   one must identify that context somehow.  Relating commands to the SIP
   dialog they arrive on defines the context for free.  Moreover, it is
   conceptually easy for the developer.

   Recently we have re-considered using the NOTIFY method for events, as
   used in, for example, KPML [12].  NOTIFY is appropriate for KPML as
   there is usually only a single response to a given KPML document.
   Moreover, mid-call requests can go in both directions, which is not
   the case for KPML.

   Because of the multiple response and peer mid-call request nature of
   MSCML, we also considered MSRP [13].  MSRP may be the appropriate
   technology.  The main benefit of MSRP is that only proxies interested
   in seeing MSCML signaling see the MSCML messages.  This is in
   contrast to the current scheme, where the interested proxies, as well
   as any other proxies that happen to record-route, see the MSCML
   messages.  The trade-off here is that many of the interested proxies



Van Dyke, et al.       Expires September 10, 2004               [Page 6]


Internet-Draft                   MSCML                        March 2004


   are border proxies.  In the interest of interoperability, we chose to
   continue using INFO.

   In order to guarantee interoperability with this specification, as
   well as with SIP User Agents that are unaware of MSCML, SIP UACs that
   wish to use MSCML services MUST Require the "mscml" service in the
   initial INVITE.  The media server, as a SIP UAS, MUST respond
   appropriately to the INVITE, as well as advertise its support of
   MSCML in the response to OPTIONS requests.  This alleviates the major
   issues with using INFO for the transport of application data, namely
   the User Agent's proper interpretation of what is, by design, a
   opaque message request.

   SIP continues to progress incredibly quickly and we will continually
   reevaluate some of the decisions that resulted in the original design
   of MSCML.  However, we can confidently say that the availability of a
   widely supported, flexible request method was very important to the
   development and adoption MSCML.

4. MSCML Design

4.1 Transaction Model

   To avoid undue complexity two rules were established regarding MSCML
   usage.  The first is that only one MSCML body may be present in a SIP
   request.  The second is that each MSCML body may contain only one
   request or response.  This greatly simplified transaction management.
   MSCML syntax does provide for the unique identification of multiple
   requests in a single body part but this is not currently allowed.

   Per the guidelines of RFC3470 [14], MSCML bodies MUST be well formed
   and valid.

   MSCML is a direct request-response protocol.  There are no
   provisional responses, only final responses.  A request may result in
   multiple notifications, as in the case of requesting active talker
   reports.  This maps to the three major tag trees for MSCML:
   <request>, <response>, and <notification>.

   Figure 1 shows a request body.  Depending on the command, one can
   send the request in an INVITE or an INFO.  Figure 2 shows a response
   body.  The SIP INFO method transports response bodies.  Figure 3
   shows a notification body.  The SIP INFO method transports
   notifications.







Van Dyke, et al.       Expires September 10, 2004               [Page 7]


Internet-Draft                   MSCML                        March 2004


   <?xml version="1.0" encoding="utf-8"?>
     <MediaServerControl version="1.0">
       <request>
         ... request body ...
       </request>
     </MediaServerControl>

                           Figure 1: Request


   <?xml version="1.0" encoding="utf-8"?>
     <MediaServerControl version="1.0">
       <response>
         ... request body ...
       </response>
     </MediaServerControl>

                           Figure 2: Response


   <?xml version="1.0" encoding="utf-8"?>
     <MediaServerControl version="1.0">
       <notification>
         ... notification body ...
       </notification>
     </MediaServerControl>

                         Figure 3: Notification


4.2 XML Usage

   In the philosophy of XML as a text-based description language, and
   not a programing language, MSCML makes the choice of many attribute
   values for readability by a human.  Thus many attributes that would
   often be "boolean" instead take "yes" or "no" values.  For example,
   what does 'report="false"' or 'report="1"' mean?  However,
   'report="yes"' is more clear: I want a report.  That said, some
   programmers prefer the precision of a boolean.  To satisfy both
   styles, MSCML defines a XML type, "yn", that takes on the values
   "yes", "no", and the boolean values, normally "true", "false", "1",
   and "0".

   Many attributes in the MSCML schema have default values.  In order to
   limit demands on the XML parser, MSCML applies these values at the
   protocol, not XML, level.  The MSCML schema documents these defaults
   as XML annotations to the appropriate attribute.




Van Dyke, et al.       Expires September 10, 2004               [Page 8]


Internet-Draft                   MSCML                        March 2004


5. Advanced Conferencing

5.1 Conference Model

   The advanced conferencing model is a star controller model, with both
   signaling and media directed to a central location.  Figure 4 depicts
   a typical signaling relationship between end users' UAC's, a
   conference application server, and a media server.

   The document cc-conferencing [7] makes use of this model.  The
   application server is an instantiation of the conference focus.  The
   media server is an instantiation of the media mixer.  Note that
   user-level constructs, such as event notifications, are in the pervue
   of the application server.  This is why, for example, the media
   server will notify the active talker report using MSCML, while the
   application server should use the conference package [15] for
   individual notifications to SIP user agents.  Note that the use of
   the conference package for media server to application server
   notifications is not recommended because none of the filtering and
   membership information is available at the media server.

     +-------+
     | UAC 1 |---\   Public URI  +-------------+
     +-------+    \ _____________| Application |
                   /    /        |   Server    |     Not shown:
     +-------+    /    /         +-------------+     RTP flows directly
     | UAC 2 |---/    /                 | Private    between UAC's and
     +-------+       /                  |   URI      Media Server
         .          /            +--------------+
         :         /             |              |
     +-------+    /              | Media Server |
     | UAC n |---/               |              |
     +-------+                   +--------------+

                       Figure 4: Conference Model

   Each UAC sends an INVITE to a Public Conference URI.  Presumably the
   Application Server publishes this URI, or it is an ad hoc URI.  In
   any event, the Application Server generates a Private URI, following
   the rules specified by Basic Network Media Services with SIP [6].
   That is, the URI is of the form:
   sip:conf=UniqueID@ms.example.net

   Where UniqueID is a unique conference identifier, and ms.example.net
   is the host name or IP address of the media server.  There is nothing
   to prevent the UAC's from contacting the media server directly.
   However, one would expect the owner of the media server to restrict
   who can use media server resources.



Van Dyke, et al.       Expires September 10, 2004               [Page 9]


Internet-Draft                   MSCML                        March 2004


   As for basic conferencing, described by Basic Network Media Services
   with SIP [6], the first INVITE to the media server with a UniqueID
   creates a conference.  However, in advanced conferencing, the first
   INVITE includes a MSCML configure_conference payload.  The MSCML
   payload conveys extended session parameters (e.g. number of
   participants) that are not readily expressed in SDP but must be known
   to allocate the appropriate resources.

   The first dialog established for an enhanced conference has several
   useful properties and is referred to as the "Conference Control Leg."
   The control leg is used for play or record audio operations to/from
   the entire conference and no RTP is expected on the Conference
   Control Leg.  Therefore, the application must send either no SDP or
   hold SDP (c=0.0.0.0) in the initial INVITE request.  In addition, the
   lifetime of the conference is the same as that of its control leg.
   This ensures that the conference remains in existence even if one or
   more participant legs unintentionally leaves the conference.

5.2 Configure Conference Tag

   The <configure_conference> tag has two attributes that control the
   resources the media server sets aside for the conference.  The
   attributes are reservedtalkers and reserveconfmedia.  Reservedtalkers
   sets the maximum number of talker legs.  Reserveconfmedia, if set to
   "yes", allocates resources for playing or recording audio to or from
   the entire conference.  The default for reserveconfmedia is "yes".

   When the reservedtalkers+1st INVITE arrives at the media server, the
   media server responds with a 486 BUSY HERE.
      NOTE:  It would be symmetric to have a reservedlisteners
      parameter.  However, the practical limitation on the media server
      is the number of talkers for a mixer to monitor.  In either case,
      the application server regulates who gets in to the conference by
      either proxying the INVITEs from the user agent clients or
      metering who it gives the conference URI to.

   The application server can include any MSCML command in the initial
   INVITE, with the exception of asynchronous commands, such as <play>
   or <record>.  The application server must issue asynchronous commands
   separately (e.g., in INFO messages) to avoid ambiguous responses.

   For example, to create a conference with up to 120 active talkers and
   the ability to play audio into the conference or record parts or all
   of the conference, the application server specifies both attributes,
   as shown in Figure 6.






Van Dyke, et al.       Expires September 10, 2004              [Page 10]


Internet-Draft                   MSCML                        March 2004


        <?xml version="1.0" encoding="utf-8"?>
          <MediaServerControl version="1.0">
            <request>
              <configure_conference reservedtalkers="120"/>
            </request>
          </MediaServerControl>

                  Figure 6: 120 Speaker MSCML Example

   Figure 7 shows a conference with up to five active speakers without
   the capability to play or record audio into the conference.

        <?xml version="1.0" encoding="utf-8"?>
          <MediaServerControl version="1.0">
            <request>
              <configure_conference reservedtalkers="5"
                                    reserveconfmedia="no"/>
            </request>
          </MediaServerControl>

                   Figure 7: 5 Speaker MSCML Example

   Once the application server has created the Conference Control Leg,
   the server can join participants to the conference.  The application
   server directs the INVITE to the Private Conference URI described
   above.  In the example given, this would be
   sip:conf=UniqueID@ms.example.net .

5.2.1 Conference Leg Attributes

   Conference legs have a number of parameters the application server
   can modify.  The defaults are in Table 1.

   +----------------------+----------------------+---------------------+
   | Parameter            | Default              | Description         |
   +----------------------+----------------------+---------------------+
   | type                 | talker               | Consider this leg's |
   |                      |                      | audio for mixing in |
   |                      |                      | the output mix.     |
   |                      |                      | Alternative is      |
   |                      |                      | "listener".         |
   | dtmfclamp            | yes                  | Remove detected     |
   |                      |                      | DTMF digit from     |
   |                      |                      | audio               |
   | toneclamp            | yes                  | Remove loud         |
   |                      |                      | single-frequency    |
   |                      |                      | tone from audio     |
   | mixmode              | full                 | Be a candidate for  |



Van Dyke, et al.       Expires September 10, 2004              [Page 11]


Internet-Draft                   MSCML                        March 2004


   |                      |                      | the full mix.       |
   |                      |                      | Alternatives are    |
   |                      |                      | "mute" to not allow |
   |                      |                      | audio in the mix,   |
   |                      |                      | "parked" to remove  |
   |                      |                      | any media streams   |
   |                      |                      | from the leg, and   |
   |                      |                      | "preferred" to give |
   |                      |                      | this stream         |
   |                      |                      | preferential        |
   |                      |                      | selection in the    |
   |                      |                      | mix (i.e., even if  |
   |                      |                      | not loudest talker, |
   |                      |                      | include media, if   |
   |                      |                      | present, from this  |
   |                      |                      | leg in the mix).    |
   +----------------------+----------------------+---------------------+

                   Table 1: Conference Leg Parameters

   In addition to these attributes, there are two tags defined.  They
   are inputgain and outputgain.  Values for these tags are <auto/>, to
   use automatic gain control (AGC) to determine input gain for the leg
   or <fixed/>.  <auto/> takes the attributes "startlevel",
   "targetlevel", and "silencethreshold".  All of the parameters are in
   dB.  <fixed> takes the atribute "level", which is in dB.

   The default for both inputgain and outputgain is <auto/>

   If the default parameters are acceptable for the leg the application
   server wishes to enter into the conference, then a normal SIP INVITE
   is sufficient.  However, if the application server wishes to modify
   one or more of the parameters, the application server can include a
   MSCML body in addition to the SDP body.

   The application server can modify the conference leg parameters by
   issuing a SIP INFO on the selected dialog representing the conference
   leg.  Of course, the application server cannot modify SDP in an INFO
   message.

5.3 Terminating a Conference

   To remove a leg from the conference, the application server issues a
   SIP BYE request on the selected dialog representing the conference
   leg.

   The application server can terminate all legs in a conference by
   issuing a SIP BYE request on the Conference Control Leg.  If one or



Van Dyke, et al.       Expires September 10, 2004              [Page 12]


Internet-Draft                   MSCML                        March 2004


   more participants are still in the conference when the media server
   receives a SIP BYE request on the Conference Control Leg, the media
   server issues SIP BYE requests on all of the remaining conference
   legs to ensure clean up of the legs.

   The media server returns a 200 OK to the SIP BYE request as it sends
   BYE requests to the other legs.  This is because we cannot issue a
   provisional response to a non-INVITE request, yet the teardown of the
   other legs may "take a while".

5.4 Conference Manipulation

   Once the conference has begun, the application server can manipulate
   the conference as a whole by issuing commands on the Conference Leg.
   For example, the application server can request the media server to
   record the conference, play a prompt to the conference, change the
   input or output gain for the conference as a whole, and report on
   events.  The elements for these commands are <playrecord>, <play>,
   <inputgain>, <outputgain>, and <subscribe>, respectively.

   Figure 8 and Figure 9 show two sample commands.  The first plays a
   prompt into the conference.  The second records the entire conference
   to the URI specified by recurl.  This file: URI scheme happens to do
   the write over NFS, per configuration at the media server.
      NOTE:  The provisioning of NFS mount points and their mapping to
      the file: schema is purely a local matter at the media server.

        <?xml version="1.0" encoding="utf-8"?>
          <MediaServerControl version="1.0">
            <request>
              <play
    prompturl="http://prompts.example.net/us_EN/welcome.au"/>
            </request>
          </MediaServerControl>

             Figure 8: Full Conference Audio Command - Play















Van Dyke, et al.       Expires September 10, 2004              [Page 13]


Internet-Draft                   MSCML                        March 2004


        <?xml version="1.0" encoding="utf-8"?>
          <MediaServerControl version="1.0">
            <request>
              <playrecord
    recurl="file://archive.example.net/conferences/archives/011208.au"
                            beep="no"
                            initsilence="-1" endsilence="-1" />
            </request>
          </MediaServerControl>

            Figure 9: Full Conference Audio Command - Record

   The response to this last request will be similar to Figure 10.

     <?xml version="1.0" encoding="utf-8"?>
       <MediaServerControl version="1.0">
        <response request="playrecord" code="200" text="OK"
                  reclength="1420374"/>
     </MediaServerControl>

               Figure 10: Sample Change Command Response

   A request for an active talker report is in Figure 11.

   <?xml version="1.0" encoding="utf-8"?>
     <MediaServerControl version="1.0">
        <request>
          <configure_conference>
            <subscribe>
              <events>
                <activetalkers/>
              </events>
            </subscribe>
          </configure_conference>
        </request>
     </MediaServerControl>

                    Figure 11: Active Talker Request

   Later event reporting comes through SIP INFO messages.  Figure 12
   shows an example report.










Van Dyke, et al.       Expires September 10, 2004              [Page 14]


Internet-Draft                   MSCML                        March 2004


        <?xml version="1.0" encoding="utf-8"?>
          <MediaServerControl version="1.0">
            <notification>
              <conference uniqueID="ab34h76z" numtalkers="16"
                          numlisteners="1382">
                <activetalkers>
                  <talker callID="myhost4sn123"/>
                  <talker callID="myhost2sn456"/>
                  <talker callID="myhost12sn78"/>
                </activetalkers>
              </conference>
            </notification>
          </MediaServerControl>

                 Figure 12: Active Talker Event Example

   An application server can modify a leg by issuing an INFO on the
   dialog associated with the participant leg.  For example, Figure 13
   mutes a conference leg.

        <?xml version="1.0" encoding="utf-8"?>
          <MediaServerControl version="1.0">
            <request>
              <configure_leg mixmode="mute"/>
            <request>
          </MediaServerControl>

                  Figure 13: Sample Change Leg Command

   In Figure 8 we saw a request to play a prompt to the entire
   conference.  We can also request to play a prompt to an individual
   call leg.  If we want to play a prompt or collect digits only on a
   single leg, we issue the commands within the dialog for the of the
   desired conference participant.

   Section 6 descibes the interactive voice response (IVR) services
   offered.  If an IVR command arrives on the control channel, it takes
   effect on the whole conference.  This is a mechanism for playing
   prompts to the entire conference (e.g., announcing new participants).
   If an IVR command arrives on an individual leg, it only effects that
   leg.  This is a mechanism for interacting with users, such as to
   create "waiting rooms", allow a user to mute themselves using key
   presses, allowing a moderator to out-dial, etc.

6. Interactive Voice Response (IVR)

   In the IVR model, the Media Server acts as a media processing proxy
   for the UAC.  This is particularly useful when the UAC is a media



Van Dyke, et al.       Expires September 10, 2004              [Page 15]


Internet-Draft                   MSCML                        March 2004


   gateway or other device with limited media processing capability.

                        SIP      +--------------+
                    Service URI  | Application  |
                 /---------------|    Server    |
                /(e.g., RFC3087) +--------------+
               /                        |  MSCML
              /                     SIP | Session
             /                   +--------------+
     +-----+/       RTP          |              |
     | UAC |=====================| Media Server |
     +-----+                     |              |
                                 +--------------+

                          Figure 14: IVR Model

   The IVR service supports basic Interactive Voice Response functions,
   playing announcements, collecting DTMF digits, and recording audio,
   based on Media Server Control Markup Language (MSCML) directives
   added to the message body of a SIP request.  Figure 14 shows the
   signaling relationship between a client UAC, and Application Server,
   and a Media Server.

   Multifunction media servers SHOULD use the URI conventions described
   in Basic Network Media Services with SIP [6].  For review, the MSCML
   IVR service indicator is "ivr":
   sip:ivr@ms.example.net

      NOTE:  The VoiceXML IVR service indicator is dialog.
   One may carry the request payload for IVR in either the initial SIP
   INVITE or INFO requests.

   Mid-call requests must use the INFO method.  The INFO method reduces
   certain timing issues that occur with re-INVITES and also uses less
   processing on both the application server and Media Server.

   The Media Server notifies the application that the command has
   completed through a <response> message containing final status
   information and data such as collected DTMF digits.

   The media server does not queue IVR requests.  If the media server
   receives a request while another is in progress, the media server
   stops the first operation and it carries out the new request.  The
   Media Server generates a <response> message for the first request and
   returns any data collected up to that point. If an application wishes
   to stop a request in progress but does not wish to initiate another
   operation, it issues a <stop> request.  This also causes the Media
   Server to generate a <response> message.



Van Dyke, et al.       Expires September 10, 2004              [Page 16]


Internet-Draft                   MSCML                        March 2004


   The Media Server treats a SIP re-INVITE with hold media (c=0.0.0.0)
   as an implicit <stop> request.  The media server immediately
   terminates the running <play>, <playcollect> or <playrecord> request,
   and sends a <response>, indicating "reason=stopped".

6.1 Play Audio <play>

   The application issues a <play> request to play an announcement
   without interruption and with no digit collection.  One use, for
   example, is to announce the name of a new participant to the entire
   conference.

   The application specifies the announcement to play by the prompt
   block in the body of the request.

   Attributes include promptencoding (optional), which explicitly
   specifies the encoding (mu-law, A-law, msgsm, etc.), and id (also
   optional).  ID is an application-defined request identifier that
   correlates the asynchronous response with its original request and
   echoes back to the application in the Media Server's response.

   When the announcement has finished playing, the Media Server sends a
   <response=> payload to the application in a SIP INFO message.

   The response may carry the id, the status code (e.g., 200), the
   status text (e.g., OK), and the reason (EOF or stopped).

6.2 Collect Digits <playcollect>

   The application issues a <playcollect> request to optionally play an
   announcement and then collect digits.

   This request has multiple attributes, all of which are optional.

   The presence or absence of the prompt block controls whether there
   will be an announcement or the result of the request is to be digit
   collection only.

   Whenever the media server receives a <playcollect> request, it will
   continuously buffer and examine collected digits.  The media server
   compares previously buffered digits to the returnkey, escapekey, and
   maxdigits attributes to determine if any immediate action is
   required.  This provides the type-ahead behavior for menu traversal
   and other types of IVR interactions.

   The application may override type-ahead behavior by setting the
   cleardigits parameter to "yes", which removes all previously-buffered
   digits such that the only user input considered is what occurs after



Van Dyke, et al.       Expires September 10, 2004              [Page 17]


Internet-Draft                   MSCML                        March 2004


   the request.

   If cleardigits is set to "no", digits previously buffered will result
   in the prompt being barged immediately.  Prompt play would never
   begin, and digit collection would start immediately.

   The default for barge is "yes".  If the barge attribute is set to
   "no", the cleardigits attribute implicitly has a value of "yes".
   This ensures that DTMF input occurring before the current collection
   is not left in the buffer after the request completes.

   The application can set two special strings to invoke special
   processing when detected:
   o  The escapekey, which defaults to *, indicates that the user
      intends to terminate the current operation without saving any
      input collected to that point.  Detection terminates the request
      immediately and generates a response.
   o  The returnkey, which defaults to #, indicates the user has
      completed input and wants to return all collected digits to the
      application.  When the media server detects the returnkey, it
      immediately terminates collection and returns the collected digits
      to the application in the <response> message.

   The media server may also support three additional strings to support
   VCR controls while playing a prompt.  These strings modify the
   behavior of the playing of the prompt block.  If the media server
   does not support VCR controls, it must silently ignore the request.
   o  The skipinterval, which defaults to "6s", indicates how far the
      media server should skip backwards or forwards in the currently
      playing object from prompturl.
   o  The ffkey indicates the user wishes to skip forward skipinterval
      in the currently playing object from prompturl.
   o  The rwkey indicates the user wishes to skip backward skipinterval
      in the currently playing object form prompturl.
   Note that it is an error to have the digits mapped to ffkey or rwkey
   in the digitmap.

   The VCR controls only work within a single <prompt> block.
   Skipping-back before the begining of the block results in playback at
   the beginning of the block.  Skiping-forward past the end of the
   block results in the media server treating the prompt as played.

      NOTE: This is only a very rudimentary implementation of VCR
      controls.  Application developers are STRONGLY RECOMMENDED to use
      the interrupt time returned in the digit report to calculate where
      to skip.  This enables sensible handling of composite voice
      objects, etc.  If an application developer needs real VCR
      controls, they are STRONGLY RECOMMENDED to use VoiceXML with VCR



Van Dyke, et al.       Expires September 10, 2004              [Page 18]


Internet-Draft                   MSCML                        March 2004


      extensions.

   Several timer attributes control how long the Media Server waits for
   digits in the input sequence.  All timer settings are in
   milliseconds.
   firstdigittimer controls how long the Media Server waits for the
      initial DTMF input before terminating collection.
   interdigittimer controls how long the Media Server waits between DTMF
      inputs.
   extradigittimer controls how long the Media Server waits for
      additional user input after the specified number of digits has
      been collected.

   The extradigittimer setting enables the "returnkey" input to be
   associated with the current collection.  For example, if maxdigits is
   set to 3 and returnkey is set to #, the user may enter either "x#",
   "xx#" or "xxx#", where x represents a DTMF digit.

   If the "returnkey" pattern is detected during the "extradigit"
   interval, the collected digits are returned to the application and
   the "returnkey" is removed from the digit buffer.

   If this were not the case, the example would return "xxx" to the
   application and leave the terminating "#" in the digit buffer to be
   processed by the next <playcollect> request.  This might result in
   the termination of the following prompt; clearly not what the user
   intended.

   The extradigittimer has no effect unless returnkey has been set.

   The <regex> element specifies a digit pattern for the media server to
   look for.  MSCML supports three modes of digit map specification:
   regular expressions, MGCP [3] digit maps, and H.248.1 [2] digit maps.
   The type attribute indicates what kind of digit map appears in the
   expression.
   regex The default; use regular expression matching.
   mgcpdigitmap Use digit maps as specified in MGCP [3].
   megacodigitmap Use digit maps as specified in H.248.1 [2].

   When the <playcollect> has finished playing, the Media Server sends a
   <response> payload to the application in a SIP INFO message.

   The response may carry the id, the code (e.g., 200), the text(e.g.,
   OK), the reason (match, timeout, returnkey, escapekey, or stopped),
   and the collected digits.






Van Dyke, et al.       Expires September 10, 2004              [Page 19]


Internet-Draft                   MSCML                        March 2004


6.3 Recording Audio <playrecord>

   The <playrecord> request directs the Media Server to capture the RTP
   it receives and deliver it to a URI specified by the controlling
   application in the appropriate codec and content encoding.

   This tag has multiple attributes. The required recurl attribute
   identifies the URI target for the recorded audio.  All other
   attributes are optional.

   The presence or absence of the prompt block controls whether or not a
   prompt plays before recording begins.

   When the application requests the media server to prompt the caller
   before recording audio, <playrecord> has two stages.  The first is
   equivalent to a <playcollect> operation.  The application may set the
   prompt phase to be interruptible by DTMF input (barge) and may also
   specify an escape key that will terminate the <playrecord> request
   before the recording phase begins.

   Detection of the escape key generates a response message, and the
   operation returns immediately.  If any other keys are pressed and if
   the prompt has been set as interruptible (barge="yes"), then the play
   stops immediately and the recording phase begins.

   Any digits collected in the prompt phase, with the exception of the
   recstopmask, are buffered and returned in the response.

   If the request proceeds to the recording phase, any digits from the
   collect phase are discarded from the buffer to eliminate unintended
   termination of the recording.

   The media server compares digits detected during the recording phase
   to the digits specified in the recstopmask to determine if they
   indicate a recording termination request.

   The media server ignores digits not present in the recstopmask and
   passes them into the recording.  If the recording is terminated
   because of a DTMF input, the collected digits are returned to the
   application in the <response>.


   Once recording has begun, the media server writes the audio to the
   specified recurl URL no matter what DTMF events are detected.  It is
   the responsibility of the application to examine the DTMF input
   returned in the <response> message to determine whether the audio
   file should be saved or if it should be deleted and potentially
   re-recorded.




Van Dyke, et al.       Expires September 10, 2004              [Page 20]


Internet-Draft                   MSCML                        March 2004


   Two attributes control how long the Media Server waits for the start
   of speech to begin the recording and the absence of speech to end the
   recording:
   initsilence determines how long to wait for initial speech input
      before terminating (canceling) the recording.  This parameter may
      take an integer value in milliseconds, or may be set to -1, which
      directs the Media Server to wait indefinitely. The default is 3000
      ms (3 seconds).
   endsilence determines how long the Media Server waits after speech
      has ended to stop the recording.  This parameter may take an
      integer value in milliseconds, or may be set to -1. With a value
      of -1, the recording will continue indefinitely after speech has
      ended and may terminate due to a DTMF keypress or because the
      maximum desired duration has been reached. The default value is
      4000 ms (4 seconds).

   If the endsilence timer expires, the Media Server trims the end of
   the recorded audio by an amount equal to the endsilence parameter.

   Additional attributes are:
   mode whether the recording will overwrite or append.
   reencoding whether encoding is mu-law, A-law, msgsm, etc.
   duration time in ms for the entire recording.
   beep whether a beep will signify the start of recording.

   When the recording is finished, the media server generates a
   <response> message and sends it to the application in a SIP INFO
   message.  The response contains the id, the code (e.g., 200, 400,
   501), the reason (e.g., digit, end_silence, init_silence,
   max_duration, escapekey, error, or stopped), collected digits, and
   the reclength (size of the recorded file in bytes).

   The recording example (Figure 16) plays a prompt ("SayName.g11") and
   records it to the recurl in MS-GSM format, wave-encoded.

















Van Dyke, et al.       Expires September 10, 2004              [Page 21]


Internet-Draft                   MSCML                        March 2004


     <playrecord
       prompturl="http://prompts.example.net/us_EN/SayName.g711"
       recurl="file://nfs.example.net/names/greet-joij34923119.wav"
       recencoding="msgsm"
       initsilence="15s"
       endsilence="2s"
       duration="8s">
     </playrecord>

                      Figure 16: Recording Example


6.4 Stop Request <stop>

   The application issues a <stop> request when the objective is to stop
   a request in progress and not initiate another operation. This
   request generates a <response> message from the Media Server.

   The only attribute is id, which is optional.

   The application-defined request id correlates the asynchronous
   response with its original request and echoes back to the application
   in the Media Server's response.

   The response may carry the id, the code (e.g., 200), and the text
   (e.g., OK).

   Note that the Media Server treats a SIP re-INVITE with hold media as
   an implicit <stop> request.  The media server immediately terminates
   the running <play>, <playcollect> or <playrecord> request, and sends
   a <response>, indicating "reason=stopped".

6.5 Prompt Block <prompt>

   This block in the body of the <play>, <playcollect>, or <playrecord>
   request contains one or more references to physical audio files,
   provisioned sequences, or variables that are played in the order in
   which they appear.

   Figure 17 shows a sample prompt block.











Van Dyke, et al.       Expires September 10, 2004              [Page 22]


Internet-Draft                   MSCML                        March 2004


       <prompt baseurl="file:////opt/snowshore/prompts/conf/">
                <audio url="please_enter.wav"/>
                <variable type="silence" value="1"/>
                <audio url="your.raw" encoding="a-law"/>
                <variable type="silence" value="1"/>
                <audio
                   url="http://prompts.example.net/pin_number.wav"/>
        </prompt>

                 Figure 17: Active Talker Event Example

   The baseurl attribute is the base URL prepended to the URL attributes
   within the <prompt> block.

   Each audio element in a <prompt> block refers to an audio file or
   provisioned sequence for the media server to play.  The media server
   plays audio files in the order in which they are listed in the block.

7. Fax Processing

7.1 Recording Fax <faxrecord>

   The <faxrecord> request directs the Media Server to process a fax in
   answer mode.  The reason for a separate tag from the <playrecord> tag
   is because the Media Server needs to know to process the T.30 [16] or
   T.38 [17] fax protocols.

   This tag has multiple attributes.  The lclid attribute is a string
   that identifies the called station.  The lclid attribute is optional.
   The default is null.

   The <faxrecord> request operates in one of three modes: receive,
   poll, and turnaround poll.

   In receive mode, the Media Server receives the fax and writes the fax
   data to the URI specified by the recurl attribute.

   In poll mode, the Media Server sends a fax, but as a polled (called)
   device.

   In turnaround poll mode, the Media Server will record a fax that the
   remote machine sends.  If the remote machine requests a transmission,
   then the Media Server will send the fax.

   The recurl attribute is the URI to record the fax to, if specified.

   The prompturl attribute is the URI to fetch the fax to transmit, if
   specified.



Van Dyke, et al.       Expires September 10, 2004              [Page 23]


Internet-Draft                   MSCML                        March 2004


   The rmtid attribute specifies the calling station identifier of the
   remote terminal.  If specified, the media server MUST reject
   transactions with the remote terminal if the remote terminal's
   identifier does not match rmtid.

   The combination of prompturl and recurl define the mode.  See Table
   2.

   +----------------+----------------+----------------+----------------+
   | prompturl      | recurl         | Mode           | Operation      |
   +----------------+----------------+----------------+----------------+
   | no             | no             | Invalid        | Request fails. |
   | no             | yes            | Receive        | Record fax     |
   |                |                |                | into recurl.   |
   | yes            | no             | Poll           | Send fax from  |
   |                |                |                | prompturl. If  |
   |                |                |                | rmtid is       |
   |                |                |                | specified, it  |
   |                |                |                | must match     |
   |                |                |                | remote         |
   |                |                |                | terminal's     |
   |                |                |                | identifier, or |
   |                |                |                | the request    |
   |                |                |                | will fail.     |
   | yes            | yes            | Turnaround     | If the remote  |
   |                |                | Poll           | terminal       |
   |                |                |                | wishes to      |
   |                |                |                | transmit, the  |
   |                |                |                | Media Server   |
   |                |                |                | records the    |
   |                |                |                | fax into       |
   |                |                |                | recurl. If the |
   |                |                |                | remote         |
   |                |                |                | terminal       |
   |                |                |                | wishes to      |
   |                |                |                | receive, the   |
   |                |                |                | Media Server   |
   |                |                |                | sends the fax  |
   |                |                |                | from           |
   |                |                |                | prompturl. If  |
   |                |                |                | rmtid is       |
   |                |                |                | specified, it  |
   |                |                |                | must match     |
   |                |                |                | remote         |
   |                |                |                | terminal's     |
   |                |                |                | identifier, or |
   |                |                |                | the send       |
   |                |                |                | request will   |



Van Dyke, et al.       Expires September 10, 2004              [Page 24]


Internet-Draft                   MSCML                        March 2004


   |                |                |                | fail. A        |
   |                |                |                | receive        |
   |                |                |                | operation will |
   |                |                |                | still succeed, |
   |                |                |                | however.       |
   +----------------+----------------+----------------+----------------+

                       Table 2: Fax Receive Modes

   The Media Server MUST flush any quarantined digits when it receives a
   <faxrecord> request.

7.2 Sending Fax <faxplay>

   The <faxplay> request directs the Media Server to process a fax in
   originate mode.  The reason for a separate tag from the <play> tag is
   because the Media Server needs to know to process the T.30 [16] or
   T.38 [17] fax protocols.

   This tag has multiple attributes.  The lclid attribute is a string
   that identifies the Media Server as the calling station in the DIS
   message.  The lclid attribute is optional.  The default is null.

   The <faxplay> request operates in one of three modes: send, remote
   poll, and turnaround poll.

   In send mode, the Media Server sends the fax.

   In remote poll mode, the Application Server places a call on behalf
   of the Media Server.  The Media Server requests a fax transmission
   from the remote fax terminal.

   In turnaround poll mode, the Media Server will record a fax that the
   remote machine sends.  If the remote machine requests a transmission,
   then the Media Server will send the fax.

   The recurl attribute is the URI to record the fax to, if specified.
   The Media Server will advertise in the
DIS message it can receive
   fax transmissions.

   The prompturl attribute is the URI to fetch the fax to transmit, if
   specified.  The Media Server will advertise in the DIS message it can
   send fax transmissions.

   The rmtid attribute specifies the calling station identifier of the
   remote terminal.  If specified, the media server MUST reject
   transactions with the remote terminal if the remote terminal's
   identifier does not match rmtid.



Van Dyke, et al.       Expires September 10, 2004              [Page 25]


Internet-Draft                   MSCML                        March 2004


   The combination of prompturl and recurl define the mode.  See Table
   3.

   +----------------+----------------+----------------+----------------+
   | prompturl      | recurl         | Mode           | Operation      |
   +----------------+----------------+----------------+----------------+
   | no             | no             | Invalid        | Request fails. |
   | yes            | no             | Send           | Send fax from  |
   |                |                |                | prompturl. If  |
   |                |                |                | rmtid is       |
   |                |                |                | specified, it  |
   |                |                |                | must match     |
   |                |                |                | remote         |
   |                |                |                | terminal's     |
   |                |                |                | identifier, or |
   |                |                |                | the receive    |
   |                |                |                | request will   |
   |                |                |                | fail.          |
   | no             | yes            | Poll           | Send fax from  |
   |                |                |                | prompturl,     |
   |                |                |                | assuming the   |
   |                |                |                | remote         |
   |                |                |                | terminal       |
   |                |                |                | specifies it   |
   |                |                |                | can receive a  |
   |                |                |                | fax in its DIS |
   |                |                |                | message. It    |
   |                |                |                | the remote     |
   |                |                |                | terminal does  |
   |                |                |                | not support    |
   |                |                |                | reverse        |
   |                |                |                | polling, the   |
   |                |                |                | request will   |
   |                |                |                | fail. If rmtid |
   |                |                |                | is specified,  |
   |                |                |                | it must match  |
   |                |                |                | remote         |
   |                |                |                | terminal's     |
   |                |                |                | identifier, or |
   |                |                |                | the request    |
   |                |                |                | will fail.     |
   | yes            | yes            | Turnaround     | If the remote  |
   |                |                | Poll           | terminal       |
   |                |                |                | wishes to      |
   |                |                |                | transmit, the  |
   |                |                |                | Media Server   |
   |                |                |                | records the    |
   |                |                |                | fax into       |



Van Dyke, et al.       Expires September 10, 2004              [Page 26]


Internet-Draft                   MSCML                        March 2004


   |                |                |                | recurl. If the |
   |                |                |                | remote         |
   |                |                |                | terminal       |
   |                |                |                | wishes to      |
   |                |                |                | receive, the   |
   |                |                |                | Media Server   |
   |                |                |                | sends the fax  |
   |                |                |                | from           |
   |                |                |                | prompturl. If  |
   |                |                |                | rmtid is       |
   |                |                |                | specified, it  |
   |                |                |                | must match     |
   |                |                |                | remote         |
   |                |                |                | terminal's     |
   |                |                |                | identifier, or |
   |                |                |                | the send       |
   |                |                |                | request will   |
   |                |                |                | fail. A        |
   |                |                |                | receive        |
   |                |                |                | operation will |
   |                |                |                | still succeed, |
   |                |                |                | however.       |
   +----------------+----------------+----------------+----------------+

                        Table 3: Fax Send Modes

   The Media Server MUST flush any quarantined digits when it receives a
   <faxplay> request.

8. Response Attributes and Return Codes

8.1 Mechanism

   The Media Server acknowledges receipt of an application request by
   sending a response of either 200 OK or 415 BAD MEDIA TYPE.  (The
   latter is sent when the SIP request contains a content type other
   than "application/sdp" or "application/mediaservercontrol+xml").

   The <response> message is transported in a SIP INFO request.

   If there is an error in the request or the request cannot be
   completed, the <response> message is sent very shortly after
   receiving the request. If the request is able to proceed, the
   <response> contains final status information as listed below.

8.2 <response> Attributes

   If the request specified an ID, the response will echoed the ID.



Van Dyke, et al.       Expires September 10, 2004              [Page 27]


Internet-Draft                   MSCML                        March 2004


   The "code" is the result code for the request.  It can take the
   following values.
   o  200 indicates command completed.
   o  400 for <playrecord>, <faxrecord>, and <faxplay> indicates command
      not accepted due to an error. The text attribute describes the
      cause of the error.
   o  501 for <playrecord>, <faxrecord>, and <faxplay> indicates an
      error because the media server does not support the URL type
      specified.

   The "digits" are the returned digits for <playcollect> and
   <playrecord>.  Its value is the collected digits, if any.

   The "reason" is why the command terminated.  For all requests, the
   reason "stopped" indicates that a <stop> request, another command, or
   a re-INVITE with hold media stopped the request.

   For the <play> request, the "EOF" reason means the media server
   played out to the end of the file.

   For the <playcollect> request, a reason of "match" means a match was
   found; "timeout" means no digit was received before the time-out
   timer expired; "returnkey" and "escapekey" means the return key or
   escape key terminated the operation, respectively; and "interrupted"
   means another request interrupted the <playcollect> request.

   For the <playrecord> request, a reason of "digit" means a digit was
   detected; "end_silence" means the recording terminated because the
   trailing silence timer expired; "init_silence" means that no voice
   was detected; "max_duration" means the recording terminated because
   the maximum time for recording completed; "escapekey" means the user
   entered the escape key in either play or record mode, thus
   terminating the recording; or "error", for a general operation
   failure.

   For the <faxplay> and <faxrecord> requests, a reason of "complete"
   means successful completion, even if there were bad lines or minor
   negotiation problems, i.e., a DCN was received; "disconnect" means
   the session was disconnected; "notfax" means no DIS or DCS was
   received on the connection.

   The "reclength" is the length of the recording in bytes for a
   <playrecord>.

   The "text" is the descriptive text associated with the response code.

   For the <faxplay> and <faxrecord> requests, the faxcode attribute is
   the binary-or of the following bit patterns.



Van Dyke, et al.       Expires September 10, 2004              [Page 28]


Internet-Draft                   MSCML                        March 2004


            +------+--------------------------------------+
            | Mask | description                          |
            +------+--------------------------------------+
            | 0    | Operation Failed                     |
            | 1    | Operation Succeeded                  |
            | 2    | Partial Success                      |
            | 4    | Image received and placed in recurl  |
            | 8    | Image sent from prompturl            |
            | 16   | rmtid did not match                  |
            | 32   | Error reading prompturl              |
            | 64   | Error writing recurl                 |
            | 128  | Negotiation failure on send phase    |
            | 256  | Negotiation failure on receive phase |
            | 512  | Reserved                             |
            | 1024 | Irrecoverable IP packet loss         |
            | 2048 | Line errors in received image        |
            +------+--------------------------------------+


9. Formal Syntax

   The following syntax specification uses the augmented Data Type
   Definition (DTD) as described in XML [4].

9.1 Schema

   <?xml version="1.0" encoding="UTF-8"?>
   <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
              elementFormDefault="qualified">
     <xs:annotation>
       <xs:documentation>MediaServerControl XML Schema (MSCML)
   Copyright (c) 2001-2004, SnowShore Networks, Inc.
   All Rights Reserved, except as offered by RFC 3667</xs:documentation>
     </xs:annotation>
     <xs:simpleType name="yesno">
       <xs:restriction base="xs:NMTOKEN">
         <xs:enumeration value="yes"/>
         <xs:enumeration value="no"/>
       </xs:restriction>
     </xs:simpleType>
     <xs:simpleType name="yn">
       <xs:annotation>
         <xs:documentation>MSCML historically only allowed "yes" and
                        "no" for boolean fields, as that made them more readable. The
                        "yn" construction allows for both "yes" and "no" as well as
                        boolean values going forward.</xs:documentation>
       </xs:annotation>
       <xs:union memberTypes="xs:boolean yesno"/>



Van Dyke, et al.       Expires September 10, 2004              [Page 29]


Internet-Draft                   MSCML                        March 2004


     </xs:simpleType>
     <xs:element name="MediaServerControl">
       <xs:annotation>
         <xs:documentation>There are three MSCML messages: request,
         response, and notification.</xs:documentation>
       </xs:annotation>
       <xs:complexType>
         <xs:choice>
           <xs:element name="request">
             <xs:annotation>
               <xs:documentation>MSCML action request.
               </xs:documentation>
             </xs:annotation>
             <xs:complexType>
               <xs:choice>
                 <xs:element name="configure_conference">
                   <xs:annotation>
                     <xs:documentation>Configure conference as an entire
                      object.</xs:documentation>
                   </xs:annotation>
                   <xs:complexType>
                     <xs:sequence>
                       <xs:element ref="inputgain" minOccurs="0"/>
                       <xs:element ref="outputgain" minOccurs="0"/>
                       <xs:element name="subscribe" minOccurs="0">
                         <xs:complexType>
                           <xs:sequence>
                             <xs:element name="events">
                               <xs:complexType>
                                 <xs:sequence>
                                   <xs:element ref="activetalkers"/>
                                 </xs:sequence>
                               </xs:complexType>
                             </xs:element>
                           </xs:sequence>
                         </xs:complexType>
                       </xs:element>
                     </xs:sequence>
                     <xs:attribute name="id" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="reservedtalkers"
                                   type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="reserveconfmedia" type="yn"
                                   use="optional"/>
                   </xs:complexType>
                 </xs:element>
                 <xs:element name="configure_leg">



Van Dyke, et al.       Expires September 10, 2004              [Page 30]


Internet-Draft                   MSCML                        March 2004


                   <xs:annotation>
                     <xs:documentation>Configure conference leg.
                     </xs:documentation>
                   </xs:annotation>
                   <xs:complexType>
                     <xs:sequence>
                       <xs:element ref="inputgain" minOccurs="0"/>
                       <xs:element ref="outputgain" minOccurs="0"/>
                     </xs:sequence>
                     <xs:attribute name="id" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="type" use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "talker"
                         </xs:documentation>
                       </xs:annotation>
                       <xs:simpleType>
                         <xs:restriction base="xs:NMTOKEN">
                           <xs:enumeration value="talker"/>
                           <xs:enumeration value="listener"/>
                         </xs:restriction>
                       </xs:simpleType>
                     </xs:attribute>
                     <xs:attribute name="mixmode" use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "full".
                         </xs:documentation>
                       </xs:annotation>
                       <xs:simpleType>
                         <xs:restriction base="xs:NMTOKEN">
                           <xs:enumeration value="full"/>
                           <xs:enumeration value="mute"/>
                           <xs:enumeration value="preferred"/>
                           <xs:enumeration value="parked"/>
                         </xs:restriction>
                       </xs:simpleType>
                     </xs:attribute>
                     <xs:attribute name="dtmfclamp" type="yn"
                                   use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "yes"
                         </xs:documentation>
                       </xs:annotation>
                     </xs:attribute>
                     <xs:attribute name="toneclamp" type="yn"
                                   use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "yes"



Van Dyke, et al.       Expires September 10, 2004              [Page 31]


Internet-Draft                   MSCML                        March 2004


                         </xs:documentation>
                       </xs:annotation>
                     </xs:attribute>
                   </xs:complexType>
                 </xs:element>
                 <xs:element name="play">
                   <xs:annotation>
                     <xs:documentation>Plays an audio prompt without
                      barge-in or digit collection. Play implicitly
                      subscribes to a "complete" event which is
                      generated when the specified prompt has finished
                      playing.</xs:documentation>
                   </xs:annotation>
                   <xs:complexType>
                     <xs:sequence minOccurs="0">
                       <xs:element ref="prompt"/>
                     </xs:sequence>
                     <xs:attribute name="id" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="prompturl" type="xs:anyURI"
                                   use="optional"/>
                     <xs:attribute name="offset" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="promptencoding"
                                   type="xs:string"
                                   use="optional"/>
                   </xs:complexType>
                 </xs:element>
                 <xs:element name="playcollect">
                   <xs:annotation>
                     <xs:documentation>Plays an audio prompt,
                      collects DTMF digits, and returns the digits
                      to the application.  May also be used to collect
                      digits only if no specified sequence.
                      Playcollect implicitly subscribes to a
                      "complete" event which is normally generated
                      when the desired digits have been collected or a
                      timeout has expired.</xs:documentation>
                   </xs:annotation>
                   <xs:complexType>
                     <xs:sequence>
                       <xs:element ref="prompt" minOccurs="0"/>
                       <xs:element name="regex" minOccurs="0">
                         <xs:complexType>
                           <xs:attribute name="type" use="optional"
                                         default="regex">
                             <xs:simpleType>
                               <xs:restriction base="xs:NMTOKEN">



Van Dyke, et al.       Expires September 10, 2004              [Page 32]


Internet-Draft                   MSCML                        March 2004


                                 <xs:enumeration value="regex"/>
                                 <xs:enumeration value="mgcpdigitmap"/>
                               </xs:restriction>
                             </xs:simpleType>
                           </xs:attribute>
                           <xs:attribute name="value" type="xs:string"
                                         use="required"/>
                           <xs:attribute name="id" type="xs:string"
                                         use="optional"/>
                         </xs:complexType>
                       </xs:element>
                     </xs:sequence>
                     <xs:attribute name="id" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="prompturl" type="xs:anyURI"
                                   use="optional"/>
                     <xs:attribute name="offset" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="barge" type="yn"
                                   use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "yes".
                         </xs:documentation>
                       </xs:annotation>
                     </xs:attribute>
                     <xs:attribute name="promptencoding"
                                   type="xs:string" use="optional"/>
                     <xs:attribute name="cleardigits" type="yn"
                                   use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "yes".
                         </xs:documentation>
                       </xs:annotation>
                     </xs:attribute>
                     <xs:attribute name="maxdigits" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="firstdigittimer"
                                   type="xs:string" use="optional"/>
                     <xs:attribute name="interdigittimer"
                                   type="xs:string" use="optional"/>
                     <xs:attribute name="extradigittimer"
                                   type="xs:string" use="optional"/>
                     <xs:attribute name="skipinterval" type="xs:string"
                                   use="optional" default="6s"/>
                     <xs:attribute name="ffkey" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="rwkey" type="xs:string"
                                   use="optional"/>



Van Dyke, et al.       Expires September 10, 2004              [Page 33]


Internet-Draft                   MSCML                        March 2004


                     <xs:attribute name="returnkey" type="xs:string"
                                   use="optional" default="#"/>
                     <xs:attribute name="escapekey" type="xs:string"
                                   use="optional" default="*"/>
                   </xs:complexType>
                 </xs:element>
                 <xs:element name="playrecord">
                   <xs:annotation>
                     <xs:documentation>Playrecord takes the audio
                      from the associated session and records it to
                      the location and in the format specified. It
                      generates a "response" message if the request
                      is in error, when the recording session has
                      been interrupted by DTMF the specified duration
                      has been exceeded or a timeout  has expired.
                      The request has an optional prompt to be played
                      prior to the start of recording.
                     </xs:documentation>
                   </xs:annotation>
                   <xs:complexType>
                     <xs:sequence minOccurs="0">
                       <xs:element ref="prompt"/>
                     </xs:sequence>
                     <xs:attribute name="id" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="prompturl" type="xs:anyURI"
                                   use="optional"/>
                     <xs:attribute name="offset" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="barge" type="yn"
                                   use="optional"/>
                     <xs:attribute name="cleardigits" type="yn"
                                   use="optional"/>
                     <xs:attribute name="escapekey" type="xs:string"
                                   use="optional" default="*"/>
                     <xs:attribute name="recurl" type="xs:string"
                                   use="required"/>
                     <xs:attribute name="mode" use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "overwrite".
                         </xs:documentation>
                       </xs:annotation>
                       <xs:simpleType>
                         <xs:restriction base="xs:NMTOKEN">
                           <xs:enumeration value="append"/>
                           <xs:enumeration value="overwrite"/>
                         </xs:restriction>
                       </xs:simpleType>



Van Dyke, et al.       Expires September 10, 2004              [Page 34]


Internet-Draft                   MSCML                        March 2004


                     </xs:attribute>
                     <xs:attribute name="recencoding" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="initsilence" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="endsilence" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="duration" type="xs:string"
                                   use="optional"/>
                     <xs:attribute name="beep" type="yn" use="optional">
                       <xs:annotation>
                         <xs:documentation>Default is "yes".
                         </xs:documentation>
                       </xs:annotation>
                     </xs:attribute>
                     <xs:attribute name="recstopmask" type="xs:string"
                                   use="optional"
                                   default="01234567890*#"/>
                   </xs:complexType>
                 </xs:element>
                 <xs:element name="faxplay">
                   <xs:annotation>
                     <xs:documentation>Send (or reverse poll
                      receive) a fax.</xs:documentation>
                   </xs:annotation>
                   <xs:complexType>
                     <xs:attribute name="lclid" type="xs:string"
                                   use="optional" default='""'/>
                     <xs:attribute name="prompturl" type="xs:anyURI"
                                   use="optional"/>
                     <xs:attribute name="recurl" type="xs:anyURI"
                                   use="optional"/>
                     <xs:attribute name="rmtid" type="xs:string"
                                   use="optional"/>
                   </xs:complexType>
                 </xs:element>
                 <xs:element name="faxrecord">
                   <xs:annotation>
                     <xs:documentation>Receive (or reverse poll
                      send) a fax.</xs:documentation>
                   </xs:annotation>
                   <xs:complexType>
                     <xs:attribute name="lclid" type="xs:string"
                                   use="optional" default='""'/>
                     <xs:attribute name="prompturl" type="xs:anyURI"
                                   use="optional"/>
                     <xs:attribute name="recurl" type="xs:anyURI"
                                   use="optional"/>



Van Dyke, et al.       Expires September 10, 2004              [Page 35]


Internet-Draft                   MSCML                        March 2004


                     <xs:attribute name="rmtid" type="xs:string"
                                   use="optional"/>
                   </xs:complexType>
                 </xs:element>
                 <xs:element name="stop">
                   <xs:annotation>
                     <xs:documentation>Stops a play, playcollect,
                      or playrecord operation.</xs:documentation>
                   </xs:annotation>
                   <xs:complexType/>
                 </xs:element>
               </xs:choice>
             </xs:complexType>
           </xs:element>
           <xs:element name="response">
             <xs:annotation>
               <xs:documentation>Response to MSCML "request".
               </xs:documentation>
             </xs:annotation>
             <xs:complexType>
               <xs:attribute name="request" use="required">
                 <xs:simpleType>
                   <xs:restriction base="xs:NMTOKEN">
                     <xs:enumeration value="configure_conference"/>
                     <xs:enumeration value="configure_leg"/>
                     <xs:enumeration value="play"/>
                     <xs:enumeration value="playcollect"/>
                     <xs:enumeration value="playrecord"/>
                     <xs:enumeration value="faxplay"/>
                     <xs:enumeration value="faxrecord"/>
                     <xs:enumeration value="stop"/>
                   </xs:restriction>
                 </xs:simpleType>
               </xs:attribute>
               <xs:attribute name="id" type="xs:string"
                             use="optional"/>
               <xs:attribute name="code" type="xs:string"
                             use="required"/>
               <xs:attribute name="text" type="xs:string"
                             use="required"/>
               <xs:attribute name="reason" type="xs:string"
                             use="optional"/>
               <xs:attribute name="reclength" type="xs:string"
                             use="optional"/>
               <xs:attribute name="digits" type="xs:string"
                             use="optional"/>
               <xs:attribute name="playduration" type="xs:string"
                             use="optional">



Van Dyke, et al.       Expires September 10, 2004              [Page 36]


Internet-Draft                   MSCML                        March 2004


                 <xs:annotation>
                   <xs:documentation>How far into the object the
                    play/playrecord/playcollect got in milliseconds.
                   </xs:documentation>
                 </xs:annotation>
               </xs:attribute>
               <xs:attribute name="faxcode" type="xs:string"
                             use="optional"/>
               <xs:attribute name="pages_sent" type="xs:string"
                             use="optional"/>
               <xs:attribute name="pages_recv" type="xs:string"
                             use="optional"/>
             </xs:complexType>
           </xs:element>
           <xs:element name="notification">
             <xs:annotation>
               <xs:documentation>Mid-session MSCML event notification.
               </xs:documentation>
             </xs:annotation>
             <xs:complexType>
               <xs:sequence>
                 <xs:element name="conference">
                   <xs:complexType>
                     <xs:sequence>
                       <xs:element ref="activetalkers"/>
                     </xs:sequence>
                     <xs:attribute name="uniqueid" type="xs:string"
                                   use="required">
                       <xs:annotation>
                         <xs:documentation>The right-hand-side of the
                          left-hand-side of the conference SIP
                          Request-URI, e.g., the unique conference
                          identifier.</xs:documentation>
                       </xs:annotation>
                     </xs:attribute>
                     <xs:attribute name="numtalkers" type="xs:string"
                                   use="required"/>
                   </xs:complexType>
                 </xs:element>
               </xs:sequence>
             </xs:complexType>
           </xs:element>
         </xs:choice>
         <xs:attribute name="version" type="xs:string" use="required"/>
       </xs:complexType>
     </xs:element>
     <xs:element name="activetalkers">
       <xs:annotation>



Van Dyke, et al.       Expires September 10, 2004              [Page 37]


Internet-Draft                   MSCML                        March 2004


         <xs:documentation>As part of a request, this element
          requests periodic lists of active talkers. As part
          of a notification, the talker elements indicate who
          is talking.
                  </xs:documentation>
       </xs:annotation>
       <xs:complexType>
         <xs:sequence minOccurs="0">
           <xs:element name="talker" maxOccurs="unbounded">
             <xs:annotation>
               <xs:documentation>The talker element only makes sense
                as part of a notification. MSCML ignores talker
                elements in requests.</xs:documentation>
             </xs:annotation>
             <xs:complexType>
               <xs:attribute name="callid" type="xs:string"
                             use="required"/>
             </xs:complexType>
           </xs:element>
         </xs:sequence>
         <xs:attribute name="report" type="yn" use="optional">
           <xs:annotation>
             <xs:documentation>The default is no reporting
              (report="no").</xs:documentation>
           </xs:annotation>
         </xs:attribute>
         <xs:attribute name="interval" type="xs:string" use="optional">
           <xs:annotation>
             <xs:documentation>Acceptable values for interval
              are 1 to 60 (seconds).</xs:documentation>
           </xs:annotation>
         </xs:attribute>
       </xs:complexType>
     </xs:element>
     <xs:element name="prompt">
       <xs:annotation>
         <xs:documentation>Note that only one of "gain_level"
          or "gain_delta" attributes can appear in a prompt tag.
                  </xs:documentation>
       </xs:annotation>
       <xs:complexType>
         <xs:choice maxOccurs="unbounded">
           <xs:element name="audio">
             <xs:annotation>
               <xs:documentation>The "encoding" attribute is required
                for files that are not in .au or .wav format.
               </xs:documentation>
             </xs:annotation>



Van Dyke, et al.       Expires September 10, 2004              [Page 38]


Internet-Draft                   MSCML                        March 2004


             <xs:complexType>
               <xs:attribute name="url" type="xs:anyURI"
                             use="required"/>
               <xs:attribute name="encoding" type="xs:string"
                             use="optional"/>
             </xs:complexType>
           </xs:element>
           <xs:element name="variable">
             <xs:complexType>
               <xs:attribute name="type" use="required">
                 <xs:simpleType>
                   <xs:restriction base="xs:NMTOKEN">
                     <xs:enumeration value="date"/>
                     <xs:enumeration value="digit"/>
                     <xs:enumeration value="duration"/>
                     <xs:enumeration value="month"/>
                     <xs:enumeration value="money"/>
                     <xs:enumeration value="number"/>
                     <xs:enumeration value="silence"/>
                     <xs:enumeration value="string"/>
                     <xs:enumeration value="time"/>
                     <xs:enumeration value="weekday"/>
                   </xs:restriction>
                 </xs:simpleType>
               </xs:attribute>
               <xs:attribute name="subtype" use="optional">
                 <xs:simpleType>
                   <xs:restriction base="xs:NMTOKEN">
                     <xs:enumeration value="mdy"/>
                     <xs:enumeration value="dmy"/>
                     <xs:enumeration value="ymd"/>
                     <xs:enumeration value="ndn"/>
                     <xs:enumeration value="t12"/>
                     <xs:enumeration value="t24"/>
                     <xs:enumeration value="USD"/>
                     <xs:enumeration value="gen"/>
                     <xs:enumeration value="ndn"/>
                     <xs:enumeration value="crd"/>
                     <xs:enumeration value="ord"/>
                   </xs:restriction>
                 </xs:simpleType>
               </xs:attribute>
               <xs:attribute name="value" type="xs:string"
                             use="required"/>
             </xs:complexType>
           </xs:element>
         </xs:choice>
         <xs:attribute name="locale" type="xs:string"



Van Dyke, et al.       Expires September 10, 2004              [Page 39]


Internet-Draft                   MSCML                        March 2004


                       use="optional"/>
         <xs:attribute name="baseurl" type="xs:string"
                       use="optional"/>
         <xs:attribute name="gain_level" type="xs:string"
                       use="optional">
           <xs:annotation>
             <xs:documentation>Range from -18dB to +12db, default is 0.
             </xs:documentation>
           </xs:annotation>
         </xs:attribute>
         <xs:attribute name="gain_delta" type="xs:string"
                       use="optional">
           <xs:annotation>
             <xs:documentation>Range from -18dB to +18db, default is 0.
             </xs:documentation>
           </xs:annotation>
         </xs:attribute>
       </xs:complexType>
     </xs:element>
     <xs:element name="inputgain">
       <xs:complexType>
         <xs:choice>
           <xs:element ref="auto"/>
           <xs:element ref="fixed"/>
         </xs:choice>
       </xs:complexType>
     </xs:element>
     <xs:element name="outputgain">
       <xs:complexType>
         <xs:choice>
           <xs:element ref="auto"/>
           <xs:element ref="fixed"/>
         </xs:choice>
       </xs:complexType>
     </xs:element>
     <xs:element name="auto">
       <xs:complexType>
         <xs:attribute name="startlevel" type="xs:string"
                       use="optional"/>
         <xs:attribute name="targetlevel" type="xs:string"
                       use="optional"/>
         <xs:attribute name="silencethreshold" type="xs:string"
                       use="optional"/>
       </xs:complexType>
     </xs:element>
     <xs:element name="fixed">
       <xs:complexType>
         <xs:attribute name="level" type="xs:string" use="optional"/>



Van Dyke, et al.       Expires September 10, 2004              [Page 40]


Internet-Draft                   MSCML                        March 2004


       </xs:complexType>
     </xs:element>
   </xs:schema>


10. IANA Considerations

10.1 IANA Registration of MIME media type application/
     mediaservercontrol+xml

   MIME media type name: application
   MIME subtype name: mediaservercontrol+xml
   Required parameters: none
   Optional parameters: charset

      charset This parameter has identical semantics to the charset
         parameter of the "application/xml" media type as specified in
         XML Media Types [5].

   Encoding considerations: See RFC3023 [5].

   Interoperability considerations: See RFC2023 [5] and this document.

   Published specification: This document.

   Applications which use this media type: Multimedia, enhanced
   conferencing and interactive applications.

   Intended usage: COMMON

11. Security Considerations

   Because media flows through a media server in a conference, the media
   server itself MUST protect the integrity, confidentiality, and
   security of the sessions.  It should not be possible for a conference
   participant, on her own behalf, to be able to "tap in" to another
   conference without proper authorization.

   Because conferencing is a high value application, the media server
   SHOULD implement appropriate security measures.  This includes, but
   not limited to, access lists for application servers.  That is, only
   a select list of application or proxy servers is allowed to create
   conferences, invite participants to sessions, etc.  Note that the
   mechanisms for such security, like private networks, shared
   certificates, MAC white/black lists, are beyond the scope of this
   document.

   As an XML markup, all of the security considerations of RFC3023 [5]



Van Dyke, et al.       Expires September 10, 2004              [Page 41]


Internet-Draft                   MSCML                        March 2004


   apply.

Normative References

   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [2]  Groves, C., Pantaleo, M., Anderson, T. and T. Taylor, "Gateway
        Control Protocol Version 1", RFC 3525, June 2003.

   [3]  "Network call signalling protocol for the delivery of
        time-critical services over cable television networks using
        cable modems", ITU-T J.162, March 2001.

   [4]  Thompson, H., Beech, D., Maloney, M. and N. Mendelsohn, "XML
        Schema Part 1: Structures", W3C REC REC-xmlschema-1-20010502,
        May 2001.

   [5]  Murata, M., St. Laurent, S. and D. Kohn, "XML Media Types", RFC
        3023, January 2001.

Informative References

   [6]   Van Dyke, J., Burger (Ed.), E. and A. Spitzer, "Basic Network
         Media Services with SIP", draft-burger-sipping-netann-06 (work
         in progress), January 2003.

   [7]   Johnston, A. and O. Levin, "Session Initiation Protocol Call
         Control - Conferencing for User  Agents",
         draft-ietf-sipping-cc-conferencing-02 (work in progress),
         October 2003.

   [8]   Mahy, R., "A Call Control and Multi-party usage framework for
         the Session  Initiation Protocol (SIP)",
         draft-ietf-sipping-cc-framework-02 (work in progress), March
         2003.

   [9]   McGlashan, S., Burnett, D., Danielsen, P., Ferrans, J., Hunt,
         A., Karam, G., Ladd, D., Lucas, B., Porter, B., Rehor, K. and
         S. Tryphonas, "Voice Extensible Markup Language (VoiceXML)
         Version 2.0", W3C LastCall WD-voicexml20-20020424, April 2002.

   [10]  ISC, "ISC Reference Architecture V1.2", June 2002.

   [11]  Donovan, S., "The SIP INFO Method", RFC 2976, October 2000.

   [12]  Burger, E. and M. Dolly, "Keypad Stimulus Protocol (KPML)",
         draft-IETF-sipping-kpml-01 (work in progress), October 2003.



Van Dyke, et al.       Expires September 10, 2004              [Page 42]


Internet-Draft                   MSCML                        March 2004


   [13]  Campbell, B., "Instant Message Sessions in SIMPLE",
         draft-ietf-simple-message-sessions-00 (work in progress), May
         2003.

   [14]  Hollenbeck, S., Rose, M. and L. Masinter, "Guidelines for the
         Use of Extensible Markup Lanugage (XML) within IETF Protocols",
         BCP 70, RFC 3470, January 2003.

   [15]  Rosenberg, J. and H. Schulzrinne, "A Session Initiation
         Protocol (SIP) Event Package for Conference State",
         draft-ietf-sipping-conference-package-00 (work in progress),
         June 2002.

   [16]  "Procedures for document facsimile transmission in the general
         switched telephone network", Recommendation T.30, April 1999.

   [17]  "Procedures for real-time Group 3 facsimile communication over
         IP networks", Recommendation T.38, March 2002.

   [18]  Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, April
         2001.


Authors' Addresses

   Jeff Van Dyke
   SnowShore Networks, Inc.
   285 Billerica Rd.
   Chelmsford, MA  01824-4120
   USA

   EMail: jvandyke@snowshore.com


   Eric Burger (editor)
   SnowShore Networks, Inc.
   285 Billerica Rd.
   Chelmsford, MA  01824-4120
   USA

   EMail: e.burger@ieee.org










Van Dyke, et al.       Expires September 10, 2004              [Page 43]


Internet-Draft                   MSCML                        March 2004


   Andy Spitzer
   SnowShore Networks, Inc.
   285 Billerica Rd.
   Chelmsford, MA  01824-4120
   USA

   EMail: woof@snowshore.com

Appendix A. Contributors

   Jeff Van Dyke, Andy Spitzer, and Terence Lobo at SnowShore Networks,
   Inc. did the concept, development, documentation, and execution for
   MSCML.  The IVR implementation was influenced by original work by
   Andy Spitzer while he was at The Telephone Connection, Inc.

   Cliff Schornak of Commetrex and Jeff Van Dyke developed the facsimile
   service.

   Terence Lobo, Srinivas Motamarri, Haj Elfadil, and Edwina Nowicki
   contributed in being the first to eat what got cooked up.

Appendix B. Acknowledgements

   The following individuals significantly assisted in the development,
   direction, or, most importantly, debugging of MSCML:
   o  Gaurav Srivastva and Subhash Verma from BayPackets
   o  Jon Hinckley from SkyWave/Sestro
   o  Wesley Hicks, Ravindra Kabre, Kevin Summers from Sonus Networks
   o  Diana Rawlins and Sharadha Vijay from WorldCom
   o  Tim Wong from Z-Tel
   o  Stephane Bastien from BroadSoft
   o  Kevin Flemming for his feedback on the semantics of creation
      versus configuration for conferencing.

   The authors would like to thank Scotty Farber, technical writer
   extraordinaire, who turned our techno-geek into English.















Van Dyke, et al.       Expires September 10, 2004              [Page 44]


Internet-Draft                   MSCML                        March 2004


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights. Information
   on the IETF's procedures with respect to rights in IETF Documents can
   be found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard. Please address the information to the IETF at
   ietf-ipr@ietf.org.

   The IETF has been notified of intellectual property rights claimed in
   regard to some or all of the specification contained in this
   document. For more information consult the online list of claimed
   rights.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2004). This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.





Van Dyke, et al.       Expires September 10, 2004              [Page 45]


Internet-Draft                   MSCML                        March 2004


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.















































Van Dyke, et al.       Expires September 10, 2004              [Page 46]