Network Working Group                                        J. Van Dyke
Internet Draft                                           E. Burger (ed.)
Document: draft-burger-sipping-netann-03.txt                  A. Spitzer
Category: Standards Track                       SnowShore Networks, Inc.
Expires: May 2003                                            W. O'Connor
                                                        November 2, 2002


                 Basic Network Media Services with SIP


Status of this Memo
   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026 [1].

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts. Internet-Drafts are draft documents valid for a maximum of
   six months and may be updated, replaced, or obsoleted by other
   documents at any time. It is inappropriate to use Internet- Drafts
   as reference material or to cite them other than as "work in
   progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


Abstract

   In SIP-based networks, there is a need to provide basic network
   media services.  Such services include network announcements, user
   interaction, conferencing, and transcoding services.  These services
   are basic building blocks, from which one can construct interesting
   applications.  In order to have interoperability between servers
   offering these building blocks (also known as Media Servers) and
   application developers, one needs to be able to locate and invoke
   such services in a well-defined manner.

   This document describes a mechanism for providing an interoperable
   protocol interface between Application Servers, which provide
   application services to SIP-based networks, and Media Servers, which
   provide the basic media processing building blocks.









Burger, et. al.            Expires 5/2/2003                          1

                    Network Announcements with SIP       November 2002


   Table of Contents

   1. Conventions used in this document..............................2
   2. Overview.......................................................2
   3. Mechanism......................................................3
   4. Announcement Service...........................................5
     4.1. Operation..................................................7
     4.2. Established Call Announcement..............................7
          4.2.1. Description.........................................7
          4.2.2. Protocol Diagram....................................8
     4.3. Early Media Announcement...................................8
          4.3.1. Description.........................................8
          4.3.2. Protocol Diagram....................................9
     4.4. Formal Syntax..............................................9
   5. Prompt and Collect Service....................................10
     5.1. Explicit Service..........................................11
     5.2. Formal Syntax for Explicit Service........................11
   6. Conference Service............................................11
     6.1. Protocol Diagram..........................................12
     6.2. Formal Syntax.............................................14
   7. Transcoding Service...........................................14
     7.1. Trans-Coding Overview.....................................14
     7.2. Media Server Interface....................................14
     7.3. Call Flows................................................15
          7.3.1. Trans-coding bridge................................16
          7.3.2. URI Parameter Method...............................16
          7.3.3. Message Flow.......................................18
     7.4. Formal Syntax.............................................21
   8. The User Part.................................................21
   9. Security Considerations.......................................23
   10. References...................................................23
   11. Changes......................................................24
     11.1. Changes Made in Version 02...............................24
     11.2. Changes Made in Version 01...............................24
   12. Acknowledgments..............................................25
   13. Author's Addresses...........................................25


1. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in RFC-2119 [2].


2. Overview

   In SIP-based media networks [3], there is a need to provide basic
   network media services.  Such services include playing
   announcements, initiating a media mixing session (conference),
   transcoding a stream, and prompting and collecting information with
   a user.


Burger, et. al.            Expires 5/2/2003                          2

                    Network Announcements with SIP       November 2002


   These services are basic in nature, are few in number, and
   fundamentally have not changed in 25 years of enhanced telephony
   services.  Moreover, given their elemental nature, one would not
   expect them to change in the future.

   Announcements are media played to the user.  Announcements can be
   static media files, media files generated in real-time, media
   streams generated in real-time, or combinations of the above.

   In some situations, one must play the announcement without providing
   an answer indication.  In others, one must play the announcement
   after completing call setup.  This document describes how to provide
   such announcements in a SIP-based network.  The method described
   here is a media server service instance.

   Media mixing is the act of mixing different RTP streams, as
   described in [4].  Note that the service described here will suffice
   for simple mixing of media for a basic conferencing service.  One
   can create a complete conferencing service using this basic building
   block.  However, this service does not address the interesting
   application-level issues such as conferencing, floor control, etc.

   Transcoding is the act of taking an RTP stream coded with one codec
   and playing it as a new RTP stream coded with another codec.  For
   example, taking a G.711-encoded stream and transcoding it to G.729e.
   In addition, the mechanism described here satisfies the needs of the
   hearing-impaired requirements [5] for a transcoding service.

   Prompt and collect is where the server prompts the user for some
   information, as in an announcement, and then collects the user's
   response.  This can be a one-step interaction, for example by
   playing an announcement, "Please enter your passcode", followed by
   collecting a string of digits.  It can also be a more complex
   interaction, specified, for example, by VoiceXML [6].


3. Mechanism

   In the context of SIP control of media servers, we take advantage of
   the fact that the standard SIP URI has a user part.  Media servers
   do not have a concept of a user.  Thus we use the user address, or
   the left-hand-side of the URI, as a service indicator.

   Note that the set of services is small, well-defined, and well-
   contained.  The section "The User Part", Section 8 below, discusses
   the issues with using a fixed set of user-space names.

   For per-service security, the media server MAY use any of the
   security protocols described in [3].

   The media server MAY issue 401 challenges for authentication.



Burger, et. al.            Expires 5/2/2003                          3

                    Network Announcements with SIP       November 2002


   The media server, upon receiving the INVITE, notes the service
   indicator.  Depending on the service indicator, the media server
   will either honor the request or return a failure response code.

   The service indicator is the concatenation of the service name and
   an optional service instance identifier, separated by an equal sign.

   Per SIP, the service indicator is case insensitive.  The service
   name MUST be from the set alphanumeric characters plus dash (US-
   ASCII %2C).  The service name MUST NOT include an equal sign (US-
   ASCII %3C).

   The service name MAY have long- and short-forms, as SIP does for
   headers.

   A given service indicator MAY have an associated set of parameters.
   Such parameters MUST follow the convention set out for SIP URI
   parameters.  That is, a semi-colon separated list of keyword=values.

   Certain services may have an association with a unique service
   instance on the media server.  For example, a given media server can
   host multiple, separate conference sessions.  To identify unique
   service instances, a unique identifier modifies the service name.
   The unique identifier MUST meet the rules for a legal user part of a
   SIP URI.  An equal sign, US-ASCII %3D, MUST separate the service
   indicator from the unique identifier.

   Note that since the service indicator is case insensitive, the
   service instance identifier is also case insensitive.

   The requesting client issues a SIP INVITE to the media server,
   specifying the requested service and any appropriate parameters.

   If the media server can perform the requested service, it does so,
   following the processing steps described in the service definition
   document (see IANA Considerations, below).

   If the media server cannot perform the requested service or does not
   recognize the service indicator, it MUST respond with the response
   code 488 NOT ACCEPTABLE HERE.  This is appropriate, as 488 refers to
   a problem with the user part of the URI.  Moreover, 606 is not
   appropriate, as some other media server may be able to satisfy the
   request.  [3] describes the 488 and 606 response codes.

   Some services require a unique identifier.  Most services
   automatically create a service instance upon the first INVITE with
   the given identifier.  However, if a service requires an existing
   service instance, and no such service instance exists on the media
   server, the media server MUST respond with the response code 404 NOT
   FOUND.  This is appropriate as the service itself exists on the
   media server, but the particular service instance does not.  It is
   as if the user was not home.


Burger, et. al.            Expires 5/2/2003                          4

                    Network Announcements with SIP       November 2002



4. Announcement Service

   A network announcement is the delivery of an audio resource, such as
   a prompt file, to a terminal device.

   There are two types of network announcements.  The differentiating
   characteristic between the two types is whether the network fully
   sets up the call before playing the announcement.  The analog in the
   PSTN is whether answer supervision is supplied; i.e. does the
   announcement server answer the call prior to delivering the
   announcement.

   Playing an announcement after call setup is straightforward.  First,
   the requesting device issues an INVITE to the media server
   requesting the announcement service.  The media server negotiates
   the SDP and responds with a 200 OK.  After receiving the ACK from
   the requesting device, the media server plays the requested prompt
   and issues a BYE to the requesting device.

   In replicating and expanding on the existing telephone network,
   there is a need to play announcements during call setup.  That is,
   the network delivers media to the caller before the setup completes.
   Network operators need this capability to provide informational
   network announcements, such as "The person you are trying to reach
   is unavailable.  Good Bye." or "We are sorry, but all circuits are
   busy.  Please try your call again later.  Good Bye."

   Note that simply redirecting the caller to a media server, with the
   media server issuing a 200 OK response, is not appropriate.  The
   call has not completed successfully.  To support the appropriate
   paradigm, the media server issues a 100 TRYING response, followed
   immediately by a 183 SESSION PROGRESS response with SDP.  This
   enables the media server to send early media to the caller.  The
   media server sends the requested audio.  After playing the audio,
   the media server issues a 487 REQUEST TERMINATED response code to
   the requesting device.

   If the media server does not support announcements, it MUST respond
   with the 488 NOT ACCEPTABLE HERE response code.

   If the media server supports announcements, but it cannot find the
   referenced URI, it MUST respond with the 404 NOT FOUND response
   code.

   If the media server receives an INVITE for the announcement service
   without a "play=" parameter, it MUST respond with the 404 NOT FOUND
   response code, as there is no default value for the announcement
   service.

   If there is an error retrieving the announcement, the media server
   MUST respond with an appropriate 4xx error code reflecting the
   error.

Burger, et. al.            Expires 5/2/2003                          5

                    Network Announcements with SIP       November 2002



   The Request URI fully describes the announcement service through the
   use of the user part of the address and additional URI parameters.
   The user portion of the address, "annc", specifies the announcement
   service on the media server.  The service has several associated URI
   parameters that control the content and delivery of the
   announcement.  These parameters are described below:

   "play=" specifies the audio resource or announcement sequence to be
   played.

   "early=" Specifies whether early media treatment is desired.

   "repeat=" Specifies how many times the media server should repeat
   the announcement or sequence named by the "play=" parameter.

   "delay=" Specifies a delay interval between announcement
   repetitions.  The delay is measured in milliseconds.

   "duration=" Specifies the maximum duration of the announcement.  The
   media server will discontinue the announcement and end the call if
   the maximum duration has been reached.  The duration is measured in
   milliseconds.

   "locale=" Specifies the language and country variant of the
   announcement sequence named in the "play=" parameter.  The language
   is defined as a two letter code per ISO 639 [7].  The country
   variant is also defined as a two letter code per ISO 3166 [8].
   These elements are concatenated with a single underbar (%x5F)
   character.

   "param[n]=" Provides a mechanism for passing values that are to be
   substituted into an announcement sequence.  Up to 9 parameters
   ("param1=" through "param9=") may be specified.

   The "play=" parameter is mandatory and MUST be present.  All other
   parameters are OPTIONAL.

   NOTE: Some encodings are not self-describing.  Should we specify
   something like content-type?  Alternatively, how about a "media="
   parameter?

   The form of the SIP Request URI for announcements is as follows.
   Note that the backslash, CRLF, and spacing before the "play=" is for
   readability purposes only.

        sip:annc@ms2.carrier.net; \
          play="http://audio.carrier.net/allcircuitsbusy.g711"; \
            early=yes
        sip:annc@ms2.carrier.net; \

   play="file://fileserver.carrier.net/geminii/yourHoroscope.wav"


Burger, et. al.            Expires 5/2/2003                          6

                    Network Announcements with SIP       November 2002


4.1. Operation

   The scenarios below assume there is a SIP Proxy, application server,
   or SoftSwitch between the caller and the media server.  However, the
   announcement service works as described below even if the caller
   invokes the service directly.  We chose to discuss the proxy case,
   as it will be the most common case.

   As described above, the "early=" parameter determines whether the
   media server plays the prompt after call setup or as early media.
   The default value for the "early=" parameter MUST BE "yes".  That
   is, the default action is for the media server to play the prompt
   before establishing the call.  We envision that that this service
   will be most commonly used for network announcements which require
   early media, hence that is the default behavior.

4.2. Established Call Announcement

4.2.1. Description

   The caller issues an INVITE to the serving SIP Proxy.  The SIP Proxy
   determines what audio prompt to play to the caller.  The proxy
   responds to the caller with 100 TRYING.

   The proxy issues an INVITE to the media server, requesting the
   appropriate prompt to play coded in the play= parameter.  The INVITE
   MUST contain the parameter "early=no" to invoke the Established Call
   Prompting service.  The media server responds with 200 OK.  The
   proxy sends a 200 OK to the caller.  The caller then issues an ACK.
   The proxy then issues an ACK to the media server.

   With the call setup, the media server plays the requested prompt.
   When the media server completes the play of the prompt, it issues a
   BYE to the proxy.  The proxy then issues a BYE to the caller.




















Burger, et. al.            Expires 5/2/2003                          7

                    Network Announcements with SIP       November 2002


4.2.2. Protocol Diagram

   Caller                   Proxy                 Media Server
     |   INVITE               |                        |
     |----------------------->|   INVITE               |
     |   100 TRYING           |----------------------->|
     |<-----------------------|   200 OK               |
     |   200 OK               |<-----------------------|
     |<-----------------------|                        |
     |   ACK                  |                        |
     |----------------------->|   ACK                  |
     |                        |----------------------->|
     |                        |                        |
     |              Play Announcement (RTP)            |
     |<================================================|
     |                        |                        |
     |                        |   BYE                  |
     |   BYE                  |<-----------------------|
     |<-----------------------|                        |
     |   200 OK               |    200 OK              |
     |----------------------->|----------------------->|
     |                        |                        |


4.3. Early Media Announcement

4.3.1. Description

   The caller issues an INVITE to the serving SIP Proxy.  Normally, the
   SIP Proxy would complete the call to the requested destination.
   However, if the destination is not available, the proxy will request
   a media server to play an audio prompt to the caller.  The proxy
   responds with a 100 TRYING.

   The proxy issues an INVITE to the media server, requesting the
   appropriate prompt to play.  The INVITE MUST contain the parameter
   "early=yes" or omit the "early=" parameter to invoke the Early Media
   Prompting service.  The media server responds with 100 TRYING
   followed by 183 SESSION PROGRESS.  At that point, the media server
   sends the announcement to the caller.  The document [3] describes
   the 183 SESSION PROGRESS result code.

   As stated above, if the Media Server cannot fetch the URI in the
   "play=" parameter, the Media Server will reply with a 404 NOT FOUND.
   Otherwise, after the media server completes the streaming of the
   prompt, it MUST send a 487 REQUEST TERMINATED to the Proxy.

   Note: When the early media service is used the requester is
   implicitly asking the media server to cancel the transaction as soon
   as the announcement is played.  Since 487 is associated with an
   explicit CANCEL request it seems appropriate for this use as well.



Burger, et. al.            Expires 5/2/2003                          8

                    Network Announcements with SIP       November 2002


   The proxy sends the appropriate error response to the caller.  That
   could be 487 or any other appropriate code reflective of the failure
   situation.

4.3.2. Protocol Diagram

   Caller                   Proxy                 Media Server
     |   INVITE               |                        |
     |----------------------->|   INVITE               |
     |   100 TRYING           |----------------------->|
     |<-----------------------|   100 TRYING           |
     |                        |<-----------------------|
     |                        |   183 SESSION PROGRESS |
     |   183 SESSION PROGRESS |<-----------------------|
     |<-----------------------|                        |
     |                        |                        |
     |              Play Announcement (RTP)            |
     |<================================================|
     |                        | 487 REQUEST TERMINATED |
     | 487 REQUEST TERMINATED |<-----------------------|
     |<-----------------------|                        |
     |   ACK                  |    ACK                 |
     |----------------------->|----------------------->|
     |                        |                        |


4.4. Formal Syntax

   The following syntax specification uses the augmented Backus-Naur
   Form (BNF) as described in RFC-2234 [9].

   ANNC-URL        = "sip:" annc-ind "@" hostport
                       annc-parameters

   annc-ind        = "annc"

   annc-parameters = ";" play-param [ ";" early-param ]
        [ ";" delay-param] [ ";" duration-param ] [ ";" repeat-param ]
        [ ";" locale-param ] [ ";" variable-params ]

   play-param      = "play=" prompt-url

   early-param     = "early=" ( "yes" | "no" )

   delay-param     = "delay=" delay-value

   delay-value     = 1*DIGIT

   duration-param  = "duration=" duration-value

   duration-value  = 1*DIGIT

   repeat-param    = "repeat=" repeat-value

Burger, et. al.            Expires 5/2/2003                          9

                    Network Announcements with SIP       November 2002



   repeat-value    = 1*DIGIT

   locale-param    = "locale=" locale-value

   locale-value    = 2ALPHA %x5F 2ALPHA

   variable-params = param-name "=" variable-value

   param-name      = "param" DIGIT ; e.g "param1"

   variable-value  = 1*(ALPHA | DIGIT)

   The locale-value consists of a 2 letter language code as specified
   in ISO 639 [7]and a 2 letter country code specified in ISO 3166 [8]
   separated by a single underbar (%x5Fh) character.

   The definition of hostport is as specified by [3].

   The syntax of prompt-url consists of a URL scheme as specified by
   [10] or a special token indicating a provisioned announcement
   sequence.  We expect the URL to be one of the following schemes.
      o http
      o ftp
      o file (referencing a local or nfs (RFC 2224) location)

   If a provisioned announcement sequence is to be played the value of
   prompt-url will have the following form:
   prompt-url      = "/provisioned/" announcement-id

   announcement-id = 1*(ALPHA | DIGIT)

   This document is strictly focused on the SIP interface for the
   announcement service and as such does not detail how announcement
   sequences are provisioned or defined.

   Note that the media type of the object the prompt-url refers to can
   be most anything, including audio file formats, text file formats,
   URI lists, or even VoiceXML scripts.  See the Prompt and Collect
   Service section below for more on this topic.


5. Prompt and Collect Service

   This service is also known as a dialog.  It establishes an aural
   dialog with the user.

   There is an implicit prompt and collect service and an explicit
   prompt and collect service.  The implicit service leverages the fact
   that the prompt URI of the play= parameter for the annc service can
   be any media type.  The explicit service allows for more flexibility
   in script management.


Burger, et. al.            Expires 5/2/2003                         10

                    Network Announcements with SIP       November 2002



5.1. Explicit Service

   The dialog service follows the model of the announcement service.
   However, the service indicator is "dialog".  The dialog service
   takes a parameter, voicexml=, indicating the URI of the VoiceXML
   script to execute.

        sip:dialog@mediaserver.carrier.net;voicexml=dialog-uri

   A Media Server MAY accept additional SIP request URI parameters and
   deliver them to the VoiceXML interpreter session as session
   variables.

5.2. Formal Syntax for Explicit Service

   The following syntax specification uses the augmented Backus-Naur
   Form (BNF) as described in RFC-2234 [9].

   CONF-URL          = "sip:" dialog-ind "@" hostport
                          dialog-parameters

   dialog-ind        = "dialog"

   dialog-parameters = ";" dialog-param [ vxml-parameters ]

   dialog-param      = "dialog=" dialog-url

   vxml-parameters   = vxml-param [ vxml-parameters ]

   vxml-param        = ";" vxml-keyword "=" vxml-value


   The dialog-url is the URI of the VoiceXML script.  If present, other
   parameters get passed to the VoiceXML interpreter session with the
   assigned vxml-keyword vxml-value pairs.  Note that all vxml-keywords
   MUST have values.

   If the Media Server does not support the passing of keyword-value
   pairs to the VoiceXML interpreter session, it MUST ignore the
   parameters.


6. Conference Service

   One identifies mixing sessions through their SIP request URIs.  To
   create a mixing session, one sends an INVITE to a request URI that
   represents the session.  If the URI does not already exist on the
   media server and the requested resources are available, the media
   server creates a new mixing session.  If there is an existing URI
   for the session, then the media server interprets it as a request
   for the new session to join the existing session.  The form of the
   SIP request URI for conferencing is:

Burger, et. al.            Expires 5/2/2003                         11

                    Network Announcements with SIP       November 2002



        sip:conf=uniqueIdentifier@mediaserver.carrier.net

   This is actually the username of the request in the request URI and
   the To header.  The host portion of the URI identifies a particular
   media server.  The "conf=" portion of the user part conveys to the
   media server that this is a request for the mixing service.  The
   uniqueIdentifier can be any value that is compliant with the SIP URI
   specification.  It is the responsibility of the conference control
   application to ensure the identifier is unique within the scope of
   any potential conflict.

   It is worth noting that the conference URI shared between the
   application and media provides enhanced security, as the SIP control
   interface does not have to be exposed to participants.  It also
   allows the assignment of a specific media server to be delayed as
   long as possible, thereby simplifying resource management.

   One can add additional legs to the conference by INVITEing them to
   the above mentioned request URI.  Conversely, one can remove legs by
   issuing a BYE in the corresponding dialog.  The mixing session, and
   thus the conference-specific request URI, remains active so long as
   there is at least one SIP dialog associated with the given request
   URI.

6.1. Protocol Diagram

   This diagram shows the establishment of a three-way conference.
   This section is informative.

























Burger, et. al.            Expires 5/2/2003                         12

                    Network Announcements with SIP       November 2002


    P1       P2        P3         Application Server     Media Server
     |       |        |                  |                   |
     |  INVITE sip:public-conf@as.c.net  |                   |
     |---------------------------------->| INVITE sip:conf=123@ms.c.net
     |       |        |                  |------------------>|
     |       |        |                  | 200 OK            |
     |  200 OK        |                  |<------------------|
     |<----------------------------------|                   |
     |       |        | RTP w/ P1        |                   |
     |<=====================================================>|
     |       |        |                  |                   |
     |  INVITE sip:public-conf@as.c.net  |                   |
     |       |-------------------------->| INVITE sip:conf=123@ms.c.net
     |       |        |                  |------------------>|
     |       |        |                  | 200 OK            |
     |       | 200 OK |                  |<------------------|
     |       |<--------------------------|                   |
     |       |        |                  |                   |
     |       |        | RTP w/ P1+P2     |                   |
     |       |        |<====================================>|
     |       |        |                  |                   |
     |  INVITE sip:public-conf@as.c.net  |                   |
     |       |        |----------------->| INVITE sip:conf=123@ms.c.net
     |       |        |                  |------------------>|
     |       |        |                  | 200 OK            |
     |       |        | 200 OK           |<------------------|
     |       |        |<-----------------|                   |
     |       |        |                  |                   |
     |       |        |                  |  RTP w/ P1+P2+P3  |
     |       |        |                  |<=================>|
     |       |        |                  |                   |

   Note that the above call flow does not show any 100 TRYING messages
   that would typically flow from the Application Server to the UAC's,
   nor does it show the ACK's from the UAC's to the Application Server
   or from the Application Server to the Media Server.

   Each leg can drop out either under the supervision of the UAC by the
   UAC sending a BYE or under the supervision of the Application Server
   by the Application Server issuing a BYE.  In either case, the
   Application Server will either issue a BYE on behalf of the UAC or
   issue it directly to the Media Server, corresponding to the
   respective disconnect case.

   It is left as a trivial exercise to the reader for how the
   Application Server can mute legs, create side conferences, and so
   forth.

   Note that the Application Server is a server to the participants
   (UAC's).  However, the Application Server is a client for mixing
   services to the Media Server.



Burger, et. al.            Expires 5/2/2003                         13

                    Network Announcements with SIP       November 2002


6.2. Formal Syntax

   The following syntax specification uses the augmented Backus-Naur
   Form (BNF) as described in RFC-2234 [9].

   CONF-URL        = "sip:" conf-ind "=" instance-id "@" hostport

   conf-ind        = "conf"

   instance-id     = token


7. Transcoding Service

7.1. Trans-Coding Overview

   The media server provides an interface that enables a SIP UA to
   request conversion of RTP media from one form to another.  It relies
   on the sending/receiving UA or on a SIP proxy or application server
   to determine when trans-coding services are needed and to coordinate
   the signaling with the media server and other SIP endpoints.

   SIP UAs or applications may require trans-coding services in at
   least two scenarios.  The first occurs when two end devices do not
   share a common codec and therefore need a third-party translator to
   communicate.  In this scenario, one of the end devices would bring
   the media server into the call.  The second scenario is one of two
   peered networks, each of which mandates use of different codecs for
   their own operational reasons.  Calls that cross network boundaries
   require trans-coding services.  In this case, the end devices will
   likely not be aware of the operational requirements and a proxy or
   application server will bring the media server into the call.

   The trans-coding scenarios require that the end device or
   application server act as a back-to-back user agent (B2BUA).  This
   enables the entity requesting trans-coding services to coordinate
   SIP sessions between other end devices and the media server

7.2. Media Server Interface

   There are two alternative approaches to providing a trans-coding
   service.  The first, and conceptually simplest, is a trans-coding
   bridge.  The signaling is similar to that used in conferencing
   scenarios.  The media server associates the input and output streams
   from the two endpoints using an application supplied unique
   identifier that the Request URI carries.  This approach has the
   advantage that the end device does not need to determine the trans-
   coding parameters.  One limitation of this approach is that both
   call legs must terminate on the same media server.

   The second alternative, the URI parameter method, takes advantage of
   the half-duplex nature of RTP to set up two, completely separate,
   trans-coding paths between the callers.  There is no association

Burger, et. al.            Expires 5/2/2003                         14

                    Network Announcements with SIP       November 2002


   between the call legs so the end device must specify the trans-
   coding parameters in the Request URI.  An advantage to this approach
   is that one can use different media servers for each trans-coding
   path.

   SIP UAs that desire trans-coding services send a SIP INVITE to a
   Request URI that has a user part, which begins with "xcod".  This
   conveys to the media server that trans-coding services are
   requested.  The remainder of the URI format is dependent upon
   whether the bridge or URI parameter method is desired.
   For the bridge method, the Request URI must contain a unique
   identifier that associates both call legs.  The URI takes the form:

        sip:xcod=uiqueID@mediaserver.provider.net

   where the uniqueID is supplied by the end device or controlling
   application.  SIP Call ID's are globally unique so the Call ID for
   the first leg could potentially be used for this parameter.

   Since there is no association between the call legs in the URI
   parameter case, no unique identifier is needed.  However, the trans-
   coding parameters must be specified explicitly in the Request URI
   with URI parameters.  The URI takes the form:

        sip:xcod@mediaserver.provider.net;codec=g711;ptime=10

   The URL parameters codec and ptime describe the desired media format
   for input to the trans-coder.  The output format and destination IP
   address/port is defined by the SDP contained in the INVITE.  In its
   response, the media server returns SDP with a single media type
   matching the requested input format and the IP address and port
   number where it will receive it.  The media server terminates RTP at
   this address:port, trans-codes it,  and resends it to the output
   address:port.

   Because the Request URI signatures are different, a media server
   could support both trans-coding interfaces simultaneously.  Further
   discussions with customers and industry partners are needed to
   determine if there is demand for both methods or if one will
   suffice.

   The call flows below will further illustrate the use of both
   methods.

7.3. Call Flows

   The following call flows illustrate the use of the trans-coding
   interfaces described above.  In both scenarios, the end device
   receives a SIP INVITE containing SDP that it cannot support.  Rather
   than returning a 4XX class response, it uses third-party call
   control methods to bring a media server with trans-coding
   capabilities into the call.


Burger, et. al.            Expires 5/2/2003                         15

                    Network Announcements with SIP       November 2002


7.3.1. Trans-coding bridge

   The following call flow depicts a trans-coding request utilizing the
   bridge signaling method.

   Caller (A)               Called (B)                 Media Server
      |                        |                             |
      |    INVITE (SDP A)      |                             |
      |----------------------->|                             |
      |    100 TRYING          |                             |
      |<-----------------------| INVITE sip:xcod=id (SDP B)  |
      |                        |---------------------------->|
      |                        | 200 OK (SDP M1)             |
      |                        |<----------------------------|
      |                        | ACK                         |
      |                        |---------------------------->|
      |                        |<========= RTP (B)==========>|
      |                        | INVITE sip:xcod=id (SDP A)  |
      |                        |---------------------------->|
      |                        | 200 OK (SDP M2)             |
      |    200 OK (SDP M2)     |<----------------------------|
      |<-----------------------|                             |
      |    ACK                 |                             |
      |----------------------->| ACK                         |
      |                        |---------------------------->|
      |<=================== RTP (A) * ======================>|


   * The Media Server implicitly transcodes between the associated
   legs.  At this point, the Media Server bridges the two legs.


7.3.2. URI Parameter Method

   The following depicts a trans-coding call-flow using the URI
   parameter method.


















Burger, et. al.            Expires 5/2/2003                         16

                    Network Announcements with SIP       November 2002


   Caller (A)               Called (B)                 Media Server
      |                        |                             |
      |  1. INVITE (SDP A)     |                             |
      |----------------------->|                             |
      |  2.  100 TRYING        |                             |
      |<-----------------------| 3. INVITE sip:xcod;codec=A  |
      |                        |---------------------------->| (1)
      |                        |           ;ptime=A (SDP B)  |
      |                        |                             |
      |                        | 4. 200 OK (SDP M1)          |
      |                        |<----------------------------| (2)
      |                        | 5. ACK                      |
      |                        |---------------------------->|
      |                        |                             |
      |                        | 6. INVITE sip:xcod;codec=B  |
      |                        |---------------------------->| (3)
      |                        |           ;ptime=B (SDP A)  |
      |                        |                             |
      |                        | 7. 200 OK (SDP M2)          |
      |  8.  200 OK (SDP M2)   |<----------------------------| (4)
      |<-----------------------|                             |
      |  9.  ACK               |                             |
      |----------------------->| 10. ACK                     |
      |                        |---------------------------->|
      |                        |                             |
      |==================== RTP (A) ========================>|\(5)
      |                        |<==== RTP (A in B format)====|/
      |                        |                             |
      |                        |===== RTP (B) ==============>|\(6)
      |============= RTP (B in A format) ===================>|/
      |                        |                             |

   Ladder diagram notes:
   (1)  Requests a session that can receive media A, transcode it to
        media format B, and send it to B's IP address:port as described
        in SDP B.
   (2)  Contains SDP with address:port for caller (A) to send to.
   (3)  Requests a session session that can receive media B, transcode
        it to media format A, and send it to A's IP address:port as
        described in SDP A.
   (4)  Contains SDP with address:port for caller (B) to send to.
   (5)  Media Server loops RTP in media format A to B.
   (6)  Media Server loops RTP in media format B to A.

   Note that messages 6, 7, and 10 can go to a different Media Server
   than 3, 4, and 5.  In this case, the second Media Server will do the
   B to A transcoding.







Burger, et. al.            Expires 5/2/2003                         17

                    Network Announcements with SIP       November 2002


7.3.3. Message Flow

   Message 1

   INVITE sip:callee@company2.com SIP/2.0
   Via: SIP/2.0/UDP a.company1.com
   From: sip:caller@company1.com
   To: sip:callee@company2.com
   Call-ID: 125@1.2.3.4
   CSeq: 1 INVITE
   Contact: sip:caller@a.company1.com
   Content-Type: application/sdp
   Content-Length: XX

   <SDP A>



   Message 2

   SIP/2.0 100 Trying
   Via: SIP/2.0/UDP a.company1.com
   From: sip:caller@company1.com
   To: sip:callee@company2.com;tag=8abj8gh
   Call-ID: 125@1.2.3.4
   CSeq: 1 INVITE



   Message 3

   INVITE sip:xcod@mediaserver.carrier.net;codec=A;ptime=A SIP/2.0
   Via: SIP/2.0/UDP b.company2.com
   From: sip:callee@company2.com
   To: sip:xcod@mediaserver.carrier.net;codec=A;ptime=A
   Call-ID: 234@5.6.7.8
   CSeq: 1 INVITE
   Contact: sip:callee@b.company2.com
   Content-Type: application/sdp
   Content-Length: XX

   <SDP B>












Burger, et. al.            Expires 5/2/2003                         18

                    Network Announcements with SIP       November 2002


   Message 4

   SIP/2.0 200 OK
   Via: SIP/2.0/UDP b.company2.com
   From: sip:callee@company2.com
   To: sip:xcod@mediaserver.carrier.net;codec=A;ptime=A;tag=9ab6g2
   Call-ID: 234@5.6.7.8
   CSeq: 1 INVITE
   Content-Type: application/sdp
   Content-Length: XX

   <SDP M1>



   Message 5

   ACK sip:xcod@mediaserver.carrier.net;codec=A;ptime=A SIP/2.0
   Via: SIP/2.0/UDP b.company2.com
   From: sip:callee@company2.com
   To: sip:xcod@mediaserver.carrier.net;codec=A;ptime=A;tag=9ab6g2
   Call-ID: 234@5.6.7.8
   CSeq: 1 ACK



   Message 6

   INVITE sip:xcod@mediaserver.carrier.net;codec=B;ptime=B SIP/2.0
   Via: SIP/2.0/UDP b.company2.com
   From: sip:callee@company2.com
   To: sip:xcod@mediaserver.carrier.net;codec=B;ptime=B
   Call-ID: 678@5.6.7.8
   CSeq: 1 INVITE
   Contact: sip:callee@b.company2.com
   Content-Type: application/sdp
   Content-Length: XX

   <SDP A>















Burger, et. al.            Expires 5/2/2003                         19

                    Network Announcements with SIP       November 2002


   Message 7

   SIP/2.0 200 OK
   Via: SIP/2.0/UDP b.company2.com
   From: sip:callee@company2.com
   To: sip:xcod@mediaserver.carrier.net;codec=B;ptime=B;tag=7ab7gh
   Call-ID: 678@5.6.7.8
   CSeq: 1 INVITE
   Content-Type: application/sdp
   Content-Length: XX

   <SDP M2>


   Message 8

   SIP/2.0 200 OK
   Via: SIP/2.0/UDP a.company1.com
   From: sip:caller@company1.com
   To: sip:callee@company2.com;tag=8abj8gh
   Call-ID: 125@1.2.3.4
   CSeq: 1 INVITE
   Contact: sip:caller@a.company1.com
   Content-Type: application/sdp
   Content-Length: XX

   <SDP M1>



   Message 9

   ACK sip:callee@company2.com SIP/2.0
   Via: SIP/2.0/UDP a.company1.com
   From: sip:callee@company2.com
   To: sip:callee@company2.com;tag=8abj8gh
   Call-ID: 125@1.2.3.4
   CSeq: 1 ACK
   Message 7
   SIP/2.0 200 OK
   Via: SIP/2.0/UDP b.company2.com
   From: sip:callee@company2.com
   To: sip:xcod@mediaserver.carrier.net;codec=B;ptime=B;tag=7ab7gh
   Call-ID: 678@5.6.7.8
   CSeq: 1 INVITE
   Content-Type: application/sdp
   Content-Length: XX

   <SDP M2>





Burger, et. al.            Expires 5/2/2003                         20

                    Network Announcements with SIP       November 2002


   Message 10

   ACK sip:xcod@mediaserver.carrier.net;codec=B;ptime=B SIP/2.0
   Via: SIP/2.0/UDP b.company2.com
   From: sip:callee@company2.com
   To: sip:xcod@mediaserver.carrier.net;codec=B;ptime=B;tag=7ab7gh
   Call-ID: 678@5.6.7.8
   CSeq: 1 ACK



7.4. Formal Syntax

   The following syntax specification uses the augmented Backus-Naur
   Form (BNF) as described in RFC-2234 [Error! Bookmark not defined.].

   XCOD-URL        = "sip:" xcod-ind "=" instance-id "@" hostport
                       xcod-parameters

   xcod-ind        = "xcod"

   instance-id     = token

   xcod-parameters = xcod-parameter / ";" xcod-parameters

   xcod-parameter = codec-param / ptime-param

   Where codec-param is one of the RTP codec labels [verify source and
   input cross reference] and ptime-param is the packet time, in
   milliseconds.


8. The User Part

   There has been considerable debate about the wisdom of using fixed
   user parts in a request URI.  The most common objection is that the
   user part should be opaque and a local matter.  The other objection
   is that using a fixed user part removes those specified user
   addresses from the user address space.

   We will address the latter issue first.  The common example is the
   Postmaster address defined by RFC 2821 [11].  The objection is that
   by using the Postmaster token for something special, one removes
   that token for anyone.  Thus, the Postmaster General of the United
   States, for example, cannot have the mail address
   Postmaster@usps.gov.  One may debate whether this is a significant
   limitation, however.

   One may point out that "annc", for example, has the potential for
   more conflict than Postmaster.  This is true.  However, one cannot
   confuse the namespace at a Media Server with the namespace for an
   organization.


Burger, et. al.            Expires 5/2/2003                         21

                    Network Announcements with SIP       November 2002


   For example, let us take the case where a network offers services
   for "Ann Charles".  She likes to use the name "annc", and thus she
   would like to use "sip:annc@provider.net".  We offer that there is
   ABSOLUTELY NO NAME COLLISION WHATSOEVER.  Why is this so?  This is
   so because sip:annc@provider.net will resolve to the specific user
   at a specific device for Ann.  As an example, provider.net's SIP
   Proxy Server can resolve sip:annc@provider.net to annc@anns-
   phone.provider.net .  One directs requests for the media service
   annc directly to the Media Server, e.g.,
   sip:annc@ms21.ap.provider.net .  Moreover, by definition, Ann
   Charles, or anything other than the announcement service, will NEVER
   be directly on the Media Server.  If that were not true, no phone in
   the world could use the user part "eburger", as eburger is a
   reserved user part in the SnowShore domain.

   The most important thing to note about this convention is that the
   left-hand side of the request URI is opaque to the network.  The
   only network elements that need to know about the convention are the
   Media Server and client.

   Some have proposed that such naming be a pure matter of local
   convention.  For example, the thesis of the informational RFC 3067
   [12] is that you can address services using a request URI.  However,
   some have taken the examples in the document to an extreme.  Namely,
   that the only way to address services is via arbitrary, opaque, long
   user parts.  It is possible to provision the service names, rather
   than fixed names.  While this can work in a closed network, where
   the Application Servers and Media Servers are in the same
   administrative domain, this does not work across domains.  This is
   because the client of the media service has to know the local name
   for each service / domain pair.  This is particularly onerous for
   situations where there is an ad hoc relationship between the
   application and the media service.  Without a well-known
   relationship between service and service address, how would the
   client locate the service?

   One very important result of using the user part as the service
   descriptor is that we can use all of the standard SIP machinery,
   without modification.  For example, Media Servers with different
   capabilities can SIP Register their capabilities as users.  For
   example, a mixing-only device will register the "conf" user, while a
   multi-purpose Media Server will register all of the users.  Note
   that this is why the URI to play is a parameter.  Doing otherwise
   would overburden a normal SIP proxy or redirect server.  Likewise,
   this scheme lets us leverage the standard SIP proxy behavior of
   using an intelligent redirect server or proxy server to provide
   high-available services.  For example, two Media Servers can
   register with a SIP redirect server for the annc user.  If one of
   the Media Servers fails, the registration will expire and all
   requests for the announcement service ("calls to the annc user") get
   sent to the surviving Media Server.



Burger, et. al.            Expires 5/2/2003                         22

                    Network Announcements with SIP       November 2002


9. Security Considerations

   Untrusted network elements could use the protocol described here for
   providing information services.  Many extant billing arrangements
   are for completed calls.  Successful call completion occurs with a
   2xx result code.  This can be an issue for the early media
   announcement service, and service providers should plan their
   network service offerings accordingly.

   Exposing network services with well-known addresses may not be
   desirable.  In this case, the Media Server should offer local
   policy, e.g., only accept requests from authorized clients.  Barring
   that, one can use a SIP Proxy to enforce the local policy.


10. References


   1  Bradner, S., "The Internet Standards Process -- Revision 3", BCP
      9, RFC 2026, October 1996.
      INFORMATIVE

   2  Bradner, S., "Key words for use in RFCs to Indicate Requirement
      Levels", BCP 14, RFC 2119, March 1997.
      NORMATIVE

   3  J. Rosenberg, et. al., "SIP: Session Initiation Protocol", RFC
      3261, June 2002.
      NORMATIVE

   4  H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, "RTP: A
      Transport Protocol for Real-Time Applications", RFC 1889, January
      1996.
      NORMATIVE

   5  Charlton, N., et. al., "User Requirements for the Session
      Initiation Protocol (SIP) in support of deaf, hard of hearing and
      speech-impaired individuals", draft-ietf-sipping-deaf-req-03.txt,
      April 2002, work in progress.
      INFORMATIVE

   6  McGlashan, S., et. al., "Voice Extensible Markup Language
      (VoiceXML) Version 2.0", http://www.w3.org/TR/voicexml20/, April
      2002.
      INFORMATIVE

   7  ISO 639, "Codes for the representation of names of languages",
      1998.
      NORMATIVE





Burger, et. al.            Expires 5/2/2003                         23

                    Network Announcements with SIP       November 2002



   8  ISO 3166, "Codes for the representation of names of countries and
      their subdivisions", 1997.
      NORMATIVE

   9  Crocker, D. and Overell, P.(Editors), "Augmented BNF for Syntax
      Specifications: ABNF", RFC 2234, November 1997.
      NORMATIVE

   10 Berners-Lee, T., Fielding, R., and Masinter, L., "Uniform
      Resource Identifiers (URI): Generic Syntax", RFC 2396, August
      1988.
      NORMATIVE

   11 Klensin, J. (ed.), "Simple Mail Transfer Protocol", RFC 2821,
      April 2001.
      INFORMATIVE

   12 Campbell, B. and Sparks, R., "Control of Service Context using
      SIP Request-URI", RFC 3087, April 2001.
      INFORMATIVE


11. Changes
   <This section will be removed before final submission>

11.1. Changes Made in Version 03

   Removed Implicit Service.

   Separated Normative and Informative references.

11.2. Changes Made in Version 02

   Removed implicit play= operation in section 5.1.

11.3. Changes Made in Version 01

   This document underwent significant updating as a result of the Las
   Vegas Interim Workgroup Meeting.

   For the Announcement Service description:
      o Added duration, repeat, delay, locale and variable parameters.
      o Added the ability to reference a provisioned announcement.
      o Made early media treatment the default behavior for the
        service.
      o 487 REQUEST TERMINATED replaces 486 BUSY HERE as the media
        serverÆs final response when early media treatment is desired.






Burger, et. al.            Expires 5/2/2003                         24

                    Network Announcements with SIP       November 2002


12. Acknowledgments

   We would like to thank Kevin Summers and Ravindra Kabre of Sonus
   Networks for their constructive comments, as well as Jonathan
   Rosenberg of Dynamicsoft and Tim Melanchuk for their encouragement.
   In addition, the discussion at the Las Vegas Interim Workgroup
   Meeting in 2002 was invaluable for clearing up the issues
   surrounding the left-hand-side of the request URI.


13. Author's Addresses

   Eric Burger (Editor)
   Andy Spitzer
   Jeff Van Dyke
   SnowShore Networks, Inc.
   285 Billerica Rd.
   Chelmsford, MA  01824-4120
   USA

   Phone: 978/367-8400
   Email: eburger@snowshore.com
   Email: woof@snowshore.com
   Email: jvandyke@snowshore.com


   Walter O'Connor
   Amherst, NH
   USA

   Email: woconnor@bit-net.com























Burger, et. al.            Expires 5/2/2003                         25

                    Network Announcements with SIP       November 2002



Full Copyright Statement

   Copyright (C) The Internet Society (2001, 2002).  All Rights
   Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph
   are included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.  This
   document and the information contained herein is provided on an "AS
   IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
   FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT
   NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN
   WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Acknowledgement

   The Internet Society currently provides funding for the RFC Editor
   function.

   SnowShore Networks, Inc. is a member of the Internet Society.


















Burger, et. al.            Expires 5/2/2003                         26