Network Working Group P. Saint-Andre
Internet-Draft &yet
Intended status: Standards Track S. Ibarra
Expires: September 21, 2014 AG Projects
E. Ivov
Jitsi
March 20, 2014
Interworking between the Session Initiation Protocol (SIP) and the
Extensible Messaging and Presence Protocol (XMPP): Media Sessions
draft-ietf-stox-media-04
Abstract
This document defines a bidirectional protocol mapping for use by
gateways that enable the exchange of media signalling messages
between systems that implement the Session Initiation Protocol (SIP)
and systems that implement the Jingle extensions to the Extensible
Messaging and Presence Protocol (XMPP).
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 21, 2014.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
Saint-Andre, et al. Expires September 21, 2014 [Page 1]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Intended Audience . . . . . . . . . . . . . . . . . . . . . . 3
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Compatibility with Offer/Answer Model . . . . . . . . . . . . 4
5. Syntax Mappings . . . . . . . . . . . . . . . . . . . . . . . 5
6. Call Hold . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7. Early Media . . . . . . . . . . . . . . . . . . . . . . . . . 11
8. Detecting Endless Loops . . . . . . . . . . . . . . . . . . . 12
9. SDP Format-Specific Parameters . . . . . . . . . . . . . . . 12
10. Dialog Forking . . . . . . . . . . . . . . . . . . . . . . . 13
11. Sample Scenarios . . . . . . . . . . . . . . . . . . . . . . 14
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
13. Security Considerations . . . . . . . . . . . . . . . . . . . 22
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 22
Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25
1. Introduction
The Session Initiation Protocol [RFC3261] is a widely-deployed
technology for the management of media sessions (such as voice and
video calls) over the Internet. SIP itself provides a signalling
channel via TCP [RFC0793] or UDP [RFC0768], over which two or more
parties can exchange messages for the purpose of negotiating a media
session that uses a dedicated media channel such as the Real-time
Transport Protocol (RTP) [RFC3550]. The Extensible Messaging and
Presence Protocol (XMPP) [RFC6120] also provides a signalling
channel, typically via TCP (although HTTP and WebSocket bindings also
exist).
Given the significant differences between XMPP and SIP, traditionally
it was difficult to combine the two technologies in a single user
agent (although nowadays such implementations are not uncommon, as
described in [RFC7081]). As a result, in 2005 some developers
wishing to add media session capabilities to XMPP clients defined a
set of XMPP-specific session negotiation protocol extensions called
Jingle [XEP-0166].
Thankfully, Jingle was designed to easily map to SIP for
communication through gateways or other transformation mechanisms.
Nevertheless, given the significantly different technology
Saint-Andre, et al. Expires September 21, 2014 [Page 2]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
assumptions underlying XMPP and SIP, Jingle is naturally different
from SIP in several important respects:
o Base SIP messages and headers use a plaintext format similar in
some ways to the Hypertext Transport Protocol [RFC2616], whereas
Jingle messages are pure XML. Mappings between SIP headers and
Jingle message syntax are provided below.
o The SIP payloads defining session semantics use the Session
Description Protocol [RFC4566], whereas the equivalent Jingle
payloads are defined as XML child elements of the Jingle <content/
> element. However, the Jingle specifications defining such child
elements specify mappings to SDP for all Jingle syntax, making the
mapping relatively straightforward.
o The SIP signalling channel has historically often been transported
over UDP, whereas the signalling channel for Jingle is XMPP over
TCP. Mapping between the transport layers typically happens
within a gateway using techniques below the application level, and
therefore is not addressed in this specification.
Consistent with existing specifications for mapping between SIP and
XMPP (see [I-D.ietf-stox-core] and other related specifications),
this document describes a bidirectional protocol mapping for use by
gateways that enable the exchange of media signalling messages
between systems that implement SIP and systems that implement the
XMPP Jingle extensions.
It is important to note that SIP and Jingle sessions can be gatewayed
in a rather simple fashion if all media was always routed and
potentially even transcoded through a gateway. This specification
defines a mapping that allows gateways to (wherever possible) only
intervene at the signalling level, letting user agents exchange media
in an end-to-end manner. Such gateways focus on handling handling
RTP session establishment and control within the context of what
users would perceive as "calls". This document is hence primarily
dealing with calling scenarios as opposed to generic media sessions
with SIP.
The discussion venue for this document is the mailing list of the
STOX WG; visit https://www.ietf.org/mailman/listinfo/stox for
subscription information and discussion archives.
2. Intended Audience
The documents in this series are intended for use by software
developers who have an existing system based on one of these
technologies (e.g., SIP), and would like to enable communication from
Saint-Andre, et al. Expires September 21, 2014 [Page 3]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
that existing system to systems based on the other technology (e.g.,
XMPP). We assume that readers are familiar with the core
specifications for both SIP [RFC3261] and XMPP [RFC6120], with the
base document for this series [I-D.ietf-stox-core], and with the
following media-related specifications:
o RTP Profile for Audio and Video Conferences with Minimal Control
[RFC3551]
o The Secure Real-time Transport Protocol (SRTP) [RFC3711]
o SDP: Session Description Protocol [RFC4566]
o Interactive Connectivity Establishment (ICE): A Protocol for
Network Address Translator (NAT) Traversal for Offer/Answer
Protocols [RFC5245]
o Jingle [XEP-0166]
o Jingle RTP Sessions [XEP-0167]
o Jingle ICE-UDP Transport Method [XEP-0176]
o Jingle Raw UDP Transport Method [XEP-0177]
3. Terminology
A number of technical terms used here are defined in [RFC3261],
[RFC6120], [XEP-0166], and [XEP-0167]. The term "JID" is short for
"Jabber Identifier".
In flow diagrams, SIP traffic is shown using arrows such as "***>"
whereas XMPP traffic is shown using arrows such as "...>".
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
[RFC2119].
4. Compatibility with Offer/Answer Model
Even if Jingle semantics have many similarities with those used in
SIP, there are some use cases that cannot be handled in exactly the
same way due to the Offer/Answer model used in SIP in conjunction
with SDP.
More specifically, mapping SIP and SDP Offer/Answer to XMPP is often
complicated due to the difference in how each handles backward
Saint-Andre, et al. Expires September 21, 2014 [Page 4]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
compatibility. Jingle, as most other XMPP extensions, relies heavily
on the XMPP extension for service discovery [XEP-0030]. In other
words, XMPP entities are able to verify the capabilities of their
intended peer before actually attempting to establish a session with
it.
SDP Offer/Answer on the other hand uses a least common denominator
approach where every SDP offer has to be understandable by legacy
endpoints. Newer, unsupported aspects in this offer can therefore
only appear as optional, or their use needs to be limited to
subsequent Offer/Answer exchanges once their support has been
confirmed.
Use of "trickle ICE" (see [I-D.ietf-mmusic-trickle-ice] and
[I-D.ivov-mmusic-trickle-ice-sip]) is one example where this issue
occurs. SIP endpoints need to always behave like so-called "vanilla
ICE" agents when sending their first offer and make sure they gather
all candidates before sending a SIP INVITE. This is necessary
because otherwise ICE agents with no support for trickle can
prematurely declare failure. Jingle endpoints, on the other hand,
can verify support for trickle ICE prior to engaging in a session and
adapt their behavior accordingly.
In order to work around this disparity in relation to communication
of transport candidates, [XEP-0176] defines an Offer/Answer support
mode through the "urn:ietf:rfc:3264" feature tag. When an XMPP
entity such as a client (or, significantly, a gateway to a SIP
system) advertises support for this feature, the entity indicates
that it needs to receive multiple transport candidates in the initial
offer, instead of receiving them in a continuous trickle over time.
Although implementations conforming to this specification MUST
support the Offer/Answer model with Jingle, such endpoints are not
required to actually declare support for the "urn:ietf:rfc:3264"
service discovery feature since this would mean that they too would
be reachable only through Offer/Answer semantics not also through
trickle-ICE semantics.)
5. Syntax Mappings
5.1. Generic Jingle Syntax
Jingle is designed in a modular fashion, so that session description
data is generally carried in a payload within the generic Jingle
elements, i.e., the <jingle/> element and its <content/> child. The
following example illustrates this structure, where the XMPP stanza
is a request to initiate an audio session (via the <content/> and
<description/> elements) using a transport of RTP over raw UDP (via
the <transport/> element).
Saint-Andre, et al. Expires September 21, 2014 [Page 5]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Example 1: Structure of a Jingle session initiation request
| <iq from='romeo@example.net/v3rsch1kk3l1jk'
| id='ne91v36s'
| to='juliet@example.com/t3hr0zny'
| type='set'>
| <jingle xmlns='urn:xmpp:jingle:1'
| action='session-initiate'
| initiator='romeo@example.net/v3rsch1kk3l1jk'
| sid='a73sjjvkla37jfea'>
| <content creator='initiator'
| media='audio'
| name='this-is-the-audio-content'
| senders='both'>
| <description xmlns='urn:xmpp:jingle:app:rtp:1'>
| <payload-type id='96' name='speex' clockrate='16000'/>
| <payload-type id='97' name='speex' clockrate='8000'/>
| <payload-type id='18' name='G729'/>
| <payload-type channels='2'
| clockrate='16000'
| id='103'
| name='L16'/>
| <payload-type id='98' name='x-ISAC' clockrate='8000'/>
| </description>
| <transport xmlns='urn:xmpp:jingle:transport:raw-udp'>
| <candidate id='v3c18fgg' ip='10.1.1.104'
| port='13540' generation='0'/>
| </transport>
| </content>
| </jingle>
| </iq>
In the foregoing example, the syntax and semantics of the <jingle/>
and <content/> elements are defined in the core Jingle specification
[XEP-0166], the syntax and semantics of the <description/> element
qualified by the 'urn:xmpp:jingle:app:rtp:1' namespace are defined in
the Jingle RTP specification [XEP-0167], and the syntax and semantics
of the <transport/> element qualified by the
'urn:xmpp:jingle:transport:raw-udp' namespace are defined in the
Jingle Raw UDP specification [XEP-0177]. Other <description/>
elements are defined in specifications for the appropriate
application types (see for example [XEP-0234] for file transfer) and
other <transport/> elements are defined in the specifications for
appropriate transport methods (see for example [XEP-0176], which
defines an XMPP profile of ICE [RFC5245]).
At the core Jingle layer, the following mappings are defined.
Saint-Andre, et al. Expires September 21, 2014 [Page 6]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Table 1: High-Level Mapping from XMPP to SIP
+--------------------------------+--------------------------------+
| Jingle | SIP |
+--------------------------------+--------------------------------+
| <jingle/> 'action' | [ see next table ] |
+--------------------------------+--------------------------------+
| <jingle/> 'initiator' | [ no mapping ] |
+--------------------------------+--------------------------------+
| <jingle/> 'responder' | [ no mapping ] |
+--------------------------------+--------------------------------+
| <jingle/> 'sid' | local-part of Dialog ID |
+--------------------------------+--------------------------------+
| local-part of 'initiator' | <username> in SDP o= line |
+--------------------------------+--------------------------------+
| <content/> 'creator' | [ no mapping ] |
+--------------------------------+--------------------------------+
| <content/> 'name' | no mandatory mapping * |
+--------------------------------+--------------------------------+
| <content/> 'senders' value of | a= line of sendrecv, recvonly, |
| both, initiator, responder, or | sendonly, or inactive |
| none | |
+--------------------------------+--------------------------------+
* In can be appropriate to map to the a=mid value defined in
[RFC5888].
The 'senders' attribute is optional in Jingle, with a default value
of "both"; thus in case the attribute is absent the SDP direction
value MUST be considered as 'sendrecv'.
The 'action' attribute of the <jingle/> element has 15 allowable
values. In general they should be mapped as shown in the following
table, with some exceptions as described below.
Saint-Andre, et al. Expires September 21, 2014 [Page 7]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Table 2: Mapping of Jingle Actions to SIP Methods
+-------------------+------------------------------+
| Jingle Action | SIP Method |
+-------------------+------------------------------+
| content-accept | INVITE response (1xx or 2xx) |
+-------------------+------------------------------+
| content-add | INVITE request |
+-------------------+------------------------------+
| content-modify | INVITE request |
+-------------------+------------------------------+
| content-reject | unused in this mapping |
+-------------------+------------------------------+
| content-remove | INVITE request |
+-------------------+------------------------------+
| description-info | unused in this mapping |
+-------------------+------------------------------+
| security-info | unused in this mapping |
+-------------------+------------------------------+
| session-accept | INVITE response (1xx or 2xx) |
+-------------------+------------------------------+
| session-info | [varies] |
+-------------------+------------------------------+
| session-initiate | INVITE request |
+-------------------+------------------------------+
| session-terminate | BYE |
+-------------------+------------------------------+
| transport-accept | unused in this mapping |
+-------------------+------------------------------+
| transport-info | unused in this mapping |
+-------------------+------------------------------+
| transport-reject | unused in this mapping |
+-------------------+------------------------------+
| transport-replace | unused in this mapping |
+-------------------+------------------------------+
5.2. Application Formats
Jingle application formats for audio and video exchange via RTP are
specified in [XEP-0167]. These application formats effectively map
to the "RTP/AVP" profile specified in [RFC3551] and the "RTP/SAVP"
profile specified in [RFC3711], where the media types are "audio" and
"video" and the specific mappings to SDP syntax are provided in
[XEP-0167].
(As stated in [XEP-0167], future versions of that specification might
define how to use other RTP profiles such as "RTP/AVPF" and "RTP/
SAVPF" as defined in [RFC4585] and [RFC5124] respectively.)
Saint-Andre, et al. Expires September 21, 2014 [Page 8]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
5.3. Raw UDP Transport Method
A basic Jingle transport method for exchanging media over UDP is
specified in [XEP-0177]. This transport method involves the
negotiation of an IP address and port only. It does not provide NAT
traversal, effectively leaving the task to intermediary entities.
The Jingle 'ip' attribute maps to the connection-address parameter of
the SDP c= line and the 'port' attribute maps to the port parameter
of the SDP m= line. Use of SIP without ICE would generally map to
use of Raw UDP on the XMPP side of a session.
5.4. ICE-UDP Transport Method
A more advanced Jingle transport method for exchanging media over UDP
is specified in [XEP-0176]. Under ideal conditions this transport
method provides NAT traversal by following the Interactive
Connectivity Exchange methodology specified in [RFC5245].
The relevant SDP mappings are provided in [XEP-0176], however there
are a few syntax incompatibilities which need to be addressed by
gateways conforming to this specification:
o The 'foundation' attribute is defined as a number in Jingle
(unsigned byte) whereas ICE [RFC5245] defines it as a string,
which can contain letters, digits and the '+' and '/' symbols.
Gateway applications MUST therefore convert ICE originating
foundations into integer numbers and they MUST guarantee that such
a conversion preserves foundation uniqueness. The exact mechanism
for the conversion is undefined.
o Jingle defines a 'generation' attribute which is used to determine
if an ICE restart is required. This attribute has no counterpart
in SIP as ICE restarts are initiated by detecting a change in the
ICE ufrag and password. Gateways MUST therefore increase the
generation number when they detect such changes.
o The 'id' attribute defined by Jingle has no SIP counterpart thus
applications are free to choose means to generate unique
identifiers across the different candidates of an ICE generation.
o The 'network' attribute defined by Jingle has no counterpart in
SIP and SHOULD be ignored.
[[OPEN ISSUE: describe handling of ICE restarts.]]
Saint-Andre, et al. Expires September 21, 2014 [Page 9]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
6. Call Hold
[RFC3264] stipulates that streams are placed on hold by setting their
direction to "sendonly". A session is placed on hold by doing this
for all the streams it contains. The same semantics are also
supported by Jingle through the "senders" element and its "initiator"
and "responder" values (XEP-0166 also defines a value of "none",
which maps to an a= value of "inactive").
The following example shows how the responder would put the call on
hold (i.e., temporarily stop listening to media sent by the
initiator) using a Jingle content-modify action and a modified value
for the 'senders' attribute (here "responder" to indicate that the
responder might continue to send media, such as hold music).
Example 2: Call hold via 'senders' attribute
| <iq from='juliet@capulet.lit/balcony'
| id='hz73n2l9'
| to='romeo@montague.lit/orchard'
| type='set'>
| <jingle xmlns='urn:xmpp:jingle:1'
| action='content-modify'
| initiator='romeo@montague.lit/orchard'
| sid='a73sjjvkla37jfea'>
| <content creator='initiator'
| media='audio'
| name='this-is-the-audio-content'
| senders='responder'/>
| </jingle>
| </iq>
In addition to these semantics, however, the Jingle RTP Sessions
specification [XEP-0167] also defines a more concise way for
achieving the same end, which consists in sending a "hold" command
within a "session-info" action, as shown in the following example.
Saint-Andre, et al. Expires September 21, 2014 [Page 10]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Example 3: Call hold via session-info action
| <iq from='juliet@capulet.lit/balcony'
| id='xv39z423'
| to='romeo@montague.lit/orchard'
| type='set'>
| <jingle xmlns='urn:xmpp:jingle:1'
| action='session-info'
| initiator='romeo@montague.lit/orchard'
| sid='a73sjjvkla37jfea'>
| <hold xmlns='urn:xmpp:jingle:apps:rtp:info:1'/>
| </jingle>
| </iq>
Gateways that receive a "hold" command from their Jingle side MUST
generate a new offer on their SIP side, placing all streams in a
"sendonly" state.
When relaying offers from SIP to XMPP however, gateways are not
required to translate "sendonly" attributes into a "hold" command as
this would not always be possible (e.g. when not all streams have the
same direction). Additionally such conversions might introduce
complications in case further offers placing a session on hold also
contain other session modifications.
It is possible that, after one entity has put the other on hold, the
second entity might put the first entity on hold. In this case, the
effective direction would then be "inactive" in SDP and "none" in
Jingle.
7. Early Media
[RFC3959] and [RFC3960] describe a number of scenarios relying on
"early media". While similar attempts have also been made for XMPP,
support for early media is not currently widely supported in Jingle
implementations. Therefore, gateways SHOULD NOT forward SDP answers
from SIP to Jingle until a final response has been received, except
in cases where the gateway is in a position to confirm specific
support for early media by the endpoint (one approach to such support
can be found in [XEP-0269] but it has not yet been standardized).
Gateways MUST however store early media SDP answers when they are
sent inside a reliable provisional response. In such cases, a
subsequent final response can follow without an actual answer and the
one from the provisional response will need to be forwarded to the
Jingle endpoint.
Saint-Andre, et al. Expires September 21, 2014 [Page 11]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
8. Detecting Endless Loops
[RFC3261] defines a "Max-Forwards" header that allows intermediate
entities such as SIP proxies to detect and prevent loops from
occurring. The specifics of XMPP make such a prevention mechanism
unnecessary for XMPP-only environments. With the introduction of
SIP-to-XMPP gatewaying, however, it would be possible for loops to
occur where messages are being repeatedly forwarded from XMPP to SIP
to XMPP to SIP, etc.
To compensate for the lack of a "Max-Forwards" header in SIP,
gateways MUST therefore keep track of all SIP transactions and Jingle
sessions that they are currently serving and they MUST block re-
entrant messages.
[[OPEN ISSUE: In order for this to work, we need a consistent way of
translating dialog IDs into Jingle sessions, and vice versa, so that
the following can be verified: jingleSessID ==
toJingleSessID(toSipCallID( jingleSessID )). We need to mention
mention spirals here as well. Alice could call Bob, but Bob forwards
his call to Romeo. A spiral on the SIP side could end up becoming a
loop if the gateway is in between.]]
9. SDP Format-Specific Parameters
[RFC4566] defines "a=fmtp" attributes for the transmission of format
specific parameters as a single transparent string. Such strings can
be used to convey either a single value or a sequence of parameters,
separated by semi-colons, commas or whatever delimiters are chosen by
a particular payload type specification.
[XEP-0167] on the other hand defines a <parameter/> element as
follows:
<parameter name="paramName" value="paramValue"/>
A sequence of parameters is thus transmitted as an array of distinct
name/value couples, at least in the context of the Jingle RTP
extension.
These differences make it impossible to devise a generic mechanism
that accurately translates format parameters from Jingle RTP to SDP
without the specifics of the payload being known to the gateway. In
general this is not a major problem because many or most of the media
type definitions supported in existing Jingle implementations follow
the normal semicolon-separated parameter model. One likely exception
is the RTP Payload for DTMF Digits, Telephony Tones, and Telephony
Signals [RFC4733].
Saint-Andre, et al. Expires September 21, 2014 [Page 12]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
For implementations that wish to provide a general-purpose
translation method, this specification makes the following
recommendations:
1. Gateways that are aware of the formats in use SHOULD parse all
format parameters and generate <parameter/> arrays and "a=fmtp"
values accordingly.
2. When translating Jingle RTP to SIP, gateways that have no
explicit support for the formats that are being negotiated SHOULD
convert the list of <parameter/> elements into a single string,
containing a sequence of "name=value" pairs, separated by a semi-
colon and a space (i.e. "; ").
3. When translating SIP to Jingle RTP, gateways that have no
explicit support for the formats that are being negotiated SHOULD
tokenize the "a=fmtp" format string using one delimiter from the
following list: ";", "; ", ",", ", ". The resulting tokens
SHOULD then be parsed as "name=value" pairs. If this process
does actually yield any such pairs, they SHOULD be used for
generating the respective <parameter/> elements. If some of the
tokens cannot be parsed into a "name=value" pair because they do
not conform to the convention suggested in [RFC4855], or in case
the format string couldn't be tokenized with the above
delimiters, the remaining strings SHOULD be used as a value for
the "value" attribute of the <parameter/> element and the
corresponding "name" attribute SHOULD be left empty.
Here is an example of the foregoing transformations, using the
aforementioned example of DTMF digits:
SDP with format data
a=rtpmap:100 telephone-event/8000
a=fmtp:100 0-15,66,70
Jingle transformation
<parameter name="" value="0-15,66-70"/>
10. Dialog Forking
[RFC3261] defines semantics for dialog forking. Such semantics have
not been defined for Jingle and need to be hidden from XMPP
endpoints.
To achieve this, a SIP-to-XMPP gateway MUST NOT forward more than one
provisional response on the Jingle side. Typically they would do so
Saint-Andre, et al. Expires September 21, 2014 [Page 13]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
only for the first provisional response they receive and ignore the
rest. This provisional response SHOULD be forwarded as if it
originated from a "user@host" address (i.e., a "bare JID")
corresponding to the AOR URI found in the "From" header of the SIP
provisional response. The gateway MUST NOT attempt to translate
GRUUs into full JIDs because it cannot know at this stage which of
the dialogs established by these provisional responses will be used
for the actual session.
Likewise, a gateway conforming to this specification MUST NOT forward
more than a single final response received through SIP to the Jingle
side. The gateway SHOULD terminate the SIP sessions whose received
final response wasn't forwarded to the Jingle side.
11. Sample Scenarios
The following sections provide sample scenarios (or "call flows")
that illustrate the principles of interworking from Jingle to SIP.
These scenarios are not exhaustive.
11.1. Basic Voice Chat
The protocol flow for a basic voice chat for which an XMPP user
(juliet@example.com) is the initiator and a SIP user
(romeo@example.net) is the responder. The voice chat is consummated
through a gateway. To simplify the example, the Jingle transport
method negotiated is "raw user datagram protocol" as specified in
[XEP-0177].
XMPP XMPP XMPP-to-SIP SIP-to-XMPP SIP SIP
User Server Gateway Gateway Server User
| | | | | |
| (F1) XMPP | | | | |
| session- | | | | |
| initiate | | | | |
|..........>| | | | |
| | (F2) XMPP | | | |
| | session- | | | |
| | initiate | | | |
| |...........>| | | |
| | (F3) XMPP | | | |
| | IQ-result | | | |
| |<...........| | | |
| (F4) XMPP | | | | |
| IQ-result | | | | |
|<..........| | | | |
| | | (F5) SIP INVITE | |
| | |**************************>| |
Saint-Andre, et al. Expires September 21, 2014 [Page 14]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
| | | | | (F6) SIP |
| | | | | INVITE |
| | | | |*********>|
| | | | | (F7) SIP |
| | | | | 180 |
| | | | | ringing |
| | | | |<*********|
| | | | (F8) SIP | |
| | | | 180 ringing| |
| | | |<***********| |
| | (F9) XMPP session-info | | |
| | (ringing) | | |
| |<..........................| | |
| (F10) XMPP| | | | |
| session- | | | | |
| info | | | | |
| (ringing) | | | | |
|<..........| | | | |
| (F11) XMPP| | | | |
| IQ-result | | | | |
|..........>| | | | |
| | (F12) XMPP | | | |
| | IQ-result | | | |
| |...........x| | | |
| | | | | (F13) SIP|
| | | | | 200 OK |
| | | | |<*********|
| | | | (F14) SIP | |
| | | | 200 OK | |
| | | |<***********| |
| | (F15) XMPP session-accept | | |
| |<..........................| | |
| (F16) XMPP| | | | |
| session- | | | | |
| accept | | | | |
|<..........| | | | |
| (F17) XMPP| | | | |
| IQ-result | | | | |
|..........>| | | | |
| | (F18) XMPP | | | |
| | IQ-result | | | |
| |...........>| | | |
| | | (F19) SIP ACK | |
| | |**************************>| |
| | | | | (F20) SIP|
| | | | | ACK |
| | | | |*********>|
| | | | | |
Saint-Andre, et al. Expires September 21, 2014 [Page 15]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
|<====================MEDIA SESSION OVER RTP===================>|
| | | | | |
| | | | | (F21) SIP|
| | | | | BYE |
| | | | |<*********|
| | | | (F22) SIP | |
| | | | BYE | |
| | | |<***********| |
| | (F23) XMPP session- | | |
| | terminate | | |
| |<..........................| | |
| (F24) XMPP| | | | |
| session- | | | | |
| terminate | | | | |
|<..........| | | | |
| (F25) XMPP| | | | |
| IQ-result | | | | |
|..........>| | | | |
| | (F26) XMPP | | | |
| | IQ-result | | | |
| |...........x| | | |
The packet flow is as follows.
First the XMPP user sends a Jingle session-initiation request to the
SIP user.
Saint-Andre, et al. Expires September 21, 2014 [Page 16]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Example 4: Jingle session-initiate (F1)
| <iq from='juliet@example.com/t3hr0zny'
| id='hu2s61f4'
| from='romeo@example.net/v3rsch1kk3l1jk'
| type='set'>
| <jingle xmlns='urn:xmpp:jingle:1'
| action='session-initiate'
| initiator='juliet@example.com/t3hr0zny'
| sid='a73sjjvkla37jfea'>
| <content creator='initiator'
| media='audio'
| name='this-is-the-audio-content'>
| <description xmlns='urn:xmpp:jingle:app:rtp:1'>
| <payload-type id='96' name='speex' clockrate='16000'/>
| <payload-type id='97' name='speex' clockrate='8000'/>
| <payload-type id='18' name='G729'/>
| </description>
| <transport xmlns='urn:xmpp:jingle:transport:raw-udp'>
| <candidate component='1' generation='0' id='u3gscv289p'
| ip='192.0.2.101' port='49172'/>
| </transport>
| </content>
| </jingle>
| </iq>
The gateway returns an XMPP IQ-result to the initiator on behalf of
the responder.
Example 5: XMPP IQ-result from gateway (F3)
| <iq from='juliet@example.com/t3hr0zny'
| id='hu2s61f4'
| to='romeo@example.net/v3rsch1kk3l1jk'
| type='result'/>
The gateway transforms the Jingle session-initiate action into a SIP
INVITE.
Saint-Andre, et al. Expires September 21, 2014 [Page 17]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Example 6: SIP transformation of Jingle session-initiate (F5)
| INVITE sip:romeo@example.net SIP/2.0
| Via: SIP/2.0/TCP client.example.com:5060;branch=z9hG4bK74bf9
| Max-Forwards: 70
| From: Juliet Capulet <sip:juliet@example.com>;tag=t3hr0zny
| To: Romeo Montague <sip:romeo@example.net>
| Call-ID: 3848276298220188511@example.com
| CSeq: 1 INVITE
| Contact: <sip:juliet@client.example.com;transport=tcp>
| Content-Type: application/sdp
| Content-Length: 184
| v=0
| o=alice 2890844526 2890844526 IN IP4 client.example.com
| s=-
| c=IN IP4 192.0.2.101
| t=0 0
| m=audio 49172 RTP/AVP 18 96 97
| a=rtpmap:96 sppex/16000
| a=rtpmap:97 speex/8000
| a=rtpmap:18 G729
The responder returns a SIP 180 Ringing message.
Example 7: SIP 180 Ringing message (F7)
| SIP/2.0 180 Ringing
| Via: SIP/2.0/TCP client.example.com:5060;branch=z9hG4bK74bf9;\
| received=192.0.2.101
| From: Juliet Capulet <sip:juliet@example.com>;tag=t3hr0zny
| To: Romeo Montague <sip:romeo@example.net>;tag=v3rsch1kk3l1jk
| Call-ID: 3848276298220188511@example.com
| CSeq: 1 INVITE
| Contact: <sip:romeo@client.example.net;transport=tcp>
| Content-Length: 0
The gateway transforms the ringing message into XMPP syntax.
Saint-Andre, et al. Expires September 21, 2014 [Page 18]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Example 8: XMPP transformation of SIP 180 Ringing message (F7)
| <iq from='romeo@montague.net/v3rsch1kk3l1jk'
| id='ol3ba71g'
| to='juliet@example.com/t3hr0zny'
| type='set'>
| <jingle xmlns='urn:xmpp:jingle:1'
| action='session-info'
| initiator='juliet@example.com/t3hr0zny'
| sid='a73sjjvkla37jfea'>
| <ringing xmlns='urn:xmpp:jingle:apps:rtp:info:1'/>
| </jingle>
| </iq>
The initiator returns an IQ-result acknowledging receipt of the
ringing message, which is used only by the gateway and not
transformed into SIP syntax.
Example 9: XMPP entity acknowledges ringing message (F11)
| <iq from='juliet@example.com/t3hr0zny'
| id='ol3ba71g'
| to='romeo@example.net/v3rsch1kk3l1jk'
| type='result'/>
The responder sends a SIP 200 OK to the initiator in order to accept
the session initiation request.
Example 10: SIP user accepts session request (F13)
| SIP/2.0 200 OK
| Via: SIP/2.0/TCP client.example.com:5060;branch=z9hG4bK74bf9;\
| received=192.0.2.101
| From: Juliet Capulet <sip:juliet@example.com>;tag=t3hr0zny
| To: Romeo Montague <sip:romeo@example.net>;tag=v3rsch1kk3l1jk
| Call-ID: 3848276298220188511@example.com
| CSeq: 1 INVITE
| Contact: <sip:romeo@client.example.net;transport=tcp>
| Content-Type: application/sdp
| Content-Length: 147
|
| v=0
| o=romeo 2890844527 2890844527 IN IP4 client.example.net
| s=-
| c=IN IP4 192.0.2.201
| t=0 0
| m=audio 3456 RTP/AVP 97
| a=rtpmap:97 speex/8000
Saint-Andre, et al. Expires September 21, 2014 [Page 19]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
The gateway transforms the 200 OK into a Jingle session-accept
action.
Example 11: XMPP transformation of SIP 200 OK (F15)
| <iq from='romeo@example.net/v3rsch1kk3l1jk'
| id='pd1bf839'
| to='juliet@example.com/t3hr0zny'
| type='set'>
| <jingle xmlns='urn:xmpp:jingle:1'
| action='session-accept'
| initiator='juliet@example.com/t3hr0zny'
| responder='romeo@example.net/v3rsch1kk3l1jk'
| sid='a73sjjvkla37jfea'>
| <content creator='initiator'
| media='audio'
| name='this-is-the-audio-content'>
| <description xmlns='urn:xmpp:jingle:app:rtp:1'>
| <payload-type id='97' name='speex' clockrate='8000'/>
| </description>
| <transport xmlns='urn:xmpp:jingle:transport:raw-udp'>
| <candidate id='9eg13am7' ip='192.0.2.101'
| port='49172' generation='0'/>
| </transport>
| </content>
| </jingle>
| </iq>
If the payload types and transport candidate can be successfully used
by both parties, then the initiator acknowledges the session-accept
action.
Example 12: XMPP user acknowledges session-accept (F17)
| <iq from='romeo@example.net/v3rsch1kk3l1jk'
| id='pd1bf839'
| to='juliet@example.com/t3hr0zny'
| type='result'/>
The parties now begin to exchange media. In this case they would
exchange audio using the Speex codec at a clockrate of 8000 since
that is the highest-priority codec for the responder (as determined
by the XML order of the <payloadtype/> children).
The parties can continue the session as long as desired.
Eventually, one of the parties (in this case the responder)
terminates the session.
Saint-Andre, et al. Expires September 21, 2014 [Page 20]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
Example 13: SIP user ends session (F21)
| BYE sip:juliet@client.example.com SIP/2.0
| Via: SIP/2.0/TCP client.example.net:5060;branch=z9hG4bKnashds7
| Max-Forwards: 70
| From: Romeo Montague <sip:romeo@example.net>;tag=8321234356
| To: Juliet Capulet <sip:juliet@example.com>;tag=9fxced76sl
| Call-ID: 3848276298220188511@example.com
| CSeq: 1 BYE
| Content-Length: 0
The gateway transforms the SIP BYE into XMPP syntax.
Example 14: XMPP transformation of SIP BYE (F23)
| <iq from='romeo@example.net/v3rsch1kk3l1jk'
| id='rv301b47'
| to='juliet@example.com/t3hr0zny'
| type='set'>
| <jingle xmlns='urn:xmpp:jingle:1'
| action='session-terminate'
| initiator='juliet@example.com/t3hr0zny'
| sid='a73sjjvkla37jfea'/>
| <reason>
| <success/>
| </reason>
| </jingle>
| </iq>
The initiator returns an IQ-result acknowledging receipt of the
session termination, which is used only by the gateway and not
transformed into SIP syntax.
Example 15: XMPP user acknowledges end of session (F25)
<iq from='romeo@example.net/v3rsch1kk3l1jk'
id='rv301b47'
to='juliet@example.com/t3hr0zny'
type='result'/>
12. IANA Considerations
This document has no actions for the IANA.
Saint-Andre, et al. Expires September 21, 2014 [Page 21]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
13. Security Considerations
Detailed security considerations for session management are given for
SIP in [RFC3261] and for XMPP in [XEP-0166] (see also [RFC6120]).
The security considerations provided in [I-D.ietf-stox-core] also
apply.
14. References
14.1. Normative References
[I-D.ietf-stox-core]
Saint-Andre, P., Houri, A., and J. Hildebrand,
"Interworking between the Session Initiation Protocol
(SIP) and the Extensible Messaging and Presence Protocol
(XMPP): Core", draft-ietf-stox-core-11 (work in progress),
February 2014.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
A., Peterson, J., Sparks, R., Handley, M., and E.
Schooler, "SIP: Session Initiation Protocol", RFC 3261,
June 2002.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
Video Conferences with Minimal Control", STD 65, RFC 3551,
July 2003.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)",
RFC 3711, March 2004.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
[RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF
Digits, Telephony Tones, and Telephony Signals", RFC 4733,
December 2006.
[RFC4855] Casner, S., "Media Type Registration of RTP Payload
Formats", RFC 4855, February 2007.
[RFC5245] Rosenberg, J., "Interactive Connectivity Establishment
(ICE): A Protocol for Network Address Translator (NAT)
Traversal for Offer/Answer Protocols", RFC 5245, April
2010.
Saint-Andre, et al. Expires September 21, 2014 [Page 22]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
[RFC6120] Saint-Andre, P., "Extensible Messaging and Presence
Protocol (XMPP): Core", RFC 6120, March 2011.
[XEP-0030]
Hildebrand, J., Eatmon, R., and P. Saint-Andre, "Service
Discovery", XSF XEP 0030, June 2008.
[XEP-0166]
Ludwig, S., Beda, J., Saint-Andre, P., McQueen, R., Egan,
S., and J. Hildebrand, "Jingle", XSF XEP 0166, June 2007.
[XEP-0167]
Ludwig, S., Saint-Andre, P., Egan, S., and R. McQueen,
"Jingle RTP Sessions", XSF XEP 0167, February 2009.
[XEP-0176]
Beda, J., Ludwig, S., Saint-Andre, P., Hildebrand, J., and
S. Egan, "Jingle ICE-UDP Transport Method", XSF XEP 0176,
February 2009.
[XEP-0177]
Beda, J., Saint-Andre, P., Ludwig, S., Hildebrand, J., and
S. Egan, "Jingle Raw UDP Transport", XSF XEP 0177,
February 2009.
14.2. Informative References
[I-D.ietf-mmusic-trickle-ice]
Ivov, E., Rescorla, E., and J. Uberti, "Trickle ICE:
Incremental Provisioning of Candidates for the Interactive
Connectivity Establishment (ICE) Protocol", draft-ietf-
mmusic-trickle-ice-01 (work in progress), February 2014.
[I-D.ivov-mmusic-trickle-ice-sip]
Ivov, E., Marocco, E., and C. Holmberg, "A Session
Initiation Protocol (SIP) usage for Trickle ICE", draft-
ivov-mmusic-trickle-ice-sip-01 (work in progress), October
2013.
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
August 1980.
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC
793, September 1981.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
Saint-Andre, et al. Expires September 21, 2014 [Page 23]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264, June
2002.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003.
[RFC3959] Camarillo, G., "The Early Session Disposition Type for the
Session Initiation Protocol (SIP)", RFC 3959, December
2004.
[RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing
Tone Generation in the Session Initiation Protocol (SIP)",
RFC 3960, December 2004.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
2006.
[RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for
Real-time Transport Control Protocol (RTCP)-Based Feedback
(RTP/SAVPF)", RFC 5124, February 2008.
[RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description
Protocol (SDP) Grouping Framework", RFC 5888, June 2010.
[RFC7081] Ivov, E., Saint-Andre, P., and E. Marocco, "CUSAX:
Combined Use of the Session Initiation Protocol (SIP) and
the Extensible Messaging and Presence Protocol (XMPP)",
RFC 7081, November 2013.
[XEP-0234]
Saint-Andre, P., "Jingle File Transfer", XSF XEP 0234,
February 2012.
[XEP-0269]
Cionoiu, D. and P. Saint-Andre, "Jingle Early Media", XSF
XEP 0269, May 2009.
Appendix A. Acknowledgements
Thanks to Dave Crocker, Philipp Hancke, Paul Kyzivat, and Jonathan
Lennox for their feedback.
Saint-Andre, et al. Expires September 21, 2014 [Page 24]
Internet-Draft SIP-XMPP Interworking: Media Sessions March 2014
The authors gratefully acknowledge the assistance of Markus Isomaki
and Yana Stamcheva as the working group chairs and Gonzalo Camarillo
and Alissa Cooper as the sponsoring Area Directors.
Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for
employing him during his work on earlier versions of this document.
Authors' Addresses
Peter Saint-Andre
&yet
P.O. Box 787
Parker, CO 80134
USA
Email: ietf@stpeter.im
Saul Ibarra Corretge
AG Projects
Dr. Leijdsstraat 92
Haarlem 2021RK
The Netherlands
Email: saul@ag-projects.com
Emil Ivov
Jitsi
Strasbourg 67000
France
Phone: +33-177-624-330
Email: emcho@jitsi.org
Saint-Andre, et al. Expires September 21, 2014 [Page 25]