MMUSIC J. Rosenberg
Internet-Draft Skype
Intended status: Standards Track A. Keranen
Expires: April 28, 2011 Ericsson
B. Lowekamp
Skype
A. Roach
Tekelec
October 25, 2010
TCP Candidates with Interactive Connectivity Establishment (ICE)
draft-ietf-mmusic-ice-tcp-10
Abstract
Interactive Connectivity Establishment (ICE) defines a mechanism for
NAT traversal for multimedia communication protocols based on the
offer/answer model of session negotiation. ICE works by providing a
set of candidate transport addresses for each media stream, which are
then validated with peer-to-peer connectivity checks based on Session
Traversal Utilities for NAT (STUN). ICE provides a general framework
for describing candidates, but only defines UDP-based transport
protocols. This specification extends ICE to TCP-based media,
including the ability to offer a mix of TCP and UDP-based candidates
for a single stream.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Rosenberg, et al. Expires April 28, 2011 [Page 1]
Internet-Draft ICE TCP October 2010
This Internet-Draft will expire on April 28, 2011.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Rosenberg, et al. Expires April 28, 2011 [Page 2]
Internet-Draft ICE TCP October 2010
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Overview of Operation . . . . . . . . . . . . . . . . . . . . 5
4. Sending the Initial Offer . . . . . . . . . . . . . . . . . . 7
4.1. Gathering Candidates . . . . . . . . . . . . . . . . . . . 7
4.2. Prioritization . . . . . . . . . . . . . . . . . . . . . . 9
4.3. Choosing Default Candidates . . . . . . . . . . . . . . . 10
4.4. Encoding the SDP . . . . . . . . . . . . . . . . . . . . . 11
5. Candidate Collection Techniques . . . . . . . . . . . . . . . 11
5.1. Host Candidates . . . . . . . . . . . . . . . . . . . . . 12
5.2. Server Reflexive Candidates . . . . . . . . . . . . . . . 13
5.3. NAT-Assisted Candidates . . . . . . . . . . . . . . . . . 13
5.4. UDP-Tunneled Candidates . . . . . . . . . . . . . . . . . 13
5.5. Relayed Candidates . . . . . . . . . . . . . . . . . . . . 14
6. Receiving the Initial Offer . . . . . . . . . . . . . . . . . 14
6.1. Verifying ICE Support . . . . . . . . . . . . . . . . . . 14
6.2. Forming the Check Lists . . . . . . . . . . . . . . . . . 15
7. Connectivity Checks . . . . . . . . . . . . . . . . . . . . . 15
7.1. STUN Client Procedures . . . . . . . . . . . . . . . . . . 15
7.2. STUN Server Procedures . . . . . . . . . . . . . . . . . . 16
8. Concluding ICE Processing . . . . . . . . . . . . . . . . . . 16
9. Subsequent Offer/Answer Exchanges . . . . . . . . . . . . . . 17
9.1. ICE Restarts . . . . . . . . . . . . . . . . . . . . . . . 17
10. Media Handling . . . . . . . . . . . . . . . . . . . . . . . . 17
10.1. Sending Media . . . . . . . . . . . . . . . . . . . . . . 17
10.2. Receiving Media . . . . . . . . . . . . . . . . . . . . . 18
11. Connection Management . . . . . . . . . . . . . . . . . . . . 18
11.1. Connections Formed During Connectivity Checks . . . . . . 18
11.2. Connections Formed for Gathering Candidates . . . . . . . 19
12. Security Considerations . . . . . . . . . . . . . . . . . . . 20
13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
15.1. Normative References . . . . . . . . . . . . . . . . . . . 21
15.2. Informative References . . . . . . . . . . . . . . . . . . 21
Appendix A. Limitations of ICE TCP . . . . . . . . . . . . . . . 23
Appendix B. Implementation Considerations for BSD Sockets . . . . 23
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24
Rosenberg, et al. Expires April 28, 2011 [Page 3]
Internet-Draft ICE TCP October 2010
1. Introduction
Interactive Connectivity Establishment (ICE) [RFC5245] defines a
mechanism for NAT traversal for multimedia communication protocols
based on the offer/answer model [RFC3264] of session negotiation.
ICE works by providing a set of candidate transport addresses for
each media stream, which are then validated with peer-to-peer
connectivity checks based on Session Traversal Utilities for NAT
(STUN) [RFC5389]. However, ICE only defines procedures for UDP-based
transport protocols.
There are many reasons why ICE support for TCP is important.
Firstly, there are media protocols that only run over TCP. Such
protocols are used, for example, for screen sharing and instant
messaging [RFC4975]. For these protocols to work in the presence of
NAT, unless they define their own NAT traversal mechanisms, ICE
support for TCP is needed. In addition, RTP can also run over TCP
[RFC4571]. Typically, it is preferable to run RTP over UDP, and not
TCP. However, in a variety of network environments, overly
restrictive NAT and firewall devices prevent UDP-based communications
altogether, but general TCP-based communications are permitted. In
such environments, sending RTP over TCP, and thus establishing the
media session, may be preferable to having it fail altogether. With
this specification, agents can gather UDP and TCP candidates for a
media stream, list the UDP ones with higher priority, and then only
use the TCP-based ones if the UDP ones fail. This provides a
fallback mechanism that allows multimedia communications to be highly
reliable.
The usage of RTP over TCP is particularly useful when combined with
Traversal Using Relays around NAT (TURN) [RFC5766]. In this case,
one of the agents would connect to its TURN server using TCP, and
obtain a TCP-based relayed candidate. It would offer this to its
peer agent as a candidate. The answerer would initiate a TCP
connection towards the TURN server. When that connection is
established, media can flow over the connections, through the TURN
server. The benefit of this usage is that it only requires the
agents to make outbound TCP connections to a server on the public
network. This kind of operation is broadly interoperable through NAT
and firewall devices. Since it is a goal of ICE and this extension
to provide highly reliable communications that "just works" in as a
broad a set of network deployments as possible, this use case is
particularly important.
This specification extends ICE by defining its usage with TCP
candidates. It also defines how ICE can be used with RTP and Secure
RTP (SRTP) to provide both TCP and UDP candidates. This
specification does so by following the outline of ICE itself, and
Rosenberg, et al. Expires April 28, 2011 [Page 4]
Internet-Draft ICE TCP October 2010
calling out the additions and changes necessary in each section of
ICE to support TCP candidates.
It should be noted that since TCP NAT traversal is more complicated
than with UDP, ICE TCP is not as efficient as UDP-based ICE.
Discussion about this topic can be found in Appendix A.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
This document uses the same terminology as ICE (see Section 3 of
[RFC5245]).
3. Overview of Operation
The usage of ICE with TCP is relatively straightforward. The main
area of specification is around how and when connections are opened,
and how those connections relate to candidate pairs.
When the agents perform address allocations to gather TCP-based
candidates, three types of candidates can be obtained. These are
active candidates, passive candidates, and simultaneous-open (S-O)
candidates. An active candidate is one for which the agent will
attempt to open an outbound connection, but will not receive incoming
connection requests. A passive candidate is one for which the agent
will receive incoming connection attempts, but not attempt a
connection. S-O candidate is one for which the agent will attempt to
open a connection simultaneously with its peer.
Unlike for UDP, there are no lite implementation defined for TCP.
Instead, an implementation that meets the criteria for a lite
implementation as discussed in Appendix A of [RFC5245] can just use
the mechanisms defined in [RFC4145], with constraints defined here on
selection of attribute values (see Section 4).
When gathering candidates from a host interface, the agent typically
obtains active, passive, and S-O candidates. Similarly, one can use
different techniques for obtaining, e.g., server reflexive, NAT-
assisted, tunneled, or relayed candidates of these three types.
Connections to servers used for relayed and server reflexive
candidates are kept open during ICE processing.
When encoding these candidates into offers and answers, the type of
Rosenberg, et al. Expires April 28, 2011 [Page 5]
Internet-Draft ICE TCP October 2010
the candidate is signaled. In the case of active candidates, an IP
address and port is present, but it is meaningless, as it is ignored
by the peer. As a consequence, active candidates do not need to be
physically allocated at the time of address gathering. Rather, the
physical allocations, which occur as a consequence of a connection
attempt, occur at the time of the connectivity checks.
When the candidates are paired together, active candidates are always
paired with passive, and S-O candidates with each other. When a
connectivity check is to be made on a candidate pair, each agent
determines whether it is to make a connection attempt for this pair.
The actual process of generating connectivity checks, managing the
state of the check list, and updating the Valid list, work
identically for TCP as they do for UDP.
ICE requires an agent to demultiplex STUN and application layer
traffic, since they appear on the same port. This demultiplexing is
described by ICE, and is done using the magic cookie and other fields
of the message. Stream-oriented transports introduce another
wrinkle, since they require a way to frame the connection so that the
application and STUN packets can be extracted in order to determine
which is which. For this reason, TCP media streams utilizing ICE use
the basic framing provided in RFC 4571 [RFC4571], even if the
application layer protocol is not RTP.
When TLS is in use (for non-RTP traffic) or DTLS (for RTP traffic),
it runs over the RFC 4571 framing shim, so that STUN runs outside of
the (D)TLS connection. Pictorially:
+----------+
| |
| App |
+----------+----------+
| | |
| STUN | (D)TLS |
+----------+----------+
| |
| RFC 4571 |
+---------------------+
| |
| TCP |
+---------------------+
| |
| IP |
+---------------------+
Figure 1: ICE TCP Stack
Rosenberg, et al. Expires April 28, 2011 [Page 6]
Internet-Draft ICE TCP October 2010
The implication of this is that, for any media stream protected by
(D)TLS, the agent will first run ICE procedures, exchanging STUN
messages. Then, once ICE completes, (D)TLS procedures begin. ICE
and (D)TLS are thus "peers" in the protocol stack. The STUN messages
are not sent over the (D)TLS connection, even ones sent for the
purposes of keepalive in the middle of the media session.
When an updated offer is generated by the controlling endpoint, the
SDP extensions for connection oriented media [RFC4145] are used to
signal that an existing connection should be used, rather than
opening a new one.
4. Sending the Initial Offer
If an offerer meets the criteria for lite as defined in Appendix A of
[RFC5245], it omits any ICE attributes for its TCP-based media
streams. Instead, the offerer follows the procedures defined in
[RFC4145] for constructing the offer. However, the offerer MUST use
a setup attribute of "actpass" for those streams.
For offerers making use of ICE for TCP streams, the procedures below
are used.
4.1. Gathering Candidates
Providers of real-time communications services may decide that it is
preferable to have no media at all than it is to have media over TCP.
To allow for choice, it is RECOMMENDED that agents be configurable
with whether they obtain TCP candidates for real time media.
Having it be configurable, and then configuring it to be off, is
far better than not having the capability at all. An important
goal of this specification is to provide a single mechanism that
can be used across all types of endpoints. As such, it is
preferable to account for provider and network variation through
configuration, instead of hard-coded limitations in an
implementation. Furthermore, network characteristics and
connectivity assumptions can, and will change over time. Just
because an agent is communicating with a server on the public
network today, doesn't mean that it won't need to communicate with
one behind a NAT tomorrow. Just because an agent is behind a NAT
with endpoint independent mapping today, doesn't mean that
tomorrow they won't pick up their agent and take it to a public
network access point where there is a NAT with address and port
dependent mapping properties, or one that only allows outbound
TCP. The way to handle these cases and build a reliable system is
for agents to implement a diverse set of techniques for allocating
Rosenberg, et al. Expires April 28, 2011 [Page 7]
Internet-Draft ICE TCP October 2010
addresses, so that at least one of them is almost certainly going
to work in any situation. Implementors should consider very
carefully any assumptions that they make about deployments before
electing not to implement one of the mechanisms for address
allocation. In particular, implementors should consider whether
the elements in the system may be mobile, and connect through
different networks with different connectivity. They should also
consider whether endpoints which are under their control, in terms
of location and network connectivity, would always be under their
control. In environments where mobility and user control are
possible, a multiplicity of techniques is essential for
reliability.
First, agents SHOULD obtain host candidates as described in
Section 5.1. Then, each agent SHOULD "obtain" (allocate a
placeholder for) an active host candidate for each component of each
TCP capable media stream on each interface that the host has. The
agent does not have to yet actually allocate a port for these
candidates, but they are used for the creation of the check lists.
Next, the agents SHOULD obtain passive (and possibly S-O) relayed
candidates for each component as described in Section 5.5. Each
agent SHOULD also allocate a placeholder for an active relayed
candidate for each component of each TCP capable media stream.
The agent SHOULD then obtain server reflexive, NAT-assisted, and/or
UDP-tunneled candidates (see Section 5.2, Section 5.3, and
Section 5.4). The mechanisms for establishing these candidates and
the number of candidates to collect vary from technique to technique.
These considerations are discussed in the relevant sections, below.
It is highly recommended that a host obtains at least one set of host
and one set of relayed candidates. Obtaining additional candidates
will increase the chance of successfully creating a direct
connection.
Once the candidates have been obtained, the agent MUST keep the TCP
connections open until ICE processing has completed. See Appendix B
for important implementation guidelines.
If a media stream is UDP-based (such as RTP), an agent MAY use an
additional host TCP candidate to request a UDP-based candidate from a
TURN server (or some other relay with similar functionality). Usage
of such UDP candidates follows the procedures defined in ICE for UDP
candidates.
Like its UDP counterparts, TCP-based STUN transactions are paced out
at one every Ta seconds. This pacing refers strictly to STUN
Rosenberg, et al. Expires April 28, 2011 [Page 8]
Internet-Draft ICE TCP October 2010
transactions (both Binding and Allocate requests). If performance of
the transaction requires establishment of a TCP connection, then the
connection gets opened when the transaction is performed.
4.2. Prioritization
The transport protocol itself is a criteria for choosing one
candidate over another. If a particular media stream can run over
UDP or TCP, the UDP candidates might be preferred over the TCP
candidates. This allows ICE to use the lower latency UDP
connectivity if it exists, but fallback to TCP if UDP doesn't work.
To accomplish this, the local preference SHOULD be defined as:
local-preference = (2^12)*(transport-pref) +
(2^7)*(class-pref) +
(2^0)*(other-pref)
Transport-pref is the relative preference for candidates with this
particular transport protocol (UDP or TCP), and class-pref is the
preference for candidates with this particular establishment
directionality and class (active, passive, or S-O with different
class of NAT traversal techniques). Other-pref is used as a
differentiator when two candidates would otherwise have identical
local preferences.
Transport-pref MUST be between 0 and 15, with 15 being the most
preferred. Class-pref MUST be between 0 and 31, with 31 being the
most preferred. Other-pref MUST be between 0 and 127, with 127 being
the most preferred. For RTP-based media streams, it is RECOMMENDED
that UDP have a transport-pref of 12 and TCP of 6. It is RECOMMENDED
that, for all connection-oriented media, candidates have a class-pref
assigned as follows:
29 Host active candidate
28 Host passive candidate
27 Host S-O candidate
23 NAT-assisted S-O candidate
22 NAT-assisted active candidate
21 NAT-assisted passive candidate
17 Server reflexive S-O candidate
16 Server reflexive active candidate
15 Server reflexive passive candidate
11 UDP-tunneled active candidate
10 UDP-tunneled passive candidate
9 UDP-tunneled S-O candidate
5 Relayed active candidate
Rosenberg, et al. Expires April 28, 2011 [Page 9]
Internet-Draft ICE TCP October 2010
4 Relayed passive candidate
3 Relayed S-O candidate
If it is more important to use certain kind (NAT-assisted, server
reflexive, etc.) of candidates rather than certain transport
protocol, it is RECOMMENDED that the type preference for NAT-assisted
candidates be set higher than that for server-reflexive candidates
and that the type preference for UDP-tunneled candidates be set lower
than that for server-reflexive candidates. The RECOMMENDED values
are 105 for NAT-assisted candidates and 75 for UDP-tunneled
candidates. However, if the transport protocol is more important,
NAT-assisted and UDP-tunneled candidates MAY use the same type
preference as the server-reflexive candidates.
The class-pref priorities listed above are simply recommendations
that try to strike a balance between success probability and
resulting path's efficiency. Depending on the scenario where ICE TCP
is used, different values may be appropriate. For example, if the
overhead of a UDP tunnel is not an issue, those candidates should be
prioritized higher since they are likely to have a high success
probability. Also, simultaneous-open is prioritized higher than
active and passive candidates for NAT-assisted and server reflexive
candidates since if TCP S-O is supported by the operating systems of
both endpoints, it should work at least as well as the act-pass
approach. If an implementation is uncertain whether S-O candidates
are supported, it may be reasonable to prioritize them lower. For
host, UDP-tunneled, and relayed candidates the S-O candidates are
prioritized lower than active and passive since act-pass candidates
should work with them at least as well as the S-O candidates.
If any two candidates have the same type-preference, transport-pref,
and class-pref, they MUST have a unique other-pref. With this
specification, this usually only happens with multi-homed hosts, in
which case other-pref is a preference amongst interfaces.
4.3. Choosing Default Candidates
The default candidate is chosen primarily based on the likelihood of
it working with a non-ICE peer. When media streams supporting mixed
modes (both TCP and UDP) are used with ICE, it is RECOMMENDED that,
for real-time streams (such as RTP), the default candidates be UDP-
based. However, the default SHOULD NOT be a simultaneous-open
candidate.
If a media stream is inherently TCP-based, the agent MUST select the
active candidate as default. This ensures proper directionality of
connection establishment for NAT traversal with non-ICE
implementations.
Rosenberg, et al. Expires April 28, 2011 [Page 10]
Internet-Draft ICE TCP October 2010
4.4. Encoding the SDP
TCP-based candidates are encoded into a=candidate lines identically
to the UDP encoding described in [RFC5245]. However, the transport
protocol (i.e., value of the transport-extension token defined in
[RFC5245] Section 15.1) is set to "tcp-so" for TCP simultaneous-open
candidates, "tcp-act" for TCP active candidates, and "tcp-pass" for
TCP passive candidates. The addr and port encoded into the candidate
attribute for active candidates MUST be set to IP address that will
be used for the attempt, but the port MUST be set to 9 (i.e.,
Discard). For active relayed candidates, the value for addr must be
identical to the IP address of a passive or simultaneous-open
candidate from the same relay server.
If the default candidate is TCP, the agent MUST include the a=setup
and a=connection attributes from RFC 4145 [RFC4145], following the
procedures defined there as if ICE was not in use. In particular, if
an agent is the answerer, the a=setup attribute MUST meet the
constraints in RFC 4145 based on the value in the offer. Since an
ICE ICE offerer always uses the active candidate as default, an ICE
ICE answerer will always use the passive attribute as default and
include the a=setup:passive attribute in the answer.
If an agent is utilizing SRTP [RFC3711], it MAY include a mix of UDP
and TCP candidates. If ICE selects a TCP candidate pair, the agent
MUST still utilize SRTP, but run it over the connection established
by ICE. The alternative, RTP over TLS, MUST NOT be used. This
allows for the higher layer protocols (the security handshakes and
media transport) to be independent of the underlying transport
protocol. In the case of DTLS-SRTP [RFC5764], the directionality
attributes (a=setup) are utilized strictly to determine the direction
of the DTLS handshake. Directionality of the TCP connection
establishment are determined by the ICE attributes and procedures
defined here.
If an agent is securing non-RTP media over TCP/TLS, the SDP MUST be
constructed as described in RFC 4572 [RFC4572]. The directionality
attributes (a=setup) are utilized strictly to determine the direction
of the TLS handshake. Directionality of the TCP connection
establishment are determined by the ICE attributes and procedures
defined here.
5. Candidate Collection Techniques
The following sections discuss a number of techniques that can be
used to obtain candidates for use with ICE TCP. It is important to
note that this list is not intended to be exhaustive, nor is
Rosenberg, et al. Expires April 28, 2011 [Page 11]
Internet-Draft ICE TCP October 2010
implementation of any specific technique beyond Host Candidates
(Section 5.1) considered mandatory.
Implementors are encouraged to implement as many of the following
techniques from the following list as is practical, as well as to
explore additional NAT-traversal techniques beyond those discussed in
this document. However, to get a reasonable success ratio, one
SHOULD implement at least one relayed technique (e.g., TURN) and one
technique for discovering the address given for the host by a NAT
(e.g., STUN).
To increase the success probability with the techniques described
below and to aid with transition to IPv6, implementors SHOULD take
particular care to include both IPv4 and IPv6 candidates as part of
the process of gathering candidates. If the local network or host
does not support IPv6 addressing, then clients SHOULD make use of
other techniques, e.g., Teredo [RFC4380] or SOCKS IPv4-IPv6
gatewaying [RFC3089], for obtaining IPv6 candidates.
While implementations SHOULD support as many techniques as feasible,
they SHOULD also consider which of them to use if multiple options
are available. Since different candidates are paired with each
other, offering a large amount of candidates results in a large
checklist and potentially long lasting connectivity checks. For
example, using multiple NAT-assisted techniques with the same NAT
usually results only in redundant candidates and similarly out of
multiple different UDP tunneling or relaying techniques with similar
features using just one is often enough.
5.1. Host Candidates
Host candidates are the most simple candidates since they only
require opening TCP sockets on one the host's interfaces and sending/
receiving connectivity checks from them. However, if the hosts are
behind different NATs, host candidates usually fail to work. On the
other hand, if there are no NATs between the hosts, host candidates
are the most efficient method since they require no additional NAT
traversal protocols or techniques.
For each TCP capable media stream the agent wishes to use (including
ones, like RTP, which can either be UDP or TCP), the agent SHOULD
obtain two host candidates (each on a different port) for each
component of the media stream on each interface that the host has -
one for the simultaneous-open, and one for the passive candidate. If
an agent is not capable of acting in one of these modes it would omit
those candidates.
Rosenberg, et al. Expires April 28, 2011 [Page 12]
Internet-Draft ICE TCP October 2010
5.2. Server Reflexive Candidates
Server reflexive techniques aim to discover the address a NAT has
given for the host by asking that from a server on the other side of
the NAT and then creating proper bindings (unless such already exist)
on the NATs with connectivity checks sent between the hosts. Success
of these techniques depends on the NATs' mapping and filtering
behavior [RFC5382] and also whether the NATs and hosts support TCP
simultaneous-open technique.
A widely used protocol for obtaining server reflexive candidates is
STUN, whose TCP specific behavior is described in [RFC5389] Section
7.2.2.
5.3. NAT-Assisted Candidates
NAT-assisted techniques communicate with the NATs directly and this
way discover the address NAT has given to the host and also create
proper bindings on the NATs. The benefit of these techniques over
the server reflexive techniques is that the NATs can adjust their
mapping and filtering behavior so that connections can be
successfully created. A downside of NAT-assisted techniques is that
they commonly allow communicating only with a NAT that is in the same
subnet as the host and thus often fail in scenarios with multiple
layers of NATs. These techniques also rely on NATs supporting the
specific protocols and that the NATs allow the users to modify their
behavior.
These candidates are encoded in the ICE offer and answer like the
server reflexive candidates but they (commonly) use a higher priority
(as described in Section 4.2) and hence are tested before the server
reflexive candidates.
Currently, the UPnP forum's Internet Gateway Device (IGD) protocol
[UPnP-IGD] and the NAT Port Mapping Protocol (PMP)
[I-D.cheshire-nat-pmp] are widely supported NAT-assisted techniques.
Other known protocols include SOCKS [RFC1928], Realm Specific IP
(RSIP) [RFC3103], and SIMCO [RFC4540]. Also, MIDCOM MIB [RFC5190]
defines an SNMP-based mechanism for controlling NATs.
5.4. UDP-Tunneled Candidates
UDP-tunneled NAT traversal techniques utilize the fact that UDP NAT
traversal is simpler and more efficient than TCP NAT traversal. With
these techniques, the TCP packets (or possibly complete IP packets)
are encapsulated in UDP packets. Because of the encapsulation these
techniques increase the overhead for the connection and may require
support from both of the endpoints, but on the other hand UDP
Rosenberg, et al. Expires April 28, 2011 [Page 13]
Internet-Draft ICE TCP October 2010
tunneling commonly results in reliable and fairly simple TCP NAT
traversal.
UDP-tunneled candidates can be encoded in the ICE offer and answer
either as relayed or server reflexive candidates, depending on
whether the tunneling protocol utilizes a relay between the hosts.
For example, the Teredo protocol [RFC4380] provides automatic UDP
tunneling and IPv6 interworking. The Teredo UDP tunnel is visible to
the host application as an IPv6 address and thus Teredo candidates
are encoded as IPv6 addresses.
5.5. Relayed Candidates
Relaying packets through a relay server is often the NAT traversal
technique that has the highest success probability: communicating via
a relay that is in the public Internet looks like normal client-
server communication for the NATs and that is supported in practice
by all existing NATs, regardless of their filtering and mapping
behavior. However, using a relay has several drawbacks, e.g., it
usually results in a sub-optimal path for the packets, the relay
needs to exist and it needs to be discovered, the relay is a possible
single point of failure, relaying consumes potentially a lot of
resources of the relay server, etc. Therefore, relaying is often
used as the last resort when no direct path can be created with other
NAT traversal techniques.
With relayed candidates the host commonly needs to obtain only a
passive candidate since any of the peer's server reflexive (and NAT-
assisted if the peer can communicate with the outermost NAT) active
candidates should work with the passive relayed candidate. However,
if the relay is behind a NAT or a firewall, using also active and S-O
candidates will increase success probability.
Relaying protocols capable of relaying TCP connections include TURN
TCP [I-D.ietf-behave-turn-tcp] and SOCKS [RFC1928] (which can also be
used for IPv4-IPv6 gatewaying [RFC3089]). It is also possible to
use, e.g., an SSH [RFC4250] tunnel as a relayed candidate if a
suitable server is available and the server permits this.
6. Receiving the Initial Offer
6.1. Verifying ICE Support
Since this specification does not define a lite mode for ICE TCP, a
lite implementation will include candidate attributes for its UDP
streams, but no such attributes for its TCP streams. An agent
Rosenberg, et al. Expires April 28, 2011 [Page 14]
Internet-Draft ICE TCP October 2010
receiving such an offer MUST proceed with ICE in this case. ICE will
be used for the UDP streams, and [RFC4145] procedures will be used
for the TCP streams. However, if the offer indicates a setup
direction of actpass, the answerer MUST utilize a=setup:active in the
answer. This is required to ensure proper directionality of
connection establishment to work through NAT.
Similarly, if an agent is lite, and receives an offer that includes
streams with TCP candidates, it will omit candidates from the answer
for those streams. This will cause [RFC4145] procedures to be used
for those streams. In this case, the offer will indicate a direction
of active, and the agent will use passive in its answer.
6.2. Forming the Check Lists
When forming candidate pairs, the following types of candidates can
be paired with each other:
Local Remote
Candidate Candidate
----------------------------
tcp-so tcp-so
tcp-act tcp-pass
tcp-pass tcp-act
When the agent prunes the check list, it MUST also remove any pair
for which the local candidate is tcp-pass. Also NAT-assisted
candidates MUST be pruned from the check list like server reflexive
candidates when the same address is used also as an active host
candidate.
The remainder of check list processing works like in the UDP case.
7. Connectivity Checks
7.1. STUN Client Procedures
7.1.1. Sending the Request
When an agent wants to send a TCP-based connectivity check, it first
opens a TCP connection if none yet exists for the 5-tuple defined by
the candidate pair for which the check is to be sent. This
connection is opened from the local candidate of the pair to the
remote candidate of the pair. If the local candidate is tcp-act, the
agent MUST open a connection from the interface associated with that
local candidate. This connection MUST be opened from an unallocated
port. For host candidates, this is readily done by connecting from
Rosenberg, et al. Expires April 28, 2011 [Page 15]
Internet-Draft ICE TCP October 2010
the candidates interface. For relayed candidates, the agent uses
procedures specific to the relaying protocol.
Once the connection is established, the agent MUST utilize the shim
defined in RFC 4571 [RFC4571] for the duration this connection
remains open. The STUN Binding requests and responses are sent on
top of this shim, so that the length field defined in RFC 4571
precedes each STUN message. If TLS or DTLS-SRTP is to be utilized
for the media session, the TLS or DTLS-SRTP handshakes will take
place on top of this shim as well. However, they only start once ICE
processing has completed. In essence, the TLS or DTLS-SRTP
handshakes are considered a part of the media protocol. STUN is
never run within the TLS or DTLS-SRTP session.
If the TCP connection cannot be established, the check is considered
to have failed, and a full-mode agent MUST update the pair state to
Failed in the check list.
Once the connection is established, client procedures are identical
to those for UDP candidates. Note that STUN responses received on an
active TCP candidate will typically produce a remote peer reflexive
candidate.
7.2. STUN Server Procedures
An agent MUST be prepared to receive incoming TCP connection requests
on any host, relayed, or UDP-tunneled TCP candidate that is
simultaneous-open or passive. When the connection request is
received, the agent MUST accept it. The agent MUST utilize the
framing defined in RFC 4571 [RFC4571] for the lifetime of this
connection. Due to this framing, the agent will receive data in
discrete frames. Each frame could be media (such as RTP or SRTP),
TLS, DTLS, or STUN packets. The STUN packets are extracted as
described in Section 10.2.
Once the connection is established, STUN server procedures are
identical to those for UDP candidates. Note that STUN requests
received on a passive TCP candidate will typically produce a remote
peer reflexive candidate.
8. Concluding ICE Processing
If there are TCP candidates for a media stream, a controlling agent
MUST use a regular selection algorithm.
When ICE processing for a media stream completes, each agent SHOULD
close all TCP connections except the ones between the candidate pairs
Rosenberg, et al. Expires April 28, 2011 [Page 16]
Internet-Draft ICE TCP October 2010
selected by ICE.
These two rules are related; the closure of connection on
completion of ICE implies that a regular selection algorithm has
to be used. This is because aggressive selection might cause
transient pairs to be selected. Once such a pair was selected,
the agents would close the other connections, one of which may be
about to be selected as a better choice. This race condition may
result in TCP connections being accidentally closed for the pair
that ICE selects.
9. Subsequent Offer/Answer Exchanges
9.1. ICE Restarts
If an ICE restart occurs for a media stream with TCP candidate pairs
that have been selected by ICE, the agents MUST NOT close the
connections after the restart. In the offer or answer that causes
the restart, an agent MAY include a simultaneous-open candidate whose
transport address matches the previously selected candidate. If both
agents do this, the result will be a simultaneous-open candidate pair
matching an existing TCP connection. In this case, the agents MUST
NOT attempt to open a new connection (or start new TLS or DTLS-SRTP
procedures). Instead, that existing connection is reused and STUN
checks are performed.
Once the restart completes, if the selected pair does not match the
previously selected pair, the TCP connection for the previously
selected pair SHOULD be closed by the agent.
10. Media Handling
10.1. Sending Media
When sending media, if the selected candidate pair matches an
existing TCP connection, that connection MUST be used for sending
media.
The framing defined in RFC 4571 MUST be used when sending media. For
media streams that are not RTP-based and do not normally use RFC
4571, the agent treats the media stream as a byte stream, and assumes
that it has its own framing of some sort. It then takes an arbitrary
number of bytes from the byte stream, and places that as a payload in
the RFC 4571 frames, including the length. Next, the sender checks
to see if the resulting set of bytes would be viewed as a STUN packet
based on the rules in Sections 6 and 8 of [RFC5389]. This includes a
Rosenberg, et al. Expires April 28, 2011 [Page 17]
Internet-Draft ICE TCP October 2010
check on the most significant two bits, the magic cookie, the length,
and the fingerprint. If, based on those rules, the bytes would be
viewed as a STUN message, the sender SHOULD utilize a different
number of bytes so that the length checks will fail. Though it is
normally highly unlikely that an arbitrary number of bytes from a
byte stream would resemble a STUN packet based on all of the checks,
it can happen if the content of the application stream happens to
contain a STUN message (for example, a file transfer of logs from a
client which includes STUN messages).
If TLS or DTLS-SRTP procedures are being utilized to protect the
media stream, those procedures start at the point that media is
permitted to flow, as defined in the ICE specification [RFC5245].
The TLS or DTLS-SRTP handshakes occur on top of the RFC 4571 shim,
and are considered part of the media stream for purposes of this
specification.
10.2. Receiving Media
The framing defined in RFC 4571 MUST be used when receiving media.
For media streams that are not RTP-based and do not normally use RFC
4571, the agent extracts the payload of each RFC 4571 frame, and
determines if it is a STUN or an application layer data based on the
procedures in ICE [RFC5245]. If media is being protected with DTLS-
SRTP, the DTLS, RTP and STUN packets are demultiplexed as described
in Section 5.1.2 [RFC5764].
For non-STUN data, the agent appends this to the ongoing byte stream
collected from the frames. It then parses the byte stream as if it
had been directly received over the TCP connection. This allows for
ICE TCP to work without regard to the framing mechanism used by the
application layer protocol.
11. Connection Management
11.1. Connections Formed During Connectivity Checks
Once a TCP or TCP/TLS connection is opened by ICE for the purpose of
connectivity checks, its life cycle depends on how it is used. If
that candidate pair is selected by ICE for usage for media, an agent
SHOULD keep the connection open until:
o The session terminates
o The media stream is removed
Rosenberg, et al. Expires April 28, 2011 [Page 18]
Internet-Draft ICE TCP October 2010
o An ICE restart takes place, resulting in the selection of a
different candidate pair.
In these cases, the agent SHOULD close the connection when that event
occurs. This applies to both agents in a session, in which case
usually one of the agents will end up closing the connection first.
If a connection has been selected by ICE, an agent MAY close it
anyway. As described in the next paragraph, this will cause it to be
reopened almost immediately, and in the interim media cannot be sent.
Consequently, such closures have a negative effect and are NOT
RECOMMENDED. However, there may be cases where an agent needs to
close a connection for some reason.
If an agent needs to send media on the selected candidate pair, and
its TCP connection has closed, either on purpose or due to some
error, then:
o If the agent's local candidate is tcp-act or tcp-so, it MUST
reopen a connection to the remote candidate of the selected pair.
o If the agent's local candidate is tcp-pass, the agent MUST await
an incoming connection request, and consequently, will not be able
to send media until it has been opened.
If the TCP connection is established, the framing of RFC 4571 is
utilized. If the agent opened the connection, it MUST send a STUN
connectivity check. An agent MUST be prepared to receive a
connectivity check over a connection it opened or accepted (note that
this is true in general; ICE requires that an agent be prepared to
receive a connectivity check at any time, even after ICE processing
completes). If an agent receives a connectivity check after re-
establishment of the connection, it MUST generate a triggered check
over that connection in response if it has not already sent a check.
Once an agent has sent a check and received a successful response,
the connection is considered Valid and media can be sent (which
includes a TLS or DTLS-SRTP session resumption or restart).
If the TCP connection cannot be established, the controlling agent
SHOULD restart ICE for this media stream. This will happen in cases
where one of the agents is behind a NAT with connection dependent
mapping properties [RFC5382].
11.2. Connections Formed for Gathering Candidates
If the agent opened a connection to a STUN server, or another similar
server, for the purposes of gathering a server reflexive candidate,
that connection SHOULD be closed by the client once ICE processing
Rosenberg, et al. Expires April 28, 2011 [Page 19]
Internet-Draft ICE TCP October 2010
has completed. This happens irregardless of whether the candidate
learned from the server was selected by ICE.
If the agent opened a connection to a TURN server for the purposes of
gathering a relayed candidate, that connection MUST be kept open by
the client for the duration of the media session if:
o A relayed candidate learned by the TURN server was selected by
ICE,
o or an active candidate established as a consequence of a Connect
request sent through that TCP connection was selected by ICE.
Otherwise, the connection to the TURN server SHOULD be closed once
ICE processing completes.
If, despite efforts of the client, a TCP connection to a TURN server
fails during the lifetime of the media session utilizing a transport
address allocated by that server, the client SHOULD reconnect to the
TURN server, obtain a new allocation, and restart ICE for that media
stream. Similar measures SHOULD apply also to other type of relaying
servers.
12. Security Considerations
The main threat in ICE is hijacking of connections for the purposes
of directing media streams to DoS targets or to malicious users. ICE
TCP prevents that by only using TCP connections that have been
validated. Validation requires a STUN transaction to take place over
the connection. This transaction cannot complete without both
participants knowing a shared secret exchanged in the rendezvous
protocol used with ICE, such as SIP [RFC3261]. This shared secret,
in turn, is protected by that protocol exchange. In the case of SIP,
the usage of the sips mechanism is RECOMMENDED. When this is done,
an attacker, even if it knows or can guess the port on which an agent
is listening for incoming TCP connections, will not be able to open a
connection and send media to the agent.
A more detailed analysis of this attack and the various ways ICE
prevents it are described in [RFC5245]. Those considerations apply
to this specification.
13. IANA Considerations
There are no IANA considerations associated with this specification.
Rosenberg, et al. Expires April 28, 2011 [Page 20]
Internet-Draft ICE TCP October 2010
14. Acknowledgements
The authors would like to thank Tim Moore, Saikat Guha, Francois
Audet, Roni Even, Simon Perreault, and Alfred Heggestad for the
reviews and input on this document.
15. References
15.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
June 2002.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)",
RFC 3711, March 2004.
[RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in
the Session Description Protocol (SDP)", RFC 4145,
September 2005.
[RFC4571] Lazzaro, J., "Framing Real-time Transport Protocol (RTP)
and RTP Control Protocol (RTCP) Packets over Connection-
Oriented Transport", RFC 4571, July 2006.
[RFC4572] Lennox, J., "Connection-Oriented Media Transport over the
Transport Layer Security (TLS) Protocol in the Session
Description Protocol (SDP)", RFC 4572, July 2006.
[RFC5245] Rosenberg, J., "Interactive Connectivity Establishment
(ICE): A Protocol for Network Address Translator (NAT)
Traversal for Offer/Answer Protocols", RFC 5245,
April 2010.
[RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer
Security (DTLS) Extension to Establish Keys for the Secure
Real-time Transport Protocol (SRTP)", RFC 5764, May 2010.
15.2. Informative References
[RFC1928] Leech, M., Ganis, M., Lee, Y., Kuris, R., Koblas, D., and
L. Jones, "SOCKS Protocol Version 5", RFC 1928,
March 1996.
Rosenberg, et al. Expires April 28, 2011 [Page 21]
Internet-Draft ICE TCP October 2010
[RFC3089] Kitamura, H., "A SOCKS-based IPv6/IPv4 Gateway Mechanism",
RFC 3089, April 2001.
[RFC3103] Borella, M., Grabelsky, D., Lo, J., and K. Taniguchi,
"Realm Specific IP: Protocol Specification", RFC 3103,
October 2001.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
A., Peterson, J., Sparks, R., Handley, M., and E.
Schooler, "SIP: Session Initiation Protocol", RFC 3261,
June 2002.
[RFC4250] Lehtinen, S. and C. Lonvick, "The Secure Shell (SSH)
Protocol Assigned Numbers", RFC 4250, January 2006.
[RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through
Network Address Translations (NATs)", RFC 4380,
February 2006.
[RFC4540] Stiemerling, M., Quittek, J., and C. Cadar, "NEC's Simple
Middlebox Configuration (SIMCO) Protocol Version 3.0",
RFC 4540, May 2006.
[RFC4975] Campbell, B., Mahy, R., and C. Jennings, "The Message
Session Relay Protocol (MSRP)", RFC 4975, September 2007.
[RFC5190] Quittek, J., Stiemerling, M., and P. Srisuresh,
"Definitions of Managed Objects for Middlebox
Communication", RFC 5190, March 2008.
[RFC5382] Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P.
Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142,
RFC 5382, October 2008.
[RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
"Session Traversal Utilities for NAT (STUN)", RFC 5389,
October 2008.
[RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using
Relays around NAT (TURN): Relay Extensions to Session
Traversal Utilities for NAT (STUN)", RFC 5766, April 2010.
[I-D.ietf-behave-turn-tcp]
Perreault, S. and J. Rosenberg, "Traversal Using Relays
around NAT (TURN) Extensions for TCP Allocations",
draft-ietf-behave-turn-tcp-07 (work in progress),
July 2010.
Rosenberg, et al. Expires April 28, 2011 [Page 22]
Internet-Draft ICE TCP October 2010
[I-D.cheshire-nat-pmp]
Cheshire, S., "NAT Port Mapping Protocol (NAT-PMP)",
draft-cheshire-nat-pmp-03 (work in progress), April 2008.
[UPnP-IGD]
Warrier, U., Iyer, P., Pennerath, F., Marynissen, G.,
Schmitz, M., Siddiqi, W., and M. Blaszczak, "Internet
Gateway Device (IGD) Standardized Device Control Protocol
V 1.0", November 2001.
[IMC05] Guha, S. and P. Francis, "Characterization and Measurement
of TCP Traversal through NATs and Firewalls", Proceedings
of the 5th ACM SIGCOMM conference on Internet Measurement,
2005.
Appendix A. Limitations of ICE TCP
Compared to UDP-based ICE, ICE TCP has in general lower success
probability for enabling connectivity without a relay if both of the
hosts are behind a NAT. This happens because many of the currently
deployed NATs have endpoint dependent mapping behavior or they do not
support the flow of TCP hand shake packets seen in case of TCP
simultaneous-open: e.g., some NATs do not allow incoming TCP SYN
packets from an address where a SYN packet has been sent to recently
or the subsequent SYNACK is not processed properly.
It has been reported in [IMC05] that with the population of NATs
deployed at the time of the measurements (2005), simultaneous-open
technique worked in roughly 45% of the cases. Also, all operating
systems do not implement TCP simultaneous-open properly and thus are
not able to use such candidates. However, if/when more NATs comply
with the requirements set by [RFC5382] and operating system TCP
stacks are fixed, the success probability of simultaneous-open is
likely to increase.
Alternatively, using unidirectional opens (where one side is active
and the other is passive) is more reliable, but will commonly require
a relay if both sides are behind different NATs. Therefore, in the
spirit of the ICE philosophy, both simultaneous-open and
unidirectional candidates are tried. Simultaneous-opens are
preferred since, if it does work, it will not require a relay even
when both sides are behind a different NAT.
Appendix B. Implementation Considerations for BSD Sockets
This specification requires unusual handling of TCP connections, the
Rosenberg, et al. Expires April 28, 2011 [Page 23]
Internet-Draft ICE TCP October 2010
implementation of which in traditional BSD socket APIs is non-
trivial.
In particular, ICE requires an agent to obtain a local TCP candidate,
bound to a local IP and port, and then from that local port, initiate
a TCP connection (e.g., to the STUN server, in order to obtain server
reflexive candidates, to the TURN server, to obtain a relayed
candidate, or to the peer as part of a connectivity check), and be
prepared to receive incoming TCP connections (for passive and
simultaneous-open candidates). A "typical" BSD socket is used either
for initiating or receiving connections, and not for both. The code
required to allow incoming and outgoing connections on the same local
IP and port is non-obvious. The following pseudocode, contributed by
Saikat Guha, has been found to work on many platforms:
for i in 0 to MAX
sock_i = socket()
set(sock_i, SO_REUSEADDR)
bind(sock_i, local)
listen(sock_0)
connect(sock_1, stun)
connect(sock_2, remote_a)
connect(sock_3, remote_b)
The key here is that, prior to the listen() call, the full set of
sockets that need to be utilized for outgoing connections must be
allocated and bound to the local IP address and port. This number,
MAX, represents the maximum number of TCP connections to different
destinations that might need to be established from the same local
candidate. This number can be potentially large for simultaneous-
open candidates. If a request forks, ICE procedures may take place
with multiple peers. Furthermore, for each peer, connections would
need to be established to each passive or simultaneous-open candidate
for the same component. If we assume a worst case of 5 forked
branches, and for each peer, five simultaneous-open candidates, that
results in MAX=25.
Authors' Addresses
Jonathan Rosenberg
Skype
Email: jdrosen@jdrosen.net
URI: http://www.jdrosen.net
Rosenberg, et al. Expires April 28, 2011 [Page 24]
Internet-Draft ICE TCP October 2010
Ari Keranen
Ericsson
Hirsalantie 11
02420 Jorvas
Finland
Email: ari.keranen@ericsson.com
Bruce B. Lowekamp
Skype
Email: bbl@lowekamp.net
Adam Roach
Tekelec
17210 Campbell Rd.
Suite 250
Dallas, TX 75252
US
Email: adam@nostrum.com
Rosenberg, et al. Expires April 28, 2011 [Page 25]