Skip to main content

Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal


(Ben Campbell)

No Objection

Alvaro Retana
(Alia Atlas)
(Alissa Cooper)
(Benoît Claise)
(Deborah Brungard)
(Spencer Dawkins)
(Terry Manderson)

Note: This ballot was opened for revision 17 and is now closed.

Alvaro Retana
No Objection
Warren Kumari
No Objection
Comment (2018-02-21 for -17)
Please also see Qin Wu's OpsDir review - it contains useful editorial comments.
Ben Campbell Former IESG member
Yes (for -17)

Adam Roach Former IESG member
(was Discuss) No Objection
No Objection (2018-03-12)
Thanks for addressing my discuss and comments.
Alexey Melnikov Former IESG member
No Objection
No Objection (2018-02-22 for -17)
I skimmed the document. I trust my ART co-ADs to do a better job at finding issues (if any).
Alia Atlas Former IESG member
No Objection
No Objection (for -17)

Alissa Cooper Former IESG member
No Objection
No Objection (for -17)

Benoît Claise Former IESG member
No Objection
No Objection (for -17)

Deborah Brungard Former IESG member
No Objection
No Objection (for -17)

Eric Rescorla Former IESG member
No Objection
No Objection (2018-02-18 for -17)
Review in context at:


*  If the agent's tie-breaker value is larger than or equal to the
         contents of the ICE-CONTROLLING attribute, the agent generates
IMPORTANT: This algorithm seems like it's not going to work properly.
Consider the case where A and B happen to have the same tie-breaker
and both think they are controlling, and the Binding Requests
cross. Each now sends a 487 and then they switch to
controlled. Ugh. Unless I'm missing something, if the tie-breakers
match, you are stuck. Given that the chance is 2^{-64} this seems
to not be a critical failing, but the algorithm still seems wrong.

   single solution that is flexible enough to work well in all
This section seems pretty dated. Aren't ICE and ICE-style solutions pretty much the de facto standard now.

   may not be aware of it.  The type of NAT and its properties are also
   unknown.  L and R are capable of engaging in an candidate exchange
   process, whose purpose is to set up a data session between L and R.
Nit: a candidate exchange. This appears to be a result of changing offer/answer (where "an" was appropriate) to "candidate"

   At least one viable candidate has a transport address obtained
   directly from a local interface.  Such a candidate is called a host
Nit: this is awkward. Perhaps "The first category of candidates are those with a transport ..."

   When the agent sends the TURN Allocate request from IP address and
   port X:x, the NAT (assuming there is one) will create a binding
Nit: "a TURN Allocate"

   the next candidate pair on the list periodically.  These are called
   ordinary checks.  When a STUN transaction succeeds, one or more
   candidate pairs will become so called valid pairs, and will be added
Nit: I would quote "ordinary check" here and "triggered check" below.

   provide means to exchange candidate information between the ICE
   agents.  The specific details of (i.e how to encode candidate
   information and the actual candidate exchange process) for different
Nit: i.e -> i.e.,

   Nomination, Regular Nomination:  The process of the controlling agent
      indicating to the controlled agent which candidate pair the ICE
Given that you have removed Aggressive Nomination, do you still need to refer to "Regular Nomination"

   candidate gathering, (2) candidate prioritization, (3) redundant
   candidate elimination, and (4) sending of the candidates to the peer.
This is an odd diagram. There's no reason why these have to happen in sequence and in fact in Trickle ICE, they don't, so this diagram seems misleading., as well as potentially contradicting the beginning of S 5.1.1.

"An ICE agent gathers candidates when it believes that communication is imminent. "

   When candidates are obtained, unless the agent knows for sure that
   RTP/RTCP multiplexing will be used (i.e. the agent knows that the
   other agent also supports, and is willing to use, RTP/RTCP
Nit: "i.e.,"

      addresses that do allow location tracking, that are configured on
      the same interface, and are part of the same network prefix MUST
      NOT be gathered.
You need to remove both of these commas, because they indicate a nonrestrictive clause, but this is a restrictive clause.

   The gathering process is controlled using a timer, Ta.  Every time Ta
   expires, the agent can generate another new STUN or TURN transaction.
No comma here.

   The agent will receive a Binding or Allocate response.  A successful
   Allocate response will provide the agent with a server reflexive
Or nothing or an error.

              (2^8)*(IP precedence) +
              (2^0)*(256 - component ID)
Isn't this the same formula as in S

      Foundation:  A sequence of up to 32 characters.
The Foundation is never transmitted, AFAIK. So why does it have to be up to 32 characters? It certainly wasn't exchanged in 5245.

   data stream, and for updating the peer with the ICE's selection, when
   needed.  The controlled agent is told which candidate pairs to use
   for each data stream, and does not require updating the peer to
Told by who?

   pair priorities, orders pairs by priority, prunes pairs, removes
   lower-priority pairs, and sets check list states.  If candidates are
   added to a check list (e.g, due to detection of peer reflexive
Please fix your subject verb agreement here.

      pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0)
This was kinda terrible in 5245. Given that you use it once, maybe just have

+ GT(G, D)

And then say GT(G, D) == 1 if G>D and 0 otherwwise.

                       Figure 8: Initial Pair State
This figure caption is kind of a mess. I suggest just removing it.

   state in the check list set has been processed, the first check list
   is picked again.  Etc.
Nit: "again, etc."

   pair to the remote candidate of the pair, as described in
   Section 7.2.4.
IMPORTANT: You don't just send a STUN request, you start a STUN transaction,

   lists.  On the other hand the responding agent either performs the
   triggered or ordinary checks as described above.
I don't understand this paragraph. What distinction are you trying to draw.

   o  The base is local candidate of the candidate pair from which the
      Binding request was sent.
Nit: "is the local"

   The ICE agent constructs a candidate pair whose local candidate
   equals the mapped address of the response, and whose remote candidate
IMPORTANT: When does this happen?

   a different check list than the one to which the pair that generated
   the connectivity checks), or it may be a pair not currently in any
   check list.
IMPORTANT: How would a valid pair be on some other check list?

      this specification.  There may be a conflict, but it cannot be
What previous version? This was required in 5245. Maybe at this point we can just deprecate this?  Learning Peer Reflexive Candidates
This entire section seems to duplicate

         in-progress transaction.  Cancellation means that the agent
         will not retransmit the request, will not treat the lack of
         response to be a failure, but will wait the duration of the
Why are you cancelling In-Progress checks when you receive a peer-reflective check? If you receive two in a row, then it seems like this delays a successful check. More generally, this document should explain how you end up in this situation: you only get here when "the source transport address of the request does not match any existing remote candidate", so how can it be on a check list unless this is the second observation of a peer reflexive.

   Prior to nominating, the controlling agent let connectivity checks
   continue until some stopping criterion is met.  After that, based on

   The criterion details for stopping the connectivity checks and for
   selecting a pair for nomination, are outside the scope of this
"criterion details" seems ungrammatical. 5245 had "criteria". What's wrong with that?

   have ceased using a given local candidate (a candidate may be used by
   multiple ICE sessions, e.g. in forking scenarios), the agent can free
   that candidate.  The three-second delay handles cases when aggressive
Nit: "e.g.,"

   Session Description Protocol (SDP) [RFC4566] is defined in
Presumably you want to cite 5245 S 14, which states:

Consequently, when a controlling agent is communicating with a peer
that supports options it doesn't know about, the agent MUST run a
regular nomination algorithm.  When regular nomination is used, ICE
will converge perfectly even when both agents use different pair
prioritization algorithms.

   15 seconds.  Agents MAY use a bigger value, but MUST NOT use a value
   smaller than 15 seconds.
This is a very old number. Is it supported by an modern measurement?

   will converge perfectly even when both agents use different pair
   prioritization algorithms.  One of the keys to such convergence is
   triggered checks, which ensure that the nominated pair is validated
Given that you have removes aggressive, you presumably want to revise this section

16.  STUN Extensions
None of the stuff here is "New" any longer, as it was allocated in RFC 5245.

   First and foremost, ICE makes use of TURN and STUN servers, which
   would typically be located in the data centers.  The STUN servers
   require relatively little bandwidth.  For each component of each data
Nit: this used to read "the network operators data centers" and when you removed "network operators" this became ungrammatical

   there will be four transactions per call (one for RTP and one for
   RTCP, for both caller and callee).  Each transaction is a single
   request and a single response, the former being 20 bytes long, and
Is this currently true? How many people still don't support RTP-MUX?

   can and will vary over time.  In a network with 100% behave-compliant
   NAT, it is exactly zero.  At time of writing, large-scale consumer
   deployments were seeing between 5 and 10 percent of calls requiring
This text dates to 5245. Is that still true?

   something incorrect about the results of the connectivity checks.
   The possible false conclusions an attacker can try and cause are:
IMPORTANT: This section would be a lot stronger if it was factored by attacker capabilities as well. As-is it is very hard to understand.

   possible attack.  Fortunately, this attack is mitigated completely
   through the STUN short-term credential mechanism.  The attacker needs
   to inject a fake response, and in order for this response to be
This text is a bit confusing. If you can generate a drop, then you can mount this attack.

   invalid attack, this attack is mitigated by the STUN short-term
   credential mechanism in conjunction with a secure candidate exchange.
This also isn't quite correct. Consider the case where A is behind a filtering element (e.g., a firewall) and shares a broadcast network with the attacker. The attacker then captures an outgoing message that would have been filtered and tunnels it to B, and then tunnels the response back. This can cause B and A to think they have a valid path when they do not. It seems like this attack is described several paragraphs down, but in sort of a confusing way.

19.4.1.  STUN Amplification Attack
It's probably worth noting that this form of attack is a lot worse when WebRTC is involved.

   rate at which they will create new bindings.  Experiments have shown
   that once every 5 ms is well supported.  This is why Ta has a lower
   bound of 5 ms.  Furthermore, transmission of these packets on the
Citation needed.
Kathleen Moriarty Former IESG member
No Objection
No Objection (2018-02-20 for -17)
Thanks for addressing Stephen's SecDir review.
Mirja Kühlewind Former IESG member
No Objection
No Objection (2018-02-16 for -17)
Important nits; please address:

1) Sec 14.2: RFC 6928, RFC 5389, and RFC 6298 should be actual references and listed in the reference section.

2) Sec 14.2: Maybe you should really not use normative language in the reasoning part of this section?!
    I find especially this SHOULD in the following sentence confusing (as it doesn't give a reference to another RFC that defines this):
   "Let MinPacing be the minimum pacing interval between
          transactions, which SHOULD be 5ms."
   Given that the previous text says:
  "all transactions from all agents [...] MUST NOT be sent more often than once
   every 5ms"
  My recommendation: please use lower cases "shoulds" and "musts" in the reasoning part!

3) Sec 19.4.1: "(say, Ta=20ms)"
Maybe use also a value of 50ms here to avoid confusion.
Spencer Dawkins Former IESG member
No Objection
No Objection (for -17)

Suresh Krishnan Former IESG member
(was Discuss) No Objection
No Objection (2018-03-04 for -18)
Thanks for addressing my DISCUSS and for adding an additional example.
Terry Manderson Former IESG member
No Objection
No Objection (for -17)