Last Call Review of draft-ietf-ice-rfc5245bis-16
review-ietf-ice-rfc5245bis-16-tsvart-lc-westerlund-2018-01-30-00

Request Review of draft-ietf-ice-rfc5245bis
Requested rev. no specific revision (document currently at 17)
Type Last Call Review
Team Transport Area Review Team (tsvart)
Deadline 2018-01-26
Requested 2018-01-12
Other Reviews Genart Last Call review of -16 by Stewart Bryant (diff)
Secdir Last Call review of -16 by Stephen Farrell (diff)
Opsdir Last Call review of -16 by Qin Wu (diff)
Genart Telechat review of -17 by Stewart Bryant
Secdir Telechat review of -17 by Stephen Farrell
Review State Completed
Reviewer Magnus Westerlund
Review review-ietf-ice-rfc5245bis-16-tsvart-lc-westerlund-2018-01-30
Posted at https://mailarchive.ietf.org/arch/msg/tsv-art/S9LLiLgIikeUwx6L14bIGFRzg5M
Reviewed rev. 16 (document currently at 17)
Review result Ready with Issues
Draft last updated 2018-01-30
Review completed: 2018-01-30

Review
review-ietf-ice-rfc5245bis-16-tsvart-lc-westerlund-2018-01-30

I've reviewed this document as part of the transport area directorate's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors for their information and to allow them to address any issues raised. When done at the time of IETF Last Call, the authors should consider this review together with any other last-call comments they receive. Please always CC tsv-dir@… if you reply to or forward this review.

Unfortunately there was an error in the datatracker that resulted in that review requests was not timely sent. Thus, I got this request the day before the deadline. Therefore I am late with the review. 

I know that this is an update to an published RFC. But, I have reviewed it on a general level without considering what is changed or not.

This draft is on the right track but has open issues, described in the review.

Significant Issues:


A. Section 5.2:

    Lite implementations only utilize host candidates.  A lite
    implementation MUST, for each component of each data stream, allocate
    zero or one IPv4 candidates.  It MAY allocate zero or more IPv6
    candidates, but no more than one per each IPv6 address utilized by
    the host.  Since there can be no more than one IPv4 candidate per
    component of each data stream, if an ICE agent has multiple IPv4
    addresses, it MUST choose one for allocating the candidate.  If a
    host is dual-stack, it is RECOMMENDED that it allocate one IPv4
    candidate and one global IPv6 address.  With the lite implementation,
    ICE cannot be used to dynamically choose amongst candidates.
    Therefore, including more than one candidate from a particular scope
    is NOT RECOMMENDED, since only a connectivity check can truly
    determine whether to use one address or the other.

I find it quite strange that the above text says there can only be single IPv4 based candidate, while for IPv6 a LITE implementation may have one candidate per IPv6 address. Isn't the LITE implication of having multiple candidates for the same address family similar? Yes, IPv6 kind of forces the need for dealing with multiple IPv6 addresses on any host. However, I can see that certain servers will actually be multi-homed in IPv4 and thus can in a sensible way actually have multiple IPv4 candidates, and let the clients select which interface has the best reachability.

Can you please be explicit on what in ICE prevents things to work for IPv4 but the same case works for IPv6?

B. Section 6.1.1:

    An agent MUST be prepared that the peer might re-determine the roles
    as part of any ICE restart, even if the criteria for doing so are not
    fulfilled.  This can happen if the peer is compliant with an older
    version of this specification.

What does it mean to be prepared for a peer that re-determine the roles? What is it one MUST do? If the peer changes its role upon an ICE restart, isn't that going to result in a role mismatch? Thus causing yet another ICE restart, where also this peer will re-evalute? Isn't that good enough? Or is it something else it can do?

C. Section 6.1.3:

    The ICE agent has a state determined by the state of the check lists.
    The state is Completed if all check lists are Completed, Failed if
    all check lists are Failed, and Running otherwise.

Does failed really require all the checklists to fail, or simply any to fail if the others are completed?

D. Section 6.1.4.2:

I don't know if I misunderstand the algorithm here in the bullet list. To me it appears that it will terminate prior to have initiated all possible tests, as it appears that it will not unfreeze some of the candidate pairs. If one have tests running for a foundation, but all other candidate checks have been started, then the steps are aborted. Is the bullet list rechecked every Ta?

E. Section 7.2.5.2.2.  ICMP Error

    An ICE agent MAY support processing of ICMP errors for connectivity
    checks.  If the agent supports processing of ICMP errors, and if a
    Binging request generates an ICMP error, the agent SHOULD set the
    state of the candidate pair to Failed.

I am a bit worried by this blanket statement on ICMP errors. I think it should be clarified which ICMP message types that are relevant to consider as errors? I assume Type 3 (Destination Unreachable) but maybe not all responde codes as Codes 4, 11,12 may be addressable in other ways, and likely Type 11 (Time exceeded) with response code 0, response code 1 is not a clear indication of a non working path.

F. Section 7.2.5.2.3.  Timeout

    If the Binding request times out, the ICE agent MUST set the
    candidate pair state to Failed.

Isn't this erroneous? Timeout for the connectivity check is happening when all the (re-)transmissions have timed out, isn't it? or as simple as missing the word "transaction"?

G. Section 14.3:

     Num-Of-Pairs: the number of pairs of candidates
     with STUN or TURN servers.

I don't understand this definition. What does "with STUN or TURN servers" mean? Candidate pairs where the local side is server-reflexive or relay? 

H. Section 14.3:

     Num-Waiting: the number of checks in the check list in the
     Waiting state.

     Num-In-Progress: the number of checks in the In-Progress state.


Is "the number of checks" only per single checklist or across all the check lists?

I. Section 17.2.3:

When VAD is being
   used, keepalives will be sent during silence periods.

I would claim that this is only true for when VAD without any comfort noise is used. A lot of codecs with VAD operations still generates comfort noise on a frequency of a couple packets a second, way more often then the minimal for ICE keep-alives. 






Minor/Editorial Issues:


1.  Section 5.1.2:

This section doesn't make it clear that higher priority values are more prioritized over lower values. That really should be defined here. Now that information only becomes evident implicitly in section 5.1.2.1.


2. Section 2.1.


    In order to execute ICE, an ICE agent has to identify all of its
    address candidates.

I think this sentence is raising a too high requirement. An ICE agent has attempt to identify as many of the address candidates as possible. The better coverage of the potential candidates the more likely it is to function. I would also argue that there are multiple cases where you will not figure out that there are candidates that you don't know about. An obvious example is in cases of two NATs between the local address realm and the STUN server, the agent can't figure out that there was an address given to the flow in the middle address realm between the two NATs. That can be only learn if the agent has a STUN server in that address realm. Secondly, there are cases where policy may be applied to exclude certain interfaces and their related candidates.

I also noted that this first paragraph and the second has a strange relation. The first part of first paragraph is general, then there is the part of the host candidate. Then the second part starts with STUN and TURN derived candidates. Maybe the first paragraph should be split between the general and the host part, or some other bridging is needed.

3. Section 2.3:

If the transactions above succeed, the agents will set
    the nominated flag for the pairs, and will cancel any future checks
    for that component of the data stream.

Although what is stated is normal, it is not guaranteed to happen, I know this is intended as a simplified overview, but ignoring that there can occur some shuffling back and forth if high priority checks complete after low priority ones should at least be hinted or at least allowed by the use of words.

4. Section 3.

As RFC 7825 do describe a significant enough different usage of ICE from SIP, I think it would be good to actually included an informational reference to this usage.

5. Section 5.1.1.4:

An ICE agent SHOULD
    monitor the interfaces it uses, invalidate candidates whose base has
    gone away, and acquire new candidates as appropriate when new
    interfaces appear.

I am missing discussion of new addresses here. If the base disappears, it might be that there is a new IP address that one should use. That doesn't necessary imply a new interface.

6. Section 7.2.5.1:

    If the Binding request generates a 487 (Role Conflict) error
    response, and if the ICE agent included an ICE-CONTROLLED attribute
    in the request, the agent MUST switch to the controlling role. If
    the agent included an ICE-CONTROLLING attribute in the request, the
    agent MUST switch to the controlled role.

I think the first sentence should have a forward reference to Section 7.3.1.1 where the rest of the solution is described.

7. Section 7.3:

If the agent is using Diffserv Codepoint markings [RFC2475] in its
    data packets, it SHOULD apply the same markings to Binding responses.

I find this sentence a bit unclear. Is it intended to say:

If the agent receiving the binding request, intended to use DSCP markings !=0 for the data, it SHOULD set, the same marking to binding responses.

or

If the agent receives a binding request with DSCP markings, then it should apply to corresponding code point when forming the binding response?

There are unclarity of which agent is referenced and whom "it" is in the sentence.

8. Section 8.3.1:

   The procedures in Section 8 require that an ICE agent continue to
   listen for STUN requests and continue to generate triggered checks
   for a data stream, even once processing for that stream completes.

That reference to Section 8, should that in fact be to Section 8.1 specifically? It looks strange with a self reference, which in some aspect a reference to section 8 means. 

9. Section 15:

4.57566E+18 (note that
   an implementation would represent this as a 64-bit integer so as not
   to lose precision).

Why the floating point representation? Priorities are integer numbers and thus should be presented as such in this example. 

10. Section 17.2.1:

   First and foremost, ICE makes use of TURN and STUN servers, which
   would typically be located in the network operator's data centers.

Is really network operator's data centers the right entity here? I would claim it is the service operators data centers, they may contract STUN services from network operators, but if the service operator isn't providing any STUN service for its own service, then ICE is unlikely to work. 

11. Section 18.2:

Local IPv6 addresses can be preferred.

I think this sentence needs to clarify that it means local to host, rather than any form of relayed or translated address, rather than a local scope only IPv6 address. 

12. Section 18.5:

   A number of NAT boxes are now being deployed into the market that try
   to provide "generic" ALG functionality.  These generic ALGs hunt for
   IP addresses, either in text or binary form within a packet, and
   rewrite them if they match a binding. 

Are actually these generic ALG functionality relevant today? They proved to be a very bad idea very quickly. A note that this was a consideration at the time RFC 5245 was published. 


Cheers

Magnus Westerlund