Ballot for draft-ietf-hip-native-nat-traversal

Discuss [Treat as non-blocking comment] (2018-05-04 for -28) Unknown

Rich version of this review at:
https://mozphab-ietf.devsvcdev.mozaws.net/D3099


I am very familiar with ICE and yet I found this document extremely
hard to follow. The problem is that it cherry-picks pieces of ICE and
I'm just not sure that it's a complete specification when put all
together. I have noted a number of places where I actually am not sure
how to implement something, and fixing those will resolve this
DISCUSS, but IMO you really should totally rewrite this document
either (a) as a variant of ICE or (b) as an entirely new document not
with a pile of new text and then references out to ICE sections.

DETAIL
S 4.2.
>      request type SHOULD NOT create any state at the Control Relay Server.
>   
>      ICE guidelines [I-D.ietf-ice-rfc5245bis] for candidate gathering are
>      followed here.  A number of host candidates (loopback, anycast and
>      others) should be excluded as described in the ICE specification
>      [I-D.ietf-ice-rfc5245bis].  Relayed candidates SHOULD be gathered in

If you're going to normatively cherry-pick ICE, you need to note
specific sections, I think.


S 4.6.2.
>   
>      A host may receive a connectivity check before it has received the
>      candidates from its peer.  In such a case, the host MUST immediately
>      generate a response, and then continue waiting for the candidates.  A
>      host MUST NOT select a candidate pair until it has verified the pair
>      using a connectivity check as defined in Section 4.6.1.

Are you supposed to put this on a TODO check list as with ICE?


S 5.8.
>   
>   5.8.  RELAY_HMAC Parameter
>   
>      As specified in Legacy ICE-HIP [RFC5770], the RELAY_HMAC parameter
>      value has the TLV type 65520.  It has the same semantics as RVS_HMAC
>      [RFC8004].

What key is used for the HMAC?

Discuss [Treat as non-blocking comment] (2019-07-30 for -28) Not sent

1) This document should also update the IANA port registry to add a reference to this RFC-to-be to the existing entry for port 10500 (eventually even with note that this RFC-to-be discusses how to distinguish the services using NAT_TRAVERSAL_MODE).

2) Sec 4.4: "Hosts SHOULD NOT use values smaller than 5 ms for the minimum Ta,..."
In rfc5245bis this is a MUST. Why is this a SHOULD here?

Also in sec 4.6.2.:
"If neither one of the hosts announced a minimum pacing value, a value of 50 ms SHOULD be used."
This must be a MUST to be inline with sec 4.4.

3) Appendix A: "Ta value so that only two connectivity check messages are sent on every RTT."
Why two? RFC8085 recommends (SHOULD) at most one packet per RTT for non-congestion control transmissions

Yes (for -28) Unknown

No Objection (2020-02-19 for -29) Not sent

Thanks to the authors for taking some of the concerns I lay out below
into account. I still do not believe this approach is good for HIP's
benefit, but am no longer worried about collateral damage from
other protocols imitating this approach. Accordingly, I am
updating my ballot to "No Objection." The original comments
from my Abstain ballot are preserved below for posterity.

===========================================================================

I share Ben's concerns about the disjointedness of this document's
specification, and am likewise abstaining. My reasons for abstaining are
deeper than Ben's, however.

I recognize that the effort put into this document is substantial, and that
the recommendations I make are unlikely to be taken at this point in time,
but I believe that the HIP ecosystem would be far better served by an "RFC
5570 bis" approach rather than a modified form of ICE recast using HIP
messages. Among other reasons: several companies already offer
geographically-distributed hosted TURN solutions, largely due to the relative
popularity of WebRTC. HIP will have to reach a similar level of popularity
before HIP-specific relay nodes are similarly commercially available.  Using
ICE as-is would allow HIP to use those services that are available today.

As a further concern, I worry that this pattern may end up replicated in other
protocols. For example, although ICE was initially developed with RTP/RTCP in
mind, it was not implemented as a series of extensions to RTP or RTCP;
instead, it is its own protocol, intended to be re-used in other contexts. I
would not like to see, e.g., ICE-like extensions to the QUIC protocol to enable
its use in peer-to-peer situations; I would certainly hope that such an effort
would use ICE as currently defined.

Given that the headline rationale offered in this document is that
"Implementing a full ICE/STUN/TURN protocol stack as specified in Legacy
ICE-HIP results in a considerable amount of effort and code which could be
avoided by re-using and extending HIP messages and state machines for the same
purpose," this document puts forth an implication that all protocols could
benefit from similar not-quite-ICE-but-almost-ICE extensions. I believe
this implication is harmful. I also believe this analysis overlooks the
availability of multiple existing, open-source, already-debugged, and
"compatible with commercial use" ICE implementations.

It is not clear that the four additional reasons offered in Appendix B are
sufficient to justify the design. Taken in turn:

>  For example, ICE is meant for application-layer protocols, whereas HIP
>  operates at layer 3.5 between transport and network layers.

This doesn't have practical effect: ICE is designed to work with generic UDP
packet flows, subject only to the ability to demultiplex STUN from such
packets.

>  This is particularly problematic because the implementations employ IPsec
>  ESP as their data plane: demultiplexing of incoming ESP, HIP and TURN
>  messages required capturing of all UDP packets destined to port 10500 to
>  the userspace, thus causing additional software complexity and an
>  unnecessary latency/throughput bottleneck for the dataplane performance.

This doesn't seem like a foregone consequence. If you're using user-space HIP
implementations, this user-space diversion is already necessary. If you're
using kernel-space HIP implementations, it seems a modest step to add minimal
STUN demultiplexing to the kernel implementation that is already performing
ESP/HIP demultiplexing.  It's possible that I'm misunderstanding some subtle
aspect of the way these protocols interact with each other, but isn't this
described performance impact simply the result of specific implementation
design decisions rather than inherent to the design of RFC 5570's mechanism?

>  Also, relaying of ESP packets via TURN relays was not
>  considered that simple because TURN relays require adding and
>  removing extra TURN framing for the relayed packets.

While it's been a while since I've looked at network kernel code, my
recollection is that implementation of the POSIX "sendmsg()" system call
generally maintains scatter/gather buffers all the way down the stack until
such bytes are copied from system memory to the network hardware. Stripping
such headers on receipt can be accomplished with simple pointer arithmetic.
It's not clear what aspect of the system "simple" is intended to refer to, but
both implementation and performance impacts should be immeasurably small when
implemented in this way, unless I'm missing something.

>  Finally, the developers of the two Legacy ICE-HIP implementations concluded
>  that "effort needed for integrating an ICE library into a HIP
>  implementation turned out to be quite a bit higher that initially
>  estimated.  Also, the amount of extra code (some 10 kLoC) needed for all
>  the new parsers, state machines, etc., is quite high and by re-using the
>  HIP code one should be able to do with much less.  This should result in
>  smaller binary size, less bugs, and easier debugging.".

Such size is not inherent in the implementation of ICE: for example, the ICE
stack used by Firefox is 2.2 kLoC in size (if you ignore the ~1.2 kLoC of
boilerplate copyright notice). Having seen the debugging of an ICE stack
up-close-and-personal, I'm pretty comfortable saying that the effort to
integrate a working stack has to be orders of magnitude less than implementing
even the simplified ICE procedures defined in this document correctly. There
are a lot of surprising corner cases that don't really turn up until you get
into production.

For the foregoing reasons, it is my conclusion that publication of this
document is harmful for HIP and is harmful as a precedent that other protocols
may mistakenly emulate. I believe that a restructuring of the document to more
clearly explain why HIP chose this path while other protocols should not would
limit some of these concerns. However, I do not believe that the fundamental
flaw -- a partial respecification of ICE for the cited reasons --  can be
fixed. I do not support publication of a document describing this mechanism,
and encourage the working group to withdraw the document from consideration
for publication.

To be clear: despite the length and detail of my preceding objections, I
recognize that I may be off in the weeds. I am happy to be corrected about
any of the assertions I make above, up to and including corrections that make my
conclusion incorrect. I will further note that this is not a blocking ballot
position, and that, procedurally, the document can be published despite my
misgivings.

===========================================================================

I have included some additional comments below.

---------------------------------------------------------------------------

§1:

>  As one solution, the HIP experiment report [RFC6538] mentions that
>  Teredo based NAT traversal for HIP and related ESP traffic (with
>  double tunneling overhead).

This isn't a sentence. Perhaps remove "that"?

Also: "Teredo-based"

---------------------------------------------------------------------------

§4.6:

>  The connectivity checks follow the ICE methodology [MMUSIC-ICE], but
>  UDP encapsulated HIP control messages are used instead of ICE
>  messages.  Only normal nomination MUST be used for the connectivity
>  checks, i.e., aggressive nomination MUST NOT be employed.

The cited document does not describe aggressive nomination, and deprecates its
use. Consider removing the mention of aggressive nomination in this document.

---------------------------------------------------------------------------

§5.4:

>    The following NAT traversal mode IDs are defined:
>
>        ID name            Value
>        RESERVED             0
>        UDP-ENCAPSULATION    1
>        ICE-STUN-UDP         2
>        ICE-HIP-UDP          3

This should probably point to the IANA table rather than replicating a snapshot
of its contents.

---------------------------------------------------------------------------

Appendix B:

>  o  Unlike in ICE, the addresses are not XOR-ed in Native ICE-HIP
>     protocol in order to avoid middlebox tampering.

This bullet should explain why such obfuscation is unnecessary.

No Objection (2018-05-09 for -28) Unknown

I admit to not having much familiarity with HIP, so apologies if some of these questions seem off-base.

Why is this document on the standards track when RFC 5770 was experimental?

Section 6.1 says:

"The locators are in plain text format in favor of inspection at HIP-
   aware middleboxes in the future.  The current document does not
   specify encrypted versions of LOCATOR_SETs, even though it could be
   beneficial for privacy reasons to avoid disclosing them to
   middleboxes."

This seems to cut in the opposite direction of some of the other work we have going on in the IETF, where the justification for maintaining header information in the clear is for backwards-compatability with existing middleboxes, not to facilitate some to-be-developed middlebox behavior. Why is this justified for HIP?

Section 6.1 also says "an end-host may exclude certain host addresses from its LOCATOR_SET parameter," but I don't think this is totally clear in Section 4.5 where it talks about "all the HIP candidates." I also wonder if it would be possible to provide some guidance about the circumstances under which an initiator might choose to exclude certain addresses, e.g. if there is a common deployment scenario where it's clear that certain candidates are meant to remain private.

Nits: 

= Section 1 =

" As one solution, the HIP experiment report [RFC6538] mentions that
   Teredo based NAT traversal for HIP and related ESP traffic (with
   double tunneling overhead)."

This is a sentence fragment.

= Section 2 =

The paragraph about RFC2119 should also reference RFC8174.

No Objection (2018-05-09 for -28) Unknown

Like Benjamin, I also found the relationship between this document and ICE somewhat confusing until the very end (Appendix B).  The attempt to "follow ICE methodology but reuse HIP messaging format to achieve the same functionality" results in lack of clarity, so I also agree with Eric's opinion that a rewrite would help, and support his DISCUSS.

No Objection (2018-05-09 for -28) Unknown

This document does a poor job at convincing me that there
is a need to re-specify ICE for use in HIP contexts as opposed to
just using ICE directly, up until Appendix B.  I'd suggest moving
some of the key points into at least the Introduction and arguably
the Abstract as well, to make it clear that this is not just
needless duplication for ideological purity.

I think there's some lingering terminology confusion (as a result of
needing to align the new bits in Native HIP-ICE with those retained
from RFC 5077, along with the move from 5245 to 5245bis.
Specifically, while it's fine for this document to refer to "ICE
offer" and "ICE answer", 5245bis itself does not talk of "offer" and
"answer", which are now concepts only in SDP, IIUC.  Things also
seem a bit hazy about server reflexive vs. server relay candidates
(though maybe here the confusion is just on my end) -- in regular
ICE, a server reflexive candidate is obtained by a STUN client
making a one-shot request of a STUN server to find out what address
is being used on the other side of a NAT, and doesn't require any
state on the STUN server.  But in this document we have a
SERVER_REFLEXIVE_CANDIDATE_ALLOCATION_FAILED Notify/error packet
that implies that state is needed on the Data Relay Server for a
reflexive candidate, text about "when the relay service is split
between hosts, the server reflexive candidate [from Control SHOULD
be used over the one from Data]", and also other discussion about
needing to register new reflexive candidates to avoid collisions or
in potential multihoming future work.

Some section-by-section comments follow.

Section 2

   Responder:
      The Responder is the host that receives the I1 packet from the
      Initiator.

Does this still hold if the message is misdelivered or an attacker
is in the network?


Section 3

   [...] At this point, also the HIP signaling can be sent over the
   same address/port pair, and is demultiplexed from IPsec as described
   in the UDP encapsulation standard for IPsec [RFC3948].

"demultiplex" does not appear anywhere in RFC 3948; it might be
worth using a few more words here to clarify what is going on.


Section 4.1

   A Control Relay Server MUST silently drop packets to a Control Relay
   Client that has not previously registered with the HIP relay.

How does the relay know where they are targetted without the
registration information?

   This applies both renewals of service registration but also to host
   movement, where especially the latter requires the Control Relay
   Client to learn its new server reflexive address candidate.

This is kind of awkward wording; maybe:

   This applies to both renewals of service registration and to host
   movement.  It is especially important for the case of host
   movement, as this is the mechanism that allows the Control Relay
   Client to learn its new server reflexive address candidate.


Section 4.2

   [...] A host SHOULD employ only a single server for
   gathering the candidates for a single HIP association; either one
   server providing both Control and Data Relay Server functionality, or
   one Control Relay Server and also Data Relay Server if the
   functionality is offered by another server.

The second half of this sentence seems to contradict the SHOULD, so
probably some rewording is in order.

   [...] If a relayed candidate is identical to a host
   candidate, the relayed candidate must be discarded.  NAT64
   considerations in [I-D.ietf-ice-rfc5245bis] apply as well.

I don't think this has enough detail of reference to ICE -- a
relayed candidate being identical to a host candidate is, IIUC,
quite unexpected.  This is even noted in 5245bis, at the bottom of
page 20 (of the -20).

   Unlike ICE, this protocol only creates a single UDP flow between the
   two communicating hosts, so only a single component exists.  Hence,
   the component ID value MUST always be set to 1.

ICE or SDP?


Section 4.5

   In step 2, the Control Relay Server receives the I1 packet.  If the
   destination HIT belongs to a registered Responder, the Control Relay
   Server processes the packet.

Do HIP participants register specifically as Responder/Initiator, or
is this just that the entity that is Responder in this exchange, has
registered at the Control Relay Server?

   [...] The
   RELAY_TO parameter is not integrity protected by the signature of the
   R1 to allow pre-created R1 packets at the Responder.

This seems to allow an attacker to replay R1 packets and have the
Relay transmit them to the Initiator.  Are there cases where this
presents an increase in attacker capabilities (e.g., when the Relay
is allowed to send to the Initiator but the attacker is not)?
(When would such pre-created R1 packets be useful?)

Should step 7 explicitly duplicate the "validate REPLAY_HMAC" part
of step 3?


Section 4.6

The situation with the (non-)existence of aggressive nomination between
5245bis and MMUSIC-ICE probably merits further investigation.
(That is, it may not be necessary to mention explicitly that regular
nomination is used.)

   The Initiator MUST take the role of controlling host and the
   Responder acts as the controlled host.  The roles MUST persist
   throughout the HIP associate lifetime (to be reused in the possibly
   mobility UPDATE procedures).  In the case both communicating nodes
   are initiating the communications to each other using an I1 packet,
   the conflict is resolved as defined in section 6.7 in [RFC7401]: the
   host with the "larger" HIT changes to its Role to Responder.  In such
   a case, the host changing its role to Responder MUST also switch to
   controlling role.

The last clause seems to not match the earlier text about the
Initiator being the controlling role.


Section 4.6.1

   In step 2, the Initiator sends a connectivity check, using the same
   address pair candidate as in the previous step, and the message
   traverses successfully the NAT boxes.  The message includes the same
   parameters as in the previous step.  It should be noted that the
   sequence identifier is locally assigned by the Responder, so it can
   be different than in the previous step.

Step 2 is from Initiator to Responder, and the message in step 1 got
dropped, so I'm quite confused at how the sequence identifier in
step 2 could be assigned by the Responder as opposed to the
Initiator.

Step 4 could say whether the sequence number from step 1 is reused
or a new one is allocated for the retransmission (even though it is
clarified later as "SHOULD be sent with the same sequence
identifier").


Section 4.7.2

   [...] However, the
   Responder SHOULD NOT send any ESP to the Initiator's address before
   it has received data from the Initiator, as specified in Sections
   4.4.3. and 6.9 of [RFC7401] and in Sections 3.2.9 and 5.4 of
   [RFC8046].

I don't see a Section 3.2.9 in RFC 8046.

   Since an I2 packet with UDP-ENCAPSULATION NAT traversal mode selected
   MUST NOT be sent via a Control Relay Server, the Responder SHOULD
   reject such I2 packets and reply with a
   NO_VALID_NAT_TRAVERSAL_MODE_PARAMETER NOTIFY packet (see
   Section 5.10).

How does this mesh with a few paragraphs back, when the Responder
MUST include the address it got by registering at a Control Relay
Server in its R1 (only when it is including UDP-ENCAPSULATION NAT as
one of its supported modes)?


Section 4.7.3

   [...] The Initiator may receive multiple R1 packets, with
   and without UDP encapsulation, from the Responder.  However, after
   receiving a valid R1 and answering it with an I2, further R1 packets
   that are not retransmissions of the original R1 message MUST be
   ignored.

We just said there may be multiple, so which one is "the" original
R1 message?


Should Section 4.8 repeat the earlier admonitions against a Relay
Server forwarding traffic that does not involve a Client that has 
egistered with that Relay Server?


Section 4.9

The relay still has to apply RELAY_HMAC; that's just not currently
shown in the diagram, right?


Section 5.6

   The format of the REG_FROM, RELAY_FROM, and RELAY_TO parameters is
   shown in Figure 10.  All parameters are identical except for the
   type.  REG_FROM is the only parameter covered with the signature.

I suggest "Of the three, only REG_FROM is covered by the signature."


Section 6.1

The RFC 5770 text talks about TURN servers, but that's no longer
applicable in this document (instead, Data Relay Servers are used to
relay data in a similar usage).

Also, the protection against DoS described in the last paragraph
seems to only be partial protection, since incoming connections can
still be affected by DoS, but outgoing ones are still possible.


Section 6.2

This is the first mention of Opportunistic Mode at all in the
document; it might be nice to refer to the section of 7401 where
it's documented.


Section 6.6

Is the invalid list of candidates sent *for* its peer or *to* its
peer?


Appendix B

   o  Unlike in ICE, the addresses are not XOR-ed in Native ICE-HIP
      protocol in order to avoid middlebox tampering.

This reads a bit oddly.  Just to check: we don't need to do the
XORing in Native ICE-HIP because the addresses are covered by an
HMAC and the "tampering" wouldn't work?
If so I'd suggest:

   o  In ICE, addresses being conveyed across a NAT are XOR-ed to
      prevent middlebox tampering.  Native ICE-HIP does not need to
      use XOR because such tampering is prevented at the HIP layer.

No Objection (for -28) Unknown

No Objection (2018-05-09 for -28) Unknown

I'm balloting No Objection, but I'm watching the discussion in Eric's ballot thread about reusing pieces of ICE, and I look forward to some discussion about the provisions being made for middleboxes in this draft - I'm not denying that such things exist, only that it would be best if we understood why middleboxes are needed for this usage.

No Objection (for -28) Unknown

Abstain (2018-05-09 for -28) Unknown

I support all points of Ekr's discuss and comment points. I think this either needs to use ICE mostly as is (maybe with some minor profiling) or it needs to be self-contained here. I understand the material in appendix B, but the current mix seems untenable for implementors. Therefore I am balloting "abstain".  I will reconsider that position if there is a substantial reorganization.

Substantive Comments:

I share Alissa's question about why this is standard track when the previous work has been experimental.

§1, second paragraph: The citation for the version of ICE used by "legacy ICE-HIP" should be RFC5245, not the bis version.

§2: There are a number of lower-case keywords. Please use the RFC 8174 boilerplate.

§4.2: 
- paragraph 5: Is everything in this paragraph from the ICE specification? I suspect not, but it's hard to tease out what is from ICE and what is new specification. It would be helpful to reference the ICE bits by section number.
- paragraph 6: I'm confused in that I thought the previous text said that native ICE-HIP does not use STUN.

§6: I am skeptical of the assertion that the security considerations for Native ICE-HIP are no different than those for Legacy ICE-HIP.

Editorial Comments:

§1, 2nd paragraph: 
- "responsible of NAT traversal": s/of/to
- "responsible of end-host": s/of/to

§4.3: "This section describes the usage of a new non-critical parameter type. ": Which is?

§4.6, first paragraph: 2nd sentence is hard to parse.