Skip to main content

Last Call Review of draft-ietf-tls-rfc4347-bis-
review-ietf-tls-rfc4347-bis-secdir-lc-kaufman-2010-12-16-00

Request Review of draft-ietf-tls-rfc4347-bis
Requested revision No specific revision (document currently at 06)
Type Last Call Review
Team Security Area Directorate (secdir)
Deadline 2010-12-17
Requested 2010-11-30
Authors Eric Rescorla , Nagendra Modadugu
I-D last updated 2010-12-16
Completed reviews Secdir Last Call review of -?? by Charlie Kaufman
Assignment Reviewer Charlie Kaufman
State Completed
Request Last Call review on draft-ietf-tls-rfc4347-bis by Security Area Directorate Assigned
Completed 2010-12-16
review-ietf-tls-rfc4347-bis-secdir-lc-kaufman-2010-12-16-00
I have reviewed this document as part of the security directorate's ongoing
effort to review all IETF documents being processed by the IESG.  These
comments were written primarily for the benefit of the security area directors.
 Document editors and WG chairs should treat these comments just like any other
last call comments.

This spec is a refresh of rfc4347, which specified DTLS v1.0 as a set of deltas
from TLS v1.1. This spec defines DTLS v1.2 as a set of deltas from TLS v1.2.
The deltas are mostly the same, so this spec is nearly identical to rfc4347
except that it adds some clarifications, updates the references, and changes
the version number. It would be nice to have a structure where if and when TLS
v1.3 appears, there would not be a need for a DTLS v1.3 spec. Unfortunately,
since there might at that time be a need for some DTLS specific changes, there
appears to be no way to do such a spec in advance. I've never looked at DTLS in
detail before, so this is a review with fresh eyes. (That means please forgive
me if I raise issues that were long debated and finally closed on the mailing
list). I found what appears to be a minor flaw in the protocol (where it hangs
if the wrong packet is lost), and some suspicious things in the spec.

The spec doesn't specify the changes from DTLS v1.0 and DTLS v1.2 and the
implications for interoperability. This would be a section that was not needed
in rfc4347. I assume the transition is smooth, picking up the version number
negotiation from TLS v1.2, but it would be worth mentioning whether there are
any known issues.

Section 3.2.2. says that DTLS queues up out of order packets for future
processing. The protocol is designed so that it can alternatively drop out of
order packets (since they will be retransmitted). It's a space/bandwidth
trade-off (as noted in section 4.2.2).

The next-to-last paragraph of section 3.2.1 says that on a timeout, the client
retransmits the unacknowledged handshake message and (if it was the response
that was lost) the server will retransmit its response. It should be noted that
the server's response must be bit-for-bit identical to the response it
previously sent (since otherwise fragmentation could interleave parts of two
responses). The protocol depends on the HelloVerifyRequest being short enough
to fit in a single packet because it cannot reliably recover if that message is
fragmented and a fragment is lost.

That retransmission strategy does not work on the last message of the protocol
(the client's Finished) in the session-resuming exchange since the client is
not expecting a response. As specified, I believe the protocol is broken in the
case where that packet is lost. The obvious way to fix the protocol would be to
add a fourth message to the session-resuming handshake. An uglier but less
disruptive to the wire protocol fix would be for the server to interpret any
properly encrypted data packet in a new epoch as being evidence that the
ChangeCipherSpec message was lost. It does not need any information from it.
That works unless in the encapsulated protocol the server was expected to speak
first.

That paragraph also says that servers maintain a retransmission  timer and
retransmit when that time expires. It notes that retransmission does not apply
to HelloVerifyRequest messages. Retransmission is not required or helpful for
any of the messages, but it is also harmless.

In sections 4.1.2.1 and 4.1.2.7, it says that invalid packets should normally
be silently discarded but can alternately cause a fatal alert. I believe that
it's worth noting that logging the discarded packets (or at least a count of
them) is included in the definition of "silently discarding" and is often
useful for diagnostic purposes.

Implementing sequence numbers correctly in the handshake protocol has some
subtle requirements implied in the phrase "(at least notionally)" in the last
paragraph of section 4.2.2. Section 4.2.1 says that there can be multiple round
trips where a server keeps telling a client to use different cookies. The spec
contains no upper bound on the number of exchanges there could be, but it also
implies that each HelloVerifyRequest should have a new sequence number. Couple
that with a stateless server, and the only way a correct implementation can
work is for the server to accept *any* sequence number from a client for a
ClientHello and use that as an initial sequence number for its responses. I
don't know whether those semantics were intended. Either way the text should
probably explain what to do or implementers are likely to do incompatible
things.

Just as the cookie exchange was added to DTLS because TLS got that benefit by
running over TCP, there is another problem which this protocol does not appear
to address very well. If a connection is broken uncleanly (e.g. an endpoint
crashes) and then someone attempts to create a new connection between the same
IP addresses and UDP ports (e.g. an endpoint reboots), there appears to be no
way in this protocol to distinguish a plaintext ClientHello from a malformed
encrypted packet on the old connection. Since the best practice for a server is
to silently discard malformed encrypted packets, when a client reboots and
tries to reconnect it is likely that the server will appear dead. It would have
been helpful if in the record header the Type field distinguished an encrypted
handshake message from an unencrypted handshake message in order to identify
this case. In the case of a fixed pair of UDP ports communicating, it would
still be tricky to recover (since this could be confused with a DoS attack),
but at least the server could figure out what was probably going on. Then it
could implement some strategy like killing the DTLS connection if it had not
received any messages in the last X minutes. This problem doesn't come up for
TLS because the initialization of a TCP connection includes a SYN message that
does not appear in the middle of a connection and random ISNs that prevent
accidental aliasing of connections.

Nits/Typos:

Section 4.1 next to last line: "In an other case" -> "In any other case"
Section 4.1.2.7 formatting glitch (spacing) from copying text from one document
to another