Skip to main content

Last Call Review of draft-ietf-lwig-minimal-esp-06
review-ietf-lwig-minimal-esp-06-tsvart-lc-briscoe-2021-09-01-00

Request Review of draft-ietf-lwig-minimal-esp
Requested revision No specific revision (document currently at 12)
Type Last Call Review
Team Transport Area Review Team (tsvart)
Deadline 2021-08-26
Requested 2021-08-12
Authors Daniel Migault , Tobias Guggemos
I-D last updated 2021-09-01
Completed reviews Secdir Last Call review of -03 by David Mandelberg (diff)
Genart Last Call review of -04 by Roni Even (diff)
Iotdir Last Call review of -03 by Nancy Cam-Winget (diff)
Secdir Last Call review of -06 by David Mandelberg (diff)
Tsvart Last Call review of -06 by Bob Briscoe (diff)
Assignment Reviewer Bob Briscoe
State Completed
Request Last Call review on draft-ietf-lwig-minimal-esp by Transport Area Review Team Assigned
Posted at https://mailarchive.ietf.org/arch/msg/tsv-art/T3jbja9lCwriOljiAIZRiDgELj4
Reviewed revision 06 (document currently at 12)
Result On the Right Track
Completed 2021-09-01
review-ietf-lwig-minimal-esp-06-tsvart-lc-briscoe-2021-09-01-00
This document has been reviewed as part of the transport area review team's
ongoing effort to review key IETF documents. These comments were written
primarily for the transport area directors, but are copied to the document's
authors and WG to allow them to address any issues raised and also to the IETF
discussion list for information.

When done at the time of IETF Last Call, the authors should consider this
review as part of the last-call comments they receive. Please always CC
tsv-art@ietf.org if you reply to or forward this review.

==A) General Comments==

===A.1) Caveat Regarding Transport Area Expertise ===

Although this review is on behalf of the transport area review team, I don't
claim to know much about the interactions between IPsec and the transport layer
(e.g. NAT traversal, etc), which is not my area of expertise.

===A.2). The document doesn't do what the Title, Abstract and Intro say===

Abstract
   This document describes a minimal implementation of the IP
   Encapsulation Security Payload (ESP) defined in RFC 4303.  Its
   purpose is to enable implementation of ESP with a minimal set of
   options to remain compatible with ESP as described in RFC 4303.

Introduction
   ...This document describes the minimal properties an ESP
   implementation needs to meet to remain interoperable with [RFC4303]
   ESP.  In addition, this document also provides a set of options to
   implement these properties under certain constrained environments.

The draft doesn't define a minimal implementation (singular). It gives
considerations that might help minimize each of many application-specific
implementations (plural).

Given the title, abstract and introduction, the reader expects a draft that
defines a single clear profile that is a subset of the generic ESP protocol -
perhaps like minimal IKEv2 [RFC7815] is a subset of IKEv2 [RFC7296]. That
expectation rapidly  leads to disappointment. Indeed, the style of this draft
is quite the opposite from a single recommended implementation. The draft's
style is not even clear-cut conditional statement like "in scenario X do Y".
The style is more like "Think carefully about this, that and the other".

So the word "Considerations" definitely ought to be included in the title.
Perhaps "Considerations for Minimizing Encapsulation Security Payload (ESP)
Implementations" (also note the abbreviation is expanded).

===A.3). Interoperability===

After Nancy Cam-Winget's review of -03, the sentence in the introduction was
clarified that interop meant interop with full RFC4303 implementations

   ... to remain interoperable with [RFC4303] ESP.

However, in the sections on specific fields, I believe there are cases where
interop seems to be restricted to an app-specific ESP peer rather than any
general RFC4303 peer. For instance: * §2 suggests use of an SPI that includes
the memory address of the SAD structure, but it is not spelled out how the
remote (possibly vanilla RFC4303) peer will know to apply an SPI to packets
that will have the appropriate value for the local peer to locate in its SAD,
given the SPI set by the remote peer is the local peer's inbound SPI. * In §3
it is remarked that the sender's use of time to generate the SN would require
the receiver to be appropriately configured, which implies the receiving peer
would somehow have to know the typical size of the sending peer's SN increments
to configure an appropriate window size.

===A.4) Completeness===

The IPsec protocol fields are used as the framework to hang the document
contents from. But surely the data plane is not the only part of an IPsec ESP
implementation? What about restricting the range of valid values in an SA
itself, to reduce complexity? What about reducing complexity of the SAD
(mentioned in §2, but not addressed specifically in its own right)? What about
simplifying the management interface? There's no mention of simplifying (or
eliminating) auditing, which is optional in RFC4303.

I am not involved in the security area, so apologies if all the above is dealt
with in other lwig drafts.

===A.5) 'It is recommended' ambiguity===

There are 5 occurrences of this ambiguous phrase. The first two examples (at
least) are ambiguous:

   For nodes supporting only unicast communications, it is recommended
   to index SA with the SPI only.
...
   For inbound traffic, it is recommended that any receiver provides
   anti-replay protection,

Do these mean that RFC4303 recommends these things, or that the present draft
recommends them for minimal implementations? Pls check the other 3 occurrences
too.

===A.6) 'May not' ambiguity ===

There's 5 occurrences of this too. I think only the first 2 are ambiguous.

  ... nodes designed to only send data may not implement this capability.

   ...a minimal ESP implementation may not
   generate such dummy packet.

Pls avoid it. In English, 'may not' can either mean 'might not' or 'must not',
depending where the emphasis is.

===A.7) Try to avoid words that would be normative if upper case===

It's up to the authors, but I would advise against using must, should, may etc.
in lower case in any RFC, given readers sometimes imagine that they might have
been intended to mean MUST, SHOULD, or MAY. Suggested lower case alternatives:
have to, ought to, and might.

==B) Specific Technical Points==
===B.1) Vague sentence in §2===

   Some other local constraints
   on the node may require a combination of the SPI as well as other
   parameters to index the SA.

===B.2) Last para of §2.1 ends up in the air===

I think this para is trying to say that there's no need to ensure that SPI
generation is properly random if a privacy analysis of the application doesn't
require this. It falls short, needing a final sentence to actually say this, if
that is what it intends to say.

Rather than make the controversial point that the state of a sensor reporting
the open/closed state of a door doesn't typically leak privacy info, the
controversy can be avoided. All that is needed is to say that /if/ messages for
a particular application are not privacy sensitive, a randomized SPI is not
necessary.

BTW, this does beg the question whether the implementer can foresee the
different trust environments an application might be used in (e.g a door
open/closed message might have no privacy implications for one application, but
then a more ambitious application might be built on top of the original app).

In summary, it's a moot point whether an app developer can be expected to
properly analyse privacy vulnerabilities, or whether it's just better to play
safe with a randomized SPI.

===B.3) Use of time-driven SN still requires sequence state===

   When the use of a
   clock is considered, one should take care that packets associated
   with a given SA are not sent with the same time value.

To check that no two time values are the same requires holding the same state
as would be needed to increment a counter - taking us back to square 1.

===B.4) Other problems with time-driven SN===

§3 says

   ...if not appropriately configured, the use of a
   significantly larger SN may result in the packet out of the
   receiver's windows and that packet being discarded.

BTW, I think that is meant to say 'a significantly larger SN *increment* may
result in...' Assuming that's what was meant, it's not just a question of a
standard receiver being 'appropriately configured', but also whether standard
receiver implementations would be even capable of supporting a much larger
anti-replay window, which would require receivers to manage a significantly
larger bit-map in memory.

Also, for devices that spend significant time sleeping, the SN would jump
hugely on first waking. That shouldn't require any larger window (unless a
stale packet from prior to the sleep was only released after a new packet on
waking). But the receiver would need to be able to somehow detect massive jumps
in the high order bits that are not communicated in the SN field.

===B.5) The need for anti-replay protection===

§3 says:

   Receiving peers may also
   implement their own anti-replay mechanism.

Assuming this is meant to mean "Receiving *applications* may also...", it is
true. However, as I understand it, IPsec provides a general anti-replay
facility for all the cases where anti-replay is not done e2e in higher layers.
Indeed, most e2e security protocols prevent replay, but most application logic
does not, and sometimes, just sometimes, this might open up a vulnerability .

===B.6) TFC more appropriate for IoT?===

One could argue that Traffic Flow Confidentiality (§4) is *more* applicable for
large classes of IoT applications than other 'traditional' applications of
IPsec.  Many IoT apps send only a few different types of message, which are
perhaps more likely to be distinguishable from the sizes of the encrypted
messages. This point is made in the draft, but not clearly enough:

   In
   most cases, the payload carried by these nodes is quite small, and
   the standard padding mechanism may also be used as an alternative to
   TFC,

This comes just after the draft has recommended that TFC is *not* used (because
standard devices haven't adopted it). But it's not really clear here that it is
recommending that TFC ought to be used in these cases with a limited set of
messages, and that 32-bit alignment will often be a sufficient mechanism.
Indeed the draft continues, saying it is preferred to poll, rather than send
only when an event occurs. That is surely a form of TFC as well.

If anything, the draft ought *not* to recommend against TFC, which is only on
the spurious grounds than non-IoT applications have tended not to need it.

===B.7) Sleep as well as reboot===

§8 on Security Considerations discusses using session keys across reboots. I
think it would appropriate to talk about across sleep periods as well.

==C) Editorial Nits==

I found loads. I'll send an edited XML file if time permits. Otherwise, we'll
have to trust that the RFC Editor will find them and fix them all.

Regards

Bob Briscoe