Last Call Review of draft-ietf-anima-bootstrapping-keyinfra-16
review-ietf-anima-bootstrapping-keyinfra-16-genart-lc-arkko-2018-10-05-00

Request Review of draft-ietf-anima-bootstrapping-keyinfra
Requested rev. no specific revision (document currently at 19)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2018-10-02
Requested 2018-09-18
Other Reviews Iotdir Telechat review of -17 by Russ Housley (diff)
Secdir Last Call review of -16 by Christian Huitema (diff)
Review State Completed
Reviewer Jari Arkko
Review review-ietf-anima-bootstrapping-keyinfra-16-genart-lc-arkko-2018-10-05
Posted at https://mailarchive.ietf.org/arch/msg/gen-art/QodztahGN3TXKlJXY2R5AQga1Kw
Reviewed rev. 16 (document currently at 19)
Review result Not Ready
Draft last updated 2018-10-05
Review completed: 2018-10-05

Review
review-ietf-anima-bootstrapping-keyinfra-16-genart-lc-arkko-2018-10-05

I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair.  Please treat these comments just
like any other last call comments.

For more information, please see the FAQ at

<https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.

Document: draft-ietf-anima-bootstrapping-keyinfra-??
Reviewer: Jari Arkko
Review Date: 2018-09-27
IETF LC End Date: 2018-10-02
IESG Telechat date: Not scheduled for a telechat

Summary:

I have reviewed this document. My intent was to complete this review by the
original Gen-ART and IETF last call deadline on October 2. Unfortunately,
it took longer to read the document than expected, for which I apologize.
Hopefully these comments are still useful.

I have some overall comments and then a number of more detailed technical
and editorial comments.

First up, I agree with Christian Huitema's stellar review that was
posted earlier on the IETF list. He brought out many of the most
relevant questions about this specification.

I won't repeat those questions, but I'll observe that I *do* think it
is useful to build enrollment and imprinting protocols, for various
situations. It is beneficial for there to be standards in this space,
and for those standards to support use cases that fit me and others
that fit other people's goals. I do agree with Christian and others
that one has to be careful about doing that though, and be careful
about, for instance, avoid putting any party in a more controlling
role than others purely due to how the technology is constructed.

I did encounter a number of question marks and have several
suggestions throughout the document. It also wasn't an easy read,
particularly from the point of view of understanding what its
implications are.

My first bigger comment is that I believe the security and privacy
considerations section should have provided an actual in-depth
analysis of the characteristics offered by the protocol, perhaps under
several different situations, as the protocol can be operated in
different modes.

My second comment is that the protocol as defined is quite focused on
manufacturer-controlled situations. As Christian mentioned, there is
some discussion of other situations in the document (Section 6.4), but
not much, and there's little information on what happens if one tries
to use the protocol in this way. It would seem that a better support
for getting a voucher or vouchers when the device is new and the
manufacturer still around would alleviate some of the concerns that
were raised in the IETF list discussion. There's little information
about the security implications, little explanation of the various
lifetimes and when infinite lifetimes are OK and when not, etc. All
of those details would have been useful. But for now, it is difficult
to even evaluate those aspects, because I don't know if I can use
the protocol for in a more owner-oriented fashion. And if I can't,
is that because of a fundamental limit or because we chose to design
the protocol that way? 

More detailed technical comments:

Section 1:

> It uses a TLS connection and an PKIX (X.509v3)
> certificate (an IEEE 802.1AR [IDevID] LDevID) of the pledge to answer
> points 1 and 2.  It uses a new artifact called a "voucher" that the
> registrar receives from a "Manufacturer Authorized Signing Authority"
> and passes to the pledge to answer points 3 and 2.

What is used to answer point 4?

Section 1.3.3:

> This document presumes that network access control has either already
> occurred, is not required, or is integrated by the proxy and
> registrar in such a way that the device itself does not need to be
> aware of the details.

I understand why this limitation is specified, but this does considerably
reduce the number of cases where BRSKI can be applied.

Section 2.1:

> A unique nonce can be
> included ensuring that any responses can be associated with this
> particular bootstrapping attempt.

It seems odd to have such a fundamental request-response matching
be optional. Also, are these the same nonces as were described early
for one-time usage limits, or different?

Section 2.2:

> Vouchers provide a signed but non-encrypted communication channel
> among the pledge, the MASA, and the registrar.  The registrar
> maintains control over the transport and policy decisions allowing
> the local security policy of the domain network to be enforced.

Is this a potential privacy vulnerability, as there are serial numbers
and possibly some other information in the voucher? (One could have,
for instance, exposed a hash of the relevant information during the
process onlu reveal the actual data once the encrypted channel is set
up to avoid this.)

Section 2.3:

> 4.  (Optionally) communicating the MUD URL (see Appendix C.

There's plenty of potentially useful information that could be
communicated. The linkage to MUDs seems a bit arbitrary. Could
this perhaps be generalized?

Section 2.4:

>  P---Voucher Request (include nonce)------>|                    |

Is this nonce the one inside the voucher or something different?

Section 3.1:

>   grouping voucher-request-grouping
>     +---- voucher
>        +---- created-on?                      yang:date-and-time
>          +---- expires-on?                      yang:date-and-time
>        +---- assertion                        enumeration
>        +---- serial-number                    string

I'm not sure it is necessary to base everything on a serial number.

Section 3.2:

> "proximity-registrar-cert": "base64encodedvalue=="

I do not understand the notation that you are using here, nor did my
grep for the above string find anything from the RFC directory other
than in RFC 8366. Is "base64encodedvalue==" a value that you are using
as an example, or some special format that signifies something else?

Section 4:

> A proxy MAY assume TLS framing for auditing purposes, but MUST NOT
> assume any TLS version.

Is this specification sufficient to understand what this means
from a actual behaviour point of view? Must pass all bytes unchanged?

Section 4:

> In the ANI, the Autonomic Control Plane (ACP) secured instance of
> GRASP ([I-D.ietf-anima-grasp]) MUST be used for discovery of ANI
> registrar ACP addresses and ports by ANI proxies.  The TCP leg of the
> proxy connection between ANI proxy and ANI registrar therefore also
> runs across the ACP.

MUST seems a strong keyword to use here. Perhaps some systems would
like to use another mechanism, but this MUST prohibits any such
changes. MUST support maybe...

Also, a bit later in Section 4.1 it says "The pledge MAY listen
concurrently for other sources of information" which would seem to
contradict the MUST.

Section 4.1:

> The result of discovery is a logical communication with a registrar,
> through a proxy.  The proxy is transparent to the pledge but is
> always assumed to exist.

Is it really a requirement that it must exist? I could imagine
small networks where the registrar is on the network...

Section 4.1:

> A new temporary
> address SHOULD be allocated whenever the discovery process is
> forced to restart due to failures. 

Why is this necessary? It seems to set a pretty high requirement
on the integration of the IP stack and the brski functionality.
Presumably RFC 4941 is used by many applications and there
should be support in the general facility for sufficiently
frequent change of addresses.

Section 5:

> o  The pledge either attempts concurrent connections, or it times out
>    quickly and tries connections in series.

Concurrent connections to where? You mean multiple proxies? Be more
specific.

Section 5.2:

> nonce:  The pledge voucher-request MUST contain a cryptographically
>     strong random or pseudo-random number nonce.  Doing so ensures
>     Section 2.6.1 functionality.  The nonce MUST NOT be reused for
>     multiple bootstrapping attempts.

But elsewhere in the document you talk about the nonceless case, so
how can the nonce actually be mandatory? Or are there multiple nonces?
The document is unclear on this.

Section 5.4:

> The registrar SHOULD verify that the serial number
> field it parsed matches the serial number field the pledge
> provided in its voucher-request.

It is surprising that this is only a SHOULD. The pledge will
be made for a wrong device and/or there is an attack somewhere
if the two values don't match.

Section 5.4.7:

> It MAY perform a simple
> consistency check: If the registrar voucher-request contains a nonce
> and the prior-signed-voucher-request exists then the nonce in both
> MUST be consistent.

The formulation is odd, because you are specifying something optional
(MAY) which then has a MUST statement but that statement relates
to how things are supposed to be, not what the implementations should
do.

And again, I think these consistency checks should be mandatory.

Section 5.5:

> expires-on:  This is set for nonceless vouchers.  The MASA ensures
> the voucher lifetime is consistent with any revocation or pinned-
> domain-cert consistency checks the pledge might perform.  See
> section Section 2.6.1.  There are three times to consider: (a) a
> configured voucher lifetime in the MASA, (b) the expiry time for
> the registrar's certificate, (c) any certificate revocation
> information (CRL) lifetime.  The expires-on field SHOULD be before
> the earliest of these three values.  Typically (b) will be some
> significant time in the future, but (c) will typically be short
> (on the order of a week or less).  The RECOMMENDED period for (a)
> is on the order of 20 minutes, so it will typically determine the
> lifespan of the resulting voucher.

There are several issues here, the lifetimes of the imprinting with
the owner/registrar vs. the lifetimes of the imprinting process
artifacts. Please clarify. I *think* you are talking about the
process expery times above, not about the expiry of the imprinting.
And certainly I'd hope 20min is not the duration of the imprinting,
but rather it needs to be infinite in most cases.

Section 5.5.2:

> The pinned-domain-cert MAY be installed as an trust anchor for future
> operations.

Could this be used for future imprinting, or only for other things?
Please clarify.

Section 5.7.1:

> 5.7.1.  MASA audit log response

I found this section not providing enough details to tell what
the MASA should return. Everything? Only log entries concerning
the device in question? Log entries for the same domain as in
the indicated voucher request?

Section 5.7.2:

> A "proximity" assertion
> assures the registrar that the pledge was truly communicating with
> the prior domain and thus provides assurance that the prior domain
> really has deployed the pledge.

I don't think I understand the promixity assurances. Obviously, the
pledge and the owner's domain are in some kind of communication,
but not sure how that proves proximity, given that requests from
a pledge could easily be tunneled from somewhere else.

Section 6.2:

> 2.  The pledge MAY support "trust on first use" for physical
>     interfaces such as a local console port or physical user
>     interface but MUST NOT support "trust on first use" on network
>     interfaces.  This is because "trust on first use" permanently
>     degrades the security for all use cases.
> 
> 3.  The pledge MAY have an operational mode where it skips voucher
>     validation one time.  For example if a physical button is
>     depressed during the bootstrapping operation.  This can be useful
>     if the manufacturer service is unavailable.  This behavior SHOULD
>     be available via local configuration or physical presence methods
>     (such as use of a serial/craft console) to ensure new entities
>     can always be deployed even when autonomic methods fail.  This
>     allows for unsecured imprint.

I think #2 is actually too conservative; with a "TOFU" button one
would reasonably be able to pair devices to a local, previously
untrusted registrar. And #3 is I suppose something along those lines,
but defined quite imprecisely. One should verify the exchanges and
signatures and go through the whole process, but NOT require signature
from the trusted MASA.

Section 6.4:

> 6.4.  MASA security reductions
> ...
> 1.  Not enforcing that a nonce is in the voucher.  This results in
>     distribution of a voucher that never expires and in effect makes
>     the Domain an always trusted entity to the pledge during any
>     subsequent bootstrapping attempts.

Depending on your viewpoint this may be a security reduction or
increase. It is only a reduction if you are the manufacturer. Owners
and users might wish to protect themselves against the manufacturer,
however, in which case permanent vouchers are a feature, not a bug.

Section 8:

> 8.  Privacy Considerations

This does not discuss the role of serial numbers and other
identifying information.

Section 9:

> 9.  Security Considerations

I expected to find a section that would go through the protocol and
explain the security properties that it has, along with some
discussion of residual vulnerabilities. What I found was a discussion
of a handful of specific issues, such as the impacts of trusting the
manufacturers, or why the nonce design is the way it is in the
protocol.

I'd find it useful to have, say, a discussion for each major
mode of the protocol and what security properties it provides
and does not provide.

Nits/editorial comments: 

> BRSKI provides a solution for secure zero-touch (automated) bootstrap
> of virgin (untouched) devices that are called pledges in this

Maybe s/virgin (untouched)/new (unconfigured)/

> Services that benefit from this:
>  o  Device management.
>  o  Routing authentication.
>  o  Service discovery.

Are these examples? If so, say so? Otherwise, please specify why these services
and not others are suitable for BRSKI.

> the following actions are required and MUST be performed by the
> pledge:
>
>  o  BRSKI: Request Voucher
>  o  EST: CA Certificates Request
>  o  EST: CSR Attributes
>  o  EST: Client Certificate Request
>  o  BRSKI: Enrollment status Telemetry

Actions? Operations? Steps? (Later you use the term operations.)

Would be good to clarify if the listed items are sending a message, or
performing a procedure called <something>, or something else. Wording
here is quite loose.

> 1.5.  Requirements for Autonomic Network Infrastructure (ANI) devices

FWIW, it would have felt more natural to have thee requirements specified in the
ANI documents, rather than here.

> The pledge goes through a series of steps, which are outlined here at
> a high level.

The Section 2.1 steps in the figure do not match the "actions" from
Section 1.5.

> 4.  Imprint on the registrar.  This requires verification of the
>     manufacturer service provided voucher.  A voucher contains
>     sufficient information for the pledge to complete authentication
>     of a registrar.  (The embedded 'pinned-domain-certificate'
>     enables the pledge to finish authentication of the registrar TLS
>     server certificate).

There is a lot going on here. Might have been better to separate the
steps.

> After imprint a secure transport exists between pledge and registrar.

Do you mean there's an open TLS/TCP connection or that there are
credentials&keys to securely communicate between the two?

>   The pledge goes through a series of steps, which are outlined here at
>   a high level.
>
> ...
>
>   |            +------v-------+
>   |            | (6) Enrolled |
>   ^------------+              |
>    Factory     +--------------+

"Enrolled" is not a step, but rather a state...

> 2.  Securely authentating the pledges identity via TLS connection to

typo

> 2.  Securely authentating the pledges identity via TLS connection to

s/pledges/pledge's/?

> registrar.  This provides protection against cloned/fake pledged.

s/pledged/pledge/?

> 3.  Secure auto-discovery of the pledges MASA by the registrar via
>     the MASA URI in IDevID as explained below.

I can't parse the sentence.

> 4.  (Optionally) communicating the MUD URL (see Appendix C.

Missing closing paren.

> The following newly defined field SHOULD be in the PKIX IDevID
> certificate:

I didn't understand the formulation here. Are you defining one
(seems like it below) or are you referencing a newly elsewhere
defined field? Suggest a reformulation, e.g., "This document
defines a new field for PKIX certificates. These fields SHOULD
appear in the PKIX IDevID certificates when used with BRSKI."

> Specifically, the IDevID:
> 
> 1.  Uniquely identifying the pledge by the Distinguished Name (DN)
> ...
> 2.  Securely authentating the pledges identity via TLS connection to
> ...
>  3.  Secure auto-discovery of the pledges MASA by the registrar via
> ...
> 4.  (Optionally) communicating the MUD URL (see Appendix C.
>
> 5.  (Optional) Signing of voucher-request by the pledges IDevID to
> ...
> 6.  Authorizing pledge (via registrar) to receive certificate from

There's something wrong with the language here. Maybe "Specifically,
the IDevID enables the following: 1. Unique identification of the ..."

Section 2.4 protocol flow is very different from the explanation
in Section 2.1, which does not show the role of the background
systems (MASA) at all.

>  P                     |       /--->       |                    |
>  P                     |       |      [accept device?]          |
>  P                     |       |      [contact Vendor]          |
>  P                     |       |           |--Pledge ID-------->|
>  P                     |       |           |--Domain ID-------->|
>  P                     |       |           |--optional:nonce--->|
>  P                     |       |           |     [extract DomainID]
>  P                     |    optional:      |     [update audit log]
>  P                     |       |can        |                    |
>  P                     |       |occur      |                    |
>  P                     |       |in         |                    |
>  P                     |       |advance    |                    |
>  P                     |       |if         |                    |
>  P                     |       |nonceless  |                    |
>  P                     |       |           |<- voucher ---------|
>  P                     |       \---->      |                    |

I had a difficulty in understanding the notation here. Is the
thing in the middle with the label "optional" a bracket that
shows a part of the flow can be at a different place? Or is it
some entity with some messages flowing from it, because it has
arrows? I think the format, but this is unclear.

> The registrar uses an Implicit Trust Anchor database for

Upon the first reference to the term you should include a
reference to RFC 7030.

> be-determined mechanism (such as an Intent) to become the local

Reference needed to "Intent".

> 4.2.  CoAP connection to Registrar
> 
> The use of CoAP to connect from pledge to registrar is out of scope
> for this document, and may be described in future work.

I would just have said other mechanisms can be defined in future work.

> In the nonced case, validation of the registrar MAY be omitted if the

Nonced?

> assertion:  The assetion leaf in the voucher and audit log indicates

Typo