Last Call Review of draft-ietf-mpls-residence-time-12
review-ietf-mpls-residence-time-12-secdir-lc-kaduk-2017-01-18-00

Request Review of draft-ietf-mpls-residence-time
Requested rev. no specific revision (document currently at 15)
Type Last Call Review
Team Security Area Directorate (secdir)
Deadline 2017-01-17
Requested 2017-01-03
Draft last updated 2017-01-18
Completed reviews Rtgdir Early review of -11 by He Jia (diff)
Genart Last Call review of -12 by Robert Sparks (diff)
Secdir Last Call review of -12 by Benjamin Kaduk (diff)
Opsdir Last Call review of -12 by Jürgen Schönwälder (diff)
Secdir Telechat review of -14 by Benjamin Kaduk (diff)
Genart Telechat review of -13 by Robert Sparks (diff)
Genart Telechat review of -14 by Robert Sparks (diff)
Assignment Reviewer Benjamin Kaduk
State Completed
Review review-ietf-mpls-residence-time-12-secdir-lc-kaduk-2017-01-18
Reviewed rev. 12 (document currently at 15)
Review result Has Issues
Review completed: 2017-01-18

Review
review-ietf-mpls-residence-time-12-secdir-lc-kaduk-2017-01-18

I have reviewed this document as part of the security directorate's 
ongoing effort to review all IETF documents being processed by the 
IESG.  These comments were written primarily for the benefit of the 
security area directors.  Document editors and WG chairs should treat 
these comments just like any other last call comments.

This document is Almost Ready.

This document describes a mechanism for recording the residence time of
timing packets along a network path, so that time protocols (NTP, PTP) can obtain
more accurate estimates of transit time.  (Well, the document just claims
to record the residence time and I am inferring the bit about more accurate
estimates; maybe there should be an explicit mention.)  It is limited (at
present) to MPLS networks established using RSVP-TE, to the exclusion of
LSPs established via LDP and non-MPLS networks.  The residence time measurement
is performed in a Generic Associated Channel, and several new data structures
(and sub-data structures) and corresponding type values (and sub-type values)
are established to carry the needed information.

I almost marked this document as Ready with Issues (discussion below), but
ended up changing to Almost Ready, since I think the document structure
needs some more work in order to clearly describe what an interoperable
implementation would need and how the pieces fit together, as is required
for Standards-Track documents.

The security considerations section incorporates that section from
RFC 5586 by reference, but 5586's security considerations are basically
just pointers to those considerations in RFCs 4385 and 5085.
This document also mentions RFC 7384, whose entirety is security requirements
of time procotols, which probably contains more detail than this document would
need if discussion was inline.  However, the security considerations of
draft-ietf-mpls-residence-time-12 also contains discussion about how
PTP-aware nodes on the path are required to modify the messages, and the
needed trust model involves these nodes being trusted to perform those modifications.
That seems true and is probably fine for a protocol that is running on
"trusted infrastructure", but the claim is also made that the messages modified
by intermediate nodes "cannot be authenticated".  This is only somewhat
true, as one can create complex crypto schemes that involve giving key
material to intermediate nodes that can let them make authenticated
(but detectable) modifications.  Such schemes seem far too complex for the
topic at hand, though, as they are likely to increase the processing delay
for the time packets, and it seems fine to defer investigating them in the
same way that it is fine to defer investigating authenticating/encrypting
the RTM data that does not need to be modified by intermediate nodes, which
is explicitly noted in the security considerations.

I do think there are some relevant security considerations that are not
mentioned, though -- for the two-step flow, an RTM-capable node is
required to wait for the follow-up RTM message and make the corresponding
residence time update.  This requirement is unbounded and could lead to
a resource leak if that follow-up packet fails to arrive, for an implementation
that blindly follows the spec without resorting to practical engineering
knowledge.  I do not expect there to be any such implementations, but this
document should probably indicate that timing out is okay within
"reasonable" bounds, or whatever similar workaround is best practice in this
domain.

In terms of other security/privacy considerations that are new in this
document, there is some information exposure about nodes along the path
that could potentially be used for fingerprinting, but since the timing
packets carry destination addresses already, and the LSP setup appears
to involve declaring the path anyway, this doesn't seem to merit any concern.

The other main issue I have with this document is arguably not an issue
at all, but it relates to the plethora of TLVs and sub-TLVs and TLVs
in other registries, with an IANA considerations section that sometimes
does not clearly indicate what registry is to be updated.  As per the
checklist at https://www.ietf.org/iesg/template/doc-writeup-essay-style.html,
the IANA considerations shoudl refer to registries by their exact names,
which probably means the name of the sub-registry and the overarching
parent registry should be clearly written out.  It might also be nice
to have more descriptive names, so I do not have to keep track that there
are RTM G-ACh packets whose values are sometimes sub-TLVs in the PTP case;
RTM Capability (or is it Capabilities?) sub-TLVs that can be contained
in any of OSPFv2, IS-IS, and maybe OSPFv3 data structures; an RTM_SET
TLV whose presence is indicated as an Attribute Flag from some registry that does
not seem to be named; sub-TLVs within those RTM_SET messages; and also
the RTM sub-TLVs that get a registry created for them in section 8.3
and I apparently missed when paging through the document to create this list.
The names used to refer to these structures have me flipping back and
forth to figure out which is which, whereas a name like "RTM_SET
address identifier" (viz. section 8.6) would help the reader a lot.
In a similar vein, it would be nice to have some test vectors that show
the encoding of these structures and their encapsulation in the parent
data structures, to make sure that the implementor gets all the right
layers of T+L wrapping in place.  Alternately (or additionally!), more
clear text that the T+L in the figures here are included in the encoded
data that is the contents of a specific, named, parent data structure
would be useful.


Some other more nit-level things:

In the example in section 4.6, when F is updating the correction field
of the PTP message, I assume that F should also use its measurement of
the residence time on F in addition to the value received in the scratch
pad field, but the example seems to not indicate that.

A couple paragraphs later, "[a]n ingress node that is configured to
perform RTM [...] verifies that the selected egress [node supports RTM]";
should that be a MUST-level requirement that the verification is done?

On page 12, last paragraph, we have some text "If no RTM_SET TLV has been
found, then the LSP setup MUST fail [...]".  Is this only in the case
when the RTM_SET flag is set?  If so, that should probably be made more
clear in the text, as on my first reading I was surprised, since
the RTM_SET generally goes in the LSP_ATTRIBUTES and not the
LSP_REQUIRED_ATTRIBUTES, and as such would not be globally mandatory.

Section 5 makes an offhand note that a 4.6 nanosecond error would
probably be ignorable, which leads to the question: what is the
actual measurment precision that is needed for this scheme to be useful?
The scratch pad uses an IEEE double to count nanoseconds, so potentially
sub-nanosecond values are in-scope, but as someone not well-versed in
PTP I really have no idea how good things can/need to be.

The "A" and "B" subcases mentioned in section 7 get multiple paragraphs
each; it might be more clear to make them subsections instead.

I'm also left puzzled by the last paragraph of section 7; it seems to say
that the *last* RTM(-capable) node of the LSP will generate the follow-up
message, but I thought it was generally an earlier node that would be
setting the S bit and generating the follow-up message.

This document uses several abbreviations/acronyms without introduction
that do not appear on the RFC Editor's abbreviations list
(https://www.rfc-editor.org/materials/abbrev.expansion.txt) as not
needing expansion: G-ACh (also appears in the abstract; the RFC Editor
will likely want to not use the acronym at all in the abstract),
RSVP-TE, and PW are ones I noted.
(LDP is also used without expansion, but does appear on the list as
"well-known".)


There are also a lot of grammar nits (including very many missing
instances of the definite article), but it does not seem worth enumerating
them here.  I will try to send a diff to the authors later this week,
but time is a bit short at the moment.

-Ben