Last Call Review of draft-ietf-netconf-subscribed-notifications-23

Request Review of draft-ietf-netconf-subscribed-notifications
Requested rev. no specific revision (document currently at 26)
Type Last Call Review
Team Transport Area Review Team (tsvart)
Deadline 2019-04-12
Requested 2019-03-22
Authors Eric Voit, Alexander Clemm, Alberto Prieto, Einar Nilsen-Nygaard, Ambika Tripathy
Draft last updated 2019-04-02
Completed reviews Yangdoctors Last Call review of -10 by Andy Bierman (diff)
Yangdoctors Last Call review of -21 by Andy Bierman (diff)
Rtgdir Last Call review of -23 by Ravi Singh (diff)
Secdir Last Call review of -23 by Chris Lonvick (diff)
Tsvart Last Call review of -23 by Wesley Eddy (diff)
Opsdir Last Call review of -23 by Carlos Pignataro (diff)
Assignment Reviewer Wesley Eddy
State Completed
Review review-ietf-netconf-subscribed-notifications-23-tsvart-lc-eddy-2019-04-02
Reviewed rev. 23 (document currently at 26)
Review result Almost Ready
Review completed: 2019-04-02


This document has been reviewed as part of the transport area review team's
ongoing effort to review key IETF documents. These comments were written
primarily for the transport area directors, but are copied to the document's
authors and WG to allow them to address any issues raised and also to the IETF
discussion list for information.

When done at the time of IETF Last Call, the authors should consider this
review as part of the last-call comments they receive. Please always CC if you reply to or forward this review.

The document includes a way intended to ask for a particular DiffServ Code Point (DSCP) value to be used by a publisher.  This is missing some context.  Why would a subscriber do this?  Is it asking for DSCP values that it assumes are honored and not bleached or altered all the way between it and the publisher?  Are there conditions normal for the use of this protocol where that assumption usually holds?

By asking for a particular DSCP to be set, the subscriber is maybe attempting to optimize the behavior during some type of overload, where they really want some notifications quickly, but others aren't as important, and may be fine to drop and have retransmitted by TCP, etc.  This all seems to be implicit though, with no deep discussion of what the protocol is attempting to enable with this option or how it would be productively used.

It seems to be considered fatal if a publisher can't write a requested DSCP value.  The requests fail with "dscp-unavailable".  This seems too drastic, since the DSCP is anyways advisory to nodes on the path, and may be altered anyways.  I'm not sure why this would be considered a fatal issue to the subscription.

The feature description for "dscp" mischaracterizes it as a pure priority mechanism, rather than a more general indication of class of service treatment:  "This feature indicates a publisher supports the placement of suggested prioritization levels for network transport within notification messages."  It could be more correct to say something like "This feature indicates that a publisher supports the ability to set the DiffServ Code Point (DSCP) value in outgoing packets."

The same comment is applicable in the dscp leaf description on page 45.  It is mischaracterized as a pure priority mechanism, which is not how the IETF has defined DSCP.

The weighting feature seems to need a little bit more work.  It allows values between 0 and 255, and there is some description that bandwidth is supposed to be allocated somehow proportional to the weighting, but it's not really clear how this would be done or that it makes sense.   Is it assumed that the publisher has some fixed bandwidth limit that it's trying to stay within, and that it can choose messages from streams (based on their size, frequency, etc) in some way to honor the weights?  What is the method to compute the proportions?  Is it purely linear?  What if there is no contention?  What if there is no inherrent bandwidth limit known to the publisher (since generally there probably wouldn't be)?  The intention and detail of what this is trying to achieve seems to need to be worked through a bit more to avoid just having a complex feature that might not really achieve much real result, depending on how its implemented.

Is there any thought on phase effects, with regard to events that have many subscribers, causing a large number of simultaneous writes on different streams?  Overall, there could be low bandwidth utilization for notifications, but then sudden spikes when there is a coupling in the notifications going out on multiple streams.  This could lead to some connections seeing losses and others not, for instance, and there might be a reason to try to dither the writes or have other logic encouraged to handle surges of outgoing notifications.  Since the underlying transports are TCP-based, losses will be recovered eventually, but there could be latency created in the meantime.  This might motivate some of the QoS features that are cursorily discussed, but its not clear how they would be effective.