Last Call Review of draft-ietf-tcpm-rto-consider-14
review-ietf-tcpm-rto-consider-14-genart-lc-bryant-2020-05-30-00

Request Review of draft-ietf-tcpm-rto-consider
Requested rev. no specific revision (document currently at 16)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2020-06-01
Requested 2020-05-18
Authors Mark Allman
Draft last updated 2020-05-30
Completed reviews Secdir Last Call review of -14 by Liang Xia (diff)
Genart Last Call review of -14 by Stewart Bryant (diff)
Iotdir Telechat review of -16 by Wesley Eddy
Assignment Reviewer Stewart Bryant
State Completed
Review review-ietf-tcpm-rto-consider-14-genart-lc-bryant-2020-05-30
Posted at https://mailarchive.ietf.org/arch/msg/gen-art/wUMdgNledJBa-wRrzFtw46QrVB0
Reviewed rev. 14 (document currently at 16)
Review result Not Ready
Review completed: 2020-05-30

Review
review-ietf-tcpm-rto-consider-14-genart-lc-bryant-2020-05-30

I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair.  Please treat these comments just
like any other last call comments.

For more information, please see the FAQ at

<https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.

Document: draft-ietf-tcpm-rto-consider-14
Reviewer: Stewart Bryant
Review Date: 2020-05-30
IETF LC End Date: 2020-06-01
IESG Telechat date: Not scheduled for a telechat

Summary:

I have concerns that there is inadequate scoping information provided in this proposed BCP.

The authors are clearly focused on some L4 cases and some applications. I am concerned
that this document, if published as is, will result in excessive work and delay if reviewers
insist that it applies to many network infrastructure cases. I am also concerned that past
 experience may not be a good guide to some new applications, particularly those 
from the deterministic stable.

Major issues:

As far as I can see this text only applies to exchanges between applications and network support applications such as DNS. I.e. this is targeted at layer 4 and above. Given the religious nature of BCPs in the eyes of some reviewers, and to prevent endless explanations by those that design routing protocols, OAM and other lower layer sub-system I think there needs to a scoping text in block capitals at the at the very start of the documnet.

=========

      - The requirements in this document may not be appropriate in all
        cases and, therefore, inconsistent deviations may be necessary
        (hence the "SHOULD" in the last bullet).  However,
        inconsistencies MUST be (a) explained and (b) gather consensus. 
 
SB> That can be quite an onerous obligation  and provide scope for endless 
argument when reviewers are not domain experts in the protocol being 
designed.

=======

          While there are a bevy of uses for timers in protocols---from
          rate-based pacing to connection failure detection and
          beyond---these are outside the scope of this document.

SB> I am not sure what that means for the applicability of this document.

=========

    (1) As we note above, loss detection happens when a sender does not
        receive delivery confirmation within an some expected period of
        time.  In the absence of any knowledge about the latency of a
        path, the initial RTO MUST be conservatively set to no less than
        1 second. 

SB> This issue may be addressed by the scoping text, but 1s is no use
when you are trying to detect sub 50ms of packet loss in the infrastructure.

=============

    (3) Each time the RTO is used to detect a loss, the value of the RTO
        MUST be exponentially backed off such that the next firing
        requires a longer interval.  The backoff SHOULD be removed after
        either (a) the subsequent successful transmission of
        non-retransmitted data, or (b) an RTO passes without detecting
        additional losses.  The former will generally be quicker.  The
        latter covers cases where loss is detected, but not repaired.
    
        A maximum value MAY be placed on the RTO.  The maximum RTO MUST
        NOT be less than 60 seconds (as specified in [RFC6298]).

        This ensures network safety.

SB> This does not work in OAM applications.

Minor issues:

 "By waiting long enough that we are unambiguously
  certain a packet has been lost we cannot repair losses in a timely
  manner and we risk prolonging network congestion."

I have a concern here that the emphasis is on classical operation. We are beginning to see
application to run over the network where the timely delivery of a packet is critical
for correct operation of even SoL. As a BCP the text needs to recognise that the 
scope and purpose of IP is changing and that classical learning and rules derived from them
may not apply.

Also if not ruled out of scope earlier we need to be clear at this point that things like BFD
have different considerations.

==========

      "- This document does not update or obsolete any existing RFC.
        These previous specifications---while generally consistent with
        the requirements in this document---reflect community consensus
        and this document does not change that consensus."

I think it needs to be clear that adherence to this RFC is not required for minor
updates and extensions to existing RFCs. Having seen minor routing extension held
up by security concerns related to underlying protocols rather than the extension itself
there is a lot of sensitivity on this point in some quarters of the IETF.

========

It might be useful to make it clear that there are some applications that would prefer
no data to late data.

Nits/editorial comments:

The terminology section confuses ID-nits - I think it should be a section in its own right
later in the document.

The following nits issues need looking at


  == Missing Reference: 'RFC5681' is mentioned on line 377, but not defined

  == Unused Reference: 'RFC3940' is defined on line 515, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC4340' is defined on line 519, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6582' is defined on line 540, but no explicit
     reference was found in the text