Relaxing Restrictions on Explicit Congestion Notification (ECN) Experimentation

Summary: Has a DISCUSS. Has enough positions to pass once DISCUSS positions are resolved.

Benoit Claise Discuss

Discuss (2017-09-28 for -06)
Below is Sue Hares' OPS DIR review. I would like to DISCUSS some points. I'm of two minds.
On one side, I read:

       The scope of this memo is limited to these three areas of
       experimentation.  This memo expresses no view on the likely outcomes
       of the proposed experiments and does not specify the experiments in
       detail.  Additional experiments in these areas are possible, e.g., on
       use of ECN to support deployment of a protocol similar to DCTCP
       [I-D.ietf-tcpm-dctcp] beyond DCTCP's current applicability that is
       limited to data center environments.  The purpose of this memo is to
       remove constraints in standards track RFCs that stand in the way of
       these areas of experimentation.

On the other side:

   An Experimental RFC MUST be published for any protocol
   or mechanism that takes advantage of any of these enabling updates.

Btw, the MUST is just plain wrong, as Ben mentioned.
So, if you go in the direction of providing requirements for the experiments (as opposed to just removing the specification constraints), then I agree with Sue. The document title goes in the direction: if it would say something such as "Removing constraints to the ECN experimentation", that would be a different story.

Reviewer: Susan Hares
Review result: Has Issues

This is an OPS-DIR Review which focus the work on issues in deployed technology
based on this RFC.

Summary: Has issues as guide to experimental RFC .  To me these operational
issues General comment: Thank you david for addressing this Area.  Better ECN
control is critical to many portions of the network.
                                   My comments on this draft are because I
                                   really hope you can do quality experiments.

How this might be resolved: if there is a operational guidelines section (or
separate document), that covered: a) how to set-up and determine if a ECT(1)
experiment success or fails b) how to manage your ECT(1) experiment in your
network. c) how to manage and detect if your ECT experiment is running into
problems with other IETF technology (TRILL, MPLS VPNs, IPVPNs, BIER and NV03
technology). d) Recommending a monitoring structure (e.g. yang modules,
netconf/restconf and monitoring tools0

Major issues:

#1 There is nothing in this document which provide guidelines to the authors of
experimental RFCs based in this draft on specific ways to
    monitor the ECN experiments, report the ECN experimental data, or disable
    the experimental data.   If the success or failure of an experiment is
    based on "popular vote" determined by deployment, then say state this
    point.  I personally would object to that
   because you cannot tell if a limited experiment in a specific location (E.g.
   a data center) might be successful in another location.

  If the success or failure of an experimental RFC is based on a specific set
  of criteria for ECN, then it would be good to give an operational suggestion
  on how to: a) design an experiment, b) run an experiment and collect data,
  and c) report ths
 data in order to standardize the ECN experiments using ECT(1).

 page 10 section 4.2 last 2 paragraphs in sentence, hinted that you have an
 experiment in mind without specifying the experiment's success or failure
 criteria other than popular vote.  Is this true?  if it is, this is
 problematic.  If I misunderstood your text, then please
have someone re-read the text.

I have read lots of papers on ECN.

2) No discussion was given on how the TCP layer experimentation would impact
routing layer handdlng of ECN.

For example, the trill WG has the draft draft-ietf-trill-ecn-support. 
Automated tunnel set-up for MPLS VPNS or IP VPNS may look at the ECN ECT(0)  or
ECT(1).  TRILL's ECN supports the layer-2 within the data centers.  Some IP
VPNS or MPLS VPNS may be needed for the data-center to business site
 or data-center backup traffic.

 As written, this draft allows loosening of the RFC3168 draft but does not
 provide guidelines  for network interaction.

3)  Some networks also use the diff-service markings to guide traffic in the
    This document does not suggest an operational check list on how to design
    an experiment that supports or does not support these markings.

4) Modern operational IETF protocols and data modules for automating the
tracking of these experiments should be suggests
Comment (2017-09-28 for -06)
And again, I can only agree with Sue's comment below, as stressed by Ekr and Ben also.
However, this point was answered by Spencer and Mirja, so let's move on. 

Some reviews have hinted that the text is repeats several sets of language.  
People have found this lacked clarity.  One wonders why the authors did not
simply provide a set of bis documents for RFC3168, RFC6679, RFC 4341, RFC4342,
and RFC5622 if it is just updating the language in the specifications.

This document tries to be both revision of the specifications and the
architectural guidelines for experiments.  The dual nature does not lead to
clarity on either subject.

Alia Atlas Yes

Spencer Dawkins Yes

Deborah Brungard No Objection

Ben Campbell No Objection

Comment (2017-09-27 for -06)


 I agree with Ekr's comment specifying RFC updates as text"patches" is confusing and unfriendly to the reader. While I don't necessarily agree that we should do bis drafts instead, I hope people will remember that we never actually render an "updated" RFC with the patches applied. I think this sort of thing would be easier to understand if we just spoke of the changes at a conceptual level and avoid getting wrapped around changing specific words to other specific words.  I don't expect that to change with this particular draft, but I think the IESG should consider some guidance here. (And I fully recognize that a lot of update drafts in ART do the same thing.)

-1, first paragraph: "An Experimental RFC MUST be published for any protocol
   or mechanism that takes advantage of any of these enabling updates."

This seems like an odd use of 2119 language, at least for something other than a BCP.  Furthermore, I don't think this paragraph is the real authority on the matter, since there is much more precise text scattered through the draft about this particular requirement. Therefore, I suggest removing the 2119 language and stating this descriptively. (e.g. "Mechanisms that take advantage of these updates are to be specified in experimental RFCs.")

Speaking of which, there are several references to requiring experimental RFCs. Are those expected to be IETF stream RFCs?

-5, first "patch": 
Are there existing implementations that would become noncompliant with the new text?

-9, 1st paragraph: "As a process memo that only removes limitations on proposed
   experiments, there are no protocol security considerations."
I am skeptical that removing limitations from experiments can be assumed to be security neutral on its face. Please consider documenting the thought process that led to that conclusion.

-1, 2nd paragraph: " There is no need to make changes for protocols ...
I'm a bit confused--those hypothetical standards track RFCs will need to make the sort of changes that this paragraph says are not needed. Is the point that _this_ document does not need to make those changes?

-2, Congestion Response Difference: "As discussed further in
      Section 4.1, an ECN congestion indication communicates a higher
      likelihood that a shorter queue exists at the network bottleneck
      node by comparison to a packet drop that indicates congestion

I'm not sure what is being compared here. Is it a high chance of shorter queue, or higher chance of a short queue?

-- (same paragraph): "(next bullet)"
There are no bullets.

-2, Congestion Marking Difference:
The first sentence is hard to parse. Please consider breaking it into simpler sentences.

-3, first paragraph: "As specified in RFC 3168, ... [stuff]... , as specified in experimental
   RFC 3540 [RFC3540].
This is hard to parse. In particular, I have trouble deciphering which part is supposed to be specified in 3168 vs 3540.

-- 2nd paragraph: "but might equally have been due to re-
   marking of the ECN field by an erroneous middlebox or router."
I think "erroneous" modified "re-marking" rather than "middlebox or router".

-4.2, 1st paragraph: "... prevent the aggressive low latency traffic starving conventional
   traffic ...  "
Is there a missing "from" between "traffic" and "starving"  Or maybe an "of" after "starving"?

Suresh Krishnan No Objection

Warren Kumari No Objection

Comment (2017-09-28 for -06)
I'm supporting Benoit's discuss. I have nothing to add, and will let him carry it.

Mirja K├╝hlewind No Objection

Comment (2017-09-26 for -06)
1) section 2 actually seems a bit redundant to me but I guess that not a problem. I guess I would also be okay with the doc if section 2 would not be there, but maybe this overview actually helps others who are less familiar with ECN.

2) I guess it could actually be helpful to include a tiny bit of reasoning why the ECN Nonce experiment is concluded instead of just saying that it was never deployed and further point to an appendix of a draft (which is actually appendix C and not B.1 I believe). I know this was discussed, but having one sentence saying something in the lines like this is probably not to hard: "The experiment is concludes with the result that ECN Nonce does not provide a reliable integrity protection as the other end can always pretend to not support this optional feature."

3) This sentence is a bit unclear:
"In addition, until the conclusion of the L4S experiment, use of
   ECT(1) in IETF RFCs is not appropriate, as the IETF may decide to
   allocate ECT(1) exclusively for L4S usage if the L4S experiment is
I guess the L4S exp RFC would use ECT(1), so the sentence to not really make sense to me. Also given that there is no RFC for L4S yet, we actually don't know if there is finally community/group consensus to publish that RFC. I guess it actually to early to say something like this. I know why you want to have this sentence in there, but from the processing point of view that does not seems to be the correct thing to do and it might be better to just remove it.
Otherwise, if you want to  keep it or something similar but maybe less specific, following up on the feedback provided by IANA, the TCP flag should probably be just marked as reserved with reference to this RFC.

4) I'm not certain about this part:
"An exception to this requirement occurs in
      responding to an ongoing attack.  For example, as part of the
      response, it may be appropriate to drop more ECT-marked TCP SYN
      packets than TCP SYN packets marked with not-ECT. "
Maybe it's to late here and that's the reason I don't get it, but what would be the reason to rather drop ETC marked SYNs? Can you explain? I do understand that there is no reason to not drop ECT-marked SYNs in an attack situation (meaning don't try to mark, drop immediately) but I don't understand why you should drop ECT-marked SYNs preferentially? If the assumption is that attacker would more likely use ECN than non-attackers because the will usually not be dropped but marked, I'm uncertain if that is true and still not sure if the above recommendation is appropriate. In any case I think this needs at least more explanation in the document.

5) As a side note on this sentence:
"Random ECT values MUST NOT be used, as that may expose RTP to
      differences in network treatment of traffic marked with ECT(1) and
      ECT(0) and differences in associated endpoint congestion
      responses, e.g., as proposed in [I-D.ietf-tsvwg-ecn-l4s-id]."
I think random marking is anyway not a good idea because this is sometimes used (incorrectly) as input for ECMP. Therefore I actually think the reference to the l4s draft is actually not needed here.

Terry Manderson No Objection

Alexey Melnikov No Objection

Comment (2017-09-27 for -06)
I found section 2 to be useful.

Kathleen Moriarty No Objection

Comment (2017-09-28 for -06)
I support Benoit's discuss.

Eric Rescorla No Objection

Comment (2017-09-24 for -06)
Having a document which is sort of a verbal patch on another document is pretty hard to read. I recognize that this seems to be customary in some areas, so I'm not marking this as DISCUSS, but I really wish you would do a bis instead.

Line 98
   This memo updates RFC 3168 [RFC3168] which specifies Explicit
   Congestion Notification (ECN) as a replacement for packet drops as
   indicators of network congestion.  It relaxes restrictions in RFC
Replacement or additional indicator?

Line 164
      that for congestion indicated by ECN, a different sender
      congestion response (e.g., reduce the response so that the sender
      backs off by a smaller amount) may be appropriate by comparison to
nit: reducing

Line 170
      couples the backoff change to Congestion Marking Differences
      changes (next bullet).  This is at variance with RFC 3168's
      requirement that a sender's congestion control response to ECN
I'm having a lot of trouble reading this sentence. It seems like you are comparing the ECN response to a lost response, but these other two drafts also are about a less aggressive response. Perhaps this would be clearer as:

"indicated by loss. Two examples of such a reduced response are..."

Alvaro Retana No Objection

Adam Roach No Objection

Comment (2017-09-26 for -06)
Section 3 ends with the following text:

   Additional minor changes remove other mentions of the ECN nonce and
   implications that ECT(1) is intended for use by the ECN nonce; the
   specific text updates are omitted for brevity.

I'm not sure I follow what's meant here. Is this basically saying "there are other, less substantive edits required to RFC 3168, but those changes are left as an exercise for the reader"?

[Sorry for the additional noise, but I noticed one more issue that you'll want to correct]

Section 9 says "See Appendix B.1 of [I-D.ietf-tsvwg-ecn-l4s-id] for discussion of alternatives to the ECN nonce." I beleive that this should refer to Appendix C.1 rather than Appendix B.1.

Alissa Cooper No Record