TRILL (TRansparent Interconnection of Lots of Links): ECN (Explicit Congestion Notification) Support
draft-ietf-trill-ecn-support-07

Note: This ballot was opened for revision 05 and is now closed.

(Alia Atlas) Yes

(Spencer Dawkins) Yes

Comment (2018-02-07 for -05)
No email
send info
I agree with Mirja about the status of L4S, but would go even farther - L4S is only one of the ECN experiments that https://datatracker.ietf.org/doc/rfc8311/ was intended to accommodate, so you might want to capture that in the appendix (basically saying "L4S is one example of the ways TRILL ECN handling may evolve", or something like that).

Is 

  If an RBridge supports ECN, for the two cases of an IP and a non-IPR
   inner packet, the egress behavior is as follows:

really "non-IPR"? I'm guessing it should be "non-IP".

s/significnat/significant/

Deborah Brungard No Objection

(Ben Campbell) No Objection

Comment (2018-02-07 for -05)
No email
send info
Just a typo in section 6: s/  significnat / significant

(And I see Spencer already caught it :-) )

(Benoît Claise) No Objection

Alissa Cooper No Objection

Suresh Krishnan No Objection

Mirja Kühlewind No Objection

Comment (2018-02-07 for -05)
No email
send info
Given L4S is not published yet and in any case experimental, I would recommend to remove section 4.2 entirely and just keep the appendix as an informational documentation of the proposed alogrithm.

(Terry Manderson) No Objection

Alexey Melnikov No Objection

(Kathleen Moriarty) No Objection

(Eric Rescorla) No Objection

Alvaro Retana No Objection

Adam Roach (was Discuss) No Objection

Comment (2018-02-20 for -05)
No email
send info
I'm balloting "no objection" based on explanations from the author about
all three points raised in my original discuss. I have requested that the
authors include a summary of these explanations in the document to
aid implementors in understanding why Table 3 is defined the way it is,
so they don't erroneously conclude that the table is incorrect.

My original discuss text and original comments appear below for posterity.

---------------------------------------------------------------------------

Thanks to the authors, chairs, shepherd, and working group for the effort that
has been put into this document.

I have concerns about some ambiguity and/or self-contradiction in this
specification, but I suspect these should be easy to fix. In particular, the
behavior defined in Table 3 doesn't seem to be consistent with the behavior
described in the prose.

For easy reference, I've copied Table 3 here:

>       +---------+----------------------------------------------+
>       | Inner   |  Arriving TRILL 3-bit ECN Codepoint Name     |
>       | Native  +---------+------------+------------+----------+
>       | Header  | Not-ECT | ECT(0)     | ECT(1)     |     CE   |
>       +---------+---------+------------+------------+----------+
>       | Not-ECT | Not-ECT | Not-ECT(*) | Not-ECT(*) |  <drop>  |
>       |  ECT(0) |  ECT(0) |  ECT(0)    |  ECT(1)    |     CE   |
>       |  ECT(1) |  ECT(1) |  ECT(1)(*) |  ECT(1)    |     CE   |
>       |    CE   |      CE |      CE    |      CE(*) |     CE   |
>       +---------+---------+------------+------------+----------+
>
>                      Table 3. Egress ECN Behavior
>
>  An asterisk in the above table indicates a currently unused
>  combination that SHOULD be logged. In contrast to [RFC6040], in TRILL
>  the drop condition is the result of a valid combination of events and
>  need not be logged.

The prose in this document indicates:

 1. Ingress gateway either copies the native header value to the TRILL ECN
    codepoint (resulting in any of the four values above) or doesn't insert
    any ECN information in the TRILL header.

 2. Intermediate gateways can set the CCE flag, resulting in "CE" in the
    table above.

Based on the above, a packet arriving at an egress gateway can only be in one of
the following states:

 A. TRILL header is Not-ECT because no TRILL node inserted ECN information.

 B. TRILL header value == Native header value because the ingress gateway
    copied it from native to TRILL.

 C. TRILL header is "CE" because an intermediate node indicated congestion.

If that's correct, I would think that any state other than those three needs
to be marked with an (*). In particular, these two states fall into that
classification, and seem to require an asterisk:

  - Native==CE && TRILL==ECT(0)

  - Native==ECT(0) && TRILL==ECT(1)

In addition, the behavior this table defines for Native==ECT(0) && TRILL==ECT(1)
is somewhat perplexing: for this case, the value in the TRILL header takes
precedence; however, when Native==ECT(1) && TRILL==ECT(0) the Native header
takes precedence. Or, put another way, this table defines ECT(1) to always
override ECT(0). I don't find any prose in here to indicate why this needs to be
treated differentially, so I'm left to conclude that this is a typographical
error. If that's not the case, please add motivating text to Table 3 explaining
why ECT(1) is treated differently than ECT(0) for baseline ECN behavior.

---------------------------------------------------------------------------

I also have a small handful of editorial suggestions and nits to propose.

Please expand "TRILL" upon first use and in the title; see
https://www.rfc-editor.org/materials/abbrev.expansion.txt for guidance.

---------------------------------------------------------------------------
§1:

>  In [RFC3168] it was recognized that tunnels and lower layer protocols

"In [RFC3168], it was..."
(insert comma)

---------------------------------------------------------------------------

§2:

>  These fields are show in Figure 2 as "ECN" and "CCE". The TRILL-ECN

"...are shown..."


>  The CRItE bit is the critical Ingress-to-Egress summary
>  bit and will be one if and only if any of the bits in the CItE range
>  (21-26) is one or there is a critical feature invoked in some further

"...if any of the bits... are one or..."
(replace "is" with "are")



>   The first three have the same meaning as the corresponding ECN field
>   codepoints in the IPv4 or IPv6 header as defined in [RFC3168].

Section 1.1 defines "IP" to mean both IPv4 and IPv6. It would seem cleaner and
easier to read if the document were to leverage that definition here.


>   However codepoint 0b11 is called Non-Critical Congestion Experienced

"However, codepoint..."
(insert comma)

---------------------------------------------------------------------------

§3.3.2:

>  If an RBridge supports ECN, for the two cases of an IP and a non-IPR

"...non-IP"

---------------------------------------------------------------------------

§4:

>  Section 3 specifies interworking between TRILL and the original
>  standardized form of ECN in IP [RFC3168].

Please indicate this at the top of Section 3. When I was puzzling over Table 3,
I spent some time trying to figure out whether the behavior I describe in my
DISCUSS above was due to behavior described in RFC 8311 or the experiments it
contemplates.

---------------------------------------------------------------------------

Appendix A:

>  o  the meaning of CE markings applied by an L4S queue is not the same
>     as the meaning of a drop by a "Classic" queue (contrary to the
>     original requirement for ECN [RFC3168]).

I think, when citing this exception, it makes much more sense to point to RFC
8311 (where the exception to RFC 3168's requirement is defined) than to RFC 3168
in a vacuum.

>     Instead the likelihood

Insert a comma after "Instead".

>     that the Classic queue drops packets is defined as the square of
>     the likelihood that the L4S queue marks packets (e.g. when there

Insert a comma after "e.g.,"