Skip to main content

Shepherd writeup
draft-ietf-tsvwg-ecn-l4s-id

As required by RFC 4858, this is the current template for the Document
Shepherd Write-Up. Changes are expected over time.

This version is dated 1 November 2019.

(1) What type of RFC is being requested (BCP, Proposed Standard, Internet
Standard, Informational, Experimental, or Historic)? Why is this the proper
type of RFC? Is this type of RFC indicated in the title page header?

Experimental.  This is the proper type of RFC, and is indicated in the title
page header.  There is an active area of research in transports that meet the
requirements and use the L4S identifier described in this document.

(2) The IESG approval announcement includes a Document Announcement Write-Up.
Please provide such a Document Announcement Write-Up. Recent examples can be
found in the "Action" announcements for approved documents. The approval
announcement contains the following sections:

Technical Summary: (from abstract)

   This specification defines the protocol to be used for a new network
   service called low latency, low loss and scalable throughput (L4S).
   L4S uses an Explicit Congestion Notification (ECN) scheme at the IP
   layer that is similar to the original (or 'Classic') ECN approach,
   except as specified within.  L4S uses 'scalable' congestion control,
   which induces much more frequent control signals from the network and
   it responds to them with much more fine-grained adjustments, so that
   very low (typically sub-millisecond on average) and consistently low
   queuing delay becomes possible for L4S traffic without compromising
   link utilization.  Thus even capacity-seeking (TCP-like) traffic can
   have high bandwidth and very low delay at the same time, even during
   periods of high traffic load.

   The L4S identifier defined in this document distinguishes L4S from
   'Classic' (e.g. TCP-Reno-friendly) traffic.  It gives an incremental
   migration path so that suitably modified network bottlenecks can
   distinguish and isolate existing traffic that still follows the
   Classic behaviour, to prevent it degrading the low queuing delay and
   low loss of L4S traffic.  This specification defines the rules that
   L4S transports and network elements need to follow with the intention
   that L4S flows neither harm each other's performance nor that of
   Classic traffic.  Examples of new active queue management (AQM)
   marking algorithms and examples of new transports (whether TCP-like
   or real-time) are specified separately.

Working Group Summary:

The working group largely supports and is enthusiastic about L4S technology,
which this document is a part of.  It was last called along with 2 other L4S
documents.  There is a wider than normal level of support that has been
expressed for this work, but it is not unanimous.  There are also some WG
participants who have concerns about L4S or prefer an alternative approach. 
The working group had many long email threads and conducted several interim
meetings focused on L4S, its goals, technical challenges, and concerns with the
approach and potential impact on Internet safety.  Full agreement was not
obtained in all cases, but significant work was done to address the issues that
were agreed upon, and each concern was considered in detail by the working
group, even if some resolutions were not unanimously agreed with.

Document Quality:

There are multiple existing implementations of the technology described, and
around 25 vendors and operators have indicated intentions to experiment with
the overall L4S technology.

Personnel:

Wesley Eddy is the document shepherd, and Martin Duke is the responsible AD.

(3) Briefly describe the review of this document that was performed by the
Document Shepherd. If this version of the document is not ready for
publication, please explain why the document is being forwarded to the IESG.

I have reviewed the document myself, and believe it is ready for publication. 
In addition, both other TSVWG co-chairs have closely reviewed and commented on
earlier revisions of the document.

(4) Does the document Shepherd have any concerns about the depth or breadth of
the reviews that have been performed?

No concerns.

(5) Do portions of the document need review from a particular or from broader
perspective, e.g., security, operational complexity, AAA, DNS, DHCP, XML, or
internationalization? If so, describe the review that took place.

Since the document deals with transport protocols, congestion control, packet
marking, and queuing behavior, there can be additional security, operations,
internet, and routing area interests.  There has been a healthy mix of
academic, industry/vendor, and operator participation in this work, and a
breadth of reviews have been incorporated.

(6) Describe any specific concerns or issues that the Document Shepherd has
with this document that the Responsible Area Director and/or the IESG should be
aware of? For example, perhaps he or she is uncomfortable with certain parts of
the document, or has concerns whether there really is a need for it. In any
event, if the WG has discussed those issues and has indicated that it still
wishes to advance the document, detail those concerns here.

There are no personal concerns of my own.  However, there have been concerns
raised in the working group, discussed in question 9 below.

(7) Has each author confirmed that any and all appropriate IPR disclosures
required for full conformance with the provisions of BCP 78 and BCP 79 have
already been filed. If not, explain why?

Yes.

(8) Has an IPR disclosure been filed that references this document? If so,
summarize any WG discussion and conclusion regarding the IPR disclosures.

No.

(9) How solid is the WG consensus behind this document? Does it represent the
strong concurrence of a few individuals, with others being silent, or does the
WG as a whole understand and agree with it?

There was a combined last call for this and the other two main L4S documents. 
During that last call, it was clear that while a majority strongly supports the
work and is eager to experiment with the technology, there are still also a
number of participants who have concerns and/or prefer alternatives to L4S. 
The working group devoted considerable time, including multiple focused interim
meetings in order to understand objections and try to address or mitigate all
concerns.  It is clear that while there are still some technical disagreements,
there is a great deal of support and even wider than normal support for going
forward with this work; spanning industry, including vendors, network
operators, and congestion control researchers.

Among the 3 documents, this ECN one is where the specific usage of ECT(1) is
discussed, that is the most contentious.

Regarding this ECN document specifically, there are a number of specific
concerns that resulted in only a "rough" consensus and not full agreement. -
Since the beginning of this work, it was well understood that enhancing ECN
involves tradeoffs in the choice of header encodings selected, since there are
limited options in terms of the header bits, codepoints, and other fields like
DSCP that can be leveraged.  The working group went through a process of
assessing the pros and cons of all of the options that were put on the table. 
The different weights that participants attach to the importance of these
tradeoffs seems to be a major source of the inability to unanimously agree on
the particular L4S usage of ECT1 (and CE).  It was remarked in one WG meeting
that we ideally would have 5 codepoints, but unfortunately need to live with 4
and engineer within that.  As a possible supplement for the limited ECN
codepoints, the working group considered pros and cons of different means of
using DSCP values, multiple times over the life of these documents.  Some
participants preferred approaches relying on DSCP, based on their view of the
tradeoffs. - Some WG participants do not like the ambiguity of meaning possible
at different points in the network for CE and ECT1 that this draft introduces
when L4S is deployed alongside classic ECN queues or hosts.  This coexistence
has been considered at length by the WG and accepted along with several
mitigating factors that have been introduced.  Particularly there is a
possibility to negatively impact non-L4S flows when a classic bottleneck is
shared and L4S responds to the CE bits differently.
    - There is not full agreement on whether L4S with this use of ECN
    codepoints is safe for experimentation on either parts of the Internet, or
    the Internet as a whole.  The WG has a separate draft-ietf-tsvwg-l4s-ops
    document intended to help describe potential problems and how to avoid them
    when enabling L4S in networks.   Related to this, when the WG was polled in
    the May 2021 interim about suitability for experimentation in parts of the
    Internet, about 20 people agreed (over half of the ~35 attendees), and
    about a half dozen disagreed.  Minutes of this meeting are at:
    https://datatracker.ietf.org/doc/minutes-interim-2021-tsvwg-01-202105101100/.
   - The matter of safety was discussed in detail in WG meetings and mailing
   list threads.  Specific work on classic bottleneck detection was performed
   and documented, and the operator guidelines I-D added further considerations
   on the presence of classic bottlenecks.  The implementers and operators
   discussing deployments seem to understand the concern and are comfortable
   with the situation and tools available.
- A few WG participants favor making this Standards Track rather than
Experimental, because of the potential for classic incompatibility, and want to
resolve that incompatibility by obsoleting RFC 3168 at the present time.  There
was not wide support for this, and it is felt that that the WG is not yet ready
to pursue this path of deprecating RFC 3168.  The WG may discuss deprecating
3168 in the future, in parallel with L4S experimentation, but not as a
prerequisite to L4S experimentation. - One question that came up in the WG
process was how this particular L4S identifier approach relates to the advice
in BCP 124 (RFC 4774).  After WG discussions, and work between the chairs and
editors, new text was included to describe this was included in the document. 
Although the WG feels it is safe and the most desirable available approach, the
point remains that this does not have pure conformance with the BCP 124 (RFC
4774) advice as-written at that time and directly interpreted now.
   - Several working group participants reject the design based on this, and
   have agreed with this logic described to the working group in detail:
   https://mailarchive.ietf.org/arch/msg/tsvwg/BULldNtilkiChD7rPKDyEdssFlw/ -
   Related to this, several also would prefer that in conjunction with the
   required classic bottleneck detection, that realtime fallback should be
   mandatory for transports to implement and use.
      - Some believe that realtime fallback (rather than simply offline means)
      is needed for safety, noting that they don't think it is clear how a user
      would be notified of a safety problem or who they would contact in the
      event that one occurred. - On this topic specifically, one person said on
      the mailing list: "And rather than going in circles again, as I said
      originally I propose to agree to disagree on the one remaining point (re:
      SHOULD NOT be sent across classic queues, MUST NOT be so sent repeatedly
      and persistently).  I don’t think there’s enough evidence to claim that
      essentially all current and future deployed marking queues use a
      sufficiently good fq hash to allow for disregarding of RFC 4774's model
      for new ECN semantics, nor that the problem-detection approach is robust
      enough (not even sure it’s coherent enough) to rely on as the basis for
      the key normative requirements for a safe operational congestion
      response." - Reasons why the WG was satisfied with a 'SHOULD' rather than
      'MUST' for realtime fallback are explained in Section 4.3.1 and Appendix
      A.1.5 of this document.
- A potential vector for DoS of tunneled traffic was described, based on
marking the traffic selectively, and causing inadequately scaled replay windows
(as in IPsec) to then be violated as some packets receive low latency and
others do not.   This condition was recognized in the WG as able to affect
regular traffic without any intent of an attack.   It was addressed and is
explicitly discussed in section 6.2, however, it is not fully agreed upon as
having been settled by some who raised the concern.

(10) Has anyone threatened an appeal or otherwise indicated extreme discontent?
If so, please summarise the areas of conflict in separate email messages to the
Responsible Area Director. (It should be in a separate email because this
questionnaire is publicly available.)

Some participants may feel strongly enough against L4S and this particular
usage of ECN to consider an appeal.

(11) Identify any ID nits the Document Shepherd has found in this document.
(See http://www.ietf.org/tools/idnits/ and the Internet-Drafts Checklist).
Boilerplate checks are not enough; this check needs to be thorough.

There are only small nits, that would be easy to handle after AD review.  There
are outdated references to the other L4S and NQB documents, and warnings about
non-ASCII characters.

(12) Describe how the document meets any required formal review criteria, such
as the MIB Doctor, YANG Doctor, media type, and URI type reviews.

N/A.

(13) Have all references within this document been identified as either
normative or informative?

Yes.

(14) Are there normative references to documents that are not ready for
advancement or are otherwise in an unclear state? If such normative references
exist, what is the plan for their completion?

No.

(15) Are there downward normative references (see RFC 3967)? If so, list these
downward references to support the Area Director in the Last Call procedure.

No.

(16) Will publication of this document change the status of any existing RFCs?
Are those RFCs listed on the title page header, listed in the abstract, and
discussed in the introduction? If the RFCs are not listed in the Abstract and
Introduction, explain why, and point to the part of the document where the
relationship of this document to the other RFCs is discussed. If this
information is not in the document, explain why the WG considers it unnecessary.

No.  During work on this document, it came up that in the future BCP 124 (RFC
4774)may need to be updated, but this particular document does not do that.

(17) Describe the Document Shepherd's review of the IANA considerations
section, especially with regard to its consistency with the body of the
document. Confirm that all protocol extensions that the document makes are
associated with the appropriate reservations in IANA registries. Confirm that
any referenced IANA registries have been clearly identified. Confirm that newly
created IANA registries include a detailed specification of the initial
contents for the registry, that allocations procedures for future registrations
are defined, and a reasonable name for the new registry has been suggested (see
RFC 8126).

The IANA considerations clearly describe one update to an existing registry,
that is specifically identified and fully described.  There are no new
registries.

(18) List any new IANA registries that require Expert Review for future
allocations. Provide any public guidance that the IESG would find useful in
selecting the IANA Experts for these new registries.

No new registries.

(19) Describe reviews and automated checks performed by the Document Shepherd
to validate sections of the document written in a formal language, such as XML
code, BNF rules, MIB definitions, YANG modules, etc.

N/A - no formal languages are used.

(20) If the document contains a YANG module, has the module been checked with
any of the recommended validation tools
(https://trac.ietf.org/trac/ops/wiki/yang-review-tools) for syntax and
formatting validation? If there are any resulting errors or warnings, what is
the justification for not fixing them at this time? Does the YANG module comply
with the Network Management Datastore Architecture (NMDA) as specified in
RFC8342?

N/A.
Back