Skip to main content

Minutes interim-2021-tsvwg-01: Mon 11:00
minutes-interim-2021-tsvwg-01-202105101100-00

Meeting Minutes Transport and Services Working Group (tsvwg) WG
Date and time 2021-05-10 15:00
Title Minutes interim-2021-tsvwg-01: Mon 11:00
State Active
Other versions plain text
Last updated 2021-05-20

minutes-interim-2021-tsvwg-01-202105101100-00
*TSVWG - Virtual Interim Meeting
*May 10, 2021

*Attendees:
    Wes Eddy (MTI Systems) - Chair
    Gorry Fairhurst (University of Aberdeen) - Chair
    David Black (Dell EMC) - Chair
    Rodney W. Grimes (rgrimes@freebsd.org, netDEF/SCE)
    Pete Heist (SCE)
    Jonathan Morton (SCE)
    Ilpo Järvinen
    Greg White (CableLabs)
    Martin Duke (F5)
    Ingemar Johansson (Ericsson AB)
    Steve Blake
    Huichen Dai (Huawei)
    Mirja Kühlewind (Ericsson)
    Michael Scharf (HS Esslingen) - partly
    Ermin Sakic (NVIDIA)
    Stuart Cheshire (Apple)
    Vidhi Goel (Apple)
    Alex Burr (Kano Computing)
    Mohit P. Tahiliani (NITK Surathkal)
    Koen De Schepper (Nokia)
    David Millman
    Spencer Dawkins (Tencent)
    Michael Tüxen (Münster University of Applied Sciences)
    Sebastian Moeller
    Philip Anderson (Charter Communications)
    Bob Briscoe (Independent)
    Jason Livingood (Comcast)
    Asad Sajjad Ahmed
    Neal Cardwell (Google)
    Anna Brunstrom (Karlstad University)
    Jake Holland
    chi-jiun su (Hughes Network systems)

*Minutes:
*-- 1. Introduction & IETF Note Well - chairs (5 minutes)
Note Well and chairs slides shown (see slides)

*-- 2. Transport requirements from: draft-ietf-tsvwg-ecn-l4s-id
     - Status update from Bob Briscoe & Koen De Schepper et al (15 minutes)

    Use of SHOULD vs. MUST for reducing RTT bias will be worked out via email.
    Summary of secure VPN reordering concern on slide disagreed with in meeting
    - will be taken to list.  Result will need to be added to security
    considerations.

    Ingemar:  I believe that the possible replay issue is not only an L4S
    problem. For that purpose I believe that this deserves a wider discussion
    (not just TSVWG) to get an understanding of under which conditions packet
    reordering becomes a problem (if they exist). This is however not something
    that will or should be handled now and I don't see it as something that
    should affect the L4S RFC timeline

    David: @Ingemar: There are other causes of reordering, diffserv in
    particular. Diffserv-caused reordering has been addressed in the relevant
    RFCs, e.g., for IPsec.  In contrast, L4S is new at this juncture.  With
    luck this & Ingemar's comment will get picke up in list discussion.

     Sebastian: L4S adds a whole new re-ordering mechanism, ECT(1) over ECT(0)
     even for packets with identical DSCP. The effect is that C-queue traffic
     gets easily starved/suppressed by ECT(1) traffic (if the ECT(1) packets
     are hoisted early enough to move the replay-window so much, that the
     NotECT/ECT(0) packets that were delayed in the C-queue arrive with a
     replay sequence number below the replay-windows lower end). Conceptually
     that is similar to re-ordering from different latency paths through DSCPs,
     but DSCPs are rarely end2end and ECN bit so far did not matter for this
     kind of re-ordering.

     - Open discussion (40 minutes)

     Jonathan: Draft should provide a reference algorithm to implement
     monitoring. Reference code that is available is not reliable under lab
     conditions.  Algorithm not documented in an IETF draft. Koen: Reference
     code has parameters that need to be tuned.  Anticipates
     deployment-specific tuning. David: Ought to add discussion of algorithm
     and tuning to draft (Bob suggests: possibly in Appendix) Bob: Only affects
     long-running flows. Jonathan: Dispute that. Wes (chair): Would like to
     hear from implementers on this topic. Bob: Whitepaper contains an
     out-of-band detection algorithm for RFC 3168 AQMs, could add to draft.

     David & Bob: There is a problem in L4S reordering interaction with
     anti-replay in secure VPNs, will take discussion to list. (Sebastian's
     above comment on re-ordering is related to this).

     Jonathan: Experiment success criteria are deployment-centric.  Need to
     look at safety, particularly with respect to RFC 4774 Option 2 (check that
     routers understand new ECN semantics)  vs. Option 3 (new ECN semantics
     coexist well [friendly] with competing traffic). David: L4S was originally
     designed for RFC 4774 Option 3 - whether it has met the criteria to use
     that option is an open issue for the WG to discuss.  The RFC 4774 options
     are in Section 4:
         https://datatracker.ietf.org/doc/html/rfc4774#section-4
     Sebastian: The success measure for an AQM needs to be active use and NOT
     simply passive deployment.

     Jake: Asks about expected timeliness and responsibilty for response to
     monitoring-detected problems. Bob: Recommendation is for real-time
     monitoring, relies upon absence (or close to it) absence of false
     negatives.

     Pete: Is congestion-control interaction of L4S/non-L4S flows in a shared
     RFC 3168 queue similar/analogous to interaction of DCTCP/non-DCTCP flows?
     Ingemar: Scream is driven by video encoders, network queues will often be
     empty because there is not an always-present backlog of data to send. Bob:
     Not sure why question is being asked, DCTCP does not meet L4S "Prague
     requirements". Pete: Reason for question was whether DCTCP restrictions
     settle coexistence question. Bob: L4S "Prague requirements" have improved
     on DCTCP, does not consider DCTCP to have settled RFC 3168 coexistence
     question.  Prague *in L4S mode* is not expected to coexist well.  More
     discussion to come on list. Koen: Sees role for both L4S and non-L4S
     services in future of Internet.

*-- 3. Safe Internet-wide experimentation: draft-white-tsvwg-l4s-ops (or newer
WG version)
    - Status update from Greg White et al (15 minutes)

    Sebastian: Recent paper with 5% use of ECN seems fishy, but 0.3% use on
    HTTP/HTTPS traffic agrees with Akamai results reported on slide.

    David: DSCP material ought to be added
    Greg: Further discussion, couldn't figure out what to do from (confusing)
    list discussion. David: Will send note to list on network-only use of
    DSCPs, without endpoint reaction to received DSCPs.

    - Open discussion (40 minutes)

    Stuart: Would like to see latency improved (has been ~0.5sec for too long,
    RFC 3168 not widely deployed). Need a selector for L4S treatement, end
    devices want best behavior at bottlenecks, independent of whether they're
    RFC 3168 vs. L4S.  Interested in whether DSCP marking will provide a
    feasible path forward as alternative to ECT(1).  Seeing increasing areas of
    traffic that want both bandwidth and low-latency, e.g., video streaming.

    Greg: Hope to see widespread deployment of L4S, improve classic ECN
    deployed systems over time. Stuart: fq_codel deployed, deployments
    increasing, classic ECN will be with us for at least a decade.  Just by
    moving mobile phone around house, network bottleneck may shift between
    cable modem infrastructure and home WiFi AP, if latter has fq_codel/classic
    ECN, it's unlikely to be upgraded.

    Spencer: I strongly agree with Stuart about the idea that "some
    applications want high bandwidth and others want low latency" is not a
    useful strategy long-term. In discussions about applications that want to
    use multiple connections in the QUIC working group, I am seeing more and
    more people saying that they really care about close control of latency.
    (https://datatracker.ietf.org/doc/draft-dawkins-quic-what-to-do-with-multipath/
    and
    https://datatracker.ietf.org/doc/draft-dawkins-quic-multipath-questions/)
    Especially pleased with discussion about possible DSCP guard usage that may
    move this work forward soon.

    Jake: 0.3% of Internet users seems small, but that's millions of people. 
    Need to consider reactions to breakage that occurs. Greg: Looking to future
    where RFC 3168 support is L4S-aware.

    Bob: RFC 3168 only causes problems when multiple flows are in same queue. 
    Not convinced that short flows are an important part of the problem - long
    flows may be appropriate focus.  Will be important for DSCP discussion to
    distinguish DSCP usage as 1) traffic marking, 2) classifier and 3) part of
    transport protocol behavior.  (David: Agrees)

    Koen: Has not seen a bulletproof DSCP solution.  Existing usage of DSCPs
    constrains what's possible. Stuart: If network treats ECT(1) as Not-ECT,
    that's bad, provides incentive for app developers to not use ECT(1) because
    result could be worse than ECT(1) [RFC 3168]. (Bob: agrees that this would
    be bad).

    Jonathan: fq_codel (in RFC 8290) provides improvement over prior
    technology, should be baseline for judging utility of L4S improvements.

    Ingemar (from chat): I have over the years tried to push ECN support into
    LTE but so far it has not materialized, the main reason is that it did not
    give a large enough delta improvement when I tried (~5 years ago). The
    situation is a bit different now with the emerging interest in XR/cloud
    gaming/remote control so ECN may be easier to push. There are however a few
    aspects related to 5G access that makes L4S more appealing. One important
    is fast fading which is a natural part of cellular access. The high marking
    intensity of L4S makes it possible for an interactive application to react
    promptly and reach a working point that gives sufficient headroom for the
    fading dips. It has been hard to reach a similar good balance with classic
    ECN, mainly because of the more sparse marking.

*-- 4. Wrap-Up - chairs (5 minutes)

Wes (chair): Does L4S operations draft contains sufficient info to run L4S
experiment? Specific Question asked:

Does the group agree that with these guidelines available, that L4S will be
suitable for experimentation in parts of the Internet?

Over half of attendees (close to 20 of about 35) agree.
A smaller number, about half a dozen attendees, do not agree.
A number of people did not express an opinion.