Minutes interim-2020-tsvwg-03: Mon 10:00

Meeting Minutes Transport Area Working Group (tsvwg) WG
Title Minutes interim-2020-tsvwg-03: Mon 10:00
State Active
Other versions plain text
Last updated 2020-04-28

Meeting Minutes

   TSVWG Interim Meeting, Monday Apr-27-2020
0700-0930 UTC-7 (San Francisco) mir 1000-1230 UTC-4 (Boston)
1500-1730 UTC+1 (London) = 1600-1830 UTC+2 (Berlin)
2200-0030[+1] UTC+8 (Beijing - not on summer time, other 4 are)
WG Chairs: Gorry Fairhurst (remote), David Black (remote), Wes Eddy (remote)

Meeting started 0708, Note Well shown

TSWG Interim 2: ECT(1) - 3rd interim of 2020.

1. Agenda - as advertised.

2. Chairs Update:
        Note Well

3. L4S and ECN
  Overview of slides
    Stuart comment 0724: agreeing with chairs input/output is useful decision

  - Greg discussion of L4S slide 5
  - Jonathan discussion of SCE slide 6

  - Wes moved on to comparison overview
  - slide clarification
    (some unintended slide modifications happened)
  - Greg covering at 0746, discussing L4S deployment plans
    - 2.5 minutes (slide 13), then Jonathan on SCE deployment (slide 14)
    - Stuart clarifying comments about Apple interest
    - Jake clarifying comments about Akamai interest
  - Wes moving on (if no consensus)
  - several comments from Dave Taht: please consider asymmetric paths, and more
  testing is needed before decisions. - Spencer question 0758: Do we have
  recent information about 3GPP intentions to adopt L4S? That was the case when
  I was talking with the 3GPP liaison to IETF three or four years ago, but I
  haven't heard anything since, and 3GPP does remove content from releases when
  it's not available to be included. That's a good question to ask them now. -
  Answer from Kevin Smith (Vodafone) saying it's strongly supported by them,
  and has a change request with Ericsson in 3GPP
- Answer from Per Willars (Ericsson) 0801: L4S input is working as-is, matches
with carrier signal well, some discussion has been postponed for a future
release (but L4S can be deployed today from a 3GPP point of view.)

4. Discussion of ECT(1) - Discussion of slides

- Jana: thanks, This was a good summary.
- The most important point is that low latency requires multiple queues in
network.  An explicit classifier is preferred, not tractable to detect a lower
latency intent in the network device - I prefer no further AQM innovation.
timescales for queue devices are intractable, vs. endpoints. - I like maturity.
 L4S is more mature, has had a lot of time in ietf and much engagement.  L4S
preferred. - Stuart Cheshire (speaking for Apple with caveats):
     - ULL is critical.
- Anecdote: 300kbps DSL outperforming gigabit link as perceived by voice
connection partners, latency is important. - Stuart is fully behind L4S. -
Andrew Macgregor: + to L4S - +1 Jana and Stuart.  L4S wins us more and trying
it teaches us more.  It's the bold thing to do and we should do it.
 SCE is feasible, but L4S is feasible on deployed hardware.

- Uma: Are there MPLS queuing implications?
- Bob: There are specs. This is not necessarily impossible, but maybe it
depends on the deployment scenario. - Uma: MPLS important in 3GPP, how will the
L4S signaling be implemented? - Chairs: This is a topic we can hear more of in
a later TSVWG meeting as an ID or possible presentation.

 - Bob:  - Answering multiple questions:    - re Jana multiple queues: 2 queues
 are better than multiple queues.  FQ can get 8ms, dualq can get lower.    - re
 David Taht's points: we've done more than congestion avoidance tests, disagree
 with criticism; we will add asymmetric testing, good point there.
- Mirja:  - I want to underline the point about strong incentives.  The whole
point is not about being fair.  - If we pick something up that has only a minor
benefit, we are missing an important opportunity! - David Taht:  - musicians a
good point.  Conflict with greedy traffic is the issue.  - FQ solves the
problem very well, non-greedy traffic does not experience latency on the link. 
- We have got end-to-end collaborative music over Internet using diffserv -
Martin Duke (Individual, no hat): - SCE is good work and would be a significant
improvement if adopted, but deployment concerns may be a problem. - L4S is
transformative and we should go big if possible. - My only concern is safe or
not.  ask the WG to consider it, hard to assess but encourage consideration
here, many tests run on L4S  - unfairness issues in 3168 queues are
complicated, but working to address them.  limits of scoping testing and limits
of queueing testing are hard  - personally comfortable with the tradeoffs -
Jonathan:  This is a big "if": our tests demonstrate this is currently not safe
 - detection of RFC 3168 AQMs is not reliable yet, we doubt it can be made so 
- SCE offers 2 levels of service:    - one designed for general internet with
long paths targeting 2.5ms    - shaped to 1mbps to make this call, works well
because normal AQM
 - the problem that Stuart describes is a cable network not using AQM at all. 
 Any normal AQM solves the problem.
- Roland:  The debate is not SCE vs. L4S, the decision is about ECT(1)
semantics.  Using a DSCP as a classifier and another as signal is a discussion
he'd like to see. - Pete:  - I want to add to Jonathan testing: we found false
negatives on RFC 3168 detection in L4S detection.  There are also false
positives: 2ms of jitter can make under-utilization  - detection needs to work
before deployment.  not clear if it can
 - question Is low latency <1ms achievable for Internet bursty traffic?
- Stuart I have a question of clarification: is it not possible or not useful
for 2.5ms->1ms? - Pete: This is a tradeoff with under-utilization is an
issue, needs consideration. - Stuart: This is a good argument for segregating
the traffic. - Pete: The goal is both will offer high thruput. - Stuart: That's
why it needs input differentiation. - Jake - Please don't break upgrading of
RFC 3168.  This is the big problem. - Jana - I am not holding our breath on RFC
3168  - The point is we must have at least 2 queues to get low latency  -
implicit or explicit is a question, implicit is   - traffic mix: many
connections are small, but 80% of bytes is long flows. - Martin: We need to
consider newer CCs, not just sawtooth.  A real deployment needs to consider
this.  Yes, safety is important, but we can discuss going forward.  The
deployability is important to consider here. - RTT fairness is a silly
discussion now.  We need to move forward with a deployable solution.  Safety
can take a hit on legacy traffic. - Bob - In answer to David Taht: L4S does not
assume all traffic is greedy, it aims to allow greedy traffic to have low
latency  - on testing of fallback: not yet happy with it, not complete yet,
took longer than expected to get a visualization but now that we have it, more
algorithms can be tested more quickly than current status. - In answer to Pete
& Jonathan: About false positives - you think it's a classic queue but it
isn't and you continue to behave agressively.  These occurred in our 4k
experiments only when no classic traffic was present with 1 exeption so far -
Regarding false negatives (stop L4S even tho not classic): this is ok, it just
drops to 50/50 vs. cubic, and is fine in terms of fairness, though it
underutilizes link - In reply to Jonathan: Though SCE is more safe on fairness
issues, that's not the only possible safety issue.  In L4S there's ambiguity on
CE, but in SCE there's ambiguity on ECT(0).  Then we need to consider a
firewall that blackholes ECT(1) packets would cause unsafety even for
non-participating codepoints.  There are more safety problems - David : The
discussion so far is mostly about edge networks.  Consider datacenters, also
important for Internet traffic.  datacenter incast is a big problem, DCTP is a
good solution, but it is slow to deploy because of risks of getting into same
queue as conventional traffic.  If we use 2 codepoints to distinguish, this is
very attractive, making it easy to deploy in a DC  - to the extent that the
problems vary according to network type, this is area diffserv is intended to
solve. - Aidan (Mellenox), regarding the datacenter case:
 This can be a controlled environment: useful to distinguish with explicit
- Stuart: Do you mean different CCs or different queuing?
- Aidan: Differentiated in DCs by priority to avoid getting confused using
DSCP.  People are doing it today with high speed transports. There are only 2
bits in the ECN field, a better CC implies lower latency.   Using the ECN bits
as output would enable low latency traffic in datacenters by enabling CC
algorithms that would react faster to congestion state, and would be
significantly important for datacenters. - L4S does not appear to enable
something better than DSCP enables today for datacenters. - Ingemar:
    - Jitter in WIFI is also an issue, but we shouldn't use today's snapshot
    for tomorrow's technology! - In 3GPP, we have solved this with timeslots
    and there's room to fix jitter in other ways. - About the non-greedy case:
    We are looking at L4S for non-greedy traffic primarily.  The upper limit of
    bitrate is a common use case.  It's not about only greedy traffic in this

- Nikki Pantelias (broadcom):
     Where we sit is usually neither endpoint, nor datacenter switch.
     Our implementation concerns are:
        - not having to inspect layer 4 header
        - being compatible with ACK-thinning
        - being able to do this with a dualq implementation
        - FQ is not practical for us. We favor ECT(1) as input for a classifier.
- Ron Raganathan (comscope?) cable modem/headend equip:
    - latency much more critical now, we're looking at it
    - same space as n=Nikki
    - AQM is doing its thing, very powerful tool
    - challenge is when there's a lot of burat traffic from different flows on
    different pipes - consistent latency under 10ms is hard when bursty traffic
    arrives - with L4S we so far see usually <8ms  - our experiments not
    finding impact on classic traffic - support L4S
- Greg:
    - these bits are precious; L4S looks to reclaim ECT(0) at some point, but
    SCE requires using all the codepoints forever.
- Lars:
    - no matter what we do we need to make a decision now, we've waited too
    long already.  don't want more discussion, need to decide something now.
- Stuart:
    - to "no point lowering latency, what about wifi": disagree, this is
    worthwhile (1ms vs. 2.5ms i think?) - to "video is inelastic": video will
    scale to fit capacity - to "some traffic will cheat": there's no benefit to
    cheating in L4S.
- Luca:
    - (first point missed, please check video)
    - semantic disagreement with Stuart over inelasticity of video conferencing
    - need to support wireless as well.  network will be heterogeneous, need to
    support them all.  many bottlenecks will exist in the future, important for
    all apps, not just low latency apps
- Jonathan:
    - In considering that "L4S is mature" there is the sunk cost fallacy.  It's
    not demonstrated that ambiguity can be resolved.  we have counterexamples
    and all sides agree at least more work is needed.  ECT(1) as input decision
    would be premature today.  We would be better to decide not to use ECT(1)
    at all would be a better decision for today, the main thing is getting
    deployment, and we have much better today than in prior years.  please
    deploy ECN in some form.
- Sebastian:
    - In reply to Stuart: "cheating backfires": as currently implemented L4S
    gives both high thruput and hi latency, there is no incentive not to cheat.
- Anna:
    - If the goal for L4S is for all traffic to be L4S queue, how is the
    throughput lower for ECT(1)? - Stuart: The expectation is we have legacy
    greedy traffic for a long time, hope of reclaiming ECT(0) is like retiring
    IPv4, need to keep it for forseeeable future.  We focus on what we want,
    long tail will be there for a while. - Anna: I agree we have long tail, but
    both high throughput and low latency would go in L4S queue - Koen: For now
    we still need 2 queues, we will have non-ECT flows too.  L4S is possible
    for high throughput with some gaps, but traffic can always use classic
    queue. - Bob: It took 20 years to reclaim ECN nonce from only 1 use
    somewhere in Scandinavia...  It will take time to adopt ECN.

5. Chairs Summary of Position
- Gorry recap:
    - strong consensus in 2016 to continue L4S.  do we have enough information
    today to make decisions we need to make today?
- Wes, to summarize:
 - There continues to be support to use ECT(1) as input, but we heard range of
 opinions, not unanimous support.  not sure if consensus, but majority wants
 input, some would like to use dscp+output. - If we go forward using ECT(1) as
 input, there are concerns about classic detection/fallback, will remain an
 important bit of work to be doing. - We saw open work to continue discussing
 about data links like wifi and cheating scenarios.

- Gorry: During the L4S BoF people were unconcerned about transport fallback,
but seems today to be the sticking point.  is this what should stop us doing
standardization? - Jana: We expect to have work to do, need to stop running in
circles, we can do the work you've listed in the working group, and just need
solutions.  let's keep going and work out the problems. - David: Safety is
crucial.  This must not break things.  interesting point mentioned: queue
protection that might apply the disincentive, but right now the safety case is
not there. - Stuart: A procedural request: Apple is not supporting SCE, please
the update slides. - Gorry : OK, the L4S contributions checked this and we
amended the slides to reflect company positions, we'll do the same for SCE.

A consensus poll cannot be done in this meeting.

Summary of tentative positions from Etherpad at conclusion of meeting:

Sebastian Moeller (currently, I lean towards ECT(1) as output, this IMHO is
orthogonal to the question whether L4S and/or SCE are to proced) Dave Taht
(TekLibre, aka  Rube GoldBerg) I am opposed to wide deployment of undroppable
packets. L4S mandates that, SCE offers a gradual and OPTIONAL deployment path,
and not widely deploying any form of ECN at all except for traffic that truly
benefits is the sanest path forward.

Greg White, CableLabs (Support L4S and ECT(1) as input)ECT
George Hart, Rogers (support L4S and input signaling for ECT(1)
Ingemar Johansson, Ericsson (support L4S and input signaling for L4S)
Per Willars, Ericsson (support L4S, especially ECT(1) as input to 5G networks)
Julien Maisonneuve, Nokia (support L4S and input signaling for L4S)
Sebnem Ozer, Comcast (support L4S)
Kyle Rose, Akamai: I am agnostic *if* part of the L4S work is to explicitly
deprecate classic ECN

The chairs will review the outcomes and summarise the next steps on the list.