Minutes for L4S at IETF-96

Meeting Minutes Low Latency Low Loss Scalable throughput (l4s) WG
Title Minutes for L4S at IETF-96
State Active
Other versions plain text
Last updated 2016-08-15

Meeting Minutes

   BOF: Low Latency Low Loss Scalable throughout (L4S) Draft Agenda for Berlin
TUESDAY, July 19, 2016
1400-1600  Afternoon Session I Potsdam I
Chairs: Lars Eggert, Philip Eardley

== Draft agenda ==s
1. Introduction - Chairs

Tim Chown: is there a mailing list?
Yes: tcpprague@ietf.org

2. The problem and very high-level solution - Bob Briscoe

- Yuchung (Google): do you need to track the per-flow state?
- Bob: no, just ECN bit
- Yuchung:  Does the AQM for classic traffic turn the elephant into a rabbit?
- Linda Dunbar (Huawei): How do you prevent the problem of queueing in a core
router? Do you need this there? - Bob: Usually you don't get queueing in a core
router. You don't want drop or delay in the core. - Bob: Once you deal with
your primary bottleneck, in the longer term you could put a queue like this in,
but queuing tends to get pushed to the edge.

3. Demo: L4S in action - Koen De Schepper

- Jana Iyengar: explain the second graph
- Koen:  One cluster of delay is retransmit timeout of 200ms
- Dave Taht (bufferbloat.net): what was the baseline RTT?
- Koen: 7 msec
- Dave Taht: do you have data with larger (> 100msec) RTTs?
- Koen Yes

4. L4S Applicability to Mobile, without flow inspection - Kevin Smith

- no questions/comments

5. L4S in a 4G/5G context - Ingemar Johansson

- no questions/comments

6. DCTCP evolution - Praveen Balasubramanian

- no questions/comments

7. Discussion about the technology

- Yuchung: Curvy Red on classic traffic doesn't solve the whole problem, still
need DCTCP - Bob: DCTCP gives lower latency.  Want to enable new applications
to motivate deployment - Yuchung: can't you tune your AQM - Bob: if you tune
too hard, lose throughput - Magnus Westerlund: interactions with video
interface (e.g., frame intervals) - Koen: concern about rapid changing
throughput.  We can guarantee low latency, but throughput might change rapidly.
 Application must adapt. - Magnus: how well does the congestion signal
interact, different from what a TCP connection would see. - Koen: allows bursts
(unlike FQ). - Dave Taht: work is an outgrowth of the AQM working group.  What
AQM has specified is vastly superior to what has been deployed today.  Would
find your presentation more convincing if you compared it to FQ-Codel. - Koen:
could show you. - Dave Taht: FQ-Codel show near-zero latency for many forms of
traffic. - Mat Ford: How does the end system know that it is on a network with
Dual-Q coupled AQM? - Praveen: end host doesnÂ’t know - Bob: DCTCP just reverts
to TCP Reno upon loss.  With classic ECN end system would see delay before ECN
signal and could guess that class ECN would be in use. - Yuchung : should we
change routers before we update hosts? - Koen: in parallel, gradually.  DCTCP
exists today.  You can make a TCP that works in both. - Yuchung: Does this
exist in the drafts? - Bob: need fallback if classic ECN is in use instead of
L4S ECN - Pat Thaler (Broadcom): cover datacentre as well as carrier? - Bob:
yes. original reason I started doing this was that there were too many
datacentres to deploy DCTCP (in BT).  Bottleneck is likely to be on the ToR
switch; solution for multi-tenant DCs. - Pat Thaler: some apps care about low
lost /latency; others don't. - David Black: Where are the incentives to keep
the other senders out of the low latency queue? - Bob: you've misunderstood:
all large flows can also be in the L4S queue ... - David Black: Assumes only
DCTCP is supposed to use ECT(1), what is the incentive for not using this
codepoint by other applications that are not DCTCP? - Koen: same as today, you
could always just ignore ECN signal.  Can employ other techniques such as rate
limiting to stop cheating. - Lars: can someone misbehaving destroy the queue
for everyone? - Praveen: will react badly to loss. - David Black: some degree
of trust that ECT1 will be set appropriately. - Bob: This needs to be mentioned
in the security considerations section of draft. - Jana: It's not just a
security consideration - it does not need to be an attacker, just a mistaken
use. - Koen: This is the same situation as with current network with classic
ECN.  Eventually you just start dropping packets, as you do in any other AQM. -
Jana: The AQM WG may be about to close, what do you think your recommendation
will be to address this? - Lars: Hold this question until the end of the
session. - Roland Bless : AQM can achieve low delay if you have more than a few
flows.  Otherwise, you need to change TCP. - Koen: tcpprague has to decide how
to be scalable, future safe. Updates may require changes to the queues in
routers. - Roland Bless : It's not so easy to update the network routers after
a change. - Christian Huitema: There is a lot of discussion about the relation
of ECN feedback with two queues vs. multi-queue (e.g., FQ-Codel).  With flow
isolation, each flow can have its own feedback.  If you look at what you are
doing, you are specifying a new network feedback mechanism (please warn me very
quickly if the queueing is going up).  I have reservations about embedding the
square root coupling into the mechanism; TCP is not square root.  It's a hack.
- Koen: with slow start, short flows don't get any feedback.  With short RTT
you can respond quickly.  Not so much with longer RTT flows. There are other
mechanisms for short flows. - CH: very small fraction of flows are long flows.
- Koen: video quality performance depends on throughput and low latency.  VR
video needs lower latency/higher throughput. - CH: be careful about hacks -
Bob: I agree it is worrying to put a formula into the architecture.  Having a
shallow threshold on L4S side allows you to test the capacity more quickly to
get your short flows going faster. - Colin Perkins: good story for not marking
a class TCP traffic with ECT(1).  Not sure about RTP conferencing application;
incentives might not apply in that case. - Bob: I want those flows in the L4S
queue. - Colin: If I build a non-adaptive video app I may just ignore all the
CE-marks, that's an important case to consider for other apps. - Bob: Koen has
done experiments.  If this thing moves to loss, will self-control itself. 
Problem is the same as what we have now. - Koen: If someone sends non-CC
traffic above bottleneck share, same problem happens. - Colin: The disadvantage
is that you push the TCP flows away quickly. - Koen: today as well.  Don't
necessarily have a bigger effect because the queues are smaller. - Yuchung:
What about delay-based CC? - Koen: There is some discussion ongoing.  Hard for
delay based CC to coexist with loss based CC (fairness). - Yuchung: ? - Koen:
fairness, I don't know - Bob: I want operators to change things to the right
way quickly. - Yuchung: So, you think ECN is a more straightforward approach? -
Bob: if delay is a problem, using delay measurements to control the system
won't help. - Koen: A combination of loss and delay can help (improve ECN
feedback). - Lars: don't want to go into loss vs. delay debate. - Jana: The
model here is a separation of queues in bottleneck routers; one low latency,
one marking ECN bits.  Sender response to that also needs to be considered - -
you could look at doing something different in TCP-Prague. - (no answer) -
Phil: question about FQ-Codel comparison; see Koen at the end.

8. Work required by the IETF - Marcelo Bagnulo

9. Discussion of the work required by IETF

- Colin Perkins: think you are missing some significant components.  Only
mentioned TCP and IP.  What about non-TCP transports - Marcelo: some mention of
others.  Think we should initially focus on TCP - Colin: existing RFCs of
non-TCP transports using ECN.  You are violating MUSTs - Marcelo: and they also
violate TCP MUSTs.  Just updating RFC 3168 (?) is enough - Colin: if you are
updating the response at the IP level; you need to update all of the other
RFCs. - Bob: we have just written text that does what you suggest we do. -
Colin: list missing things like circuit breakers in AVT. - Bob: updating
existing RFCs.  New work can adapt. - Lars: you are changing things underneath
then. - Mirja: ongoing work we are doing anyway - Gorry: not just RFCs, they
are deployed.  Other areas of the IETF that will be impacted. - Bob: I checked
ECN over RTP over UDP.  Nobody using ECT(1). - Colin: this affects existing
work.  Not listed.  Needs to be added. - Tim Shepard: because you are changing
things underneath them, they need to change.  But if they are using ECT0 they
don't need to change. - Lars: - Tim: what changes for ECT(0)? - Colin: discuss
based on this that says nothing about ECT(1).  Impacting other drafts even if
they don't use ECT1. - Lars: proponents need to check other work that is
standardized and ongoing to see what will be affected. - Mirja: Some changes to
ECN are changes that are independent of L4S.  These need to be changed anyway.
- Christian H: want to change to have high-volume feedback from the network. 
Need to do the checks of other work. - Colin: impacts more WGs than tsvwg. -
Lars: already a problem that the tsvwg folks need to take into account. - David
Black: author of RFC 3168 and tsvwg chair - agree with Colin.  Experimentation
is called for.  RFC3168 prohibits experimentation?  Draft to change ECN
response is being split into two drafts.  What do we need to do to enable
experimentation.  If we open this up, we are tinkering with things that could
break the Internet.  What are the appropriate controls to allow us to
experiment responsibly. - Magnus W: feels that the assumption that everyone is
using ECT(0) today is wrong.  Some folks may be using ECT(1) today.  APIs allow
applications to set ECT will allow more applications to use ECT(1). - Koen:
Apps can detect ECN-CE response behaviour that has delay and adapt. - Magnus W:
If you have an API to set the ECN bits for UDP, then you may immediately expect
to see more use of this codepoint...

10. Polls - Chairs

- Lars: do people believe that they understand the proposal being brought
forward? * rough consensus for understanding

- Lars: do you believe this is something the IETF should take on
* strong consensus for yes
- Lars: show hands if you want to help with the work
* approximately 20 hands up.  Send an email to tcpprague@ietf.org if you are
willing to help.  State if you are willing to review documents. - Jana:
question about who is willing to deploy. - Lars: please show your hand if you
build equipment or run networks * 15-20 hands

- Mirja : do people believe that the IETF is able to do this work?
- Lars: if you believe, please hum
* strong consensus yes
- Jana: different pieces: dual-Q, response
- Jana: does tcpprague belong in iccrg?
- Mirja: the only thing that really needs standardization is the marking to
discriminate DCTCP vs. classic - Phil: Koen will show demo now.