Notes for IETF108 LOOPS BOF

Unified Slides

Introduction, Tool fiddling, Scribe Kidnapping, and Agenda Bashing Chairs

Chairs: Brian Trammell and Spencer Dawkins
Scribes - Li Yizhou and Tommy Pauly

“What is LOOPS?” Carsten Bormann

Carsten presenting…
Second BoF
Packet loss is recovered locally within a network segment
Improve latency when doing so
Hosts are not required to participate

Looking for a minimum viable product
Previously, people were scared that this was trying to reinvent the internet; not doing that!

Have removed several features since IETF 105 - dynamic block sizes, FEC beyond a basic “default scheme”

Use drop-to-mark for congestion feedback
Current congestion fedback only for ECT traffic

Not a transport protocol, OK with losses. Also will deal with more packets than a transport connection often would.

Status after IETF 105 BOF Carsten Bormann

Previous meeting looked at design space. We are now in a WG-forming stage, trying to narrow the proposal.

Focusing on latency improvement now.
Tail loss is the most important problem to consider.
Narrow the broad problem to opportunities. The problem is too wide, so really what we need is opportunity statement.LOOPS is also an incentive to switch on ECN.

Scenario 1: Aa1 B Aa2.
Scenario 2: Aa Bc

Not applicable to all environments (won’t see it in the US, or on the backbone), but specific deployment scenarios.

Needs to be standardized because different platforms will need to interoperate and the sides of the LOOPS implementation.

Quick review of use cases Yizhou Li and Jianglong

Overlay Path via Cloud (Huawei use case)
Consider Ca1 Ba3 Aa2 – three different clouds, operator A hosts running on each.
The default path C - A might be slow, but sending through B may help.
LOOPS does not address path selection, but only helps to recover packets once the path has been chosen like this.

SD-WAN branch office interconnect (China Telecom use case)
Aa1 Aa2
Overlay connected by tunnels
LOOPS trying to recover loss due to the underlay being best effort

Improving Multipath support (DT use case - Markus Amend)
LOOPS on a path level can improve any given path used by multipath endpoints
LOOPS end-to-end overlaying multipath can compensate for imperfections created by the multipath layer due to scheduling, etc

LOOPS only handles packet recovery, no direct interaction with path selection, routing, etc.

Clarifying questions All

Brian: Looking at Jabber, the answer to the question “do we understand the problem”, the answer is “not yet”

Gorry: Talked about using ECN end-to-end. What about using networks that have ECN in the underlay? If you have ECN in the underlay, you would want to react differently to drops vs marking. You need to handle the case where a drop is trying to reduce load.

Carsten: Might simplify by using ECN in the underlay but found we don’t need that, by converting drops to marks. Indeed that’s an application of LOOPS. You don’t know where the routers that mark are; converting drops to marks is innocuous. Use circuit breakers to make sure you’re not completely swamping a segment.

Colin: What I thought was happening with LOOPS is that it’s a fancy tunnel. Use cases seem to be more general. Are the rest things that just use LOOPS, and the rest is out of scope?

Carsten: LOOPS is just the recovery part.

Martin: This is a fancy tunnel, and if your route selection puts you down that tunnel, then you get all the wins.

Carsten: Yes. Some caveats, don’t know how to signal without ECT though.

Martin: Reordering?

Carsten: Yes. Difference between DetNet and LOOPS. DetNet emulates Ethernet (ish), handles reordering/resequencing.

Martin: If the application is not reordering-tolerant, and repair/displacement causes latency, how is this a good thing?

Carsten: Don’t run it if displacement is serious. Flows are small. Traffic has different requirements.

Jana: No classification in scope. Trying to understand context of deployment, sounds like what already happens around problematic links. Satellites and WiFi look at RTX, because you know about the loss characteristics. At the IP level I do it across the Internet. How does LOOPS only apply to the traffic that wants this treatment?

Carsten: We do this in link-layer adaptation all the time. Our question: if that makes sense for one hop, why not two? This is a generalization.

Jana: My question is how do you know you can generalize it?

Carsten: True, we may not know why packets are lost. You might be overestimating LLP intelligence here.

Jana: One reason for loss in multihop is congestion. Not sure what happens when you set up LOOPS on top of that. Very important to consider that recovery that loops does has to be seen in the context with sender recovery. How does this interact?

Carsten: LOOPS can’t look into the transport, so it doesn’t know. If there are losses that there is bugedt to repair, will happen.

Yizhou: When the underlay infrastructer operator is different from the overlay operator, overlay doesn’t know about the conditions on the underlay. Overlays can often be provisioned much faster than underlays.

Jana: I hear a fundamental assumption that loss is fundamentally a latency-inducing thing, and that is bad. That’s a strong assumption: endpoint protocols are built around loss, and sometimes can correct without latency penalty. BBR for example reduces latency while increasing

Charter, charter discussion All

Is the draft charter ready to take forward?
Is it close enough to polish on the list during the review process?

Colin: Network coding is planning on winding down over the next year; it makes sense to talk to them, but they likely won’t be around. More generally, I don’t see any list of deliverables yet. Some applicability statement or operator guidelines for where it is appropriate to use LOOPS would be useful to produce. If you have RTTs with more than X, you would be adding to much reordering, etc. It seems like a reasonable engineering problem that people will try to fix whether or not we standardize it, so it is good for IETF to make statements not only on how to do it, but when to do it.

Gorry: PEPs had to be known by the endpoint. You shouldn’t force the tradeoffs on endpoints. I am concerned about multiple teams trying to fix the same problem at different layers. Would like to include this in the charter.

Brian (individual): Looking at the charter in terms of Gorry’s question… is this a minimum viable protocol? Do we need signalling beyond ECN to opt-out of this. You may not know what the underlay is doing, but you also don’t know what the connections above you are doing. Is there a signalling layer that is missing here, to ask for this behavior. As Colin said, on the other hand, a lot of this is happening already, and often occurs without endpoint cooperation. We should at least be able to talk about experiments in this space in the IETF. I think the charter isn’t ready but can be polished.

Carsten: I agree, and have wanted endpoint intention signalling for a long time. We tried to steer clear of the signalling, but it would be useful. I wouldn’t want to do that in this working group.

Martin Duke: The premise was being completely transparent to the endpoint, but we’re seeing feedback against that. Could we just use DSCP?

Carsten: If it were deployed in any way, sure, but it’s not.

Martin: As an experiment is that OK?

Martin Thomson: Fundamental problem here in that these enhancements are being applied over a fairly long span of the network. This isn’t free, and may interfere with what endpoints already would be doing. I get why there aren’t explicit signals, but I don’t think that is necessarily good. Without an applicability statement, this is actively harmful–we need to make sure its clear when this is OK.

Spencer: The PANRG what-not-to-do talks about endpoints and networks not trusting each other. I am more comfortable now about these entities making “side deals”.

Jana: At a high level, there is a big difference between link-level retranmissions. Those are under IP. This is above IP, as an overlay. As a link, the link knows that it is a small part of the end-to-end path. I can buy that the applicability statement can narrow the usage, but it’s still not clear to me that we really understand where it makes sense. Just because some people can deploy this today, that doesn’t mean it’s more than snake oil. It’s not clear that people want a standard here, and it’s not clear that we want to make this easy to deploy. Applicability may not be enough.

Spencer: As a former AD, one of the BoFs that didn’t charter was about tunnelling compression, etc. We had the concern about people actually needing to do this. It turned out that most people who needed it were in Africa, and weren’t at the BoF.

Calling the question Chairs

Let’s hum!

Should we solve this in the IETF?
Yes - Forte
No - Forte

Is it useful to have an applicability statement if we do this work?
Yes - Forte
No - Pianissimo

Questions about is this independent of a WG or not. It may be interesting to write the applicability of when to do this or not, even if we don’t want to work on this.

Gorry: TSVWG could take on the applicability statement for when it is appropriate to do these items. (That is it could discuss this topic, that doesn’t necessarily mean publish a doc.)

Lars: The proponents of the work need to do the work for applicability before anyone tries to charter this. I’m feeling pretty negative about this, and saying the IETF can do things here. We need more work.

Brian: So, go write the applicability and come back.

Lars: Yes, I want to read it first.

Brian: Outcome is to go write that, and maybe present it as TSVWG, and then see if we want a BoF later.

Martin: Agreed on that assessment of the room. We’re not ready to set this up as a WG. Applicability presented to TSVWG is a fine idea. I would encourage proponents to do some of the research behind this rather than waiting around. There’s a lot of work to make the unknowns more clear.

Colin: ICCRG is not a bad place to discuss this.

Jana: I’m not convinced this is limited to CC. Don’t want to punt, want to discuss, but it’s a broad transport question.

Colin: Agreed. The docs should be discussed in ICCRG, TSVWG, TSVAREA, etc, to get the community aware before we do any more long-term work.

Brian: Can you do this kind of work in the IETF 109 timeframe?

Carsten: To share with transport area, or second BoF?

Brian: Just to share with transport area. This would be a milestone before doing another BoF.

Spencer: We are forte on both we should do this and not; dumping this on just the proponents is a shrinkage. Anyone who piled in for forte should know where to contribute.

Brian: The discussion on spinning this up should be on the LOOPS mailing list.