Minutes IETF101: tsvwg
Transport Area Working Group
||Minutes IETF101: tsvwg
DRAFT - to be checked against audio archive.
IETF 101, TSVWG Meeting Minutes
Chairs: Gorry Fairhurst, David Black, and Wes Eddy.
Thanks to Richard Scheffenegger and Paul Congdon for assistance in taking notes.
1550-1750 Afternoon Session II
1. Chairs Update:
Four RFCs published since Singapore: 8260, 8261, 8311 and 8325.
Two drafts are expected to be submitted to IESG - SCTP errata and DSCP IANA
process changes. We expect 5 drafts to go to WGLC before next meeting.
There are 7 other WG drafts that are in progress; SCTP NAT, a set of L4S drafts,
UDP options, Tunnel Congestion Feedback and Datagram PLPMTUD. There will be
2 related drafts discussed in this meeting. There are 3 additional WG that
are not on the agenda for this meeting. In the INTAREA WG there are 3 drafts
that are related to TSVWG activity - tunnel MTU, fragmentation and SOCKSv6.
Milestone updates and review of WG Progress:
SCTP NAT draft should be ready for WGLC at/after the Montreal meeting, so
October is a good new milestone date for it. Confirmed by Michael Tuexen.
Tunnel Congestion Feedback - Bob Briscoe reported that Donnald Eastlake and
Andy Malis have become interested in using this for Service Function
Chaining (SFC). They were pointed to this for load balancing across
service functions, and are interested in helping.
2. Announcements and Heads-Up
2.1 Liaisons (none)
2.2 Other Drafts Related to TSVWG
draft-olteanu-intarea-socks-6 (For info - Please discuss on INTAREA list)
draft-saldana-tsvwg-simplemux (For info - Please discuss on TSVWG list)
draft-herbert-fast (For info - Please discuss on TSVWG list)
draft-han-tsvwg-cc (For info - Please discuss on TSVWG list)
3. Transport and Network
3.1 Gorry Fairhurst: IANA Action for DSCP Pools
draft-ietf-tsvwg-iana-dscp-registry (In WGLC)
Spencer Dawkins (as AD):
(regarding discussion about shortening the name of the draft)
The current name is meaningful, it is ok.
I commented on -01. Replace last part with "requires IANA action".
Gorry Fairhurst (as author):
OK, I plan to revise shortly after the meeting.
When you say there will not be any negative impact, would
you know there is no private use of this pool of DCSPs?
Gorry Fairhurst (as author):
The purpose of publishing this is to ensure people know about the change.
I do not think existing use is harmful, but there is a change, people
using local use DSCPs should already be aware that they may be
used by others. This will be called out to other WGs in IETF LC.
If those networks use the local DSCP, don't they need to just shoot
themselves in the foot when they use this LE PHB?
Gorry Fairhurst (as author):
They should not if they adhere to DiffServ specs.
DiffServ behaves best with complete configuration at the DiffServ perimeter.
I look forward to the shepherd writeup, that should call attention to
any possible impacts.
This should not be a big issue, as there was an indication these DSCPs
(xxxx01) were ready marked as possible use for future standards action.
Local remapping is always possible.
3.2 Roland Bless: Lower Effort PHB
(Asks for any final feedback on the decision to allocate the DSCP).
I see no objection to use of codepoint 000001 as the recommendation
here for the LE PHB (to be confirmed on list).
The text currently says:
With respect to the use of LE traffic by different styles of congestion
control. The text says you don't want to harm people, you shouldn't
harm others traffic.
I think this text should describe what happens if there would be harm.
Gorry Fairhurst (from the floor):
I prefer an approach that says SHOULD use a less than best effort
congestion control (e.g., LEDBAT) and explain why this is really desirable.
That is, if you don't, there may be unwanted or unexpected interaction
with other traffic. That would be better guidance. I (personally)
would be reluctant to use MUST, but really I would like to see people do this.
David Black (Floor Mic):
From an operator perspective SHOULD vs MUST is irrelevant, an operator
has to defend the service regardless of which words are in RFCs. I think
the text proposal by gorry sounds OK.
The text just doesn't make sense in the current wording: whether LE traffic
is going to do harm should be written clearly, the important thing is to
write clearly words to say in what conditions harm can result.
The operator concern is not for a singular flow, but for the aggregate.
Gorry Fairhurst (as a chair):
Bob, please send example text to list, to help find better wording
Bob will send text about whether use of a less than best effort transport
(e.g. LEDBAT) ought to be a MUST or SHOULD requirement for use of the
LE PHB, and David will add operator perspective on this.
We will discuss this on the mailing list. Do we need extra text on tunnels?
A reference to RFC2953 will help.
We recommend not updating the document about 802.11 mapping, it is pretty
clear how this new DSCP maps to WiFi.
The ID with guidance on WebRTC (draft-ietf-tsvwg-rtcweb-qos-18, in RFC-Ed
queue) needs to be updated. This ID discussed CS1 and that advice will
now become obsolete.
Spencer Dawkins (as AD):
Updating a draft in the RFC Editor queue is possible to change to the
RFC to reflect the new recommended DSCP.
I will drop a note to RFC Editor, and ask how best to do this.
The Chairs should let WEBRTC people know of the updated consensus here.
We will let the IETF and W3C RTCWEB/WEBRTC groups know
and recommended the DSCP change. (Please cc tsvwg.)
3.3 Anais Finzi: Priority Switching Scheduler
(After discussion with the authors this is headed to the Independent
Submission Editor, with inputs from TSVWG.)
The scheduled talk did not occur in Singapore. The goal is to make the
AF Class more predictable. This scheme changes the priority of the traffic
based on credit thresholds.
Is EF not impacted? I am not quite sure about that conclusion.
EF has a definition of an error term.
In our simulations we did not see any impact.
The maximum impact is at the maximum frame size.
I will check the paper, thanks.
Roland volunteered as a reviewer of draft.
This proposal would remove the req'd policer on the EF class.
Would this conflict with RFC3246?
So far, the IETF did not specify any schedulers. Is this Standard Track?
No. This is heading for an Independent Submission.
I am not sure if I want to have that scheduler on a 100G link.
I can review the draft/paper, but I am not an expert on scheduling.
Easy to configure? To set these 3 parameters...
There paper shows how to translate the current config into the
corresponding set of settings in PSS. This can be simplified by
removing some, but not optimal.
Roland Bless and Ruediger Geib agreed to review the draft, and David will
send an announcement to the list requesting other reviews.
3.4 Tom Jones: Datagram Path Layer Path MTU Discovery
Have you looked at the draft in INTAREA on ICMP PTB signals?
Yes, we have read and sent a few comments, this does not conflict.
SCTP can do verification on the VTAG, not just the 5-tuple.
I will read this draft. The issue of authenticating messages
from the network is much broader than just PMTUD. It is good
to check if there are new solutions. In the absence of authentication,
I agree that messages are just advisory. You have to be able to be
able to tolerate bad cases (e.g. byte-swapped lengths).
Gorry Fairhurst (as author):
Are you saying there can be an advertised link MTU much smaller than
the actual path MTIU?
There is in principal a DOS attack trying to reduce the PMTU.
We did see cases where the two MTU bytes were swapped.
The intent was to facilitate Jumbo discovery - but this moved the
problem down to the design of NICs and switch buffer carving.
It is very important to make verifications robust.
I will review this draft.
When working with SCTP we found a middlebox (at my home) that just gave
some random number, instead of the VTAG!
We have been running into a wide range of MTU issues.
This is a nice optimization. How does it interact with load balancers?
This is important work, we need to consider how PTB gets mapped back
to the source. I also saw NATs and at least one TCP optimiser that did
The method only takes PTB as advisory, we plan in the next revision
to add a probe method to verify if the actual PMTU is larger than the
advertised link MTU.
Please tell us about any strange behaviours - that would be greta input.
Gorry Fairhurst (as author):
Load balancing (i.e., use of more than one PMTU) was one motivation
for making it robust - we want to get this part right. For this, we
need some tales of what happens in the wild - please talk to us?
How does this work relate to the transport work in QUIC?
Spencer Dawkins (as AD):
I hope we can make progress on using this in QUIC, although the starting
point may be for QUIC to pick a PMTU number and just fallback if this
fails to be supported by the path.
We know PMTU using ICMP is broken. Anything that is doing this better
seems like a good thing.
We plan to have the algorithm finished for Montreal, and WGLC by December.
Lars Eggert (as QUIC Chair):
Most current QUIC implementations do something very simple,
a basic test/fallback. Having a functional PMTUD is not strictly required.
If this scheme is not too complex, some implementers might try this.
After we do a v1 of the QUIC spec, we will do v1.1 shortly after.
Speaking as an individual this mechanism might go into QUIC v1.1.
Gorry Fairhurst (as author): Actually the full algorithm has many states,
but once we have this, we can profile a simpler algorithm for
applications that do not need the full method.
There are two parts needed: First, we need some small paragraphs in
this TSVWG ID to describe requirements and features of each protocol.
This should be straightforward for the use of QUIC, there is some text
there already. Review by a QUIC subject matter expert would be great.
This would make sure nothing is inconsistent is in this draft.
- We can also help with the actual text for how to implement this
in a QUIC transport - and that could be in a QUIC WG draft.
I prefer QUIC to be upwards compatible with this.
As someone who tried to run Jumbo frames, my conclusion is we all need
this. Network operators should make sure big PMTU works.
Should this become a BCP for all protocols? Saying we ought to do this?
BCP is plausible with much more experience. Qualitatively useful, so
BCP is the correct end-goal. OS should not ship without some
In BSD, the black-hole detection code is not currently good.
I also want some telemetry to tell me this has gone active, to help me
An entry in the syslog? netstat counter?
Yes, that sort of thing.
A BCP - implies more running code, more experience is needed.
There is currently a risk that we do not get to a state that is good
and so the endpoint remains in a very bad state, just like clamping MSS
low enough that it is bound to work, but way smaller than optimal.
If we do not add this as a default, then jumbo frames are never going
to happen. The Linux black-hole detector also needed improvement
when I looked a year ago.
I do want Blackhole detection and logging most.
There are differences when using PLPMTUD with TCP. TCP uses probe packets
that carry user data, this changes retransmission and the CC state. TCP
PLPMTUD is much more complex. In contrast, this datagram version here
has the assumption this is done without using user data, so probe loss
is not something that impacts CC or retransmission/repair logic.
3.5 Bob Briscoe: ECN transport
(This draft is now ready for WGLC.
It will be WGLC'd with the second encaps draft)
Are there any IETF shims that got it right?
All have their own issues. If you do it when you first design it,
it is easy.
Bob (points to table slide).
Let's make this a management problem. My view ECN is an inter-operator
3.6 Bob Briscoe: ECN & L4S
(Bob presented slides on L4S. No comments or questions from the WG).
(Bob presented a new individual draft on diffuser and L4S.)
There were no comments or questions from the WG. This draft will be
discussed in Montreal, to see if there was interest in this work.
4. Other presentations / New Work
4.1 Paul Congdon: Congestion Isolation in IEEE 802.1
Paul outlined proposed work in IEEE 802 on providing layer 2 congestion
support to switches and the possible interactions with other methods.
The IEEE will be making a possible decision in July.
How does this relate to routers?
How often is the xon/xoff transition?
Paul : This is basically propagation time, and a threshold.
Can the hardware actually scale to high speed links and support back to
back frames of the order of 1 microsec?
Is is feasible to xon/xoff of 10-12 packets?
We think hardware can provide solutions.
When the xoff is received, there might not be a packet to be sent.
Most implementations set xoff threshold to maximum and count on xon.
Send xoff on high thresh, and watermark at low threshold send xon.
That is not the way it's generally deployed.
What is the trust model?
This targets a single admin domain, probably a data centre.
Have you considered virtual queues, instead of thresholds for triggering?
A virtual queue slows you down before you have to buffer. I know
Broadcom/CISCO chipsets can utilise this.
We do not like PFC, ECN mostly works. If E2E ECN works, this is not that
Congestion comes up if multiple flows share the same queue.
How does the switch get to the bad flow?
Same way as an ECN mark for the offending flows.
Probabilistically the signal should reach the correct source.
Is there granularity of flow detection, can other flows also get stuck
into the same congested queue?
Yes, we are aware of this.
1750-1810 Beverage Break
1810-1910 Afternoon Session III
5. Transport Protocols and Mechanisms
5.1 Vincent Roca: FEC drafts
(These drafts are now ready for WGLC, reviewers are needed.)
There were no comments on the final drafts.
The chairs will look for volunteer reviewers, and we will cross-post
review requests to the IRTF NWCRG.
5.2 Joe Touch: UDP Options (proxy by Gorry Fairhurst)
Is there going to be more text to explain the options?
This is the January draft.
Joe needs to continue the discussion on the list and revise draft
following the presentation.
5.3 Tom Jones: UDP Options Implementation
(Tom Jones presented a view that differs from Joe's view of the way forward,
especially a different way to handle the checksum option.)
We want to complete implementation for June.
Gorry Fairhurst (based on email from Joe):
Joe Touch is willing to look at different sizes of checksums, should
the TSVWG think that is useful and considers efficiently.
A 16 bit checksum is one less line of code, less code, and standard
algorithms are good.
Please follow-up with Joe on the list.
5.4 Michael Tuexen: RFC4960 Errata
(This talk presents updates following WGLC.)
You note a change to the CRC32c definition. This if often referred
to by other groups. What changed?
This change is to fix code typos by changing definition to make it
compile on all platforms, it is not an algorithmic change to the CRC32c
There were no further comments from the room. People are encouraged
to check the corrections and report any issues to the list.
The current version can now continue to AD review and IESG LC.
5.5 Michael Tuexen: SCTP NAT
There were no comments in the room. The authors have noted the new
milestone and expect to work on this ID for the Montreal IETF meeting.
5.6 Gorry Fairhurst/Colin Perkins: Impact of Transport Header Encryption
Are you interested in speculation about what might happen?
Could we imagine a standardised generic transport-agnostic header,
that is end-to-end and indicates embedded bytes across multiple transports?
This would be useful for debugging and a partial solution to some issues.
Who would turn this on?
It would be good for debugging applications and not so much
use for some other places.
OAM that is not in the flow may not be representative.
There are also issues relating to equipment that the stakeholders
do not wish to tell people about - this makes it tricky.
It may seem that some tools are duplicated, or other ways can be
used to measure the network path, but people do need to have multiple
tools to check the answer to the same question. This is needed so they
can be sure what is actually happening.
In-band OAM might be interesting as one of these approaches.
This seems like something that is already possible in some networks,
but I agree it is really interesting to find out more about, please
tell us about this as ID editors.
OK, I think there could be a generic instrumentation shim. This could
be pervasive where it needs to be.
Generic tools are much better than version-specific tools that need updates.
How much feedback have you had from operators and equipment manufacturers?
I worked with operations and insight was often very hard to come by.
On a different topic, tools based on machine learning are even more
demanding. How can we get good feedback?
We received good feedback from many operators off-list. A key problem is
knowing who needs what operational information. Very often transport
information is used by people to debug equipment/configuration issues -
to understand anomalies/health etc, and organisations as a whole do not
care about content, only specific people need this.
We are looking still for more people to provide feedback
DDOS mitigation techniques are important, some rely on machine learning.
Some approaches suffer more from encryption than others.
Colin Perkins (author):
Think this is a interesting discussion. It would be wonderful to have
a measurement shim, but that may be not something that we have currently
much experience. We would be interesting to look at different approaches,
but that would be a different draft.
The IPPM WG did publish an instrumentation shim, for replacing timestamp
and IP ID in IPv4 networks. This use is not applicable in public Internet.
This can be better than passive TCP monitoring.
Some discussions in QUIC will drive us on different designs in this space.
As a framing document, this ID should be adopted.
This provides a good view of what encryption has changed from the transport
side. This is a useful scope. When you say encryption has costs, adding
shims definitely adds very real costs.
It would be useful to quantify somehow the complexities that are added here,
and what this gives us. I am happy to read the next revision of the draft.
How many have read this? (A fair number.)
I ask people to read this and we will review the adoption next meeting.
(+1 read this, via jabber)
I have a suggestion that we should aim at consensus on this one.
By scoping to the transport layer, we may get around to this and
we can look at both sides of the coin..
Adoption now, or in Montreal - I would be happy with either.
These problems are not going away, we will be discussing in Montreal.
Colin Perkins (author):
Things are not going to get better. QUIC is going to be deployed,
and is using these kind of header, so there is some time contraint.
At this time the bits that are exposed in QUIC are still changing, it is
not fully stabilized yet, we need to understand these tradeoffs.
End of meeting.