Minutes IETF101: tsvwg

Meeting Minutes Transport Area Working Group (tsvwg) WG
Title Minutes IETF101: tsvwg
State Active
Other versions plain text
Last updated 2018-04-11

Meeting Minutes

   DRAFT - to be checked against audio archive.

IETF 101, TSVWG Meeting Minutes
Chairs: Gorry Fairhurst, David Black, and Wes Eddy.

Thanks to Richard Scheffenegger and Paul Congdon for assistance in taking notes.

1550-1750  Afternoon Session II

1. Chairs Update:

  Four RFCs published since Singapore: 8260, 8261, 8311 and 8325.
  Two drafts are expected to be submitted to IESG - SCTP errata and DSCP IANA
  process changes.  We expect 5 drafts to go to WGLC before next meeting.
  There are 7 other WG drafts that are in progress; SCTP NAT, a set of L4S drafts,
  UDP options, Tunnel Congestion Feedback and Datagram PLPMTUD.  There will be
  2 related drafts discussed in this meeting.  There are 3 additional WG that
  are not on the agenda for this meeting.  In the INTAREA WG there are 3 drafts
  that are related to TSVWG activity - tunnel MTU, fragmentation and SOCKSv6.
  Milestone updates and review of WG Progress:

  SCTP NAT draft should be ready for WGLC at/after the Montreal meeting, so
  October is a good new milestone date for it.  Confirmed by Michael Tuexen.

  Tunnel Congestion Feedback - Bob Briscoe reported that Donnald Eastlake and
  Andy Malis have become interested in using this for Service Function
  Chaining (SFC).  They were pointed to this for load balancing across
  service functions, and are interested in helping.

2. Announcements and Heads-Up

2.1 Liaisons (none)

2.2 Other Drafts Related to TSVWG
    draft-olteanu-intarea-socks-6 (For info - Please discuss on INTAREA list)
    draft-saldana-tsvwg-simplemux (For info - Please discuss on TSVWG list)
    draft-herbert-fast (For info - Please discuss on TSVWG list)
    draft-han-tsvwg-cc (For info - Please discuss on TSVWG list)

3. Transport and Network

3.1 Gorry Fairhurst: IANA Action for DSCP Pools  
    draft-ietf-tsvwg-iana-dscp-registry (In WGLC)

Spencer Dawkins (as AD):
  (regarding discussion about shortening the name of the draft)
  The current name is meaningful, it is ok.

Roland Bless:
  I commented on -01. Replace last part with "requires IANA action".
Gorry Fairhurst (as author):
  OK, I plan to revise shortly after the meeting.

Bob Briscoe:
  When you say there will not be any negative impact, would
  you know there is no private use of this pool of DCSPs?
Gorry Fairhurst (as author):
  The purpose of publishing this is to ensure people know about the change.  
  I do not think existing use is harmful, but there is a change, people
  using local use DSCPs should already be aware that they may be 
  used by others. This will be called out to other WGs in IETF LC.

Spencer Dawkins:
  If those networks use the local DSCP, don't they need to just shoot 
  themselves in the foot when they use this LE PHB?
Gorry Fairhurst (as author):
  They should not if they adhere to DiffServ specs.
David Black:
  DiffServ behaves best with complete configuration at the DiffServ perimeter.
Spencer Dawkins: 
  I look forward to the shepherd writeup, that should call attention to 
  any possible impacts.
Roland Bless:
  This should not be a big issue, as there was an indication these DSCPs 
  (xxxx01) were ready marked as possible use for future standards action.
  Local remapping is always possible.

3.2 Roland Bless: Lower Effort PHB 

David Black:
  (Asks for any final feedback on the decision to allocate the DSCP).
   I see no objection to use of codepoint 000001 as the recommendation 
   here for the LE PHB (to be confirmed on list).
Bob Briscoe:
  The text currently says:
  With respect to the use of LE traffic by different styles of congestion
  control. The text says you don't want to harm people, you shouldn't 
  harm others traffic.
  I think this text should  describe what happens if there would be harm.
Gorry Fairhurst (from the floor):
  I prefer an approach that says SHOULD use a less than best effort
  congestion control (e.g., LEDBAT) and explain why this is really desirable.
  That is, if you don't, there may  be unwanted or unexpected interaction 
  with other traffic. That would be better guidance. I (personally)
  would be reluctant to use MUST, but really I would like to see people do this.
David Black (Floor Mic):
  From an operator perspective SHOULD vs MUST is irrelevant, an operator 
  has to defend the service regardless of which words are in RFCs. I think
  the text proposal by gorry sounds OK.
Bob Briscoe:
  The text just doesn't make sense in the current wording: whether LE traffic
   is going to do harm should be written clearly, the important thing is to
   write clearly words to say in what conditions harm can result. 
David Black: 
  The operator concern is not for a singular flow, but for the aggregate.
Gorry Fairhurst (as a chair):
  Bob, please send example text to list, to help find better wording
  Bob will send text about whether use of a less than best effort transport
  (e.g. LEDBAT) ought to be a MUST or SHOULD requirement for use of the 
  LE PHB, and David will add operator perspective on this.
Roland Bless: 
  We will discuss this on the mailing list. Do we need extra text on tunnels?
David Black: 
  A reference to RFC2953 will help.
  We recommend not updating the document about 802.11 mapping, it is pretty 
  clear how this new DSCP maps to WiFi.
Roland Bless:
  The ID with guidance on WebRTC (draft-ietf-tsvwg-rtcweb-qos-18, in RFC-Ed 
  queue) needs to be updated. This ID discussed CS1 and that advice will 
  now become obsolete.

Spencer Dawkins (as AD):
  Updating a draft in the RFC Editor queue is possible to change to the 
  RFC to reflect the new recommended DSCP.
  I will drop a note to RFC Editor, and ask how best to do this.
Gorry Fairhurst:
  The Chairs should let WEBRTC people know of the updated consensus here.  
  We will let the IETF and W3C RTCWEB/WEBRTC groups know 
  and recommended the DSCP change. (Please cc tsvwg.)

3.3 Anais Finzi: Priority Switching Scheduler

(After discussion with the authors this is headed to the Independent 
Submission Editor, with inputs from TSVWG.)

The scheduled talk did not occur in Singapore.  The goal is to make the 
AF Class more predictable.  This scheme changes the priority of the traffic 
based on  credit thresholds.

Roland Bless:
  Is EF not impacted? I am not quite sure about that conclusion.
  EF has a definition of an error term.
Anais Finzi:
  In our simulations we did not see any impact.
  The maximum impact is at the maximum frame size.
Roland Bless:
  I will check the paper, thanks.
David Black:
  Roland volunteered as a reviewer of draft.
Ruediger Geib:
  This proposal would remove the req'd policer on the EF class.
  Would this conflict with RFC3246?
  So far, the IETF did not specify any schedulers. Is this Standard Track?
David Black:
  No. This is heading for an Independent Submission.
Ruediger Geib:
  I am not sure if I want to have that scheduler on a 100G link.
  I can review the draft/paper, but I am not an expert on scheduling.
Roland Bless:
  Easy to configure? To set these 3 parameters...
  There paper shows how to translate the current config into the 
  corresponding set of settings in PSS. This can be simplified by 
  removing some, but not optimal.

Roland Bless and Ruediger Geib agreed to review the draft, and David will 
send an announcement to the list requesting other reviews.

3.4 Tom Jones: Datagram Path Layer Path MTU Discovery

David Black:
  Have you looked at the draft in INTAREA on ICMP PTB signals?
Tom Jones: 
  Yes, we have read and sent a few comments, this does not conflict.
Michael Tuexen:
   SCTP can do verification on the VTAG, not just the 5-tuple.

Matt Mathis:
  I will read this draft. The issue of authenticating messages
  from the network is much broader than just PMTUD. It is good
  to check if there are new solutions. In the absence of authentication,
  I agree that messages are just advisory. You have to be able to be
  able to tolerate bad cases (e.g. byte-swapped lengths).
Gorry Fairhurst (as author):
  Are you saying there can be an advertised link MTU much smaller than
  the actual path MTIU? 
Matt Mathis: 
  There is in principal a DOS attack trying to reduce the PMTU.
  We did see cases where the two MTU bytes were swapped.
  The intent was to facilitate Jumbo discovery - but this moved the
  problem down to the design of NICs and switch buffer carving.

Magnus Westerlund:
  It is very important to make verifications robust.
  I will review this draft.
Michael Tuexen:
  When working with SCTP we found a middlebox (at my home) that just gave 
  some random number, instead of the VTAG!
Eric (Akamai):
  We have been running into a wide range of MTU issues. 
  This is a nice optimization. How does it interact with load balancers?
  This is important work, we need to consider how PTB gets mapped back
  to the source. I also saw NATs and at least one TCP optimiser that did
  strange things.
Tom Jones:
  The method only takes PTB as advisory, we plan in the next revision
  to add a probe method to verify if the actual PMTU is larger than the
  advertised link MTU.
  Please tell us about any strange behaviours - that would be greta input.
Gorry Fairhurst (as author):
  Load balancing (i.e., use of more than one PMTU) was one motivation 
  for making it robust - we want to get this part right. For this, we
  need some tales of what happens in the wild - please talk to us?

David Black:
  How does this work relate to the transport work in QUIC?
Spencer Dawkins (as AD):
  I hope we can make progress on using this in QUIC, although the starting
  point may be for QUIC to pick a PMTU number and just fallback if this
  fails to be supported by the path.
  We know PMTU using ICMP is broken. Anything that is doing this better 
  seems like a good thing.
Tom Jones:
  We plan to have the algorithm finished for Montreal, and WGLC by December.
Lars Eggert (as QUIC Chair):
  Most current QUIC implementations do something very simple, 
  a basic test/fallback. Having a functional PMTUD is not strictly required. 
  If this scheme is not too complex, some implementers might try this. 
  After we do a v1 of the QUIC spec, we will do v1.1 shortly after. 
  Speaking as an individual this mechanism might go into QUIC v1.1.
Gorry Fairhurst (as author): Actually the full algorithm has many states,
  but once we have this, we can profile a simpler algorithm for 
  applications that do not need the full method.
Tom Jones:
  There are two parts needed: First, we need some small paragraphs in
  this TSVWG ID to describe requirements and features of each protocol.
  This should be straightforward for the use of QUIC, there is some text
  there already. Review by a QUIC subject matter expert would be great. 
  This would make sure nothing is inconsistent is in this draft.
  - We can also help with the actual text for how to implement this
  in a QUIC transport - and that could be in a QUIC WG draft.
David Black:
  I prefer QUIC to be upwards compatible with this.

Michael Abramson:
  As someone who tried to run Jumbo frames, my conclusion is we all need 
  this. Network operators should make sure big PMTU works.
  Should this become a BCP for all protocols? Saying we ought to do this? 
David Black:
  BCP is plausible with much more experience. Qualitatively useful, so 
  BCP is the correct end-goal. OS should not ship without some 
  PMTU support.
  In BSD, the black-hole detection code is not currently good.
Michael A:
  I also want some telemetry to tell me this has gone active, to help me
  for troubleshooting.
  An entry in the syslog? netstat counter?
Michael A: 
  Yes, that sort of thing.
Spencer Dawkins:
  A BCP - implies more running code, more experience is needed.
Eric (Akamai):
  There is currently a risk that we do not get to a state that is good 
  and so the endpoint remains in a very bad state, just like clamping MSS 
  low enough that it is bound to work, but way smaller than optimal.
  If we do not add this as a default, then jumbo frames are never going 
  to happen. The Linux black-hole detector also needed improvement
  when I looked a year ago.

Michael A:
  I do want Blackhole detection and logging most.
Michael Tuexen:
  There are differences when using PLPMTUD with TCP. TCP uses probe packets 
  that carry user data, this changes retransmission and the CC state. TCP
  PLPMTUD is much more complex. In contrast, this datagram version here
  has the assumption this is done without using user data, so probe loss 
  is not something that impacts CC or retransmission/repair logic.

3.5 Bob Briscoe: ECN transport 

(This draft is now ready for WGLC.
It will be WGLC'd with the second encaps draft)

Tom Herbert:
  Are there any IETF shims that got it right?
  All have their own issues. If you do it when you first design it, 
  it is easy.
Bob (points to table slide).
Philip Eardley:
  Let's make this a management problem. My view ECN is an inter-operator 

3.6 Bob Briscoe: ECN & L4S 

(Bob presented slides on L4S. No comments or questions from the WG).
(Bob presented a new individual draft on diffuser and L4S.)
  There were no comments or questions from the WG. This draft will be 
  discussed in Montreal, to see if there was interest in this work.

4. Other presentations / New Work

4.1 Paul Congdon: Congestion Isolation in IEEE 802.1

Paul outlined proposed work in IEEE 802 on providing layer 2 congestion
support to switches and the possible interactions with other methods.
The IEEE will be making a possible decision in July.

Michael A:
  How does this relate to routers?
  How often is the xon/xoff transition?
Paul : This is basically propagation time, and a threshold.
Michael A:
  Can the hardware actually scale to high speed links and support back to 
  back frames of the order of 1 microsec?
  Is is feasible to xon/xoff of 10-12 packets?
  We think hardware can provide solutions.
Pat Thaler:
  When the xoff is received, there might not be a packet to be sent. 
  Most implementations set xoff threshold to maximum and count on xon.
  Send xoff on high thresh, and watermark at low threshold send xon.
  That is not the way it's generally deployed.
Bob Briscoe:
  What is the trust model?
  This targets a single admin domain, probably a data centre. 
  One administrator. 
  Have you considered virtual queues, instead of thresholds for triggering?
  A virtual queue slows you down before you have to buffer. I know 
  Broadcom/CISCO chipsets can utilise this.

Sowmini Varadhan:
  We do not like PFC, ECN mostly works. If E2E ECN works, this is not that 
Mirja Kuehlewind:
  Congestion comes up if multiple flows share the same queue.
  How does the switch get to the bad flow?
  Same way as an ECN mark for the offending flows. 
  Probabilistically the signal should reach the correct source.
  Is there granularity of flow detection, can other flows also get stuck 
  into the same congested queue?
  Yes, we are aware of this.

1750-1810  Beverage Break

1810-1910  Afternoon Session III

5. Transport Protocols and Mechanisms

5.1 Vincent Roca: FEC drafts 

  (These drafts are now ready for WGLC, reviewers are needed.)
  There were no comments on the final drafts.

  The chairs will look for volunteer reviewers, and we will cross-post 
  review requests to the IRTF NWCRG.

5.2 Joe Touch: UDP Options (proxy by Gorry Fairhurst)

Tom Jones:
  Is there going to be more text to explain the options?
  This is the January draft. 
  Joe needs to continue the discussion on the list and revise draft 
  following the presentation.

5.3 Tom Jones: UDP Options Implementation 

 (Tom Jones presented a view that differs from Joe's view of the way forward,
  especially a different way to handle the checksum option.) 

  We want to complete implementation for June.
Gorry Fairhurst (based on email from Joe):
  Joe Touch is willing to look at different sizes of checksums, should 
  the TSVWG think that is useful and considers efficiently.
Tom Jones: 
  A 16 bit checksum is one less line of code, less code, and standard
  algorithms are good.
  Please follow-up with Joe on the list.

5.4 Michael Tuexen: RFC4960 Errata 

  (This talk presents updates following WGLC.)
Gorry Fairhurst: 
   You note a change to the CRC32c definition. This if often referred 
   to by other groups. What changed?
Michael Tuexen: 
   This change is to fix code typos by changing definition to make it 
   compile on all platforms, it is not an algorithmic change to the CRC32c 

  There were no further comments from the room. People are encouraged
  to check the corrections and report any issues to the list.

  The current version can now continue to AD review and IESG LC.

5.5 Michael Tuexen: SCTP NAT 

  There were no comments in the room. The authors have noted the new
  milestone and expect to work on this ID for the Montreal IETF meeting.

5.6 Gorry Fairhurst/Colin Perkins: Impact of Transport Header Encryption 

Matt Mathis:
  Are you interested in speculation about what might happen?
  Could we imagine a standardised generic transport-agnostic header, 
  that is end-to-end and indicates embedded bytes across multiple transports?
  This would be useful for debugging and a partial solution to some issues.
Gorry Fairhurst: 
  Who would turn this on?  
Matt Mathis:
  It would be good for debugging applications and not so much
  use for some other places.
David Black:
  OAM that is not in the flow may not be representative.
Matt Mathis:
  There are also issues relating to equipment that the stakeholders
  do not wish to tell people about - this makes it tricky.
  It may seem that some tools are duplicated, or other ways can be    
  used to measure the network path, but people do need to have multiple 
  tools to check the answer to the same question. This is needed so they 
  can be sure what is actually happening.
  In-band OAM might be interesting as one of these approaches.
Gorry Fairhurst: 
  This seems like something that is already possible in some networks,
  but I agree it is really interesting to find out more about, please
  tell us about this as ID editors.
Matt Mathis:
  OK, I think there could be a generic instrumentation shim. This could
  be pervasive where it needs to be.
Gorry Fairhurst: 
  Generic tools are much better than version-specific tools that need updates.
Hannes Tschofennig: 
  How much feedback have you had from operators and equipment manufacturers?
  I worked with operations and insight was often very hard to come by.
  On a different topic, tools based on machine learning are even more
  demanding. How can we get good feedback?
Gorry Fairhurst:
  We received good feedback from many operators off-list. A key problem is
  knowing who needs what operational information. Very often transport
  information is used by people to debug equipment/configuration issues -  
  to understand anomalies/health etc, and organisations as a whole do not 
  care about content, only specific people need this.
  We are looking still for more people to provide feedback
Hannes Tschofennig: 
  DDOS mitigation techniques are important, some rely on machine learning.
  Some approaches suffer more from encryption than others.
Gorry Fairhurst:  
Colin Perkins (author):
  Think this is a interesting discussion. It would be wonderful to have 
  a measurement shim, but that may be not something that we have currently
  much experience. We would be interesting to look at different approaches,
  but that would be a different draft.
Brian Trammel: 
  The IPPM WG did publish an instrumentation shim, for replacing timestamp 
  and IP ID in IPv4 networks. This use is not applicable in public Internet.
  This can be better than passive TCP monitoring.
  Some discussions in QUIC will drive us on different designs in this space. 
  As a framing document, this ID should be adopted.
Al Morton:
  This provides a good view of what encryption has changed from the transport 
  side. This is a useful scope. When you say encryption has costs, adding 
  shims definitely adds very real costs.
  It would be useful to quantify somehow the complexities that are added here,
  and what this gives us. I am happy to read the next revision of the draft.

David Black:
  How many have read this? (A fair number.)
  I ask people to read this and we will review the adoption next meeting.
  (+1 read this, via jabber)

Brian Trammell: 
  I have a suggestion that we should aim at consensus on this one.
  By scoping to the transport layer, we may get around to this and
  we can look at both sides of the coin..
  Adoption now, or in Montreal - I would be happy with either.
David Black:
  These problems are not going away, we will be discussing in Montreal.
Colin Perkins (author):
  Things are not going to get better. QUIC is going to be deployed, 
  and is using these kind of header, so there is some time contraint.
  At this time the bits that are exposed in QUIC are still changing, it is 
  not fully stabilized yet, we need to understand these tradeoffs.

End of meeting.