TCPM meeting at IETF-97 Meeting : IETF103, Monday Nov 5, 2018, 16:10 - 18:10 Location : Meeting 1 Chairs : Michael Scharf Michael Tüxen Yoshifumi Nishida AD : Mirja Kühlewind URL : http://tools.ietf.org/wg/tcpm/ Note Taker: Gorry Fairhurst, Richard Scheffenegger -------------------------------------------------------------------------- 1: WG Updates - Chairs One document had been passed to the RFC Ed: draft-tcpm-alternativebackoff-ecn. RTO Consider draft. There was discussion about other RFCs referenced. Other drafts are progressing. -------------------------------------------------------------------------- 2: TCP Usage Guidance in the Internet of Things - Carles Gomez Gorry Fairhurst: The ECN case combined with the single MSS case seems like it a needs a little explanation, and the ECN text may also need to be updated. Carles: Not only for 1 MSS session, covers more complete implementations. David Black: MD5 with TCP is an obsolete crypt, be careful not to use this as an example, unless you very carefully explain why this is a special case for these networks. leave it out. use currently recommended crypt hash Gorry Fairhurst: I will review the document and I think I can offer some new words in places. I think it looks like an updated draft will then be ready for a WGLC. -------------------------------------------------------------------------- 3: ECN++: Adding ECN to TCP Control Packets - Bob Briscoe Richard Scheffenegger: Private DCTCP implementation with ECT on SYN, Does this draft rule that out? Bob: No, this is in a private network, IETF deals with the internet as whole. In separate networks you can do things like this. Mirja Kuehlewind: I understand the problem of having dependencies on other drafts. On the SYN case. I think you must do the AccECN and generalized experiments together because AccECN needs to be negotiated. Bob: For pure acks, you depend on the negotiation, for the syn, you don't depend. Some implementers may not want to implement accecn yet because there are fallback implications with middleboxes. Mirja: The experiment would be safer if aligned to AccECN. Bob: ECN on the SYN is only AccECN case. Mirja: OK, I am happy. Gorry: What does the ACK ECN response mean? Marku: Me neither. Are you talking about the sender or client? Bob: Sender of the pure ACK Mirja: An example is when you have sent data and have a larger cwnd, and are now sending ACKs, then it lets you appropriately reduce the cwnd. Mirja: What was the problem of using the standard CC reaction? Bob: you would be very unfair on yourself, as you massively reduce cwnd even when you are not send. So you have much lower bandwidth when you start sending. Magnus: Does this require more accounting? Bob: This all depends on the specific CC algo. Magnus: How does this apply to QUIC? Bob: You will have to come up with your own scheme. Gorry: This looks like something a little like CWV considerations. Bob: Yes, a little like that - it says something in the text. The details need to be worked out in a congestion control draft. Mirja: This is close to a DCTCP style adaption. You could use something like CWV, you may need to do the experiment to figure out. Marku: For the third case (ACK-only) what do you do? Bob: That could be ACK-CC. Like ACK-thinning. Marko: It would only react to case 3. I fear it can become complicated in some scenarios. Is it too complicated to optimize. Bob: ACK-ECN is required for our ACK with ECT, we should be as simple as possible. Mirja: There was a case where a path set every packet to CE. I am surprised about the fallback for ECT(0), which seems to be wrong. Bob: Yes. M Abramson: p Is this checking of the ECT in the server mentioned in RFC3168? Bob: No, the spec does not say what a server should do when sees ECT on a SYN. Mirja: If we get the OS code fixed to match RFC3168, we should go for sending on the SYN. David B: We should refine the text on RFC 8311. RFC 8311 permits this experiment. The draft can now specify this. Please delete "follows from RFC 8311" justify the specific change, and just assert exactly the experiment needs. Gorry: +1. There are two parts, and this was the approach we tried to take in ABE. First say RFC 8311 permits this experiment. Then, declare your experiment. Bob: Ok. Let's take it to offline Yoshi: The draft will need to be updated. Mirja: I suggest the text describing the response to CE-marking on a pure ACK could be (also) placed in the "experiment considerations" section. -------------------------------------------------------------------------- 4: RACK+L4S:A Way to Remove Head-of-Line Blocking from Links - Bob Briscoe Gorry: Actually, if you say reordering permitted - it's not just about re-ordering on the forwarding plane, e.g. parallelism in processing - it could be about how you map the flows to the multiple forwarding possibilities and retransmission over paths that happen to exhibit very different forwarding delay - so once you allow reordering there could be very large reordering. The reordering options have a cost in performance (latency or throughput), processing cost/hardware costs, and utilization. Yoshifumi: Note RACK also uses DupACK as well timers in the algorithm. M Abramson: I think you will find many application developers using UDP will assume in-order deliver. Christian: There is a lot crappy in the network, because there is no more Richard: If dupack threshold is adaptive, will it be similar to this approach? Bob: don't think so. RTT is shared info. Gorry: The DupACK at the start in RACK makes it actually quite safe, because at the start you may not have a known RTT, and even the RTT can change - e.g., on a resource-managed link that is capacity limited, where the actual RTT changes as you land the network. M Abramson: This would be a fundamental change to the way the Internet works. It cannot be a quick change, it needs to be done across the board for a class of traffic and takes time to deploy. Bob: This gives a clean slate for new designs. N Kuhn: There can be delayed ACKs. There are also cases of radio links (e.g. LEO satellite), especially since RACK is delay-based and this needs to be considered. Fragmenting across layer 2 and using layer 2 retransmissions can be smaller and much more efficient. -------------------------------------------------------------------------- Meeting : IETF103, Tuesday Nov 6, 2018, 11:20 - 12:20 Location : Chairs : Michael Scharf Michael Tüxen Yoshifumi Nishida AD : Mirja Kühlewind URL : http://tools.ietf.org/wg/tcpm/ Note Taker: David Black 1: Comparison between congestion control/loss recovery in QUIC and TCP - Jana Iyengar, Ian Swett QUIC congestion control is effectively NewReno, no major differences. This session is mostly about loss recovery. QUIC packet numbers: Monotonically increasing 62-bit numbers that do *not* indicate delivery order. These are *not* TCP-like sequence numbers! ACK frame carries largest ACKed packet number, one or more ACK ranges, ACK delay, ECN counts. ACK ranges provide SACK-like functionality. Largest ACKed packet number is not included in any ACK range. QUIC uses largest ACKed packet number (does not ACK all prior packets), differs from TCP's cumulative ACK point (ACKs all prior packets). -- Fast Retransmit Lost packet is not retransmitted, it is instead resent with next packet number. FACK is a minor enhancement to Fast Retransmit, helps when ACKs are dropped and enables receiver to not send an ACK on every packet receipt after a gap occurs. Discussion of sender time window before retransmitting. -- Early Retransmit Retransmit when gap first reported *after* an RTT-based timeout expires. This backstops usual 3-dupACK mechanism when send queue is short (can't send enough packets to get 3 dupACKs back). -- RTT and Timeouts RTT samples exclude receiver-reported ACK delay. -- TLP (Tail Loss Probe) Quickly retransmit when last packet in a set/train has been lost, there's a much smaller timeout than usual retransmit timeout. Unlike TCP TLP proposal (expired draft), QUIC does two TLPs before falling back. -- Spurious RTO Detection QUIC does not reuse packet numbers, which helps disambiguate whether a spurious RTO occurred. -- Crypto timeout Aggressive timeout during setup to avoid losses causing longer delays to handshakes. --------------- Potential Improvements (not in I-D), possible for QUIC v1 ------------------- Yoshi: Will these be included in V1? Jana/Ian: Yes. But, if some of them require more discussions, we can omit them from V1. Generate ACKs less frequently than once every 2 packets? - drivers and some network nodes (e.g., wireless Access Points) compress TCP ACKs, can't do that for QUIC due to encryption. Bob: Is this not only determine the default value, but also include sender mechanism to control the value? Jana: Yes. Remove MinRTO? - QUIC is in better shape than TCP here due to spurious RTO detection improvements. Combine TLP and RTO timeouts for simplicity? - Again, possible benefit from better spurious RTO detection improvements. - Primary cost (impact) of an undetected spurious RTO is congestion control changes that reduce throughput. This is why QUIC retransmits are cheaper than TCPs/ Stuart Cheshire: Supports unifying these timeouts. Mirja: TCP has spurious RTO detection mechanism. Jana: Yes, but FRTO is not required, That's the problem. Praveen (from remote): TCP RACK draft subsumes expired TCP TLP draft. For generation of ACKs, we already have LRO for TCP, so moving away from two looks fine. For combining two timers, Microsoft already did similar things. So, it makes sense. ????: How do you unify TLP and RTO? RTO lowers cwin to min cwin. Ian: I don't have a clear answer for that. That's why we bring it up here. Jana: I think we can do it. We have to try and see. Stuart Cheshire: We might need to think about traditional RTO calculation because unit doesn't match. mean is a time unit and variance is in time^2 units, which means we might see radical differences in some cases. in old days this was expensive, but we now should take sqrt(variance) before adding to get units correct. Christian Huitema: Use of "variance" term is inexact, what is being used is average variation which is sqrt(2) * standard deviation, hence no unit conflict. Jana: We won't include this in V1, but dicsussing better algorithm for this might be a topic after V1. Praveen: Adaptive time thresholding for fast retransmit? How to combine both packet and time thresholds for fast retransmit? Windows hasn't implemented early retransmit, hasn't seen a problem, so Early Retransmit timeout mechanism may not be needed. Jana: We are discussing how to fix it. Michael Welzl: Monotonically increasing packet numbers should make ACK congestion control easier (feasible?) for QUIC it's very difficult for TCP due to non-monotonic sequence numbers. Yoshi Nishida: TCP is not planning to standardize everything that QUIC is using. What is QUIC's approach to going beyond TCP? How well will QUIC/TCP bottleneck sharing work? Jana: Trying to adopt common TCP practice, and pick up things that should be standardized, TLP is an example of something that TCP implementations usually do that has not been standardized. Lars Eggert (QUIC WG chair): Ok to continue to improve QUIC recovery draft even while QUIC drafts are generally "frozen" which will start soon. Tommy Pauly: Agrees with Lars on improving QUIC recovery draft during QUIC "frozen" phase QUIC has more information than TCP, hence can do more - spurious RTO detection is an important example. Yoshi: Shall we have another meeting like this? Jana: Could repeat a session like this at next IETF, but need mailing list discussion prior to next IETF (in Prague) in order to get changes into QUIC v1. Lars Eggert: Want to improve current QUIC recovery draft for QUIC v1 in short term. In longer term, QUIC has more information than TCP , hence can do innovative/better congestion control beyond what is possible in TCP. This is a research area, ICCRG is a good forum to evaluate ideas in this area.