High Performance Wide Area Network (HPWAN) BoF 13:00 - 15:00 UTC, Monday Session II, IETF-121 Dublin The Auditorium Chairs: Tim Chown and Gorry Fairhurst 1. Chairs: Introduction. Note Well, Agenda No Agenda changes. 2. Chairs: Goals of the BoF & definition of HP WAN No objection to the HPWAN definition. 3. State of the Art Congestion Control and Related Technologies (Michael Welzl) Daniel Huang: What feedback from the network would be needed to help the application modify its windowing, to avoid congestion? Michael: There is a variety of mechanisms available. Daniel: Maybe some quantative data, such as BW? Michael: This is difficult in public Internet. [Chairs] We can discuss this further (after the initial presentations). Erik Nygren from chat: How well does DCTCP (or L4S) work in HP-WAN contexts? Eduard V from chat: BBRv2 is almost half of the Internet traffic now. It is strange to talk about RENO and CUBIC our days. DCTCP assumes around 20us RTT. It is for DC. L4S is a queuing that permits a wide set of CCs. 4. High-Volume Content Mover (Michał Zasadziński) Gorry: You mention latency sensitive, can you expand? Michal: These are very large files, but expectation is minutes and the data is not continuous/streamed. Jana: Does the platform use BBR? Michal: Yes. Jana: Does it also use a QOS? Michal: Yes, these transfers are assigned the lowest QOS tier. Jordi: When you talk about fairness, proportional fairness in maximum fairness? Anything you can say about that? Michal: Yeah, so it’s basically proportional fairness, but we’re doing maximization of the throughput. So you can look at the paper or in the slides there, are two experiments that explain what is our regard to fairness. Jordi: And do enforcement when entering the network, do some kind of traffic shaping based on bandwidth? Michal: Yes, there is a Google Enforcer that checks and then allows the data transfer. Spencer Dawkins from chat: Michael’s “Run it for an hour, send a terrabyte, you’re good” is the best quote I’ve heard about high speed networking in a while … I’m intrigued by the mention of latency sensitive transfers in Effingo. Is that just trying to keep copies in sync as much as possible? 5. R&E Operator (Tim Chown) Rodney: What the X - axis and Y - axis mean in the slide on page seven? Tim: The Y - axis is the number of flows of that size within the band, and the X - axis is the data rate in bytes per second. Multiply it by 8 to get the bits per second. It shows how many flows on the Y - axis there are over the sample period with different data rates. Daniel: Are the CERN traffic requirements “predictable”. Tim: Yes. Some experiments are planned, but not all. China Mobile: With the comparison (London to CERN), BBRv3. Jana: You mention the application, do you use specific metrics for optimisation? Tim: Typically, there is no urgency to move the data but there may be some types data that might need to be prioritised. Jana: Do you use any QoS mechanisms? Tim: You can, currently on Janet in the UK we don’t, but some NRENs might treat traffic in an LHCONE L3VPN differently. Weiqiang: Do you have examples/applications that use multicast? Tim: No, but there may be some cases where a number of universities wish to receive copies of the same data. Maybe we can talk about the applicability of multicast during the Q/A? Lixia Zhang from chat: a clarification question about CERN traffic volume: everyone fetches data directly from CERN site? (i.e. unicast point-point delivery to clients). Kyle Rose: This is a histogram by throughput. Tim Chown: Tier 1’s all pull from CERN and they serve data to Tier 2s. Tier 2s don’t pull from CERN Christian Kuhtz: How common is it for multiple sites to want the same data at the same time? What is multiple? 1-2? dozens? Tim Chown: I don’t know, but it leads to the multicast question. There are 4 major experiments, 2 of which have bigger data requirements, so there’s likely some level of common data used. Dale Carder: I don’t think the amount of file replication is that high. Some of the dominant traffic flows are tape archival and data processing/reprocessing. 6. Public Operator with HP-WAN Applications (Kehan Yao) John: Is iWARP a reliable protocol? Kehan: Yes. So ROCEv2 is not reliable, and needs some upgrades such as reliability and lossless, but it also needs to be lightweight. John: I’ll follow-up offline. Gorry: Do you refer to RDMA over WAN? What’s the need is it RoCEv2 or RDMA over WAN? Kehan: RDMA is not an IETF protocol, so we need to see if we can work it. Christian Kuhtz from chat: Rather than calling in RDMA over WAN, maybe calling it RoCEv2 over WAN may make more sense to bound the scenario? Eduard V from chat: It looks like a very typical congestion control discussion, not much different from hundreds of other CC solutions. All of CCs are benefited from the additional telemetry. 7. Open Technical Issues (Daniel Huang) Gorry: What does coordination mean, is this an RSVP-type signalling reserving capacity for flows? Daniel: Not RSVP, more like providing feedback (signals) from the network. Peng Liu: We need to slim down the use cases, Gorry: Do you have a specific use case you would like to work on? yixinxin from chat: The congestion feedback is also determined by the flight time. How do you speed up the congestion signal with long distance transmission? 8. Open Discussion Gorry: Have we missed a specific topic from HPWAN not discussed? Jana: Which space are we in? WIT, Routing? What is the thrust? What is the IETF work that needs to be done. Gorry: The reason for the BOF is facilitate a discussion, clearly there seems to be support for investigating this topic to improve mechanisms and we could also consider emerging use cases. Tim: In my view (as an R&E network operator) we want to improve our ability to support high volume data services for our Janet-connected universities and research organisations. Zahed: As shepherd AD the goal of the BOF is to focus on transport issues, can we identify the transport protocol (that is IETF) that we can work ok? Jana: Adrian: Something Daniel mentioned in his slides relates to “long distances” which translates to high e-2-e latency so a larger MTU might be not be significant benefit. The transmission latency might also be a part of the problem scope. Tim: [slide WLCG map] shows there’s a west coast US Tier1, so some services will have a RTT of over 100ms from CERN Brian: Common theme across the presentations is bulk transfer, and there is also application-layer scheduling. Which are not typically part of Transport, but my suggestion is to focus on the interfaces between layers to achieve this. Tim: We also have cases where we have smaller transfers, but many of them. Brian: yes, but choose one to help focus otherwise we boil the ocean. Weiqiang: Some transport protocols that may be applicable are not IETF (such as RDMA), can we still work on these? Gorry: Yes that might be possible, with the permission of whoever defines the specifications and providing the appropriate stakeholders contribute to the IETF. Weiqiang?: Another question, maybe we need some coordination between transport and routing layers for these HPWAN services. Gorry: Maybe, but in this BoF we try to narrow scope to discuss transport, which includes signals to/from the network layer. Rodney: Work on a Japanese research network called ARENA PAC. I will drop an email to the mailing list. Jana: I think another aspect we might want to consider is multi-path, it was not a major requirement but it was mentioned. I think IBM have a technique ???, but it may be something to look at. There does seem to be gaps, and the fact there are multiple solutions be developed feels to me like a problem exists and some collaboration would be good. Matt: I want to echo some points Jana mentioned. I’ve been working on these problems for 34 years. This BOF is focusing some attention on the challenges, the IETF is multi-threaded and there may be multiple mechanisms required. One thing we might need is support for MTU discovery. Congestion control fairness is something we have neglected. 9. Chairs: Conclusions and Next Steps • Chairs: Do people think this is a topic where the IETF can make a contribution? (Yes:49, No:11, No Opinion:18) • Chairs: Is this something where you would contribute or review? (Yes: 36) • Chairs: Is there a transport-related issue the IETF should work on? (Yes: 55, No: 9) • Chairs: Can the transport work be done in existing groups (ICCRG, TSVWG, CCWG, etc) (Yes: 38, No: 10)