IETF 115 RTGWG Agenda Chairs: Jeff Tantsura (jefftant.ietf@gmail.com) Yingzhen Qu (yingzhen.ietf@gmail.com) WG Page: http://tools.ietf.org/wg/rtgwg/ Materials: https://datatracker.ietf.org/meeting/115/session/rtgwg * * * 15:00-16:30 - Wednesday, November 9 Session III **\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***\*\*\* 1. Meeting Administrivia and WG Update Chairs (10 mins) Stewart: we had a very productive meeting with the authors this afternoon, hopefully this draft will progress soon. ============================================= WG document Update ============================================= 1. YANG Models for Quality of Service (QoS) Aseem Choudhary (10 mins) https://datatracker.ietf.org/doc/draft-ietf-rtgwg-qos-model/ No questions asked. There are still review comments to be addressed. 1. SRv6 Path Egress Protection   Huaimo Chen (10 mins) https://datatracker.ietf.org/doc/draft-ietf-rtgwg-srv6-egress-protection/ Ketan: The majority of this draft is SPRING and LSR related, you mentioned that it was reviewed in LSR, how about SPRING? Huaimo: I sent this draft to SPRING for comments a couple of years ago. Yingzhen: before LC we will need to get feedbacks from SPRING. Jeff: there are a few documents that share WGs. We will try to get SPRING review this. Yingzhen: The next presentation is on similar topic, let's have a discussion after it. ============================================= Individual draft ============================================= 1. SRv6 Egress Protection in Multi-homed scenario Weiqiang Cheng & Wenying Jiang (6 mins) https://datatracker.ietf.org/doc/draft-cheng-rtgwg-srv6-multihome-egress-protection/ Darren Duke: I haven't seen the PSD sid discussed in SPRING, did I miss it? Weiqiang: not yet. we'll try to present in SPRING. Darren: or just send an email and discuss it. Weiqiang: we'll. Huaimo: I'd like to explain the differences between these two drafts. My understanding is that this draft is trying to provide service protection, on the backup egress node, a backup node needs to be configured. The behavior of the SID should be mirrored or similar to the primary. At ingress node, we can use the backup SID to achieve protection. While our draft use mirror SID and no configuration needed, we mirror the primary node and forward the packet. One focus on service and the other is just protection. Weiqiang: these two drafts are different and solve problems at different levels. Ron Bonica: it is a problem that needs to be solved but there were a few impediments. First this needs to be brought to SPRING as well as 6man since it changes the semantic of SRH. Second it does change the meaning of fields in the routing header, so how to maintain compatibility? Weiqiang: agreed with you, we need comments from both SPRING and 6MAN. Regarding the 2nd question, I think it's compatible with SRH processing rules. Ron: you changed the next segment and segments left entries. Weiqiang: that's right. we introduced new flavors. Ron: finally how do this interact with c-sid? Weiqiang: It's compatible with c-sid draft. Jeff: this needs more discussions. Ketan: question on slide #6. About penultimate node, it could be several hops away, how does this work? assume P1/A1 is penultimate. when P2 detects the failure, how does it know the end/overlay node behavior since it's underlay? Jeff: please send the questions to the list. 1. SRv6 Midpoint Protection Huaimo Chen (10 mins)   https://datatracker.ietf.org/doc/draft-chen-rtgwg-srv6-midpoint-protection/ Jeff: have you confirmed with SPRING and 6MAN that your changes are compliant? Huaimo: right now only the end point can change the destination. Yingzhen: are you going to change the SRH or use another layer of encapsulation is not clear in the draft. Huaimo: not changing the SRH, only the segment left. When one node fails, the previous endpoint decrement the segment left. It's a simple change, not violating RFC 8200. Yingzhen: if you skip one node, you can't not guarantee there is no loop. Huaimo: it's after IGP converge, we go along the shortest path. There should be no loop. Jeff: IGP convergence is slow. Huaimo: there are two stages: before IGP converge, the adj node to the failure will do ti-lfa. after converge, every nodes know the failure, the endpoint node will bypass the failed node. Darren: I don't think this change is discussed. If you're keeping the original list, just decrement the segment left, you will break it. You can't do that. Huaimo: I'll send a pointer of the discussion thread. Jeff: Please send another email to SPRING and discuss it before we can consider adoption. Ketan: again, the same question as the previous presentation. You're assuming directly connected nodes, this is not going to be all cases. I'm wondering how practical it is, or it only works with hop by hop path. Huaimo: there no assumption that endpoint is directly connected to failed node, it can be multiple hops away. Ketan: it's not clear in the draft. Huaimo: after IGP converges, it knows the failed node and hence bypass it. Jeff: Please discuss it in SPRING. 1. Generalized IPv6 Tunnel (GIP6) Qiangzhou Gao (10 mins) https://datatracker.ietf.org/doc/draft-li-rtgwg-generalized-ipv6-tunnel/ Greg Mirsky: thank you for sharing your imagination. are you aware of the work on MPLS actions? Qiangzhou: No. Greg: Please check the work in MPLS and DetNet, not to duplicate work there. Qiangzhou: I don't think it's duplicate. Greg: since you're not aware of the work, you can't do a comparison. I don't think what you're proposing is required. Darren Duke: more for wg chairs, this draft is defining all new encapsulations, is this the right WG? Jeff: Usually there are requirements, I'm waiting for feedbacks from transport area. The cost of doing this work is high while not that much benefits. I understand what you're trying to do but I'm not sure the industry really needs it. Acee: this is interesting exercise, you map all encap heads into IPv6 routing header. This would have been a good research paper, but considering how long these things have envolved and deployed, you will neve see the light of the day. with this scope, your encap is trying to map all cases, it never gonna happen. From chat: Ketan Talaulikar Does this GIPv6 tunnel thing not belong to intarea? 1. BGP Blockchain Mike McBride (10 mins) https://datatracker.ietf.org/doc/draft-mcbride-rtgwg-bgp-blockchain/ David Lamparter: for block chain, you typically have proof of work etc. The only thing I see this useful is the proof of something, such as address. are you looking for something like that? or is it still too early? Mike: not yet, but good point. Jeff Haas: having this as part of protocol control loop probably is not a good thing. I'd suggest taking them throw into fire and keep warm. BGP is about policy. Dino: Agreed with Jeff, it's not for the protocol. It could be used for administrative things, such as proof of ownership, registries. Jeff: we're happy to provide a place for this work, but please continue to work on it and add more meat. From chat: Eduard V Smart Contracts are based on the particular Blockchain platform. Which one is proposed? Why? Jeffrey Haas One of my criticisms for trying to use blockchains is even though they are distributed for the work, the chain itself is effectively centralized. The market for much of what we’d want for a competitor to pki for bgp resources is further decentralization. ============================================== 1. SCION Update and Q&A Nicola Rustignoli (20 mins) Related drafts at the moment (including some planned ones): draft-dekater-panrg-scion-overview draft-rustignoli-panrg-scion-components (recently discussed at panrg and updated) draft-dekater-scion-pki (PKI specification) Routing - Control plane specification (to be written) Forwarding - Data Plane specification (to be written) Jeff: thank you for sharing this work with us. * * * 13:00-15:00 - Thursday, November 10, Session II **\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***{::}**\*\***\*\*\* ==================================================================== \*\* Invited Talk, ACM Sigcomm 2022 Best Paper Award \*\* ==================================================================== 1. Software-defined network assimilation: bridging the last mile towards centralized network configuration management with NAssim Huangxun Chen (30 mins) https://dl.acm.org/doi/10.1145/3544216.3544244 https://amyworkspace.github.io/hxchen/files/sigcomm22-nassim.pdf Acee: Are you saying you use the PDF as input and generate this JSON? Huangxun: we use HTML documents. Acee: so the initial inputs are the html docs you get from the web? Huangxun: yes. Jeff: you're trying inconsistent CLIs from different vendors, it's a wrong start. Look at YANG, it will at least give you some consistent behavior. It's a noble task. Huangxun: CLI is entry level to configure network devices. YANG is more advanced, but it's not fully addressing multi-vendor network. Each vendor has its own vendor specific YANG. Our current project is to work on different vendor specific YANG models. I agree this field still needs efforts to address the heterogeneity. From chat: Tony Li Couldn't this be done with YANG? Amy YANG did not fully address the semantic interoperability between multiple vendors due to vendor-specific YANGs. We think the parsing-validating-mapping methodology could also be applied to YANG. We discuss it more in the discussion session of our paper: https://amyworkspace.github.io/hxchen/files/sigcomm22-nassim.pdf ============================================= New Routing Architecture Proposal ============================================= 1. Routing on Service Addresses Dirk Trossen (15 mins) https://datatracker.ietf.org/doc/draft-trossen-rtgwg-rosa/ Daniel Huang: from your presentation, it seems service identification/address is globally unique, does it have to be assigned by a higher management utility? the service discovery is removed from dns and routing, so it reduces delay. Dirk: you may rely on domain name, so we rely on existing governance. You can use your own name space or your own stuff. The second, we put this in path, the ingress is only dealing with the incoming service request, so it's faster than DNS. The performance could be so much better. Roland: Adding application info into the network may bring complexities into the network, so what kind of service/location semantics to put in a gateway? Dirk: the only application is the description of the service identifier. if hash based, the lookup can be fast. The announcement may need additional verification, but it's ok to not at data speed. Jeff: please continue the discussion on the list. From chat: Eduard V DNS has so many extentions. It makes sense to analyze which one is sopported by ROSA and which one is not. Comparison to DNS looks mendatory for better understanding. Feng Yang Does that mean the SAG needs to learn full of the services advertised ? That will put a big burden to the SAG Dirk Trossen 01:00:49 @Eduard fair point to outline which DNS capability we intend to 'replace' Dirk Trossen 01:01:53 @Feng no, the SAG utilizes the namespace-specific resolution service, e.g., the DNS, it does not learn all services available (which would indeed be a significant scalability challenge) Louis Chan 01:04:17 Just processing the URL in a different way could be a solution too. Replacing DNS would a huge processing load to routing entities Markus Stenberg 01:06:03 I don't really see what extra it brings over dns -> anycast address -> (some routing happening), but perhaps I should read the draft. e.g. adding new AF is pretty big hammer. Dirk Trossen 01:07:37 @Louis at the SAR (i.e., the domain-internal elements), the binary names (encoded from the URLs) are indeed processed through a hash-based lookup, so much faster than a DNS lookup. In eBPF, that is what we do, i.e., extract the serviceID from the EH, do the hash based lookup to determine the next hop (possibly doing a secondary step to select one of possibly more next hops) and forward. This is fast (100ks of requests per second), even in eBPF. Also, the load is smaller since the SAR only processes the local client load, rather than needing to resolve all clients of a domain. Dirk Trossen 01:09:34 @Markus less latency due to missing DNS lookup and the ability to realize ingress scheduling of requests, in addition to anycast type of ROSA-level routing. Markus Stenberg 01:10:18 If those anycast addresses are static, they should be cacheable and have long TTL etc. Is this just optimizing initial single rare requests to bunch of seldom used services or what? Dirk Trossen 01:10:48 @Markus this allows, for instance, for multi-site retrieval, scheduled at the ingress, without involving any lookup and using the existing IPv6 routes to the unicast replicas. Dirk Trossen 01:12:10 @Markus indeed, for long-lived anycast addresses, the routed mode of ROSA is the same, essentially, but you could do per request scheduling for those (anycast) services. Darren Dukes 01:14:48 @Dirk Trossen The draft mentions LISP similarities, is it correct to say you convert the service url into an identifier then make a map request equivalent based on the service identifier? However you do this in-band by sending the service request to a special map server - called ingress SAR who forwards the request and then the host updates the service mapping locally when they get a response from the receiving service. Dirk Trossen 01:19:33 @Darren that is indeed correct. In a manner, the SAR is the mapping service, particularly in the scheduled mode where indeed one of possibly several service instance IPs is chosen, based on some possibly service-specific selection mechanism. In routed mode, the SAR forwards to another SAR, until it reaches the instance. This is different from the mapping process in LISP. Dirk Trossen 01:24:10 Differences come in the mapping federation, itself. SARs are not federated but include mapping/routing entries populated by an overlay DV protocol. Any locally unknown service address is being directed to the SAG, which may either direct the packet to another ROSA domain or to the public Internet. 1. KIRA: Distributed Scalable ID-based Routing with Fast Forwarding Roland Bless (15 mins) https://publikationen.bibliothek.kit.edu/1000148953 Acee: if there are node ids you don't know, how do you know how you want to talk to? Roland: you choose the closes node you know in ID space, and you can tell whether you're making progress in ID space. In case you don't know a node that's closer to the target then there's some inconsistency in the routing table or the node doesn't exist any more. Acee: if you want to talk to someone you don't know, do you send discoveries out of all your interfaces? Roland: not all, only one. Acee: what if it's a wrong way? or partitioned? Roland: if the ID overlay is consistent, we'll always make progress. Acee: so the ID tells you which way to go? Roland: exactly. Acee: I think I saw similar work years ago. Roland: yes, there are previous works. From chat: Eduard V 01:11:30 Is KIRA a replacement for BGP? Tony Li 01:11:59 See bullet 1.1 Boris Khasanov 01:12:08 How does the KIRA communicate with routers? Roland Bless 01:32:54 Eduard V said: Is KIRA a replacement for BGP? No, it is not. It's designed for control plane connectivity, so routing of data packets would be a different thing. One of KIRA's use cases is its use as in-band control plane fabric for SDN. Roland Bless 01:34:40 Boris Khasanov said: How does the KIRA communicate with routers? Not sure what you mean. Currently, we are using IPv6 packets and link-local addresses for communicating between KIRA routing instances. So a router would always have to implement KIRA for its control plane. Using this connectivity, you could simply use ssh netconf or whatever to connect to the router in order to configure it, e.g., its data plane routing like OSPF or BGP. ============================================= Individual draft ============================================= 1. Requirement of Fast Fault Detection for IP-based Network Lily Zhao (10 Mins) https://datatracker.ietf.org/doc/draft-guo-ffd-requirement/00/ 2. Framework of Fast Fault Detection for IP-based Networks Haibo Wang (10 Mins) https://datatracker.ietf.org/doc/draft-wang-ffd-framework/00/ Greg Mirsky: the requirement slide. what's your goal with this work? Haibo: to let the network help endpoints quickly detect failures. Greg: the whole network or the device to detect failure? Haibo the network detect failure then notify access points. Greg: you plan to do it as distributed or centralized? Haibo: distributed. Greg: so each device in the network must know about the failure. Haibo: yes, this is the first step, later there might be optimizations. Greg: that's already done in IGP. Haibo: the network synchronization part is not described in the framework draft yet. Jeff: as a participant, I want to talk about machine learning clusters, the goal is to converge with a number of entities not number of seconds. The infrastructure is parallel without single point of failure, the goal is to detect a failure asap and route it in ip network. This is commonly implemented on hosts today like flow bender or a variety of other techniques. If you need to notify a controller you're in seconds and your machine learning job is dead. The requirement is not suitable for machine learning clusters. David Black: I'm one of the original designers of NVMe over fabric and NVMe over TCP transport. I'm surprised the storage network configuration shown here is unrealistic. The active passive configuration is typically moved active active these days, which means the second path is active. if there is a failure on the first path there is an opportunity to immediately use a second path to get the failure information communicated without having to go through all switches. That's better for NVMe because it's not relying on switch interactions. It seems to me the failure detection is building based on ip accessibility, it's not a good idea as routing is the authority for the topology, what ip addresses are reachable. Please don't reinvent that. The draft labels security consideration as NA, not applicable, which might also be not acceptable. It's a great vector of dos attack. when a switch side detects a link failure it should turn the link off, so the other end notice it pretty quickly and you don't have a problem with two ends disagreeing on link failure. Jeff: all storage protocols do implement their own livability mechanisms at much lower layers than IP, and they are very fast, 2 RTTs, we're talking about nano seconds. Sasha: (the slide using NVMe as an example) in this configuration, if sw1 and sw2 were connected, and the hosts are using loopback addresses, when failure happens the switch would reroute and host remain ignorant of the failure, that's what network operators would prefer. If the switches are not connected, personally I see it as a poor network design and we shouldn't propagate new functions to host. From chat: Tony Li 01:27:31 The similarities to the UPA work in LSR are not small. ============================================= Individual draft from other WG (if time allows) ============================================= 1. SRv6 Deployment Consideration Qiangzhou Gao (10 mins) https://datatracker.ietf.org/doc/draft-tian-spring-srv6-deployment-consideration/ Jeff: which of these services can't be done using as SR-MPLS? Qiangzhou: MPLS is only used in the backbone but not campus, while SRv6 can achieve end to end connection. Jeff: (New Network Deployment slide) looking at the features implemented there are no preventions from using other technologies. I'll take this to the list. Gyan: is there any interconnect technology you're using? the core is srv6 and campus different? (voice was breaking up, please take it to the list) From chat: Tony Li 01:43:34 SRv6 is required to waste more bandwidth. Boris Khasanov 01:46:14 @Tony: "SRv6 or not SRv6, that is the question..." :) 1. Intelligent Routing Method of SR Policy Feng Yang (10 mins) https://datatracker.ietf.org/doc/draft-yang-sr-policy-intelligent-routing/ Jeff: Due to time constraint, please take questions to the list. 1. Signaling In-Network Computing operations (SINC) Zhe Lou or Luigi Iannone  (15 mins)  https://datatracker.ietf.org/doc/draft-zhou-sfc-sinc/ Jeff: currently SFC doesn't accept new drafts, we're looking for ADs' advice on next steps. From chat: Tony Li 02:00:14 This seems somewhat similar to MPLS MNA