IDR meeting at IETF 112 (version 00) Chairs: Susan Hares Jeffrey Haas Keyur Patel Secretary: Jie Dong Materials: https://datatracker.ietf.org/meeting/112/session/idr/ Meetecho: Session 1: https://meetings.conf.meetecho.com/ietf112/?group=idr&short=&item=1 Session 2: https://meetings.conf.meetecho.com/ietf112/?group=idr&short=&item=2 Collaborate on Note Taking: https://codimd.ietf.org/notes-ietf-112-idr/ First session: Wednesday Session 1, 12:00-14:00 (UTC), November 10, 2021 0. Agenda bashing and Chairs’ Slides (15 mins) Regarding the coordination on multicast work. (from chat) Joel Halpern Also, spring i snot chartered to modify BGP. Jorge Rabadan @Jeff, draft-ietf-bess-evpn-ipvpn-ipvpn was listed on the BGP multicast work slide, but it is not multicast work… it is unicast really. Tony Przygienda also, BIER and BGP? well, yeah, there is BIER control part over BGP but BIER charter has protocol extensions for BIER (after initial discussion when group was chartered that was agreed upon). Jeffrey Haas We know our docs in the multicast cluster aren’t quite up to date and correct. But they’re what we’ve been asked to evaluate in that context. We’ll get a correct list hopefully shortly. And also, looking to build a group of expert reviewers to help us 1. BGP SR Policy Extensions to Enable IFIT [Giuseppe Fioccola] (5 mins) https://datatracker.ietf.org/doc/html/draft-ietf-idr-sr-policy-ifit/ No question. 2. BGP SR Policy Extensions for Template [Ka Zhang] (10 mins) https://datatracker.ietf.org/doc/html/draft-zhang-idr-sr-policy-template/ Srihari: Is the template ID global? How to coordinate across multiple routers? Ka Zhang: It can also be a local ID configured on a device. The device can also provide a range of the template ID for the controller to choose. Srihari: How to deal with the template ID conflicts? Ka Zhang: The template ID is device significant. For the same template ID, the controller can overide and update its content. Andrew Alston: How does this work when the SR policy is sent via reflectors? Does this require the template ID be syncronized globally within the reflector domain? Ka Zhang: Think the template ID and content only needs to be understood by the headend node. Acee Lindom: We have one level of abstraction with SR policy. We are using template in management. Not sure about the advantange in putting this into protocol. Shraddha: You can configure the template and the template used by SR policy on device. (from chat) Andrew Alston: Am trying to wrap my mind around how this would work in the case of reflected policy - it gets tricky20:24:48 Dhruv Dhody: In PCEP we did this using an ASSOCIATION object https://www.rfc-editor.org/rfc/rfc9005.html Ketan Talaulikar: @Dhruv - ack - it looks the same to me too However, PCEP is 1:1 session while BGP is different (when used with RRs). Jeffrey Haas: I’m sure the authors would welcome text for their operational considerations Cheng Li: it would be better to discuss via email thank you @andrew Andrew Alston: will drop that one on the list - but I do see issues with this in the case of reflection 3. BGP Extension for SR-MPLS Entropy Label Position [Yao Liu] (5 mins) https://datatracker.ietf.org/doc/html/draft-zhou-idr-bgp-srmpls-elp/ (from chat) Srihari Sangli: How will the advertising router know when to insert the “E” flag. Does it require to know the topology? Yao Liu: @Srihari,The router doesn’t need to know the topo, a possible way is by policy or configuration, if configuration on the headend requires load balancing on particular SR path, the headend node should insert the ELs based on the E-FLAG received via BGP SR Policy. 4. BGP Extensions of SR Policy for Path Protection [Yao Liu] (5 mins) https://datatracker.ietf.org/doc/html/draft-lp-idr-sr-path-protection/ Ketan: Clarify my comment. This requires changes to the semantics of segment list in SR policy architecture. Need to have that reviewed and updated in SPRING WG before this protocol extension work is done. Suggest to do the SR policy architecture change in SPRING first. Yao Liu: Do you suggest to first update the SR Policy architecture? Ketan: Yes. Yao: There are many drafts adding attributes to SR Policy architecture. Ketan: Can discuss that on SPRING mail list. Jeff Haas: Will capture this in minutes and coordinate with SPRING. (from chat) dhruv dhody @ketan - https://datatracker.ietf.org/doc/ht…gment-routing-policy-14#section-9.320:41:32 I remember checking that when we adopted the work in the PCE WG Ketan Talaulikar @ Dhruv - ack. But this proposal introduces active/backup semantics between SLs in the same CP. That IMHO is a key semantic change for SL construct. I am not sure what is the driving motivation for this and if it can be perhaps addressed differently. dhruv dhody Do check the PCE draft as well, I want to check if your comments apply there as well https://www.ietf.org/archive/id/dra…f-pce-multipath-03.html#section-4.420:45:14 Ketan Talaulikar PCE did not have CP construct until the recent draft. So there might be some subtle differences. I have some trouble mapping from PCEP to the SR Policy info module on that account and could use your help :-) … perhaps discuss on the list? dhruv dhody Yes! Lets fix up time after IETF week! 5. Advertising P2MP Policies in BGP [Hooman Bidgoli] (10 mins) https://datatracker.ietf.org/doc/html/draft-hb-idr-sr-p2mp-policy/ Jeffrey Zhang: There is another way to setup SR P2MP using BGP, which is not listed in your slides. 6. BGP for BIER-TE Path [Huaimo Chen] (10 mins) https://datatracker.ietf.org/doc/html/draft-chen-idr-bier-te-path/ Jeffrey Zhang: Are you going to use one option, or use the two options for different purposes? How to use the mechanism may be discussed in BESS or BIER. Huaimo: Both options would work, personally prefer the first one. In the end one option will be selected. Regarding the application, are you asking details about how to distribute the path information? It is described in the draft. Jeffrey: Should pick one option then go through adoption. Either this draft or another one talks about the use case. The use cases are related to MVPN or EVPN, should belong to BESS. Huaimo: It can applies to both MVPN and other use cases. Sue: Depends on the content, what specifications it modifies. it can be reviewed by both BESS and IDR, and approved by both. The chairs will discuss this. Jeffrey: Is this for setting up a tunnel, or announce how the tunnel will be used? Depends on the use case. Huaimo: This is for the BGP extensions to distribute the tunnel information. Dhruv: Not just on this draft. All the above topics have related work in PCE. Would like to have better coordination between PCE and IDR. Sue: The IDR and PCE chairs should get together, we have put things in IDR wiki about the work which needs coordination. Jeff: We had some general discussion about the BESS and IDR chairs coordination. Coordination with PCE is also needed. We would like to have good cross WG coordinations. Jeffrey: Is PCE WG specific work on PCEP protocol, or it also works on architecture which may not use PCE. Dhruv: The PCE architecture belongs to TEAS, PCE WG only works on PCEP protocol. Jeffrey: There is one way to setup SR P2MP using PCE, and two BGP based proposals. Dhruv: PCE WG only handles the PCEP extension for SR P2MP, the SR P2MP policy belongs to PIM. Alvaro: We keep seeing this more, P2MP is happening in BESS, IDR, SPRING, PCE, PIM and BIER. ADs are talking about this. Hope we could find a consistent way for the cross-WG coordination. Maybe easier with some tools, e.g. the wiki. Multiple solutions is not bad, bet we need to get them properly reviewed. Suggest to provide ideas for better coordination. Hooman Bidgoli: I have a slide to track the drafts in related WGs. My understanding about this is it is like unicast, PIM is lead in the multicast, for the protocol extensions it belongs to the corresponding WGs. Jeff: There is currently no IETF process to govern how to do this coordination. If there is experience from other SDOs, we are happy to take it. Make sure the interested parties talk to each other. (from chat) Alvaro Retana: Dhruv, “All of the above.” We are clearly seeing more cross-WG work that requires coordination. I don’t think there’s a single answer, but would really want to find a consistent way so that we’re not dealing with it an a case-to-case basis. Jeffrey Haas Effectively what we need is a “Technology area” concept. Tech can cross multiple areas and working groups. and often, they have architecture that guides the whole thing. see SR as a possible example. Keyur Patel i like the technology area concept. :) Dhruv dhody I am thinking of adding a list of corresponding I-Ds in the PCE wiki and then asking authors to confirm on the list that there is alignment with them and highlight the differences etc 7. BGP-LS Extensions for IS-IS Flood Reflectors [Jordan Head] (10 mins) https://datatracker.ietf.org/doc/html/draft-head-idr-bgp-ls-isis-fr/ Jeff Haas: Have discussion with LSR, We can have either speparate document in LSR and IDR, or they can be combined into the LSR document. (from chat) Ketan Talaulikar: @ Jordan, I would recommend adding a section in LSR draft and getting done with the BGP-LS part … reduces “work” ;-) 8. Extension of Link Bandwidth Extended Community [Wenyan Li] (10 mins) https://datatracker.ietf.org/doc/html/draft-li-idr-link-bandwidth-ext/ Acee: I never liked the floating-point, but don’t know whether it is the time to change. The integreted service guys decided to put floating point into RSVP and we’ve been stuck with it. I’m wondering that if it is a good idea to change it. THis is good for link bandwidth, but when used in other cases (unreserved, reserved, etc.), you may need fractions. Jeff Haas: Floating point has been problematic for a lot of reasons. People work on software based controllers always use decimal value. What we were using are low precision floating-point, can be difficult to use in some cases such as comparison. Regarding the details, you give many different types with overlapped behavoirs. Two link bandwith communities may have overlapped semantics due to using different units. We can have discussion about can we do better for link bandwidth. Acee: If we are going to switch, it is good to have consistent way for link bandwidth at least in Routing area. I’d prefer to use 64-bit number with bits per second. Luc: In BESS we also run into issue of link bandwidth with different units, suggest to normalized to one unit. Jeff: The problem with 64-bit is that there is limited space in BGP ext. comm. Wenyan: Suggest to bring this to the mail list. Jeff: Agree, and there is also discussion about transitive vs. non-transitive in the BESS case. (from chat) Srihari Sangli a nit: 1K = 1024 and not 1000 Jeffrey Haas normalizing terminology would be helpful here, yes. Randy Bush wow! we return to the '60s! Tony Przygienda one thing is sure, I doubt IEEE will improve on 754 encoding ;-) first, to improve precison @ same range we’d probably need to break Shanon ;-) and 2nd we hopefully keep to simple things like float normalization and epsilon comparisons ;-) Luc André Burdet you can also set ‘mbps’+1000 vs ‘gbps’+1 -> is that an equal value or not ? Keyur Patel @acee ack on 64 bit number Tony Przygienda well, we’re just inventing a 2^X base 754 encoding here. badly ;-) Andrew Alston I have to agree - a 64bit number is far easier to deal with Haibo Wang @Luc Yes, it is , but doesn’t suggest to do that. We have described it in the draft Jie Dong +1 to Jeff on the limited space in ext. comm for 64 bits Andrew Alston Could we not think about an attribute + 64bit number - which may be easier to deal with - just - random thought [5 minutes for switching] Second session: Thursday session 3, 16:00-18:00 (UTC), November 11, 2021 0. Agenda bashing (5 mins) 1. PCEP Extension for Native IP Network [Aijun Wang] (10 mins) https://datatracker.ietf.org/doc/html/draft-ietf-pce-pcep-extension-native-ip/ Requested cross WG last call from IDR. (Chairs will try to set a deadline after IETF) BGP Peer Info object contains peer as, ip, etc. Need clear procdures in case it conflicts with local configuration (XXX JMH) Explicit Peer Route Object (similarly check for conflict procedure - XXX JMH) Peer Prefix Advertisement Object Keyur: this is another draft need collaboration with PCE. Jeff: We’ll try to get you a deadline for WG review the week after IETF. We’ll also add this to the wiki for cross-wg review. Aijun: Welcome to send comments to the IDR list. (from chat) Dhruv Dhody: The LC will happen on the PCE WG, we will cross-post it in the IDR WG. Would be better to get comments early to avoid any late surprises :) Has updated IDR wiki for tracking: https://trac.ietf.org/trac/idr/wiki/Feedback for PCE drafts 2. BGP Flow Specification Version 2 [Donald Eastlake] (15 mins) https://datatracker.ietf.org/doc/html/draft-hares-idr-flowspec-v2/ Slides… No questions for Donald. IDR will hold an interim for fsv2, tentatively scheduled for December 13, 2021. Keyur: This is a pretty serious work and its very comprehensive, encourage both operators and developers to take a closer look and provide comments. 3. Dynamic Capability for BGP-4 [Enke Chen] (10 mins) https://datatracker.ietf.org/doc/html/draft-ietf-idr-dynamic-cap/ https://trac.ietf.org/trac/idr/wiki/draft-ietf-idr-dynamic-cap implementations - for implementation reports Slides… Supposedly multiple implementations: Support is for address families and GR -15 small changes: -16: multi-instance capabilities. clarifications on single-instance capabilities; whole capability is revised. Clarified removal procedure. Implementation considerations: generic feature. Driven by customers’ demands. Be selective on capabilities to be supported. IANA Considerations: Updated per review. Found errors for BGP NOTIFICATION. Jeff Haas: If we have implementations, need to do IANA early allocation to officially nail them down. Need to add an appendix covering changes; specifically version 4 to 5. Can be removed before final publication. Acee: Should only be for capabilities that it makes sense for. What happens if you have mismatch in implementations. People supports dynamic capabilities, but they don’t support the same set of capabilities on this. Will the session get reset? Need to document this. Enke: By negotiation. If one supports A and B, and another one supports B and C, they can only do B. Not expecting things to be reset if there’s a mismatch of negotiated capabilities. Acee: Attempt dynamic renenogtiation. If one supports A, B, C, the other guy supports C but does not support C dynamically, it could not be renegotiated. Enke: You have to list the capabilities you want to support for dynamics. We’re not skipping the comparison vs. the originally advertised capabilities. Draft text updated that covers this detail. we only update originally advertised stuff. Jeff: there may be cases which are not safe for renegotiation, for example: add-paths. May need a parallel capability to list the capabilities which are allowed for renegotiation. Enke: We don’t expect vendor to support dynamics for capabilities which do not make sense. Then operationally either side can decide whether want to enable the dynamics for a capability by configuration. Jeff: There may be cases where after the negoatiation of a feature, it may not be safe to revise it and cause session drop. Jeff: If there is multiple independent implementations, please document them in the implementation report, then we can have early allocation quickly. Keyur: (As WG member) Suggest to document and discuss in WG what are the easiest use cases of using this mechanism? E.g. new afi/safi. This will help the implementers to think about its usage. Enke: Agree, -16 also adds text on GR. afi/safi, gr make easy sense. Will look at it. Please offer some text. (from chat) Srihari Sangli: Capability advertisement is about advertiser’s capability Srihari Sangli: Acee- there is also ACK in the capability advertisement Srihari Sangli: I mean, there is a provision for ACK bit for the capability in the message Acee Lindem: Thanks Srihari - will read the latest. 4. Use of Streams in BGP over QUIC [Yingzhen Qu] (10 mins) https://datatracker.ietf.org/doc/html/draft-retana-idr-bgp-quic-stream/ Slides… Compare vs. BGP multisession. Strong motivation is to deal with sessions that have malformed message that causes session to drop. E.g. flowspec malformed message shouldn’t impact ipv4/ipv6. QUIC supports multiple streams in the same connection. […] Linda Dunbar: Why need BGP multisession? Yingzhen: If all AFI/SAFIs in one session, some error in one AFI/SAFI will cause the whole session reset, will impact other AFI/SAFIs. Keyur: Due to limited time, please take the questions to the list . Andrew Alston: If we have problem with session reset, can we modify BGP itself to allow AFI/SAFI reset? Yingzhen: That was what the BGP multisession draft trying to do, support BGP multisession over TCP. The issue is you need multiple TCP connections. Enke: Clarify on error handling. If this error is for a given afi/safi, that afi/safi can be disabled. The error handling spec (7606) can carry that forward. But currently it was not covered by error handling spec. Now it can be achieved by coupling with dynamic capability to disable a AFI/SAFI. Dynamic caps can let us disable things temporarily. So, dyanmic caps may change this, although it may not be as clean as a completely separate session. Yingzhen: didn’t know you refreshed the dynamic cap draft. will review that and assess the impact. Enke: Also check the error handling spec (7606). Yingzhen: Once we can support dynamic cap, we can be more agile about this. It’s an improvement. Jeff Haas: THe error handling attempts to handle the malformed NLRI case, where we lose the packet boundary, it is all in a TCP stream. Quic may give us a proper layer 5 style session, so we can trust packet boundaries better. There can be alternatives about how to handle the errors with packet boundary issue. still have nlri issues to resolve, but not toxic to the entire session. Yingzhen: Will catch up with the comments in the chat window and discuss on the list. 5. BGP App Metadata for 5G Edge Computing Service [Linda Dunbar] (10 mins) https://datatracker.ietf.org/doc/html/draft-dunbar-idr-5g-edge-compute-app-meta-data/ Jeffrey Zhang: sent a mail to lsr, idr, 6man about the prior discussion in Spring about the use case. There are two problems to solve: pick which server to use, how to stick to same server when devices move to different UPF. Second problem is harder to solve. Discussion in 6man covered this. Best solution to problem is application layer - use app load balancer to pick which one and how to make it sticky. There is an email thread about this. Linda: Agree with you. Cloud providers think app layer can do everything, but it has limitations. Fast to make a load balancer - but it’s at a specific location. Example is deploy xxx in edge datacenters they can’t really add a load balancer quickly - takes months. Since we’re the underlay provider, we have a lot of information that make for better balanced service. Trying to put this option on the table. Jeffrey Zhang: Solving which server to pick is just a small part of the problem. Sticky is harder problem. Linda: That’s in the 6man discussion. We didn’t get a chance to present there. The source address and the Flowid don’t move. Based on info in BGP, they can use same tunnel to the previous egress router. This can be implemented in existing routers. Similar to TE changes when path changes based on path cost. Jeffrey: Will take it offline about the UE address change. Keyur: (as WG member Linda, can you comment on chattiness this will brint to bgp? Linda: Hours or days is the expected granularity, based on the timers. We can put that into the draft. 6. Revised Error Handling for BGP Messages [Haibo Wang] (10 mins) https://datatracker.ietf.org/doc/html/draft-wang-idr-bgp-error-enhance/ The purpose is reduce the impact of errors on existing BGP services. Enke Chen: Have sent comments to the list. NLRI parsing requires semantics validation. The revised error handling (7606) introduces treat-as-withdraw to keep the good routes and throw the bad routes, it requires to parse all the prefix. Stale routes are a big problem. other issues you list are semantic issues. Lots of issues to relax validation. For somethings like aggregate, unnecessary to make it more restrictive. originator id, bug. how much harm? Nexthop could be interesting, room for improvement. e.g. interface address configuration issues - receiving your own address. Possibly separate wg document on nh validation. … enhanced error handling spec? Haibo: Lots of comments. For MP_REACH_NLRI, may do best effort in NLRI parsing. E.g. for EVPN, may keep type-1 route and discard other types of routes. Enke: you have to be sure about the whole NLRI field that all of the prefixes are parsed; can’t partially do it. bgp is incremental. better to keep bad state in the face of such corruption. Parsing partially has risks. Haibo: Will reply to other comments on the list. Jeff: Some of the proposals are small changes to 7606 for some edge cases, may be OK to loosen, easier verification of nexthop also worth discussing. (WG chair hat on) The discussions about how to do with bad NLRI is the difficult part to revise. There may be some room for revision after 7606 is done, when there is only one NLRI per update, no packing, you can identify the PDU boundary. There is still information lost, we may leave the session up as broken. This used to terrify people and not suitable for internet. In data center may make more sense to proceed in a partially broken state. So it can be discussed for single NLRI case and for some address families, not like the internet. Like we we did with LLGR, which is not for internet. Haibo: Agree we need to identify the PDU boundary. Want to try best to maintain the BGP session stability and reduce the impact. Keyur: (as WG member) Agree with Enke and Jeff’s point. There is no definite way to say after the nexthop length error the prefix are not corrupted. Even the update message is carrying one NLRI only. The best way is to reset the session. Will send comments to list. Thanks for the proposal. 7. BGP Flow Specification for DetNet Flow Mapping [Quan Xiong] (10 mins) Jeff: Thanks for presenting this work also in IDR. The feature you’re proposing needs to be implemented in flowspec v2. Flowspec v1 cannot do it. Please consider contribut to the flowspec v2 work. Donald Eastlake would likely be glad to have you work with him and also has background in L2VPN flowspec work. [5 minutes for switching]