Skip to main content

Minutes IETF109: rtgwg

Meeting Minutes Routing Area Working Group (rtgwg) WG
Date and time 2020-11-17 05:00
Title Minutes IETF109: rtgwg
State Active
Other versions plain text
Last updated 2020-11-24


Chairs:   Chris Bowers               
          Jeff Tantsura
Secretary: Yingzhen Qu

Date: Tuesday, 17 Nov 2020 
Time: 12:00 – 14:00 ICT

1   12:00   10m Administrivia and WG update             Jeff/Chris  

Stephane:     Regarding the ti-lfa draft, we have reduced the number of 
              authors to be compliant with IETF rules, I think it is 
              ready for WG LC. can we add it to the LC queue?
Jeff:         we'll handle your request and review your update.

2   12:10   15m Multilevel configuration 
Dean Bogdanvoic 

Tony Li:      how do you see it with the hierarchical architecture in 
              NETCONF or openconfig?
Dean:         This is the area that requires lots of work. I don't have 
              a good answer yet.
Jeff:         This is a large topic, and it's going to affect the models 
              that we've been working on for years.

3   12:25   10m SRv6 Midpoint Protection 
Xuesong Geng

Parag Kaneriya: how do you differentiate the condition the route is not 
              present vs. link down? IPv6 prefix not advertised, or not 
              link down?
Xuesong:      on slide 2, there are three stages. when IGP is already 
              converged, and we can't find the route anymore, some 
              upstream nodes will do the proxy forwarding. If there is 
              no IGP convergence yet, node B, the adjacency of the 
              failed node will know that something wrong happened and 
              will do the proxy forwarding.
Parag:        what about the function supposed to be executed at node E?
Xuesong:      you mean the function should be limited to some nodes?
Parag:        in SRv6, some function maybe executed at node E, maybe 
              accounting or rewrite?
Xuesong:      so you're asking what if there are some functions ended 
              on node E. node B can only do the forwarding function.
Parag:        can we use ti-lfa to avoid this situation?
Xuesong:      there are cases discussed in the document about types of 
              B, we have different cases. There are different scenarios.
Parag:        The idea would be to use backup path to reach E via ti-lfa.
Xuesong:      actually it skipped failed node E.
Jeff T:       we've not converging. you're asking to use ti-lfa. please 
              take your question to the list.
Stewart:      So I'm afraid my question was very similar. When you set 
              up segment routing path you set that path up for a reason, 
              and that reason made me want traffic to go to a particular 
              node for a particular reason, or you want to protect other 
              paths from being overloaded by that traffic. So with all 
              repairs and alternatives, I get really worried that you're 
              going to bust all the original creating of the SR path. 
              And I would imagine the only node that could actually 
              reroute a path is the head end node, which has the whole 
              network at which it can to choose from, as opposed to a 
              node in the middle, which could be busting someone else's 
xuesong:      Actually, I think that is a very good question. I can 
              respond to the question in two aspects. The first one is 
              that if some node, cannot be bypassed, We have some 
              mechanisms that are already being referenced in the 
              document. Maybe some node, like node E, it performs the 
              firewall to the whole path, so it can't be bypassed. We 
              can for example expand the IGP, and to advertise that I 
              cannot be bypassed so do not bypass me. This is the first 
              solution for that. And the second one is the motivation 
              for this solution is just to give a method to supplement 
              the ti-lfa mechanisms. Maybe something wrong happens in 
              the endpoint and we can have some methods to bypass the 
              failed node and to go to the next endpoint, at least the 
              packets will not be lost, and the the flow will not be 
              interrupted. So that is the motivation here. Okay. That 
              is my response so I think the mechanism itself is, is 
              reasonable and it can solve some problems here.
Stewart:      Can we show you're not going to bust something else, then 
              fine but it's quite a complicated problem I already 
              mentioned, making sure you aren't going to bust so 
              carefully engineered traffic plan.
Zhenbin Li:   To protect the function at E is a different scenario. 
              Second, the confusion is to mix with ti-lfa, this is for 
              SR traffic engineering, when E failed, we need B to proxy 
              forward the traffic. This can be explained in the mailing 
              list for details.
Jeff T:       The motivation is not clear in the draft. It's difficult 
              to understand, so not good for WG adoption. I have a 
              question about experimental trucks. I mean you don't 
              define any new extension, you kind of define possible 
              behavior. So, probably informational would be better track 
              rather than experimental. 
Xuesong:      Actually there are some new behavior defined for repair 
              nodes. For example, to do the proxy forwarding we define 
              some SRv6 functions. So there will be some new functions 
              defined in this document.
Jeff T:       I'm looking forward to reading new updated version of the 
Bruno:        it's about terminating a segment. let's have some 
              discussions in SPRING. We've discussed with Jeff, there
              are multiple documents about terminating a TE or a tunnel, 
              and what do you do with a VPN etc.
Jeff T:       please post your questions to the list.

From chat:
Ketan Talaulikar
@ Parag - this proposal is somewhat similar to draft-ietf-spring-node-protection-for-sr-te-paths?
@ Stewart - this was exactly the discussion that happened around the 
adoption of draft-ietf-spring-node-protection-for-sr-te-paths

4   12:35   10m Egress Protection for Segment Routing Networks 
Shraddha Hegde  

Stephane Litkowski: Could you elaborate more on how you are allocating 
              the service labels from the SID? 
Shraddha:     yes. that's right.
Stephane:     so you can't make it work without repair label allocation.
Shraddha:     that's too much overhead.
Stephane:     you may want to clarify it in the draft.
Shraddha:     yes, I'll.
Peter:        So I have two comments here so first of all similar to 
              what Stephane was asking, I can see this working in a 
              preallocation. if we go to the very next hop becomes way 
              more complex to synchronize the sids or labels between 
              these especially if you have multiple PEs not just two 
              but even more. And the second question is, well, it looks 
              like more like a deployment. It's not really standardizing 
              any protocol extension, people always ask whether these 
              things should really be published as a draft. So, that's 
              my comment.
Shraddha:     Yes, there are other solutions floating around. This is 
              one way to deploy it. we consider it as good informational 
              document. we got some feedback, allocating the labels 
              statically is a little complicated so we are also working 
              to see how we can automate that. And that will most likely 
              require protocol extensions.
Peter:        Absolutely. That's definitely something to look at. 
              Because, as I said, it can become very complicated. Sounds 
              easy if you have two PEs and use the same label or sid. 
              If you do an allocation, other than third VRF, it can 
              really get more complicated
Greg:         have you considered double failures?
Shraddha:     So the multiple failures. there's always possibility that 
              traffic can get dropped or can undergo micro loops when 
              there are multiple failures that's applicable to ti-lfa in 
              general as well. And nothing special,  nothing different 
              for egress protection. 
Jeff T:       FRR never claims to protect double or triple failures, 
              it's a formal statement. Please send your question to the 
Xuesong:      I have a concern about deployment complexity because the 
              request of static assignment of labels. The second is that 
              SRm6 is still under discussion for maybe not proper to 
              have a reference here.
Jeff T:       points taken.

5   12:45   15m Dynamic-Anycast in Compute First Networking use cases and Problem Statement
Architecture of Dynamic-Anycast in Compute First Networking 
Yizhou Li   

Jeff T: we're not going to take questions here. Please send your questions to Yizhou.

6   13:00   15m The Problem Statement for Precise Transport Networking 
The Requirements for Precise Transport Networking 
Daniel Huang    

7   13:15   15m BFD for Multi-point Networks and VRRP Use Case 
BFD on Multi-Chassis Link Aggregation Group (MC-LAG) Interfaces 
Greg Mirsky 

8   13:30   20m Extension of Transport Aware Mobility in Data Network 
Kausik Majumdar 

Dhruv Dhoday: we were discussing this on chat. Flow spec would be a much 
              better choice, it will provide you much more granularity 
              for steering and we already have a mechanism both in BGP 
              and PCEP to do generic flow specification so why did you 
              not use that and instead define a new 5g metadata sub-TLV?
Kausik:       you mean like BGP SR policy case?
Dhruv:        in BGP and PCEP, we have a flow specification, where you 
              can have more granularity control of defining, which flows 
              needs to flow in SR policy or even in any PCEP base 
              tunnel. So we have much better techniques, in my opinion. 
              Were you aware of that?
Kausik:       I'm aware of flow-spec. This is a generic approach. There 
              is already SAFI session, and we need the metadata. Either 
              do it with a SAFI session or flow spec, I don't see huge 
              advantages. But we can discuss more.
Dhruv:        let's take this offline.
Uma:          flowspec can be used, definitely, that will give much more 
              granular specification at the UPF and PE combined scenario, 
              but the here the case is the granular is not required, 
              rather than he wants to maintain end to end SLA. The 
              mobility domain and internet domain, if you want security 
              or one of their 30 characteristics, you want to maintain 
              that, and they want to maintain the same characters, that's 
              used in the mobile domain which is the resource based. 
              So yeah, you can use both ways, actually, to me.
Jeff T:       so you're limited by UDP space, why don't we look up tid 
              so GTP header next, it will give you a much better context. 
Uma:          This is discussed in the DMM draft. I think the problem is 
              tid is a scalar. It doesn't represent the properties we 
              are seeking and that it is more loaded with 5g control 
              plane characteristics so we don't want to touch that, so 
              that's the reason. We had a calculation of how many slices 
              can be done in a typical network, that's in the DMM draft. 
              We found out for most practical deployments the UDP source 
              port ranges are good enough. The problem is we are limited 
              by what you can put in the packet, so we don't want to 
              enhance too many things in the data planes so that's why 
              we are limited with that UDP source port. it's kind of 
              middle ground approach we took. 

From chat:
Dhruv Dhody
Cant this be done by flowspec?

Stephane Litkowski
Right, this is also my question :) Flowspec will also provide the full 
granularity for steering

Jeffrey Haas
Flowspec is usually implemented as a firewall rather than destination
based lookups. Different hardware paths.
Regardless of whether you're looking at encoding stuff fancier than just 
destination based in flowspec or tunnel encaps, that's the penalty you 

Tony Przygienda
no free lunch. more complex lookup, more gates, more energy, will cost 
more. as a side note, isn't that what we have flow-id in v6 for?

Date: Wednesday, 18 Nov 2020 
Time: 14:30– 15:30 ICT

0   14:30   5m  Administrivia           Jeff/Chris

1   14:35     10m   Protocol Assisted Protocol (PAP) 
Shuanglong Chen

Xiang Ji:     first question, how do we address IP path to run this 
              protocol? second question, for performance, instead of 
              TCP or UDP, have you considered SCTP?
Zhenbin:      The first one depends on configuration, the reachability 
              is dependent on the forwarding table. For the 2nd, we only 
              considered TCP and UDP, we will consider your suggestion.
Randy Bush:   I like the separation. I think it's architectural 
              reasonable, but not MD5, pick something from this 
Zhenbin:      we will consider more options.
Jeff T:       I would really advise the authors to articulate who is 
              the end consumer of the data, because networking devices 
              don't usually do anything with kind of information. So, 
              if the consumer of the data is NMS, the real question 
              would be why not use south north communication we already 
              have, such as gRPC. I'd really like to see better 
              justification of doing it horizontally rather than 
              vertically. And hopefully on the list.
Zhenbin:      Thanks for your comments. We talked about the reason 
              to introduce west-east network monitoring. This should be 
              a light protocol, 
Randy Bush:   back to the discussion IDR, protocol has to have this 
              extract mechanism. That's what I meant when I said I kind 
              of like it architecturally.

2   14:45   20m GRASP 
Toerless Eckert

zhenbin:      Thank you for the information. This is a very valuable 
              reference for PAP. My understanding is that GRASP is based 
              on TCP, and I have a concern about resource consumption. 
              if PAP is used to locate BGP errors, we may need full 
              meshed TCP sessions, it's very challenging. Full mesh BGP 
              peers is already a challenge for operators, and PAP is 
              only for network monitoring, that's my concern.
Toerless:     GRASP doesn't really specify the underlying protocol, it 
              can be used in TLS, QUIC, any secure transport protocols 
              and standardized in IETF. You can also use UDP or TCP, 
              just you have to consider security mechanisms. 
Brian Carpenter: The prototype code was done by me, it's not production 
              code. I looked in TCP or TLS at the moment. but I looked 
              at what would be involved in switching to UDP, the 
              messages of course could possibly well be wrapped up in 
              UDP packets and said, but my impression was that the 
              overhead that TCP allegedly adds, I would have to add 
              back into code, just to make sure that UDP messages are 
              delivered to the right. And you would also have to put in 
              more recovery logic, because UDP doesn't do it for you. 
              So, I'm not convinced that there's an efficiency argument 
              against using a transparent layer.
Toerless:     Through the years and Anima we've seen a lot of opinions 
              about transport layer so I think that the prudent thing is 
              obviously to define the Message Protocol, independent of 
              the transport and the security as we've done in grasp and 
              then basically let the chips, or IETF, or whatever the 
              wisdom is. so I think if we had for example more IoT 
              people in here, they would come and say that some even 
              lower overhead on things that are based on UDP is really 
              crucial for very low end devices. That's why they did 
              COAB avoid TCP, and that's based on UDP, so I don't think 
              there is an industry wide common understanding of what the 
              best transport is. And that's why we did GRASP the way we 
              did it.
Zhenbin:      Thanks for the information. We'll consider using GRASP 
              for PAP. 
Toerless:     I'd like you guys provide some detail examples, such as 
              BGP, for automation. With ANIMA, we have security 
              mechanisms, and we'd be happy to build those examples. If 
              we have this framework, we'd be happy to have routing 
              people using it.
Zhenbin:      Happy to do it.
Chris:        there seemed like a pretty good way to bootstrap the sort 
              of the use cases. We could spend a year or two discussing 
              things for PAP, but instead it makes sense to try these 
              use cases in the existing coding. It seems like you would 
              get a lot of benefit from a reliable transport. To begin 
              with, you wouldn't have to worry about a lot of protocol 
              stuff if you could just be sure that messages were getting 
              received. That would be a pretty good way to to get this 
              moving a lot faster with solid convincing use cases.
Toerless:     I can't remember there was this one effort to try to unify 
              and build better common security for the different routing 
              protocols, and I think that ended in pretty much nothing. 
              If people also think that the securing of the routing 
              protocol is still something that the industry struggles 
              with, with that, beside the other operational examples 
              given in the PAP draft i think the securing might be 
              something that could be most easily done upfront.
Jeff T:       Thanks Toerless for the presentation. Looking forward to 
              more collaborations.

3   15:05   20m xBGP: When You Can't Wait for the IETF and Vendors   
Olivier Bonaventure

Jeff Haas:    The nice thing that is an output of this presentation is 
              that, yes, plugins can be done for allowing code to be 
              hooked externally, you know the Linux kernel is an 
              excellent example of that. The challenge I think that's 
              not really covered in the paper, or at least the 
              presentation here of the paper is most of the headache 
              for BGP is really incremental deployment issues. So 
              writing code isn't all that bad. Getting code that can 
              run in implementations they're scattered around the 
              internet and follow BGP rules in terms of validation and 
              such. That's the main challenge. So they pay I agree that 
              having plugin frameworks is actually a very useful thing. 
              But I think that's not necessarily the hard problem here.
Olivier:      But I think there are benefits in being able to deploy 
              extension inside an AS, so for ibgp only. So doing it over 
              external BGP is much more difficult, but for ibgp I think 
              it makes sense and it could address some use cases for 
              network operators that have specific requirements. And it 
              could it could also allow network operators or network 
              designers to develop new extensions, before they are being 
              discussed within the IETF. And so you could get 
              information, prototype, running code that could be 
              discussed within the IETF without having to change complex 
              implementations and to do a full implementation.
Jeff Haas:    Understood. I'll leave my comments with I agree with you 
              that doing this is not hard. In that, if you contain, so 
              I'd call it the blast radius of this problem to something 
              that's strictly internal, you're right that this is not 
              too much of a problem, but the blast radius is really the 
              discussion that's the difficult one for incremental 
              deployment. It's very common for features in development 
              to have unfortunate side effects, far away from where 
              they're at. I will cite the attribute 128 issue that many 
              people are familiar with. As an example of a feature that 
              had two different versions that caused a large network 
              outages. My suggestion is perhaps part of the conversation 
              if you're going to talk at least about BGP is whether or 
              not, BGP should be basically leveraged to side protocol 
              for different purposes, separate the inter domain case 
              from your local case. That's it. Thank you.
Jared Mauch:  I think this is interesting, but I have a lot of concerns 
              here specifically around deployability and usability 
              testing. So, if we were to do this, similar to what Jeff 
              was talking about with attribute 128 issue of which I'm 
              still trying to get the config out of our many network 
              devices, we have a variety of code bases that we still 
              run. I'm concerned about what happens when we deployed 
              these plugins and they have different results across 
              different vendors and how that kind of bug ecosystem 
              interacts because it's one thing to specify a protocol 
              and a method for transporting, but it's a whole another 
              thing to discuss what that operational impact is, and I 
              think similar to what Jeff commented about about the 
              internal use cases versus external use cases. We have a 
              lot of internal use cases where we transport BGP data and 
              signal that around our internal infrastructure. And those 
              use cases tend to not really align with what shows up in 
              the public Internet. And a lot of what happens is people 
              have many more fine grained signaling that they want to 
              do for geo location. Then what you actually want to 
              announce in the routing table. As a result, so I'm very 
              concerned about something like this and what happens to 
              the operational use case, when we have a lot better things 
              like if we want to signal to feed data or something there 
              is the geo RFC that was recently published. There was 
              discussion last week about how to actually go and 
              potentially authenticating sign that data to provide more 
              granular information than what you actually want to 
              publicly expose in BGP. there's definitely an interesting 
              idea but I really fear the unintended side effects that 
              this is going to introduce instability that'll create as 
              a result.
Olivier:      there are two different parts of the presentation and 
              there are two different elements. So the first one is the 
              ability to consider a routing protocol a kind of 
              microkernel operating system that exposes an API and that 
              allows you to extend it. And this is generic, and although 
              we implemented in BGP because it was easy for us to extend
              BGP and to do test with BGP. We believe that it can apply 
              to any routing protocol and that's why we discuss with 
              Jeff to present that here because this is an idea that is 
              generic and it would be applicable to any routing protocol 
              I agree with you that doing that over external BGP session 
              and over the pubic internet is something that would be 
              dangerous, and that would need to take much more care to 
              be able to do that. This is a research prototype which is 
              intended to show that you can extend a routing protocol, 
              you can view the routing protocol as a different way. And 
              for the BGP extensions and for the BGP use case, we are 
              focused on internal BGP issues we don't consider external 
              BGP as a possible solution right now.
Rudiger Volk: I love this view and way of taking up the definition of 
              manipulation. When you provide agile to complexes you 
              introduce fragility and you need to control very 
              carefully. I see Jeff's points, however I note that we 
              usually industrial production is a system we already some 
              control points. In BGP, that's quite clear the policy 
              definitions are available to operators. My previous on 
              policy configurations did show me that evolving the 
              definitions of policy in a network is something exactly 
              points to the control problem that Jeff was mentioning. 
              BTW, this is about communities being used in Internet. I 
              see an opening with your approach to actually allow users 
              to do much nicer and better controlled stuff for extending 
              functionality far beyond what's been done so far with 
              policy language. Great power comes with great 
              responsibility, it should not be abused. Nevertheless the 
              agility is much needed.
Olivier:      let me try to to answer a bit some of your concerns. So 
              you mentioned that operators do a lot of with BGP 
              communities to implement the policies. And basically the 
              BGP communities is an ad hoc solution so you have to play 
              with route filters which acts with access lists which with 
              maps and stuff like that, to be able to implement the 
              policies that operators wants. On the other side, by using 
              xBGP what you have is that you have standard programming 
              language in our prototype but this could be another 
              language. And when you have standard programming language 
              it's also possible to use software validation techniques 
              where you can specify the properties that the plugin is to 
              be able to support, to be able to be acceptable by a 
              router. So as you see we don't want a plugin that runs 
              forever, but we can have properties that are defined 
              formally, and we can design tools, based on the progress 
              in software verification that will verify that the plugin 
              is working, is behaving correctly according to the 
              appropriate properties of the underlying BGP. So that's 
              something that you can do if you have a much more 
              expressive language, then match the access list, route 
              maps that we have today. And we can leverage lots of 
              advances in software verification in the last year to 
              verify that.
Eduard V:     Thanks for your presentation. I have one concern, by 
              itself is a good idea, but unfortunately from a generic 
              term world wide multi vendor interoperability is more 
              important, could not be a trade off instead of multi 
              vendor interoperability. The API should be exactly the 
              same for all implementations, on the previous slide if 
              you've seen that prototype implementation, in one case 
              you have 400 API calls and the other 600 API calls, this 
              is a very typical alarm. Big alarm because it looks like 
              you don't have mature stable API, which you could really 
              demand from all vendors, and it means there will be no 
              multi vendor interoperability. This particular effort will 
              not be accepted by the market. Therefore, from my point 
              of view, you could mitigate if multi vendor review your 
              problem. If you will clearly specify and make it mandatory 
              some basic feature or basic API calls, but you shouldn't 
              be very rigid for API, which should be in a mandatory. 
              Then it's potentially possible to progress.
Olivier:      So just to answer your comments. We have a very simple 
              API, which is available on the website, that they can 
              provide the slide that Jeff shows here, shows the number 
              of lines of code that we have to change to be able to 
              implement the API. So this is not the number of API. So we 
              have about 10, different functions in the API so the API 
              is very simple. And the slide shows only the number of 
              lines of code that we had to change in the BGP 
              implementation to be able to support the API, which means 
              that in fact, the API was already part of the interworking 
              and build implementation and we did not have to change 
              that much to be able to support the, the abstract API 
              that I mentioned earlier with a workflow.
Jeff T:       Thank you Olivier for the presentation. I'm looking 
              forward to hosting you anytime. Thanks you everyone for 
              joining us tonight.

Some chat history:
Eduard V
AMINA presentation is very good - easy to understand even for new 
people. Thanks to Toerless.

John Scudder
There are some liabilities with reliable transports, but the benefits 
usually vastly outweigh them.

Brian Carpenter for the code etc

Jeffrey Haas
audio not working
the effort was called karp

John Scudder
^ routing protocol security, that is

Jeffrey Haas
the issue with karp was bootstrapping often required getting components 
of ip working before you could get the job of security or routing done.

Joel Halpern
Arguably, the issue with karp was that folks did not see enough return 
(value) for the effort. So they stopped.

John Scudder
An interesting metric of protocol complexity. If we are to believe it, 
it's actually encouraging it only is increasing linearly.

Jeffrey Haas
karp had many issues. but it was helpful in highlighting the 
bootstrapping issues. :-)

Joel Halpern
I grant the bootstrap part was (and is) non-trivial. BRSKI has a lot of 
moving parts to get around that.

Jeffrey Haas
my point, joel.

Brian Carpenter
Yes. I waited so long for BRSKI that I added something called Quick And 
Dirty Security to the GRASP prototype.

Jeffrey Haas
Anyone care to start the betting pool for first outage from the plugins? 
RIPE had a record for a while on those for a bit

Jared Mauch
I had an unknown BGP attribute finder that I ran for awhile when there 
was the large round of leaks of them, around the timeframe that the 128 
attribute issue occurred. Getting code upgraded on devices always takes 
longer than expected.

Jeffrey Haas
new code, faster, more agile, is not the problem. not blowing up the 
internet is. :-)
that said, standardizing ways to interact with bgp pipelines is good.

Jared Mauch
I would be interested in improving the way the specs are written and 
how you do the capability signaling, etc.. vs bgp4+++++++++++++ but the 
path to transition would be difficult.

Tony Li
There are enough open source implementations so that if someone wants 
to blow up BGP, they don't need plugins.

Jeffrey Haas
The fun of plugins is that eventually you have same issue as things 
like linux kernel extnsions: overlaps, ordering issues, etc.

John Scudder
quite. security through obscurity (or through gatekeepers) is a 
non-starter. Nonetheless

Jared Mauch
considering how we moved away from plugins in web browsers, etc.. i'm 
similarly concerned about going down that path in routing protocols.

Martin Horneffer
Tony: which work well enough for academia?

Jared Mauch
and that's before the security/attack profile

Tony Li
Martin: my sense of the bar for academia is very, very low.

Jeff Tantsura
within right boundaries some fo this can be done, offloading best path 
to a custom algo was unthinkable just a few years ago