Skip to main content

Minutes interim-2019-lsr-01: Thu 07:00
minutes-interim-2019-lsr-01-201905300700-01

The information below is for an old version of the document.
Meeting Minutes Link State Routing (lsr) WG Snapshot
Date and time 2019-05-30 14:00
Title Minutes interim-2019-lsr-01: Thu 07:00
State Active
Other versions plain text
Last updated 2019-06-01

minutes-interim-2019-lsr-01-201905300700-01
LSR WG Interim Meeting
May 30th, 2019
Chair: Acee Lindem
Secretary: Yingzhen Qu

Attendees:
Acee Lindem
Aijun Wang
Donald Estelake
Henk Smit
Huaimo Chen
Jeff Tantsura
Ketan Talaulikar
Kiran Makhijani
Les Ginsberg
Lin Han
Mehmet Toy
Padma Esnault
Peter Psenak
Sarah Chen
Sriganesh Kini
Susan Hares
Tony Li
Tony Przygienda
Xufeng Liu
Yingzhen Qu
Zhenbin (Robin) Li

Chair:    Please be aware of IPR and Notewell.


Starting from Tony's presentation.

Tony:   Arista has filed an IPR claim on dynamic flooding.

Tony:   FT bit from Huaimo's draft is added and it's straightforward. There's
        a minor question I have here for the chairs, about how to deal with
        IANA. I'm not seeing a registry for the link attributes bit.

Les:    you're talking about the ISIS link attribute bits? There is a
        registry and it's referenced in the draft.
Tony:   Okay, my apology.


Acee:   did you say you only wait for 10 ms? So, actually when you receive a
        new flooding topology, you only flood it on the new and old for a
        certain amount of time.
Tony:   yes. Let me be clear that's 10 milliseconds between adding temporary
        additions.
Huaimo: if there are hundreds of links, are you going to do temporary
        flooding on those links?
Tony:   If we have hundreds of links that somehow have fallen off the
        flooding topology and where we have disconnected nodes. Yes, we're
        going to end up adding all hundred links. that the truly bizarre
        occurrence because that should not happen unless there's those nodes
        are truly disconnected.
Acee:   slide 5 please. I read the draft yesterday, this level of detail is
        not in the draft.
Tony:   correct.
Huaimo: this algorithm is not in the draft?
Tony:   yes, not in the draft.
Acee:   hopefully it will converge unless there are some other problems in
        the IGP domain. we don't even need to standardize it, don't we?
Tony:   we don't because as far as I can tell, there's no negotiation going on.
Acee:   it’s a local repair. it's good to have it for information but this
        could be modified based on experimentation.
Tony:   we’ve worked on it. Any other questions?


Huaimo presenting.

* FT bit discussion
Acee:   Let's do the discussion now.
Tony:   there is another reason why we don’t want to do it, consider the case
        where you've got parallel links between A and B. We may be flooding
        to each other on opposite link. If you use this mechanism, you're
        going to warn about an error, and there is no error.
Huaimo: In this case we can send a warning, then people or tools can analysis
        it.
Tony:   this generates lots of noises. Parallel links are extremely common
        these days, especially in data center topologies.
Les:    I did respond to your email on this on the list with some reasons
        why this was not a good idea. When you get a chance, you know, please
        reply to that.2nd, Tony, I agree with you. The way the bit is defined
        in the draft and I would hope the way that this would be used, even
        if we agreed to do what Huaimo suggesting is that it would be edge
        oriented. In other words, you have to advertise the bit on all the
        parallel links. But how you evaluate the bit depends upon whether the
        edges in the flooding topology or not. I think that's the only way it
        could work reasonably.
Huaimo: in cases there are no parallel links, should we do something?
Tony:   no need. If somebody wants to build a tool to look for this they can.
        You're advertising the information.
Peter:  it’s up to implementation, but no need to standardize anything.
Huaimo: so even if there might be problems, we're not going to take action to
        have a temporary fix?
Tony:   If there are real issues on the flooding topology, and partition
        repair would have acted to actually repair the flooding topology.
        This adds another level of worrying about things that we already have
        a mechanism for.


*FN bit discussion
Les:    this issue has nothing to do with dynamic flooding. If the WG decides
        to take it, it should be in a separate draft. Having said that, I'm
        not encouraging you to for example, issue another draft to propose
        this because I don't think this is a good idea. As you've mentioned
        and has been discussed on the list of this problem has already been
        addressed by implementation, without any protocol extension. I think
        there's some very significant issues associated with your solution.
        You're changing the state machine. You're trying to set up a
        negotiation based on partial information. I think there's a lot of
        problems with this solution, but I really would like to divorce it
        from dynamic flooding discussion.
Huaimo: I think this is in the scope of flood reduction. when there are
        thousands of links, we only need to flood over one or two links.
Peter:  there are other issues for this problem, not just flooding. I agree
        with Les, I can confirm there are implementations that solve this
        problem. And there are other problems associated with bring up of
        the large number of adjs. It is completely unrelated to flooding.
Huaimo: we’re talking about bring 1000 adjs up.
Peter:  if you solve this problem, then you wouldn't have that problem.
Tony:   If you solve it as a generic problem, then you bring up, links, a
        few at a time. And as you do that, partition repair within the
        floating dynamic funding would add some of those links to the
        flooding topology temporarily, creating flooding. Then you don’t have
        a problem.
Huaimo: this problem is from you.
Tony:   that was a better way to fix it.
Robin:  I’m confused. Tony mentioned some it was included in the draft?
Tony:   we haven’t included FN bit, just FT bit.
Acee:   the scenario is for thousand links?
Robin:  I agree that this can be solved by implementation, not protocol
        extension.

* Transfer
Les:    the draft has always had the ability to quickly and easily, transition
        between enabling and disabling. I think what we didn't have in the
        draft was a very clear explanation of how this is done. We added some
        language in the latest version v2. I'd encourage you to review that.
        But the draft already has the necessary mechanisms so I don't think
        any of this is necessary.
Huaimo: in the draft, you mentioned centralized mode. After the flooding
        topology is flushed, every node transition to normal flooding.
Les:    apologies for interrupting you but that's actually a point on your
        slides, which is incorrect. The flooding topology is carried in LSP
        or LSA, and if those LSP or LSA get updated, then everybody's link
        state database gets updated in a relatively short period of time,
        we're talking a matter of seconds, the flooding topology is then gone
        or updated. There is no separate aging of the flooding topology
        independent of the link states database. So point five are just not
        correct.
Huaimo: So the flooding topology is advertised by the leaders, so the leader
        needs to flush the LSAs when switching back to normal flooding.
Les:    that’s what the leader will do. the leader will withdraw the
        advertising whether that's purging and LSA, or in the case of ISIS
        it's simply removing the appropriate TLVs from the LSP , in which
        they were advertised, and they're gone.
Huaimo: so the leader will flush the flooding topology, and this is not wrong.
Les:    leader will update the flooding typology as necessary. again this is
        described in the updated language in the draft. I'm just, again to
        repeat, None of this is necessary. The existing TLV or sub TLVs we
        have are sufficient. And we try to make the procedures a bit clear
        in the latest version of the draft. If the language needs to be
        improved and certainly we're open to improving the language, but none
        of this is needed.
Huaimo: for the centralized mode, the draft is ok. For distributed mode,
        because we don’t have a way to tell each node to transfer to normal.
Les:    Actually we do. it’s in version 2.
Huaimo: how did you do that? because you need to transform to centralized
        mode. And then you withdraw the flood topology.
Les:    if I'm a leader, and I want to operate in distributed mode, but I want
        to disable the optimized flooding the dynamic flooding, at this point
        I simply advertise the algorithm zero and I don't advertise the
        flooding topology.
Huaimo: you only have two states. Central or distributed? You can use 0 for
        two things.
Les:    let’s take it offline. Again, I encourage you to read the updated
        draft it does explain how this is done.
Huaimo: we need to fix it. Robert also mentioned it in the list.
Les:    I will try to clarify on the list, but we do have it.
Robin:  I think both parties agree this needs to be fixed, and Les agreed to
        clarify on the list. from my experience, we may have to do protocol
        extension. Considering time, my suggestion it’s better to clarify
        it in the list.
Acee:   Les, please put it up again in the list. There are different ways to
        do it. any algorithm number devoted to normal flooding?
Tony:   we didn’t think it’s needed because disable dynamic flooding is
        terrible.
Acee:   we may want to disable it. so we need separate algorithm for central
        and distributed, then 0 for disable. either way works.
Tony:   we have it covered.
Acee:   Let's move on to the next one.


* Area Leader Sub-TLV
Tony:   I'm sorry you're having trouble understanding it. But the point of
        the area leader sub TLVs  are very clear. This was to carry the
        priority, and also to carry the algorithm for distributed mode. We
        should point out that the dynamic flooding sub TLV is was intended,
        so that nodes can indicate that they are participant and capable of
        operating with dynamic flooding. And also we carry around the
        potential algorithms for distributed mode. all nodes in the topology,
        assuming the codes present, advertising the dynamic flooding sub TLV.
Huaimo: this is also inline with broadcast network.
Tony:   That is exactly what we're modeling this after Yes, every area leader
        candidate needs to advertise a priority. some nodes if they are short
        on RAM or short on CPU, may choose not to be the area leader, they
        would not advertise an area leader priority. It's important that they
        be able to do that. Right way to do that is to not advertise the area
        leaders sub TLV.
Huaimo: you mean you have a way to advertise priority without area leader sub
        TLV now?
Tony:   the priority is in the area leader sub TLV. That is where it belongs.
        That is how a node indicates that it is willing to become area leader.
        The priority belongs in that TLV. if the node chooses the area leader
        it has to advertise priority.
Huaimo: that means every node will send leader sub TLV.
Tony:   selecting a distributed mode algorithm, and having them all not be
        elected except one is not a problem. That is the whole point.
Huaimo: either way can work, right?
Robin:  maybe for simplicity, maybe we should introduce the enhancement now.
Les:    this has been discussed on the list, you may want to have multiple
        area leader advertisements, So that if the current area leader fail.
        You don't have to go through a reconvergence cycle in order to elect
        a new area leader and get the flooding topology from the new area
        leader in centralized mode. So the idea of that we only want one area
        leaders sub TLV advertised leaves us very vulnerable.
Huaimo: The leader will be elected even though we have multiple area sub TLVs
        in the system. When the leader dies, a new leader will be elected.
Les:    that presents a significant convergence problem. And again, if you
        look at the latest version of the draft we've tried to clarify how that
        can be avoided. but it requires that there are multiple area leader
        sub TLVs that are always advertised. So apart from the fact that
        architecturally we have concerns about what you're proposing,
        operationally, it leaves us very vulnerable to a single point of
        failure, which we definitely don't want.
Huaimo: like DR operation in ISIS, we don't have any problem. Leader is
        elected dynamically. Let’s take it offline.


* Encoding
Tony:   In this example, have you incurred the fact that the link between
        RN11 and RN31 is also part of the flooding topology?
Huaimo: Good question. In this case, RN11 is local node, the link between
        RN11 and RN31 is included in the topology.
Tony:   I'm still now clear. it seems to me you added index for link?
Huaimo: no index for link, we only have nodes. the link is represented by
        local node and remote node.
Tony:   If I understand how to encode things here. Let's suppose that the
        links are RN2 to  RN31, and RN11 and RN31 are both part of the
        flooding topology. The way that I see you encoding this. You have RN1
        as an adjacent node, it's going to mark is external, and it's going
        to list RN31 as part of its adjacent nodes. Similarly RN11 that's
        going to have RN31 as one of its adjacent notes.
Huaimo: everything here is implied by the node, local node and the remote node.
Tony:   this implies to me that if you are bi-connected, that the index for
        the node has to appear twice in the block encoding.
Huaimo: no.
Tony:   If you don't do that, then how are you indicating which links to use?
Huaimo: yes, there are some duplications in some cases.
Acee:   it seems this encoding will be more as nodes will be listed as both
        local nodes and remote nodes. What’s the advantage?
Huaimo: slides 6-4. more efficient.
Donald: There's a problem with the nodes having to be listed multiple times
        because the links are all implicitly bi directional.
Tony:   It's a space efficiency issue.
Donald: I think this is more compact.
Tony:   I disagree.
Acee:   you still use the indices?
Huaimo: yes.
Acee:   How can this be more compact as listing multiple times? Didn't think
        much about ISIS, but in OSPF you could break it up into each LSA
        could have a part of it, and only change that. The other thing let's,
        let's get this into perspective you know like something for LSP or a
        router LSA, so that every node in the domain floods this, so it makes
        a bigger difference. as far as the flooding topology, it's only the
        area leader that's flooding it, so there's only one instance of it.
        so it's just a matter of compactness unless there's order of magnitude
        difference. I don't see that it's the most important, correctness is
        the most important consideration. I don't see that for something that
        there's only one instance. Let me ask this, do backup area leaders
        computed and flooded so it's ready to use right away?
Les:    that’s what we recommended the latest version. Because in the event
        that the area leader fails, this allows you to transition to the new
        area leader much more quickly.
Acee:   that’s what we did for the network lsa in ospf.
Huaimo: This is more efficient because of blocks.
Acee:   I don’t see why. This is actually more. The total size is more.
Huaimo: no. ..
Les:    Acee, I’d like to reinforce your point. I think the primary concern
        here is correctness. And because there's only going to be a small
        number of copies of the flooding typology. What we've recommended in
        the draft is see the area leader to advertise it and the second best
        candidate advertise it. Even if the final conclusion is that this
        encoding saves some number of bytes that the total value add to this
        when you look at the full size of state database is very modest.
        So to me, correctness is the dominant concern here.
Huaimo: The correctness is equal, also the complexity.
Aijun:  based on the block information, we can easily recover the flooding
        topology, but not with path info.
Tony:   paths are links in the topology.
Aijun:  block encoding is more structured.
Robin:  if we don’t use this enhancement, is there any critical issue?
Huaimo: no critical issue. This is for improvement. it’s to reduce flooding.
Tony:   It's true that we're trying to be reasonably space efficient. But as
        we've said many times, we are trying not to make things so complicated
        that things become fragile. if we were really trying to ultimately
        make everything efficient, we could actually use compression
        algorithms and run them on top of our LSPs before we flood them. set
        aside the patent issues, there's a question, is everybody got
        compression algorithm compressing correctly? We try not to do that.
        Again, correctness is more important than efficiency.
Huaimo: regarding correctness, the methods are equal.
Robin:  to simplify the discussions, we may not want to have too many options.
        2nd, if there is no critical issue, this can be for future discussion.


* Backup paths
Acee:   is this local repair?
Huaimo: the iteration is local, but computing the path is global. Because
        there is split, the database may be out of sync among some nodes, we
        add some links to make database resync, then we converge one step
        further. But for rate limiting, those flooding topologies are
        calculated by the leaders, and it may take a long time.
Tony:   that’s incorrect. Both mechanisms needs full topology information
        for repair.
Huaimo: For the backup path, we don't depend on the flood topology computed
        by the leader. as soon as we calculate backup paths, we enable them.
        For rate limiting, each node needs to check whether this is a link to
        the remote site through flooding topology computed by the leader.
Tony:   I disagree. the correct thing to do here, regardless of which
        mechanism you use, to determine which temporary links to flood on is
        to notice that as soon as you have repaired the partition, you're
        going to get new LSP information. As soon as that happens. Assuming
        centralized mode, so we could be just on the same page. Then the area
        leader is going to have to re compute the flooding topology in both
        situations.
Huaimo: no. the area leader will compute the flooding topology.
Tony:   rate limit checks change, then decides that it has to reevaluate. At
        that point, is going to see it and flooding topology, and proceed
        differently. could conceivably add more links while it's waiting for
        the topology. But that's largely irrelevant because it gets the
        flooding topology it's going to disable it.
Huaimo: we need to check based on flooding topology.
Tony:   rate limiting acting on topology change, not flooding topology.
Huaimo: so you will need to check whether a link is part of the flooding
        topology.
Tony:   after you done a successful repair, the flooding topology is going to
        change. we're discussing in the arrival of the flooding topology
        information, and some other events.
Huaimo: so depending on the flooding topology change, you iterate further,
        right?
Tony:   if necessary.
Huaimo: yes, that's the difference. The backup path is not depending on the
        flooding topology change.
Tony:   it still has to look at the flooding topology to determine if there is
        a partition, it’s completely dependent.
Robin:  is there a case that no backup path in some topology?
Huaimo: as soon as the topology is connected, we will have a unique backup
        path. if the topology is physically split, no way we have a backup
        path.

Robin:  from my experience, partition is a real problem. And it’s better to
        use back up path to fix the problem.
Acee:   how does a node know there is partition before area leader computes?
        You don’t know where is going to be partition. How to you calculate
        repair paths? are you saying you compute every node you're connected
        to?
Huaimo: This is on demand. as soon as there is failure, we assume there is
        partition and compute backup path.
Acee:   any reason this enhancement couldn't go on a 2nd draft?
Tony:   we’re trying to have one draft.
Acee:   but this is something extra.
Tony:   We need one consistent algorithm for the domain to act on for
        partition repair.
Acee:   independent of this, centralized or distributed, you will have a new
        flooding topology whether or not you try to do this backup. So the
        question is does this do anything faster? How does a guy in the middle
        of the flooding topology know that there is a partition? I'm saying
        let's just say you're doing distributed because it's easier to see the
        analogy. If you're doing this on on demand backup path, you might as
        well just compute a new flooding topology. Because everybody's going
        to converge to a new topology sooner rather than trying to do a repair
        with the existing one.
Les:    Acee, I think that's the that's the catch 22 here. if the flooding
        topology is partitioned, you don’t know what you don’t know. You can
        only detect locally, like your neighbor is not on the flooding
        topology.
Acee:   then why is this better then temporary flooding?
Huaimo: with backup paths, it convergences faster, also minimum number of
        links, and algorithm is simple
Acee:   are you tunneling?
Huaimo: no tunnel. because every node use same algorithm will come up with
        the same backup path.
Sarah:  every one has the same algorithm but not the same DB, so the back up
        paths might be different.
Huaimo: we will come up with same unique backup path no matter of the database
        and the partition. it's guaranteed.
Acee:   I don’t think it’s simple. Node on the back up path will need to know.
        It’s squared.
Huaimo: No. Every node compute backup path from A to B. ...
Acee:   Let’s take it offline. I will think more about it.
Sarah:  you seem to enable more links for temporary flooding.
Huaimo: no. minimum number of links are used.
Sarah:  some cases, it may enable less links but not all cases.
Huaimo: no.
Tony:   you don’t know that yet. Because you don't have critical information
        about everything north of the partition. You only have database
        before the partition. your calculation is not correct.
Huaimo: Iteration is different. This one is more locally.
Acee:   this one we don’t think we have a converge. I don’t see the advantage
        of this because you don’t really know the whole topology. If you have
        multiple failures, you don't know where the failures are. Let’s take
        it offline. The rate limiting is already in the draft. 


Acce’s Somewhat Biased Summary of today's Dynamic Flooding Discussion


    1. FT Consistency Check - This has been incorporated into the Dynamic
       Flooding draft but in the LSPs/LSA encodings rather than hellos. 

       Consensus of the room was that we should not prescribe the behavior
       when there is a disagreement on flooding topology. The check doesn't
       work when there are parallel links given that while flooding
       topology inclusion is bi-directional, nodes may use different links. 


    2. Flooding Negotiation Bit - this is meant to solve the problem of a
       node entering the domain that has many links. 

       This problem is really a generic problem independent of dynamic
       flooding. Additionally, the proposed solution negotiates flooding
       when it is really a problem of the node with all the links entering
       the the IGP domain. Also, it only takes effect after the adjacencies
       are formed which doesn't address a large part of the problem. 

    3. Transfer between Flooding Reduction and Normal Flooding

       This is handled in the current draft. We can verify on the list
       that this mechanism is sufficient. Les to start an Email thread.
 
    4. Enhancement related to Area Leader Sub-TLV

       It seems that either encoding would work. However, I can't
       see that the proposed change offers any advantage. The existing
       encoding consolidates area-leader information in a single
       TLV (priority and proposed algorithm) which, at least in my
       opinion, seems to be a minor advantage. In any case, it doesn't
       worth a protracted debate. 


    5. Enhancement on Flooding Topology Encoding 

       I don't believe we concluded this discussion and this needs to be
       continued on the list along with some examples. I think there was
       general agreement that compactness is not the primary
       consideration here given the flooding topology is only flooded by
       the area leader and, potentially, one or more backups.
       Consequently, I view the consensus as not pursuing the compressed
       path encoding. Between the path and block, we should continue the
       discussion on the list with emphasis on clarity, consumption, and
       dynamics rather than initial byte size. A potential outcome could
       be that it doesn't really make that much difference as the two
       are roughly equivalent.
     
    6. Flooding Topology Backup Paths

      This one is somewhat of a misnomer. A heuristic was presented at
      interim for handling partitioned flooding topologies. Huaimo
      presented another heuristic to solve this problem based on
      computation of backup path to a router adjacent to the partition
      breakage. I believe we need to continue this discussion with in
      terms of the following:

         - Behavior of routers adjacent to the partitioning 
         - Bahavior of other router in the domain 
         - Information required for hueristic to fix the partiion 

      Both Huaimo and the dynamic flooding draft authors have thought
      about the problem infinitely more than myself. 


Thanks for people attending this. Special thanks to Tony and Huaimo 
for leading this discussion.