LSR Working Group                                                A. Wang
Internet-Draft                                             China Telecom
Intended status: Standards Track                               G. Mishra
Expires: September 27, 2021                                 Verizon Inc.
                                                                   Z. Hu
                                                                 Y. Xiao
                                                     Huawei Technologies
                                                          March 26, 2021


                    Prefix Unreachable Announcement
            draft-wang-lsr-prefix-unreachable-annoucement-06

Abstract

   This document describes a mechanism to solve an existing issue with
   Longest Prefix Match (LPM), that exists where an operator domain is
   divided into multiple areas or levels where summarization is
   utilized.  This draft addresses a fail-over issue related to a multi
   areas or levels domain, where a link or node down event occurs
   resulting in an LPM component prefix being omitted from the FIB
   resulting in black hole sink of routing and connectivity loss.  This
   draft introduces a new control plane convergence signaling mechanism
   using a negative prefix called Prefix Unreachable Announcement (PUA),
   utilized to detect a link or node down event and signal the RIB that
   the event has occurred to force immediate control plane convergence.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 27, 2021.








Wang, et al.           Expires September 27, 2021               [Page 1]


Internet-Draft                     PUA                        March 2021


Copyright Notice

   Copyright (c) 2021 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions used in this document . . . . . . . . . . . . . .   3
   3.  Scenario Description  . . . . . . . . . . . . . . . . . . . .   3
     3.1.  Inter-Area Node Failure Scenario  . . . . . . . . . . . .   4
     3.2.  Inter-Area Links Failure Scenario . . . . . . . . . . . .   4
   4.  PUA (Prefix Unreachable Advertisement) Procedures . . . . . .   5
   5.  MPLS and SRv6 LPM based BGP Next-hop Failure Application  . .   5
   6.  Implementation Consideration  . . . . . . . . . . . . . . . .   6
   7.  Deployment Considerations . . . . . . . . . . . . . . . . . .   7
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   10. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . .   7
   11. Normative References  . . . . . . . . . . . . . . . . . . . .   8
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   As part of an operator optimized design criteria, a critical
   requirement is to limit Shortest Path First (SPF) churn which occurs
   within a single OSPF area or ISIS level.  This is accomplished by
   sub-dividing the IGP domain into multiple areas for flood reduction
   of intra area prefixes so they are contained within each discrete
   area to avoid domain wide flooding.

   OSPF and ISIS have a default and summary route mechanism which is
   performed on the OSPF area border router or ISIS L1-L2 node.  The
   OSPF summary route is triggered to be advertised conditionally when
   at least one component prefix exists within the non-zero area.  ISIS
   Level-L1-L2 node as well generate a summary prefix into the level-2
   backbone area for Level 1 area prefixes that is triggered to be
   advertised conditionally when at least a single component prefix



Wang, et al.           Expires September 27, 2021               [Page 2]


Internet-Draft                     PUA                        March 2021


   exists within the Level-1 area.  ISIS L1-L2 node with attach bit set
   also generates a default route into each Level-1 area along with
   summary prefixes generated for other Level-1 areas.

   Operators have historically relied on MPLS architecture which is
   based on exact match host route FEC binding for single area.
   [RFC5283] LDP inter-area extension provides the ability to LPM, so
   now the RIB match can now be a summary match and not an exact match
   of a host route of the egress PE for an inter-area LSP to be
   instantiated.  SRV6 routing framework utilities the IPv6 data plane
   standard IGP LPM.  When operators start to migrate from MPLS LSP
   based host route bootstrapped FEC binding, to SRv6 routing framework,
   the IGP LPM now comes into play with summarization which will
   influence the forwarding of traffic when a link or node event occurs
   for a component prefix within the summary range resulting in black
   hole routing of traffic.

   The motivation behind this draft is based on either MPLS LPM FEC
   binding, or SRv6 BGP service overlay using traditional unicast
   routing (uRIB) LPM forwarding plane where the IGP domain has been
   carved up into OSPF or ISIS areas and summarization is utilized.  In
   this scenario where a failure conditions result in a black hole of
   traffic where multiple ABRs exist and either the area is partitioned
   or other link or node failures occur resulting in the component
   prefix host route missing within the summary range.  Summarization of
   inter-area types routes propagated into the backbone area for flood
   reduction are made up of component prefixes.  It is these component
   prefixes that the PUA tracks to ensure traffic is not black hole sink
   routed due to a PE or ABR failure.  The PUA mechanism ensures
   immediate control plane convergence with ABR or PE node switchover
   when area is partitioned or ABR has services down to avoid black hole
   of traffic.

2.  Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119] .

3.  Scenario Description

   Figure 1 illustrates the topology scenario when OSPF or ISIS is
   running in multi areas or multi levels domain.  R0-R4 are routers in
   backbone area, S1-S4,T1-T4 are internal routers in area 1 and area 2
   respectively.  R1 and R3 are area border routers or ISIS Level 1-2
   border nodes between area 0 and area 1.  R2 and R4 are area border
   routers between area 0 and area 2.




Wang, et al.           Expires September 27, 2021               [Page 3]


Internet-Draft                     PUA                        March 2021


   S1/S4 and T2/T4 PEs peer to customer CEs for overlay VPNs.  Ps1/Ps4
   is the loopback0 address of S1/S4 and Pt2/Pt4 is the loopback0
   address of T2/T4.

     +---------------------+------+--------+-----+--------------+
     | +--+        +--+   ++-+   ++-+    +-++   + -+        +--+|
     | |S1+--------+S2+---+R1+---|R0+----+R2+---+T1+--------+T2||
     | +-++Ps1     +-++   ++-+   +--+    +-++   ++++    Pt2 +-++|
     |   |           |     |               |     ||           | |
     |   |           |     |               |     ||           | |
     | +-++Ps4     +-++   ++-+           +-++   ++++     Pt4+-++|
     | |S4+--------+S3+---+R3+-----------+R4+---+T3+--------+T4||
     | +--+        +--+   ++-+           +-++   ++-+        +--+|
     |                     |               |                    |
     |                     |               |                    |
     |         Area 1      |     Area 0    |      Area 2        |
     +---------------------+---------------+--------------------+

    Figure 1: OSPF Inter-Area Prefix Unreachable Announcement Scenario

3.1.  Inter-Area Node Failure Scenario

   If the area border router R2/R4 does the summary action, then one
   summary address that cover the prefixes of area 2 will be announced
   to area 0 and area 1, instead of the detail address.  When the node
   T2 is down, Pt2 bgp next hop becomes unreachable while the LPM
   summary prefix continues to be advertised into the backbone area.
   Except the border router R2/R4, the other routers within area 0 and
   area 1 do not know the unreachable status of the Pt2 bgp next hop
   prefix.  Traffic will continue to forward LPM match to prefix Pt2 and
   will be dropped on the ABR or Level 1-2 border node resulting in
   black hole routing and connectivity loss.  Customer overlay VPN dual
   homed to both S1/S4 and T2/R4, traffic will not be able to fail-over
   to alternate egress PE T4 bgp next hop Pt4 due to the summarization.

3.2.  Inter-Area Links Failure Scenario

   In a link failure scenario, if the link between T1/T2 and T1/T3 are
   down, R2 will not be able to reach node T2.  But as R2 and R4 do the
   summary announcement, and the summary address covers the bgp next hop
   prefix of Pt2, other nodes in area 0 area 1 will still send traffic
   to T2 bgp next hop prefix Pt2 via the border router R2, thus black
   hole sink routing the traffic.

   In such a situation, the border router R2 should notify other routers
   that it can't reach the prefix Pt2, and lets the other ABRs(R4) that
   can reach prefix Pt2 advertise one specific route to Pt2, then the




Wang, et al.           Expires September 27, 2021               [Page 4]


Internet-Draft                     PUA                        March 2021


   internal routers will select R4 as the bypass router to reach prefix
   Pt2.

4.  PUA (Prefix Unreachable Advertisement) Procedures

   [RFC7794] and [I-D.ietf-lsr-ospf-prefix-originator] draft both define
   one sub-tlv to announce the originator information of the one prefix
   from a specified node.  This draft utilizes such TLV for both OSPF
   and ISIS to signal the negative prefix in the perspective PUA when a
   link or node goes down.

   ABR detects link or node down and floods PUA negative prefix
   advertisement along with the summary advertisement according to the
   prefix-originator specification.  The ABR or ISIS L1-L2 border node
   has the responsibility to add the prefix originator information when
   it receives the Router LSA from other routers in the same area or
   level.

   When the ABR or ISIS L1-L2 border node generates the summary
   advertisement based on component prefixes, the ABR will announce one
   new summary LSA or LSP which includes the information about this down
   prefix, with the prefix originator set to NULL.  The number of PUAs
   is equivalent to the number of links down or nodes down.  The LSA or
   LSP will be propagated with standard flooding procedures.

   If the nodes in the area receive the PUA flood from all of its ABR
   routers, they will start BGP convergence process if there exist BGP
   session on this PUA prefix.  The PUA creates a forced fail over
   action to initiate immediate control plane convergence switchover to
   alternate egress PE.  Without the PUA forced convergence the down
   prefix will yield black hole routing resulting in loss of
   connectivity.

   When only some of the ABRs can't reach the failure node/link, as that
   described in Section 3.2, the ABR that can reach the PUA prefix
   should advertise one specific route to this PUA prefix.  The internal
   routers within another area can then bypass the ABRs that can't reach
   the PUA prefix, to reach the PUA prefix.

5.  MPLS and SRv6 LPM based BGP Next-hop Failure Application

   In an MPLS or SR-MPLS service provider core, scalability has been a
   concern for operators which have split up the IGP domain into
   multiple areas to avoid SPF churn.  Normally, MPLS FEC binding for
   LSP instantiation is based on egress PE exact match of a host route
   Looback0.  [RFC5283] LDP inter-area extension provides the ability to
   LPM, so now the RIB match can now be a summary match and not an exact
   match of host route of the egress PE for an inter-area LSP to be



Wang, et al.           Expires September 27, 2021               [Page 5]


Internet-Draft                     PUA                        March 2021


   instantiated.  The caveat related to this feature that has prevented
   operators from using the [RFC5283] LDP inter-area extension concept
   is that when the component prefixes are now hidden in the summary
   prefix, and thus the visibility of the BGP next-hop attribute is
   lost.

   In a case where a PE is down, and the [RFC5283] LDP inter-area
   extension LPM summary is used to build the LSP inter-area, the LSP
   remains partially established black hole on the ABR performing the
   summarization.  This major gap with [RFC5283] inter-area extension
   forces operators into a workaround of having to flood the BGP next-
   hop domain wide.  In a small network this is fine, however if you
   have 1000s PEs and many areas, the domain wide flooding can be
   painful for operators as far as resource usage memory consumption and
   computational requirements for RIB / FIB / LFIB label binding control
   plane state.  The ramifications of domain wide flooding of host
   routes is described in detail in [RFC5302] domain wide prefix
   distribution with 2 level ISIS Section 1.2 - Scalability.  As SRv6
   utilizes LPM, this problem exists as well with SRv6 when IGP domain
   is broken up into areas and summarization is utilized.

   PUA is now able to provide the negative prefix component flooded
   across the backbone to the other areas along with the summary prefix,
   which is now immediately programmed into the RIB control plane.  MPLS
   LSP exact match or SRv6 LPM match over fail over path can now be
   established to the alternate egress PE.  No disruption in traffic or
   loss of connectivity results from PUA.  Further optimizations such as
   LFA and BFD can be done to make the data plane convergence hitless.
   The PUA solution applies to MPLS or SR-MPLS where LDP inter-area
   extension is utilized for LPM aggregate FEC, as well a SRv6 IPv6
   control plane LPM match summarization of BGP next hop.

6.  Implementation Consideration

   Considering the balances of reachable information and unreachable
   information announcement capabilities, the implementation of this
   mechanism should set one MAX_Address_Announcement (MAA) threshold
   value that can be configurable.  Then, the ABR should make the
   following decisions to announce the prefixes:

   1.  If the number of unreachable prefixes is less than MAA, the ABR
   should advertise the summary address and the PUA.

   2.  If the number of reachable address is less than MAA, the ABR
   should advertise the detail reachable address only.

   3.  If the number of reachable prefixes and unreachable prefixes
   exceed MAA, then advertise the summary address with MAX metric.



Wang, et al.           Expires September 27, 2021               [Page 6]


Internet-Draft                     PUA                        March 2021


7.  Deployment Considerations

   To support the PUA advertisement, the ABRs should be upgraded
   according to the procedures described in Section 4.  The PEs that
   want to accomplish the BGP switchover that described in Section 3.1
   and Section 5 should also be upgraded to act upon the receive of the
   PUA message.  Other nodes within the network can ignore such PUA
   message if they don't care or don't support.

   As described in Section 4, the ABR will advertise the PUA message
   once it detects there is link or node down within the summary
   address.  In order to reduce the unnecessary advertisements of PUA
   messages on ABRs, the ABRs should support the configuration of the
   protected prefixes.  Based on such information, the ABR will only
   advertise the PUA message when the protected prefixes(for example,
   the loopback addresses of PEs that run BGP) that within the summary
   address is missing.

   The advertisement of PUA message should only last one configurable
   period to allow the services that run on the failure prefixes are
   converged or switchover.  If one prefix is missed before the PUA
   mechanism takes effect, the ABR will not declare its absence via the
   PUA mechanism.

8.  Security Considerations

   Advertisement of PUA information follow the same procedure of
   traditional LSA.  The action based on the PUA is clearly defined in
   this document for ABR or Level1/2 router and the receiver that run
   BGP.

   There is no changes to the forward behavior of other internal
   routers.

9.  IANA Considerations

   This document has no IANA actions.

10.  Acknowledgement

   Thanks Peter Psenak, Les Ginsberg, Acee Lindem, Shraddha Hegde,
   Robert Raszuk, Tonly Li, Jeff Tantsura, Tony Przygienda and Bruno
   Decraene for their suggestions and comments on this draft.








Wang, et al.           Expires September 27, 2021               [Page 7]


Internet-Draft                     PUA                        March 2021


11.  Normative References

   [I-D.ietf-lsr-ospf-prefix-originator]
              Wang, A., Lindem, A., Dong, J., Psenak, P., and K.
              Talaulikar, "OSPF Prefix Originator Extensions", draft-
              ietf-lsr-ospf-prefix-originator-07 (work in progress),
              October 2020.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC2328]  Moy, J., "OSPF Version 2", STD 54, RFC 2328,
              DOI 10.17487/RFC2328, April 1998,
              <https://www.rfc-editor.org/info/rfc2328>.

   [RFC5283]  Decraene, B., Le Roux, JL., and I. Minei, "LDP Extension
              for Inter-Area Label Switched Paths (LSPs)", RFC 5283,
              DOI 10.17487/RFC5283, July 2008,
              <https://www.rfc-editor.org/info/rfc5283>.

   [RFC5302]  Li, T., Smit, H., and T. Przygienda, "Domain-Wide Prefix
              Distribution with Two-Level IS-IS", RFC 5302,
              DOI 10.17487/RFC5302, October 2008,
              <https://www.rfc-editor.org/info/rfc5302>.

   [RFC5340]  Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF
              for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008,
              <https://www.rfc-editor.org/info/rfc5340>.

   [RFC5709]  Bhatia, M., Manral, V., Fanto, M., White, R., Barnes, M.,
              Li, T., and R. Atkinson, "OSPFv2 HMAC-SHA Cryptographic
              Authentication", RFC 5709, DOI 10.17487/RFC5709, October
              2009, <https://www.rfc-editor.org/info/rfc5709>.

   [RFC7770]  Lindem, A., Ed., Shen, N., Vasseur, JP., Aggarwal, R., and
              S. Shaffer, "Extensions to OSPF for Advertising Optional
              Router Capabilities", RFC 7770, DOI 10.17487/RFC7770,
              February 2016, <https://www.rfc-editor.org/info/rfc7770>.

   [RFC7794]  Ginsberg, L., Ed., Decraene, B., Previdi, S., Xu, X., and
              U. Chunduri, "IS-IS Prefix Attributes for Extended IPv4
              and IPv6 Reachability", RFC 7794, DOI 10.17487/RFC7794,
              March 2016, <https://www.rfc-editor.org/info/rfc7794>.






Wang, et al.           Expires September 27, 2021               [Page 8]


Internet-Draft                     PUA                        March 2021


   [RFC7981]  Ginsberg, L., Previdi, S., and M. Chen, "IS-IS Extensions
              for Advertising Router Information", RFC 7981,
              DOI 10.17487/RFC7981, October 2016,
              <https://www.rfc-editor.org/info/rfc7981>.

Authors' Addresses

   Aijun Wang
   China Telecom
   Beiqijia Town, Changping District
   Beijing  102209
   China

   Email: wangaj3@chinatelecom.cn


   Gyan Mishra
   Verizon Inc.

   Email: gyan.s.mishra@verizon.com


   Zhibo Hu
   Huawei Technologies
   Huawei Bld., No.156 Beiqing Rd.
   Beijing  100095
   China

   Email: huzhibo@huawei.com


   Yaqun Xiao
   Huawei Technologies
   Huawei Bld., No.156 Beiqing Rd.
   Beijing  100095
   China

   Email: xiaoyaqun@huawei.com













Wang, et al.           Expires September 27, 2021               [Page 9]