Personal                                                M. Scott Corson
                                                        Ansible Systems
Internet Draft                                               A. O'Neill
                                                                     BT
                                                            G. Tsirtsis
                                                                Flarion
Document: draft-corson-fastrestore-00.txt
Category: Informational                                   November 2000


                         IP Fast Restoration
                  <draft-corson-fastrestore-00.txt>


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026. Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.



                                Abstract

   This memo describes a generalized approach for achieving IP fast
   restoration in response to link and router failures by using IP
   routing to support high network availability.  The approach relies
   on the utilization of flat, yet highly-localized routing technology
   to reduce far-reaching control message propagation, thereby limiting
   the effects of a failure and enabling sub-second restoration.  It
   then outlines how the Temporally-Ordered Routing Algorithm (TORA) (a
   Mobile Ad hoc Network (MANET) routing protocol), when suitably
   modified for use in fixed networks, can be used within this
   framework. The general approach to routing topology management
   within the fast restoration framework is immediate reaction on link
   failure (to preserve existing flows) and gradually integration of
   new network elements on activation into the domain (to further
   assure stability).
Internet Draft               Fast Restore                  August 2000



INDEX

      1. Introduction . . . . . . . . . . . . . . . . . . . . .  2
           1.1 Terminology. . . . . . . . . . . . . . . . . . .  2
      2. Motivation . . . . . . . . . . . . . . . . . . . . . .  3
      3. The Fast Restoration Framework . . . . . . . . . . . .  4
           3.1 Highly Localized Routing Protocol. . . . . . . .  5
           3.2 Link Status Assessment Protocol. . . . . . . . .  5
           3.3 Reaction to Failure. . . . . . . . . . . . . . .  6
      4. The Temporally-Ordered Routing Algorithm . . . . . . .  7
           4.1 Overview . . . . . . . . . . . . . . . . . . . .  7
           4.2 Router Activation (Route Creation) . . . . . . .  8
           4.3 Link Failure . . . . . . . . . . . . . . . . . .  9
           4.4 Link Activation. . . . . . . . . . . . . . . . .  9
           4.5 Router Failure (Route Erasure) . . . . . . . . . 10
      5. Detecting Router Failures with Single-Homed Prefixes . 10
           5.1 Neighbor Advertisement Protocol. . . . . . . . . 11
           5.2 Neighbor Reachability Protocol . . . . . . . . . 12
      6. Conclusion . . . . . . . . . . . . . . . . . . . . . . 14
      7. References . . . . . . . . . . . . . . . . . . . . . . 15


1. Introduction

   This memo describes a generalized approach for achieving IP fast
   restoration in response to link and router failures by using IP
   routing to support high network availability.  The approach relies
   on the utilization of flat, yet highly-localized routing technology
   to reduce far-reaching control message propagation, thereby limiting
   the effects of a failure and enabling sub-second restoration.  It
   then outlines how the Temporally-Ordered Routing Algorithm (TORA) (a
   Mobile Ad hoc Network (MANET) routing protocol), when suitably
   modified for use in fixed networks, can be used within this
   framework. The general approach to routing topology management
   within the fast restoration framework is an immediate reaction on
   link failure (to preserve existing flows) and gradually integration
   of new network elements on activation into the domain (to further
   assure stability).

   The draft first discusses the motivation for this draft and its
   relevance to a recent IAB report [IAB99] on routing convergence. It
   then outlines a framework of protocol mechanisms, and then gives a
   fuller description of the main mechanisms.  It then describes a
   candidate routing algorithm that can be used within this framework.
   It then additionally describes the mechanisms within the framework
   to address efficiency and localization issues associated with the
   failure of routers advertising single-homed prefixes. The draft
   concludes with a discussion of the benefits of this fast restoration
   framework.


1.1 Terminology
Internet Draft               Fast Restore                  August 2000



   Here are some new terms defined in this draft.

   RID    Router ID. A unique ID used by a routing protocol to identify
       a router.

   RPLF   Routing Protocol Link Failure. A RPLF packet is generated as
          a routing protocol's reaction to a link failure for each
          destination prefix affected by the failure

   RPDF   Routing Protocol Destination Failure. RPDF is a packet
          generated to eliminate routing to a destination that has been
          partitioned from the rest of the network

   LFDT   Link Failure Detection Timer. Failure to receive an FRLS
          packet with the neighbor heard flag set within the LFDT
          setting indicates link failure.

   LFUB   Link Failure Upper Bound. This is the upper bounds of the
   time
          required to detect failure.

   FRNA   Fast Restore Neighbor Advertisement. A router's FRNA
          packet also contains the RIDs and corresponding prefixes of
          the router's one-hop neighbors.

   FRLS   Fast Restore Link Status. A FRLS packet contains a router's
          RID and an indication of whether or not the router has
          recently heard its neighbor.

2. Motivation

   The report from the recent IAB Network Layer Workshop [IAB99] raised
   several routing issues.  Regarding routing convergence, it noted
   that "at least one backbone operator has concerns about the
   convergence time of internetwork-wide routing during a failover.
   This operator believes that current convergence times are on the
   order of half a minute, and possibly getting worse.  Others in the
   routing community did not believe that the convergence times are a
   current issue.  Some, who believe that real-time applications (e.g.
   telephony) require sub-second convergence, are concerned about the
   implications of convergence times of a half minute on such
   applications. [IAB99]"

   It is agreed that existing Internet intra-domain routing protocols
   (e.g. [RIP,OSPF]) have limited responsiveness in reacting to
   topological changes (i.e. slow convergence) and consequently have
   limitations on the size of domains they may support.  These
   limitations stem directly from the use of shortest-path routing
   based on global topological information exchange.

   For any premium service with critically tight availability
   requirements, operators are left with the use of hot standby routers
   and link level restoration to improve restoration times. Both
Internet Draft               Fast Restore                  August 2000



   mechanisms are however very expensive, and applicable mainly to
   large-scale core elements. However, these cannot be cost-effectively
   applied in edge networks such as IP enabled Radio Access Networks
   and base stations and flat enterprise networks. These will likely be
   composed of large numbers of small and necessarily cheap IP routers
   which may be interconnected by ad-hoc, non-resilient or unprotected
   link layers such as ethernet and various serial lines. The need for
   a fast restoring IGP is therefore apparent. In addition, if a cost-
   effective IP layer solution could be found which could be made
   sympathetic to traffic engineering requirements then the requirement
   for hot standby and link layer restoration could be reduced
   resulting in cost savings. The traffic engineering implications are
   not however considered in this draft.

   By relaxing the requirement for shortest path routing only slightly
   (with acceptable impact on forwarding efficiency), one can
   approximate shortest-path routing while enabling IP fast restoration
   and higher levels of domain scalability.  This can be achieved by
   utilizing adaptive routing algorithms that localize their response
   to topological changes while ensuring loop-free paths.  It is
   localization that yields the desired benefits of reduced restoration
   time and increased domain size.

   The proposed fast restoration approach therefore offers a potential
   intra-domain solution to this convergence problem in reaction to
   failure. It is presently unclear whether the fast restoration
   techniques discussed in this memo have potential applicability to
   the associated inter-domain routing problem.


3. Fast Restoration Framework

   Four general protocol mechanisms are required in the proposed fast
   restoration framework. The first two are mandatory in that they
   provide the basic fault detection and reaction mechanisms. The
   second two are optional and illustrate one way to enable the
   reaction to be tailored to whether the failure point is a link or a
   neighbor router.

   1) A highly localized routing protocol
        This is required to enable the protocol to be able to react
        extremely quickly to failures, with minimal impact on other
        routers within the domain.

   2) A link status assessment protocol
        This is required to enable a router to rapidly detect link or
        router failures.

   3) A neighbor advertisement protocol
        This is used to enable a router to learn about its local two
        hop topology including its neighbors' neighbors and the degree
        of connectivity (single/multi-homed) of subnets.
Internet Draft               Fast Restore                  August 2000



   4) A neighbor reachability protocol
        This is used to enable a router to identify whether it is the
        link or the neighbor router that has failed through
        collaboration with its two hop neighbors.


3.1 Highly Localized Routing

   The general approach to routing topology management within the fast
   restoration framework is immediate reaction on link failure (to
   preserve existing flows) and gradually integration of new network
   elements on activation into the domain (to further assure
   stability).

   The method by which the aforementioned route adjustment occurs is a
   function of the routing algorithm.  It is essential that a routing
   algorithm have the following properties to operate within this fast
   restoration framework.

   1. The protocol reaction to a link failure that does not partition
      the network be localized for all destination prefixes.

   2. The protocol reaction to a router failure that does not partition
      the network should be localized for all destination prefixes.
      (i.e. it should be treated as a link failure with respect to
      these other prefixes).

   3. The protocol reaction to a link failure that does partition the
      network be capable of detecting the partition (and erasing routes
      if desired).

   4. The protocol reaction to a router failure that does partition the
      network be capable of erasing routes for these prefixes (if
      desired). Single-homed prefixes on the failed router will by
      definition create a partition.

   The preceding description assumes that a routing protocol reacts to
   a link or router failure by sending generic Routing Protocol Link
   Failure (RPLF). Routing Protocol Destination Failure (RPDF) packets
   as generated as necessary in response to partitions.


3.2 Link Status Assessment

   The link status assessment protocol is principally concerned with
   fast detection of link failures.  Work within the IETF on a
   generalized IP level link status assessment algorithm was recently
   proposed [FLIP].  A similar capability is required here where a link
   consists of a pair of Router IDs (RIDs).  After discovery of a
   neighbor (identified by its RID) via the neighbor discovery
   protocol, the objective of a link status assessment algorithm would
   be to monitor with high resolution (in time) the bi-directional
   connectivity of a link with an adjacent neighbor router.
Internet Draft               Fast Restore                  August 2000




   It is desired that a link failure be detectable on millisecond time
   scales enabling the higher level mechanisms (to be described) to
   quickly react thereby preserving ongoing sessions. The protocol
   operates by periodically sending Fast Restore Link Status (FRLS)
   packets between adjacent interfaces to reconfirm link bi-
   directionality.  The FRLS retransmission period is set to
   LINK_PROBE_INTERVAL time units.  A FRLS packet contains a router's
   RID and an indication of whether or not the router has recently
   heard its neighbor.

   FRLS packets are initially sent indicating that the sending router
   has not heard its neighbor.  Reception of a FRLS packet at a router
   from a neighbor causes the router to set the "neighbor heard" flag
   to TRUE for ROUTER_HEARD TIME (perhaps 3.5 times
   LINK_PROBE_INTERVAL) during which time it sends FRLS packets back
   towards its neighbor indicating that it has heard its neighbor.
   Reception of any packet from its neighbor causes the router to reset
   the ROUTER_HEARD_TIMER associated with this flag setting. Reception
   of a FRLS packet with the neighbor heard flag set to TRUE causes the
   router to consider the link with its neighbor as bi-directional.  At
   that time the Link Failure Detection Timer (LFDT) is also set to
   ROUTER_HEARD_TIME.  Reception of a subsequent FRLS packet with the
   neighbor heard flag set resets the LFDT.  Failure to receive an FRLS
   packet with the neighbor heard flag set within the LFDT setting
   indicates link failure.  Also, if the router considers the link to
   be bi-directional, subsequent reception of a FRLS packet without the
   neighbor heard flag set immediately indicates link failure (i.e.
   this indicates that its neighbor no longer hears the router).  Thus
   the sum of ROUTER_HEARD_TIME plus LINK_PROBE_INTERVAL equals Link
   Failure Upper Bound (LFUB) and effectively upper bounds the time
   required to detect failure.

   Alternatively, the protocol may also rely on a link layer
   notification of link failure (e.g. loss of carrier) if available.
   Such a signal may provide a more accurate and timely indication of
   link status.

3.3 Reaction to Failure

   After the link status assessment protocol detects a link failure
   (possible router failure) at a router X with a neighbor Y, the
   following processing steps occur at router X:


   1) Router X initially drops (or buffers) data packets for
      destination routes whose next hop was over the failed link with
      neighbor Y.  This clearly includes any single-homed or multi-
      homed destination prefixes on the (potentially failed)
      neighboring router Y.

   2) Router X adjusts routes for all affected destination prefixes
      aiming to locally route around the failed link and/or router. Let
Internet Draft               Fast Restore                  August 2000



      us assume that the routing protocol's reaction to a link failure
      generates a Routing Protocol Link Failure (RPLF) packet for each
      destination prefix. Note that the RPLF activity for a particular
      destination will definitely fail in either of two cases.

   3) The first would be if the link failure had generated a partition
      in the topology towards that destination which places an obvious
      requirement that restoration requires a sufficient level of
      connectivity between nodes to guarantee the availability of an
      alternate path. Partition detection would lead to the generation
      of a Routing Protocol Destination Failure (RPDF) packet that
      would be used to eliminate the routing for that destination
      throughout the domain.

   4) The second more specific example occurs if the destination prefix
      is single-homed on a router, and that router has itself failed,
      rendering restoration around the failure futile. Therefore, for
      singly homed prefixes it may be useful to react differently to
      the failure of the router. This firstly requires a method to
      discover which routers have single-homed prefixes, a method to
      discover when such routers fail, and an alternative strategy to
      react to this type of failure. These are discussed in section 5.
      One such strategy though would be to hold back on pointless RPLF
      processing to give the router time to re-boot. This avoids the
      RPLF processing causing partition detection and subsequent RPDF
      generation, resulting in the elimination of the routing for that
      destination throughout the domain. In this case, the domain
      relies on interim ICMP unreachable messages to inform
      applications of the loss of connectivity to the destination. If
      re-boot does not occur within a fixed period then the routes to
      that destination can be then be reasonably removed from the
      domain through RPDF propagation. This is somewhat similar to the
      `graceful' strategy for avoiding routing flaps in BGP
      [BGPrestart] by suppressing route erasure on router failure in
      anticipation of fast reboot and graceful BGP restart.



4 The Temporally-Ordered Routing Algorithm

4.1 Overview

   We now map these generic packets to a specific routing protocol ---
   the Temporally-Ordered Routing Algorithm (TORA) [TORA, TORAdraft].
   In TORA, RPLF packets map to Update (UPD) packets while RPDF packets
   map to Erase (ERS) packets.  This latter packet type does not yet
   appear in [TORA, TORAdraft].

   TORA is not a shortest-path routing protocol.  It is a "link
   reversal" routing protocol that differs significantly from existing
   routing technology.  TORA is similar to distance vector protocols
   (e.g. RIP) in that a separate version of the protocol runs
   independently for each destination prefix.  Thus its storage
Internet Draft               Fast Restore                  August 2000



   requirements scale with the number of destination prefixes.  But its
   similarity ends there.  In its original usage mode, TORA is an on-
   demand protocol that builds routes reactively in response to route
   requests from a MANET IP kernel.  However, for usage in fixed
   networks it is transformed into a more traditional, proactive
   protocol.

   The key attribute of the protocol remains the same for both
   versions, and this is the protocol's aggressive, optimistic reaction
   to link failures which is critical within the MANET environment but
   is becoming equally important in fixed network scenario's.  On link
   failure, when necessary, the protocol immediately reverses the
   directions of links in the area near the failure effectively
   searching for alternative routes to the destination.  If such exist,
   the protocol typically finds them in a time and communication-
   efficient, single-pass reaction.  Otherwise (in case of partition),
   the reaction detects the partition enabling route erasure if
   desired.

   We now briefly describe how TORA would function within the fast
   restoration framework.


4.2 Router Activation (Route Creation)

   When a router boots up, it establishes new links with its neighbors.
   The router proactively initiates a localized process of height
   database synchronization with each of its new neighbors.  Its
   neighbors transmit the contents of their height databases to the new
   router.  From this and any received routing advertisements (if any),
   a router can determine which of its own prefixes are single or
   multi-homed.

   It then generates an Optimization (OPT) packet for each destination
   prefix it wishes to advertise.  If the prefix is single-homed and
   heights exist in the neighboring routers, the OPT packet stops at
   these neighbors and is not propagated further. Otherwise, the OPT
   packet floods fully (or partially) throughout a domain installing
   (or reorienting) routing entries for its destination prefix. The
   collection of routing entries forms a Directed Acyclic Graph (DAG).
   From an individual router's perspective, this means that it will
   have one or more feasible next hops for the destination, all of
   which are guaranteed to be loop-free.

   Loop-free multipath routing is achieved through the use of
   "heights".  Each router maintains a height with respect to the
   destination, and the destination has the lowest height.  A link is
   directed from a higher height to a lower height, and packets may
   only be forwarded from routers with higher heights to routers with
   lower heights.  Thus packets flow "downhill" towards the
   destination.  Each height is guaranteed to be unique and thus the
   collection of heights forms a total order.  The protocol is
Internet Draft               Fast Restore                  August 2000



   sometimes also referred to as the "Totally-Ordered Routing
   Algorithm".

   From a router's perspective, a neighbor that is higher is referred
   to as an "upstream" neighbor while a neighbor that is lower is
   referred to as a "downstream" neighbor.  A destination router always
   has only upstream neighbors, and no other router is allowed to have
   upstream neighbors without having at least one downstream neighbor.
   It should be understood that all heights and associated link
   directions are per IP destination prefix DAG.


4.3 Link Failure

   A router may lose an upstream neighbor (due to a link failure) at
   any time without requiring protocol reaction.  Similarly, it may
   lose a downstream neighbor provided it has other downstream
   neighbors, or has no downstream neighbors and no upstream neighbors,
   without requiring protocol reaction. However, if due to a link
   failure a router finds itself with at least one upstream neighbor
   without having any downstream neighbors (i.e. it just lost its last
   downstream neighbor and is a black hole), the router must adjust its
   height so that it has at least one downstream neighbor.  The manner
   in which this occurs is described in [TORA,TORAdraft].  The
   adjustment results in transmission of an UPD (RPLF) packet to the
   router's neighbors advertising its new height.

   The effect of this height adjustment is to reverse the direction of
   some or all of the router's links; hence the name "link reversal"
   algorithm.  Reception of an UPD (RPLF) packet at a neighbor may
   cause that neighbor to lose its last downstream link, which is cause
   for subsequent height modification and UPD packet transmission.
   Examination of the algorithm's reaction on mesh-like topologies
   reveals that the number of subsequent UPD (RPLF) transmissions is
   typically quite low; often only one UPD (RPLF) transmission is
   required. The resultant reaction is thus highly localized.  It is
   also strongly de-correlated with the size of the domain.


4.4 Link Activation

   When a router restarts, it re-establishes  links with its neighbors.
   The router initiates a localized process of height database
   synchronization with each of its new neighbors.  Its neighbors
   transmit the contents of their height databases to the new router.
   By default, the  router's new heights are NULL for each destination.
   (Note: A NULL height is higher than any non-NULL height, and packets
   may be forwarded from NULL to non-NULL heights.  A link between two
   routers with NULL heights for a destination is marked as
   "unassigned", and packet forwarding over such links is disallowed.)
   The re-started  router then performs its own non-NULL height
   selection with respect to the existing heights of its neighbors.
Internet Draft               Fast Restore                  August 2000



   The algorithm for height selection is not discussed here, but it
   will be described in an upcoming draft detailing modification of
   TORA for use in fixed networks.  The procedure only involves a
   router's one-hop neighbors.  The height selection procedure occurs
   slowly, and a link is gradually brought back into active usage
   within the routing domain.


4.5 Router Failure (Route Erasure)

   When a router fails, then the default behavior is that each neighbor
   individually reacts to the router failure as a link failure, sending
   UPD (RPLF) packets as described in 3.3 above.  This will restore
   paths around the failed router if sufficient physical connectivity
   exists.

   The failed router itself may have been advertising destination
   prefixes into the network and in the case of multi-homed
   subnets/prefixes the TORA reaction will result in the paths towards
   the failed routing being redirected towards the nearest alternate
   router also advertising that prefix.

   In the case of single-homed prefixes, the restoration attempt will
   clearly fail as those subnets are now partitioned from the core
   network. The collective UPD (RPLF) reaction of the routers connected
   to the failed router, followed by partition detection via the normal
   TORA reflection, would eventually clear the network of routing state
   for these single-homed destination prefixes.  This is wasteful if
   the router is simply undergoing a re-boot and therefore additional
   protocol mechanisms may be warranted as highlighted in 3.3 and
   described in section5 below. The aim of these mechanisms in TORA
   would be to avoid sending UPD (RPLF) packets for the single-homed
   destination prefixes, wait to see if the associated router recovers
   in a set time, and if not, only then use ERS (RPDF) packets to
   efficiently clear this routing state from the network.

5. Detecting Router Failures with Single-Homed Prefixes

   As has previously been discussed, it may be useful to use additional
   protocol mechanisms to be able to distinguish router failures and
   associated single-homed destinations so that futile restoration
   processing and premature route elimination can be avoided. One such
   approach is outlined below, but the necessity and some of the
   mechanics for solving this problem may be critically dependent on
   the particular IGP.

   In this approach, a bootstrapping neighbor advertisement protocol
   enables a router to learn its one-hop neighbors, as well as its
   neighbors' one-hop neighbors, called the two-hop neighbor set.  It
   works in combination with IPv6 router advertisements (or  equivalent
   ICMP router messages in IPv4 (RFC1256)) to enable a router to assess
   its routing neighbors. Once a link failure is detected, the neighbor
   reachability protocol attempts to determine whether the router at
Internet Draft               Fast Restore                  August 2000



   the other end of the failed link is still reachable, and whether it
   is advertising single-homed and/or multi-homed destination prefixes.
   With input from the preceding protocols, the highly localized
   routing protocol then reacts differently for router failures
   involving single-homed prefixes.



5.1 Neighbor Advertisement Protocol

   This protocol is principally concerned with neighbor discovery and
   protocol bootstrapping. As with OSPF, each router is identified by a
   Router ID (RID).  It also is associated with at least one IP
   destination prefix by which it is globally reachable.  Routes for
   this prefix are advertised out of all its interfaces and built by
   the routing algorithm.  It may also advertise other destination
   prefixes.  These prefixes may be single-homed (i.e. only this router
   advertises itself as the last hop for these prefixes) or multi-homed
   (i.e. they are advertised by multiple routers as being last hop
   routers).

   The neighbor advertisement protocol requires that each router
   periodically advertises its RID and destination prefix(es) in Fast
   Restore Neighbor Advertisement (FRNA) packets to its one-hop routing
   neighbors out interfaces on which routing is activated. FRNA packets
   would likely be multicast (TTL scope == 1) to a well-known multicast
   address.  The transmission period would be relatively long (on the
   order of 30 seconds).  A router's FRNA packet also contains the RIDs
   and corresponding prefixes of the router's one-hop neighbors that it
   has learned from their previous FRNA transmissions. In this way, by
   receiving its neighbors' FRNA packets, a router learns the
   identities of its neighbors' neighbors in a fashion very similar to
   the neighbor advertisement algorithm of [OLSR].  Associated with
   each link a router has with a routing neighbor is an instance of a
   link status assessment protocol (see section 2.2).  For each
   instance a router has a Link Failure Detection Timer (LFDT) for that
   link.  This is the time after which a router will declare a link to
   its neighbor down if no packets are received from its neighbor.  A
   router also computes an Link Failure Upper Bound (LFUB) on the
   overall time required to detect a failure on each link, which is
   equal to the LFDT value plus a link polling period (to be
   described). A router X also advertises in its FRNA transmissions the
   maximum computed value over all its links of these bounds, known as
   the maximum LFUB (LFUBmax(X)).

   The neighbor advertisement protocol operates in conjunction with
   IPv6 Router Advertisement (RA) messages (or an equivalent in IPv4)
   which are sent out interfaces on which routing is not activated
   (e.g. over subnets with attached hosts).  Router advertisement
   messages convey the destination prefixes being advertised by other
   routers on a given subnet.  By listening to these advertisements, a
   router can learn if other routers are advertising destination
Internet Draft               Fast Restore                  August 2000



   prefixes that itself is also advertising.  It thus determines which
   of its own last-hop prefixes are single-homed or multi-homed.

5.2 Neighbor Reachability Protocol

   We now describe the basic operation of a neighbor reachability
   protocol. The protocol is assumed to run at a router X, and is
   triggered by link failure detection on a link to router Y. After a
   short time (estimated sufficient for localized route re-
   computation), router X executes a distributed neighbor reachability
   protocol to determine reachability of the single-homed destination
   prefixes advertised by neighbor Y as follows.

   The (potentially failed) neighbor Y's maximum LFUB (LFUBmax) is
   known from its previous FRNA transmissions. With this information a
   router X creates a RPLF packet as it would if this were a link
   failure, but delays transmission of the packet for a time T somewhat
   greater than its neighbor Y's LFUBmax; thus T > LFUBmax(Y).  In the
   meantime it immediately sends a Fast Restore Link Failure (FRLF)
   packet to each of the (potentially failed) neighbor Y's one-hop
   neighbors.  Reception of a FRLF packet at such a neighbor Z
   instructs that neighbor to mark the link between the indicated
   router Y and the sender X as "failed". But the receiver Z keeps the
   sender X in its list of the neighbor Y's one-hop neighbors.  This is
   to ensure that Z will still send a FRLF packet to X if it also
   detects a link failure with Y.  Z will change its notion of Y's
   neighbors only on reception of a FRNA message from Y itself.
   Subsequent reception of a FRLS packet at router Z from neighbor Y
   indicates to the receiver Z that the link between routers X and Y
   may indeed have failed, but that router Y itself is still
   functioning.

   Let us assume that router Y has indeed failed, and that its neighbor
   X was the first neighbor to detect this failure, thus sending FRLF
   packets to router Y's previous one-hop neighbors.  Subsequently
   others of Y's previous neighbor set begin to declare link failures
   with Y, each scheduling delayed RPLF transmissions and immediately
   sending an FRLF packet to each of Y's past neighbors.  Let us also
   assume that Y's neighbor Z is the last neighbor to detect a link
   failure with router Y.  It also schedules a RPLF packet for
   transmission (suitably delayed) and sends FRLF packets to each of
   Y's previous neighbors. By this time (or very shortly thereafter) Z
   will likely to be the first of Y's past neighbors to both declare
   link failure with Y and have received FRLF messages from all of Y's
   previous neighbors (without having heard a subsequent message from
   Y).

   At that point router Z declares "router failure" of previously
   adjacent neighbor Y, and cancels its pending RPLF transmission for
   Y. The router Z then queues a Routing Protocol Destination Failure
   (RPDF) packet for each of neighbor Y's single-homed destination
   prefixes (Note that for efficiency, multiple RPDF packets may be
   aggregated and sent within a single aggregate RPDF packet).
Internet Draft               Fast Restore                  August 2000



   Assuming T (T > LFUBmax(Y)) is set appropriately at each of Y's
   previous neighbors, none of Y's previous neighbors would have yet
   transmitted their pending RPLF packets. The router also sets a
   NEIGHBOR_RESTART_TIMER to some value (possibly 30 seconds) while it
   waits for neighbor Y to reboot.  During this time packets sent to
   Y's single-homed prefixes will be undeliverable resulting in "host
   unreachable" ICMP message being returned to the sources.  Assuming
   neighbor Y reboots within this time, the queued RPDF transmission is
   cancelled.  Otherwise the queued RPDF packet is transmitted when the
   timer expires and the routes for Y's single-homed destination
   prefixes are erased from the domain.  The objective of this timer is
   to suppress route erasure in the case of fast router reboot in a
   fashion similar to that proposed in [BGPrestart].

   If the NEIGHBOR_RESTART_TIMER expires and T (T > LFUBmax(Y)) is not
   set appropriately at each of Y's previous neighbors, possibly others
   may also have queued and then issued RPDF packets prior to receiving
   an FRDF message originating from router Z.  This is not detrimental.
   RPDF packets are destination-specific and source-independent.  Their
   collective propagation results in only a single aggregated flood,
   regardless of the number of routers originating RPDF packets. An
   RPDF packet floods throughout the network immediately, efficiently
   clearing state information regarding the effected destinations. An
   RPDF packet is sent to the ALL_ROUTERS multicast address.

   For maximal efficiency the aforementioned protocol assumes a domain
   topology with the characteristic that, after any single router or
   link failure, the remaining domain remains well connected.  Thus
   router failure does not partition the remaining domain permitting
   the failed neighbors to exchange FRLF messages. This is required so
   fast restoration can be achieved.

   Nevertheless, if the network becomes partitioned the above algorithm
   need not result in undo harm provided the routing protocol is
   capable of detecting partitions. A router/link failure, in such a
   case, will result in multiple link failures being detected since the
   failed router Y's neighbors on one side of the partitioned network
   will not be able to reach Y's one hop neighbor's on the other side
   of the partitioned domain (i.e. FRLF will fail to reach some
   neighbors) to reach a more definite conclusion.  Consequently each
   such neighbor will transmit a RPLF packet for each single-homed
   destination prefix on the potentially failed router.  Each packet
   will initiate a distributed computation that performs "distributed
   partition detection" and eventually clears the domain of routing
   state for its destination.  This approach is less efficient than
   that using RPDF packets.

   The approach also assumes the FRLF messages are not frequently
   dropped.  Loss of at least one FRLF message destined for each of the
   failed destination's neighbors would ensure that no neighbor
   generates an RPDF packet.  If there are k such neighbors with each
   needing to receive all k-1 FRLF packets from the other neighbors,
   and the probability of packet dropping is p, then an upper bound on
Internet Draft               Fast Restore                  August 2000



   the probability of this undesirable, joint packet dropping event is
   p^k.  That is, each of the k neighbors must have missed at least one
   packet from some other neighbor.  If control packets are given
   reasonable priority (suitably small p), then this will not occur
   very frequently (i.e. only with probability p^k).

   If it does occur, each of these neighbors would start to
   independently transmit their queued RPLF packets (which is what this
   router failure detection mechanism intends to avoid).  This results
   in the aforementioned distributed partition detection process and
   the worst-case consequences regarding convergence in this case
   depend on the particular routing algorithm.


6. Conclusions

   The fast restoration processing occurs entirely at layer 3 and does
   not require special layer 2 considerations, although a layer 2 link
   activation/failure notification trigger may be used if available.

   The overall restoration approach is constrained only by the speed at
   which a router can create, transmit and process the localized
   routing updates.  Increases in processing and transmission speeds
   will only improve restoration performance over time.

   It is anticipated that restoration can frequently be achieved with
   little or no noticeable effect on voice over IP sessions.  Some
   degradation would likely be experienced by higher rate sessions
   given current hardware capabilities.

   The approach offers immunity to link instability (such as that
   triggered by malfunctioning interface cards).  A link that flip-
   flops between UP and DOWN will effectively be removed from the
   forwarding paths of its adjacent routers without triggering domain-
   wide routing updates or routing re-convergence computations.  Thus
   the network will not melt down due to localized link problems.

   The approach offers reasonable immunity to router instability as
   well.  A router that frequently goes UP and DOWN will not create
   domain-wide routing messages (route building followed by erasure) if
   it can quickly reboot.  If it cannot, then these messages still only
   effect its own prefixes.  With regards to the remaining network
   prefixes, the routing protocol will treat such router instability as
   a set of individually unstable links, effectively removing them from
   the forwarding paths for these prefixes and isolating the network
   from harm.

   The net effect then is that one can deploy much larger domains with
   greatly reduced risk of domain melt down while enabling fast
   restoration of IP flows when necessary.
Internet Draft               Fast Restore                  August 2000



7. References

   [BGPrestart] S. Ramachandra et al., "Graceful Restart mechanism for
   BGP", Internet-Draft, draft-ramachandra-bgp-restart-00.txt, (work in
   progress), August 2000.

   [FLIP] H.Sandick et al., "Fast LIveness Protocol (FLIP)", Internet
   Draft, draft-sandiick-flip-00.txt, February 2000.

   [IAB99] M. Kaat, "Overview of 1999 IAB Network Layer Workshop",
   Internet-Draft, draft-iab-ntwlyrws-over-03.txt, July 2000.

   [MobileIP] C.E. Perkins, ``IP Mobility Support," Internet RFC 2002,
   Oct 1996.

   [OLSR] P. Jacquet, P. Muhlethaler and A. Qayyum, "Optimized Link
   State Routing Protocol", Internet-Draft, draft-ietf-manet-olsr-
   01.txt, February 2000.

   [OSPF] J. Moy, "OSPF Version 2", Internet RFC 2328, April 1998.

   [RIP] G. Malkin, "RIP Version 2", Internet RFC 2453, November 1998.

   [TORA] V. Park, M. S. Corson, "The Temporally-Ordered Routing
   Algorithm", Proc. IEEE INFOCOM '97, Kobe, Japan, April 1997.

   [TORAdraft] V. Park, M. S. Corson, "Temporally-Ordered Routing
   Algorithm (TORA) Version 1 Functional Specification", Internet-Draft
   (work in progress), draft-ietf-manet-tora-spec-02.txt, October 1999.


Author's Addresses

      M. Scott Corson
      University of Maryland
      Ansible Systems
      (+1) 301-405-6630
      corson@isr.umd.edu
      mscorson@ix.netcom.com


      Alan O'Neill
      BT
      (+44) 1473-646459
      alan.w.oneill@bt.com


      George Tsirtsis
      Flarion Technologies
      (+44) 020 88260073
      g.tsirtsis@Flarion.com
      gtsirt@hotmail.com
Internet Draft               Fast Restore                  August 2000



                                Copyright Notice

                         Placeholder for ISOC copyright.