INTERNET-DRAFT
Bidirectional Forwarding Detection                       A.  Palanivelan
Category: HISTORIC                                         Cisco Systems
Expires:  Dec 2010                                         June 09, 2010


    Bidirectional Forwarding Detection (BFD) with Graceful Restart
                    draft-palanivelan-bfd-v2-gr-05.txt

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on December 10, 2010.


Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the BSD License.




A.Palanivelan                                                  [Page 1]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt     June 2010


Abstract

This document proposes an extension for Bidirectional Forwarding Detection
(BFD) to support Graceful restart, in complementing Gracefulrestart support
of the underlying protocol.This shall work consistently irrespective of the
bfd  mode or protocol or the type of restart and most importantly the
vendors design and implementation.This document describes in  detail the
challenges to bfd in surviving  a graceful restart and a generic solution
to succeed.



Table of Contents

1  INTRODUCTION   .............................................  3
2  OVERVIEW     ...............................................  3
3  MOTIVATIONS   ..............................................  4
   3.1  Planned Restarts with control protocols ...............  4
   3.2  BFD Co-existing with BB configs .......................  5
4  Extensions to BFD ..........................................  5
   4.1  Version  (Vers)........................................  5
   4.2  Diagnostic  (Diag).....................................  6
   4.3  My Restart Interval....................................  6
   4.4  Your Restart Interval..................................  6
5  State Machine for BFD with GR Support.......................  6
6  Theory of operation.........................................  8
   6.1  Session Establishment and GR Timer exchange............  8
   6.2  Remote Neighbor Restart and Recovery...................  9
7  Security Considerations...................................... 11
8  IANA Considerations......................................... 11
9  References ................................................. 11
   9.1  Normative References................................... 11
   9.2  Informative References................................. 11
10 Author's address............................................ 12










A.Palanivelan                                                  [Page 2]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt     June 2010


1. Introduction

The Bidirectional  Forwarding  Detection  protocol  [BFD]  provides  a
mechanism for liveness detection of arbitrary paths  between  systems.
It is intended to provide low-overhead,  short-duration  detection  of
failures in the path between adjacent  forwarding  engines,  including
the  interfaces,  data  link(s),  and  to  the  extent  possible   the
forwarding  engines   themselves.   It   operates   independently   of
media,data protocols,and routing protocols. An additional goal  is  to
provide a single mechanism that can be  used  for  liveness  detection
over any media, at any protocol layer, with a wide range of  detection
times and overhead, to avoid a proliferation of different methods.

The extensions introduced in this draft  for  bfd  shall  aid  in  bfd
complementing the GR capabilities of protocols such as ospf and  also
in providing a  consistent  behavior  for  planned/unplanned restarts
irrespective of the underlying protocols.The intention of this document
is to provide a solution that works fine for all types of bfd
implementations.

2. Overview

The Bidirectional Forwarding Detection [BFD] specification  defines  a
protocol with simple and specific semantics. Its sole  purpose  is  to
verify connectivity between a pair of systems, for a  particular  data
protocol across a path (which may be of  any  technology,  length,  or
OSI layer). The promptness of the detection of a path failure  can  be
controlled by trading off  protocol  overhead  and  system  load  with
detection times.

BFD is assumed to be working fine without a need for any GR support in
it.But, the deployments show that the different types of implementations
in the products and their inherent mechanisms lead to issues with bfd
especially in surviving GR.It is true that prioritizing bfd would make
sure the other CPU intensive processes do not fail bfd, but this won't
be possible as there may be other higher priority processing that cant
be ignored.Example for this are the existing subscriber connections that
can't be given a lesser priority.

The extensions introduced in this draft  for  bfd  shall  aid  in  bfd
complementing the GR capabilities of protocols such as ospf  and  also
in providing a consistent behavior for planned/unplanned restarts  for
the underlying protocols.

A.Palanivelan                                                  [Page 3]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt     June 2010


3. Motivations

Though the existing drafts discuss bfd interactions with  applications
with Graceful Restart and ways of implementing in  serving  successful
GR, the drafts itself have some exceptions and caveats  applied.  This
draft in particular discusses the issues in  the  following  scenarios
and  provides  a  generic  solution  that  would  scale   for   future
applications and to provide a solution that works fine for all types
of bfd implementations.

*  Planned restart with a control protocol such  as  IS-IS,which  cannot
   signal GR.
*  BFD  co-existing  with  BB  configs

This document tries to  address  the  above  issues  in  specific  and
Graceful restart mechanism  in  general,  for  bfd.

3.1  Planned Restarts with control  protocols

The  existing  bfd  drafts suggest administratively disabling bfd prior
to the start of GR.  But, this works only for planned restarts and not
for  unplanned  restarts. This also does not work for  a  protocol  such
as  IS-IS  that  cannot signal a planned restart.

For a Planned restart where  a  control  protocol  can  signal  before
restarting, if a BFD session failure occurs during the restart, it  is
recommended in the existing document(s)  that,  such  a  planned  restart
SHOULD NOT be aborted and the session failure SHOULD NOT result  in  a
topology change  being  signaled  in  the  control  protocol.  Control
protocols that cannot signal a planned restart depend on the  recently
restarted system to signal the Graceful Restart prior to  the  control
protocol adjacency timeout.

In most cases, whether the restart is  planned  or  unplanned,  it  is
likely that the BFD session will  time  out  prior  to  the  onset  of
Graceful Restart, and a topology change SHALL be signaled.  This  type
of  implementation  shall  impact  non-stop   routing   and   non-stop
forwarding  support  using  GR-enabled  protocols  and   provides   an
opportunity to review the existing bfd  implementations  and  improve.





A.Palanivelan                                                  [Page 4]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt     June 2010


3.2 BFD Co-existing with BB configs

In  a  real  time  scenario  with Broadband configurations,it is highly
likely that the bfd sessions do not survive a Graceful restart.

Assume a router at PE that has  active  DHCP  sessions  with  a  large
number of clients (say 16k). During a  planned  restart,  it  is  also
likely that the DHCP clients request for renewal of IP address to  the
server  (restarting  router)  at  that  time.  When  the   router   is
restarting, these requests do not reach the router.  But,  when  these
requests reach the router when the router has just come  up,  it  will
treat these requests at a high priority and responds to them. When  we
have thousands of such requests to the restarting router,  the  router
shall spend a major part of its first second of uptime  in  addressing
these requests. In this scenario, a control protocol like ospfv2  that
has GR enabled [OSPF-GRACE], shall withstand the restart for the
specified  restart interval (as it will be in seconds) and it is likely
to  survive  the restart in maintaining its forwarding plane.

In the same scenario, if bfd is enabled  for  ospfv2, for  an unplanned
restart, the (bfd) neighbor router will be expecting bfd control packets
in  milliseconds interval and during the restart process, is likely  to
timeout,  also impacting the associated ospfv2 adjacency and  resulting
in  loss  of traffic.

The scenario will be the same for bfd with a protocol such  as  is-is
[IS-IS-GRACE], where the problem  is  likely  to  be  seen  even for a
planned restart.

4. Extensions to BFD

This draft introduces a new diag value to indicate that  the  neighbor
is restarting and provisions to  configure  graceful  restart  timers.

The Generic BFD Control  Packet  Format  shown  below  introduces  two
additional  sections  "My  Restart   Interval"   and   "Your   Restart
Interval".

         0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |Vers |  Diag   |Sta|P|F|C|A|D|M|  Detect Mult  |    Length     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       My Discriminator                        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                      Your Discriminator                       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                    Desired Min TX Interval                    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                   Required Min RX Interval                    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                 Required Min Echo RX Interval                 |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                 My Restart Interval                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                 Your Restart Interval                         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


        4.1 Version (Vers)
        The version of bfd defined by this  draft,  that  has  support
        for GR  configuration  and  a  diag  for  neighbor  restarting
        state, shall have a value of 2.

A.Palanivelan                                                  [Page 5]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt     June 2010

        4.2  Diagnostic  (Diag)

        A diagnostic code specifying the  local  system's  reason  for
        the last change in session state.  A  new  diag  value  9  for
        "Neighbor   Restarting" is introduced in this draft.

        Values are:
        0 -- No Diagnostic
        1 -- Control Detection Time Expired
        2 -- Echo Function Failed
        3 -- Neighbor Signaled Session Down
        4 -- Forwarding Plane Reset
        5 -- Path Down
        6 -- Concatenated  Path  Down
        7 -- Administratively  Down
        8 -- Reverse Concatenated Path Down
        9 -  Neighbor  Restarting
        10-31 -- Reserved for future use

        This field allows remote systems to determine the reason  that
        the previous session failed.


        4.3 My Restart Interval

        This  is  the  restart  interval,in   microseconds,   of   the
        transmitting system advertised to the remote  system.  In  the
        case of a restart (of transmitting system), the remote  system
        is expected to keep the bfd session up  for  this  duration  of
        time.This field shall have a value greater than the detection
        time.Value of 0 shall indicate to the remote system that this
        system has bfd-gr disabled.

        4.4 Your Restart Interval

        The  restart  interval,in  microseconds,  received  from   the
        corresponding remote system. In the  case  of  a  restart  (of
        remote system), the transmitting system is  expected  to  keep
        the bfd session up for this duration of time.This field shall
        have a value greater than the detection time.


5. State Machine for BFD with GR Support


   The BFD state machine is quite straightforward and explained in detail
   by [BFD].The [BFD} RFC describes different states for BFD as:
   Down, Init,Up, AdminDown.


A.Palanivelan                                                  [Page 6]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt     June 2010


   Each system communicates its session state in the State (Sta) field
   in the BFD Control packet, and that received state, in combination
   with the local session state, drives the state machine.

   Please refer [BFD] for state machine diagram and detailed explanations
   of the state transitions.

   The following diagram provides an overview of the state machine, for
   state transitions for BFD with GR support (where "Your Restart Interval"
   has a non-zero value and greater than Detection time).This document
   does not introduce any new state to BFD state machine.

   The "Your Restart Interval" shall have a value greater than the detection
   time value.If this value is zero or less than the detection time value,
   the state transitions shall completely follow bfd state machine as
   defined by [BFD].

   The notation on each arc represents the state of the remote system (as
   received in the State field in the BFD Control packet) or indicates the
   expiration of the Detection Timer.


                                          +-----+
                                          |     | INIT, UP
                                          |     v
                                   +-----------------------------+
                         +-------->|State =  UP, Diag = 0,       |
                         |         |Timer = "Detect Interval"    |<----+
                         |         |                             |     |
                         |         +-----------------------------+     |
                         |           |                 |               |
                         |           |                 |               |
                         |           |          {Neighbor Restart}     |
                         |           |                 |               |
                         |           |                 |               |
                         |INIT,UP    |                 | {Neighbor Restart
                         |           |                 |         complete}
                         |           |                 |               |
                         |           |                 v               |
                      +-------+      |   +----------------------------------+
                +-----|       |      |   | State = UP, Diag = 9,            |
                |     |       |      |   | Timer = "Your Restart interval"  |
          DOWN  |     | INIT  |      |   +----------------------------------+
                +---->|       |      |                     |
                      +-------+      |ADMIN DOWN,          |
                          |          |DOWN,                |
                          |          |TIMER                |
                          |          v          ADMIN DOWN,|
                          |       +-------+           DOWN,|
                          |       |       |          TIMER |
               ADMIN DOWN,|       | DOWN  |<---------------+
                    TIMER |       |       |
                          +------>|       |
                                  +-------+
                                    |   ^
                                    |   | UP, ADMIN DOWN, TIMER
                                    |   |
                                    +---+


   Note1: This state diagram holds for bfd with GR extension,as described
   in this document, which implies that  "your Restart Interval" has a
   value greater than the Detection time value of the established session.

   Note2: The parts of the diagram with flower braces {} indicates the GR
   Specific events on the remote neighbor(Restart/Restart complete).


A.Palanivelan                                                  [Page 7]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt    June 2010



6. Theory of Operation

The system that  has  support  for  high-availability,  when  using  a
routing protocol  that  is  GR  enabled,  shall  continue  to  forward
traffic during a restart.when bfd is enabled on such  a  protocol,  it
is expected to assist the process than disturb it.  With  current  bfd
implementations, the bfd sessions  do  not  survive  a  restart  under
different conditions.An Unplanned restart or a planned restart with  a
protocol such as IS-IS that cannot signal about restart,  are  some  of
the conditions where bfd config is set to impact  a  high-availability
situation.  Though  there  are  certain  implementations  adopted   by
various companies to make bfd survive restarts, there  is  no  uniform
method of achieving this and is  likely  to  fail  when  interop  with
routers from other companies.This draft proposes  a  standard  way  of
achieving this objective.

This draft recommends the introduction of a  new  diag  value  (9  for
"Neighbor restarting"), new version (2 for GR supported bfd)  and  two
additional sections to the bfd packet format.This design  is  expected
to provide a capability to bfd in withstanding restart  scenarios,  in
complementing the associated  protocol.This  shall  work  consistently
irrespective of the bfd mode  or  protocol  or  the  type  of  restart.


6.1. Session Establishment and GR Timer exchange

The bfd session establishment follows the procedures as described in [BFD].
if the technology described by this document were to be implemented, the
bfd control packets shall have the following field(s) with the values
given below:

The Version field (Section 4.1) shall have a value of 2,indicating the
support for GR.

A new section to  the  bfd  control packet format,"My Restart Interval"
(Section 4.3) shall have a non-zero value that is greater than the
detection time.

A new section to  the  bfd  control packet format,"Your Restart Interval"
(Section 4.4)  shall have a non-zero value that is greater than the
detection time.

The "My Restart Interval" and  "Your  Restart  The interval" shall be
used in exchanging the GR timers information between the systems.

"My  Restart Interval" is  the  time  interval in  microseconds, that
this system expects its remote system to wait for, before bringing
down its bfd session  with  this  system.

"Your  Restart  Interval" is the time interval in microseconds, specified
by  the  remote  system,  that it expects this system,  to  wait  for,
by this system before bringing  down  its  bfd session by the remote
system.


A.Palanivelan                                                  [Page 8]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt    June 2010


Once the packet exchanges  are  complete  and  the  bfd  sessions  are
up,every bfd session will have info,  about  the  time  interval,  its
remote system will wait during a Restart and also  the  time  interval
this system has  to  wait,when  the  remote  system  restarts.The  "My
Restart Interval" and the "Your Restart Interval" values can be modified
after the  session  is  up, just like the other bfd parameters, and
in this case, the packet exchanges shall sync up the restart  interval
times (My and Your) on both the sides appropriately.

The exchange of GR Specific parameters, during bfd session establishment
is indicated in the diagram below.The diagram shows only the part of
control packets, for the purpose of clarity.

          SystemA                               SystemB
              |                                    |
              |                                    |
              |----------------------------------->|
              |  {bfd.version = 2                  |
              |  bfd.MyRestartInterval  = AAAA     |
              |  bfd.YourRestartInterval = 0 }     |
              |                                    |
              |<-----------------------------------|
              |  {bfd.version = 2                  |
              |  bfd.MyRestartInterval  = BBBB     |
              |  bfd.YourRestartInterval= AAAA }   |
              |                                    |
              |----------------------------------->|
              |  {bfd.version = 2                  |
              |  bfd.MyRestartInterval  = AAAA     |
              |  bfd.YourRestartInterval= BBBB }   |
              |                                    |

The initial bfd packet exchange  between  the  system  to  remote  system
shall have the exchanged values  for  the  "My  Restart  Interval"  or
0.The "Your Restart Interval" will reflect the value received  in  "My
Restart Interval" from the corresponding remote system or  is  Zero  if
value is not set.A value of Zero for  "Your  Restart  Interval"  shall
mean that the bfd GR is disabled at the  remote  end  and  similarly  a
value of Zero for "My Restart Interval" shall  mean  that  bfd  GR  is
disabled at the transmitting system.


6.2. Remote Neighbor Restart and Recovery

When the bfd neighbors that have their bfd sessions established (with
their bfd GR timer values exchanged as described above),the following
set of operations take place, when the remote neighbor attempts a
graceful restart (For eg.with a GR enabled routing protocol like OSPFv2/
IS-IS tied with BFD).


A.Palanivelan                                                  [Page 9]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt    June 2010


Once the packet exchanges  are  complete  and  the  bfd  sessions  are
up,every bfd session will have info,  about  the  time  interval,  its
remote system will wait during a Restart and also  the  time  interval
this system has  to  wait,when  the  remote  system  restarts.

For clarity, let us revisit the bfd timers and bfd detection time as
described in [BFD].

The Detection Time (the period of time without receiving BFD packets
after which the session is determined to have failed) is not carried
explicitly in the protocol.  Rather, it is calculated independently
in each direction by the receiving system based on the negotiated
transmit interval and the detection multiplier.

This means that a bfd control packet shall be received from the remote
neighbor within the detection time.When the bfd control packet is not
received from the remote neighbor within this time, the timer expiry,
shall bring the bfd session state to down.

In the case of Graceful Restart scenario, we may end up in a situation
that the routing protocol (like ospfv2) is in graceful restart  mode
with the remote neighbor restarting, and the system not receiving bfd
control packets within the detection time, due to other CPU intensive
processes in the system.This shall be addressed if the technology
proposed by this document were adapted.

When the set of systems had their bfd sessions established , with GR
support, as described in this document,when the remote neighbor
restarts, it shall set the bfd diagnostics field to a value of 9
(Neighbor Restarting) in the control packet to its neighbor (local
system).

When the local system receives bfd control packet with diag field set to
9, the local system shall update its timer to the previously exchanged
value of "Your Restart Interval".This effectively means that the local
system shall wait for a bfd control packet till "Your Restart Interval"
instead of Detection time.This shall be the case as long as the diag
field from the remote neighbor is 9.

When the restart is complete and the remote neighbor recovers, the remote
neighbor shall set the Diagnostics field to a value of 0.The local system
on receiving bfd control packets, with diag field set to 0, understands
that the restart process for remote neighbor is complete and hence shall
revert the timer, back to detection time (by calculation) and shall expect
control packets from the neighbor within this detection time.

If the remote neighbor is not recovering in time to send a bfd control
packet within the previously communicated "Your Restart Interval", the
timer expiry, shall bring the session down.

It is important to have a meaningful values to the "Your Restart Interval"
and  "My Restart Interval" to complement the GR timers in the associated
protocol.


A.Palanivelan                                                  [Page 10]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt    June 2010


7. Security Considerations

Security considerations discussed in [BFD], [BFD-1HOP] apply to this
document.


8. IANA Considerations

If this technology were to be implemented, it would need two sections
added to the BFD generic packet format namely "My Restart Interval" and
"Your Restart Interval"  as described in Section4 of this document.

This document also defines a Diag  value  of  9  to  be  used  to
specify  "Neighbor  Restarting" in addition to the "BFD Diagnostic Codes"
defined by [BFD] and referred in Section4.2 of this document.If this
technology were to be implemented, the "BFD Diagnostic Codes" need to be
updated as:

      Value    BFD Diagnostic Code Name
      -----    ------------------------
       0       No Diagnostic
       1       Control Detection Time Expired
       2       Echo Function Failed
       3       Neighbor Signaled Session Down
       4       Forwarding Plane Reset
       5       Path Down
       6       Concatenated Path Down
       7       Administratively Down
       8       Reverse Concatenated Path Down
       9       Neighbor Restarting
       10-31   Unassigned


9. References

9.1. Normative References

   [BFD] Katz, D., and Ward, D., "Bidirectional Forwarding Detection",
         RFC 5880, June, 2010.

   [BFD-1HOP] Katz, D., and Ward, D., "BFD for IPv4 and IPv6 (Single
       Hop)", RFC 5881, June, 2010.



9.2. Informative References

   [IS-IS-GRACE] Shand, M., and Ginsberg, L., "Restart signaling for IS-
       IS", RFC 5306, October 2008.

   [OSPF-GRACE] Moy, J., et al, "Graceful OSPF Restart", RFC 3623,
       November 2003.




A.Palanivelan                                                   [Page 11]


Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt      June 2010


10. Authors' Addresses

Palanivelan A
Cisco Systems,
Bangalore,India.

Email: apvelan@cisco.com



























A.Palanivelan                                                   [Page 12]

Internet Draft         draft-palanivelan-bfd-v2-gr-05.txt      June 2010