INTERNET-DRAFT
Bidirectional Forwarding Detection A. Palanivelan
Category: HISTORIC Cisco Systems
Expires: Dec 2010 June 09, 2010
Bidirectional Forwarding Detection (BFD) with Graceful Restart
draft-palanivelan-bfd-v2-gr-05.txt
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 10, 2010.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the BSD License.
A.Palanivelan [Page 1]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
Abstract
This document proposes an extension for Bidirectional Forwarding Detection
(BFD) to support Graceful restart, in complementing Gracefulrestart support
of the underlying protocol.This shall work consistently irrespective of the
bfd mode or protocol or the type of restart and most importantly the
vendors design and implementation.This document describes in detail the
challenges to bfd in surviving a graceful restart and a generic solution
to succeed.
Table of Contents
1 INTRODUCTION ............................................. 3
2 OVERVIEW ............................................... 3
3 MOTIVATIONS .............................................. 4
3.1 Planned Restarts with control protocols ............... 4
3.2 BFD Co-existing with BB configs ....................... 5
4 Extensions to BFD .......................................... 5
4.1 Version (Vers)........................................ 5
4.2 Diagnostic (Diag)..................................... 6
4.3 My Restart Interval.................................... 6
4.4 Your Restart Interval.................................. 6
5 State Machine for BFD with GR Support....................... 6
6 Theory of operation......................................... 8
6.1 Session Establishment and GR Timer exchange............ 8
6.2 Remote Neighbor Restart and Recovery................... 9
7 Security Considerations...................................... 11
8 IANA Considerations......................................... 11
9 References ................................................. 11
9.1 Normative References................................... 11
9.2 Informative References................................. 11
10 Author's address............................................ 12
A.Palanivelan [Page 2]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
1. Introduction
The Bidirectional Forwarding Detection protocol [BFD] provides a
mechanism for liveness detection of arbitrary paths between systems.
It is intended to provide low-overhead, short-duration detection of
failures in the path between adjacent forwarding engines, including
the interfaces, data link(s), and to the extent possible the
forwarding engines themselves. It operates independently of
media,data protocols,and routing protocols. An additional goal is to
provide a single mechanism that can be used for liveness detection
over any media, at any protocol layer, with a wide range of detection
times and overhead, to avoid a proliferation of different methods.
The extensions introduced in this draft for bfd shall aid in bfd
complementing the GR capabilities of protocols such as ospf and also
in providing a consistent behavior for planned/unplanned restarts
irrespective of the underlying protocols.The intention of this document
is to provide a solution that works fine for all types of bfd
implementations.
2. Overview
The Bidirectional Forwarding Detection [BFD] specification defines a
protocol with simple and specific semantics. Its sole purpose is to
verify connectivity between a pair of systems, for a particular data
protocol across a path (which may be of any technology, length, or
OSI layer). The promptness of the detection of a path failure can be
controlled by trading off protocol overhead and system load with
detection times.
BFD is assumed to be working fine without a need for any GR support in
it.But, the deployments show that the different types of implementations
in the products and their inherent mechanisms lead to issues with bfd
especially in surviving GR.It is true that prioritizing bfd would make
sure the other CPU intensive processes do not fail bfd, but this won't
be possible as there may be other higher priority processing that cant
be ignored.Example for this are the existing subscriber connections that
can't be given a lesser priority.
The extensions introduced in this draft for bfd shall aid in bfd
complementing the GR capabilities of protocols such as ospf and also
in providing a consistent behavior for planned/unplanned restarts for
the underlying protocols.
A.Palanivelan [Page 3]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
3. Motivations
Though the existing drafts discuss bfd interactions with applications
with Graceful Restart and ways of implementing in serving successful
GR, the drafts itself have some exceptions and caveats applied. This
draft in particular discusses the issues in the following scenarios
and provides a generic solution that would scale for future
applications and to provide a solution that works fine for all types
of bfd implementations.
* Planned restart with a control protocol such as IS-IS,which cannot
signal GR.
* BFD co-existing with BB configs
This document tries to address the above issues in specific and
Graceful restart mechanism in general, for bfd.
3.1 Planned Restarts with control protocols
The existing bfd drafts suggest administratively disabling bfd prior
to the start of GR. But, this works only for planned restarts and not
for unplanned restarts. This also does not work for a protocol such
as IS-IS that cannot signal a planned restart.
For a Planned restart where a control protocol can signal before
restarting, if a BFD session failure occurs during the restart, it is
recommended in the existing document(s) that, such a planned restart
SHOULD NOT be aborted and the session failure SHOULD NOT result in a
topology change being signaled in the control protocol. Control
protocols that cannot signal a planned restart depend on the recently
restarted system to signal the Graceful Restart prior to the control
protocol adjacency timeout.
In most cases, whether the restart is planned or unplanned, it is
likely that the BFD session will time out prior to the onset of
Graceful Restart, and a topology change SHALL be signaled. This type
of implementation shall impact non-stop routing and non-stop
forwarding support using GR-enabled protocols and provides an
opportunity to review the existing bfd implementations and improve.
A.Palanivelan [Page 4]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
3.2 BFD Co-existing with BB configs
In a real time scenario with Broadband configurations,it is highly
likely that the bfd sessions do not survive a Graceful restart.
Assume a router at PE that has active DHCP sessions with a large
number of clients (say 16k). During a planned restart, it is also
likely that the DHCP clients request for renewal of IP address to the
server (restarting router) at that time. When the router is
restarting, these requests do not reach the router. But, when these
requests reach the router when the router has just come up, it will
treat these requests at a high priority and responds to them. When we
have thousands of such requests to the restarting router, the router
shall spend a major part of its first second of uptime in addressing
these requests. In this scenario, a control protocol like ospfv2 that
has GR enabled [OSPF-GRACE], shall withstand the restart for the
specified restart interval (as it will be in seconds) and it is likely
to survive the restart in maintaining its forwarding plane.
In the same scenario, if bfd is enabled for ospfv2, for an unplanned
restart, the (bfd) neighbor router will be expecting bfd control packets
in milliseconds interval and during the restart process, is likely to
timeout, also impacting the associated ospfv2 adjacency and resulting
in loss of traffic.
The scenario will be the same for bfd with a protocol such as is-is
[IS-IS-GRACE], where the problem is likely to be seen even for a
planned restart.
4. Extensions to BFD
This draft introduces a new diag value to indicate that the neighbor
is restarting and provisions to configure graceful restart timers.
The Generic BFD Control Packet Format shown below introduces two
additional sections "My Restart Interval" and "Your Restart
Interval".
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Vers | Diag |Sta|P|F|C|A|D|M| Detect Mult | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| My Discriminator |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Your Discriminator |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Desired Min TX Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Required Min RX Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Required Min Echo RX Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| My Restart Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Your Restart Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
4.1 Version (Vers)
The version of bfd defined by this draft, that has support
for GR configuration and a diag for neighbor restarting
state, shall have a value of 2.
A.Palanivelan [Page 5]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
4.2 Diagnostic (Diag)
A diagnostic code specifying the local system's reason for
the last change in session state. A new diag value 9 for
"Neighbor Restarting" is introduced in this draft.
Values are:
0 -- No Diagnostic
1 -- Control Detection Time Expired
2 -- Echo Function Failed
3 -- Neighbor Signaled Session Down
4 -- Forwarding Plane Reset
5 -- Path Down
6 -- Concatenated Path Down
7 -- Administratively Down
8 -- Reverse Concatenated Path Down
9 - Neighbor Restarting
10-31 -- Reserved for future use
This field allows remote systems to determine the reason that
the previous session failed.
4.3 My Restart Interval
This is the restart interval,in microseconds, of the
transmitting system advertised to the remote system. In the
case of a restart (of transmitting system), the remote system
is expected to keep the bfd session up for this duration of
time.This field shall have a value greater than the detection
time.Value of 0 shall indicate to the remote system that this
system has bfd-gr disabled.
4.4 Your Restart Interval
The restart interval,in microseconds, received from the
corresponding remote system. In the case of a restart (of
remote system), the transmitting system is expected to keep
the bfd session up for this duration of time.This field shall
have a value greater than the detection time.
5. State Machine for BFD with GR Support
The BFD state machine is quite straightforward and explained in detail
by [BFD].The [BFD} RFC describes different states for BFD as:
Down, Init,Up, AdminDown.
A.Palanivelan [Page 6]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
Each system communicates its session state in the State (Sta) field
in the BFD Control packet, and that received state, in combination
with the local session state, drives the state machine.
Please refer [BFD] for state machine diagram and detailed explanations
of the state transitions.
The following diagram provides an overview of the state machine, for
state transitions for BFD with GR support (where "Your Restart Interval"
has a non-zero value and greater than Detection time).This document
does not introduce any new state to BFD state machine.
The "Your Restart Interval" shall have a value greater than the detection
time value.If this value is zero or less than the detection time value,
the state transitions shall completely follow bfd state machine as
defined by [BFD].
The notation on each arc represents the state of the remote system (as
received in the State field in the BFD Control packet) or indicates the
expiration of the Detection Timer.
+-----+
| | INIT, UP
| v
+-----------------------------+
+-------->|State = UP, Diag = 0, |
| |Timer = "Detect Interval" |<----+
| | | |
| +-----------------------------+ |
| | | |
| | | |
| | {Neighbor Restart} |
| | | |
| | | |
|INIT,UP | | {Neighbor Restart
| | | complete}
| | | |
| | v |
+-------+ | +----------------------------------+
+-----| | | | State = UP, Diag = 9, |
| | | | | Timer = "Your Restart interval" |
DOWN | | INIT | | +----------------------------------+
+---->| | | |
+-------+ |ADMIN DOWN, |
| |DOWN, |
| |TIMER |
| v ADMIN DOWN,|
| +-------+ DOWN,|
| | | TIMER |
ADMIN DOWN,| | DOWN |<---------------+
TIMER | | |
+------>| |
+-------+
| ^
| | UP, ADMIN DOWN, TIMER
| |
+---+
Note1: This state diagram holds for bfd with GR extension,as described
in this document, which implies that "your Restart Interval" has a
value greater than the Detection time value of the established session.
Note2: The parts of the diagram with flower braces {} indicates the GR
Specific events on the remote neighbor(Restart/Restart complete).
A.Palanivelan [Page 7]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
6. Theory of Operation
The system that has support for high-availability, when using a
routing protocol that is GR enabled, shall continue to forward
traffic during a restart.when bfd is enabled on such a protocol, it
is expected to assist the process than disturb it. With current bfd
implementations, the bfd sessions do not survive a restart under
different conditions.An Unplanned restart or a planned restart with a
protocol such as IS-IS that cannot signal about restart, are some of
the conditions where bfd config is set to impact a high-availability
situation. Though there are certain implementations adopted by
various companies to make bfd survive restarts, there is no uniform
method of achieving this and is likely to fail when interop with
routers from other companies.This draft proposes a standard way of
achieving this objective.
This draft recommends the introduction of a new diag value (9 for
"Neighbor restarting"), new version (2 for GR supported bfd) and two
additional sections to the bfd packet format.This design is expected
to provide a capability to bfd in withstanding restart scenarios, in
complementing the associated protocol.This shall work consistently
irrespective of the bfd mode or protocol or the type of restart.
6.1. Session Establishment and GR Timer exchange
The bfd session establishment follows the procedures as described in [BFD].
if the technology described by this document were to be implemented, the
bfd control packets shall have the following field(s) with the values
given below:
The Version field (Section 4.1) shall have a value of 2,indicating the
support for GR.
A new section to the bfd control packet format,"My Restart Interval"
(Section 4.3) shall have a non-zero value that is greater than the
detection time.
A new section to the bfd control packet format,"Your Restart Interval"
(Section 4.4) shall have a non-zero value that is greater than the
detection time.
The "My Restart Interval" and "Your Restart The interval" shall be
used in exchanging the GR timers information between the systems.
"My Restart Interval" is the time interval in microseconds, that
this system expects its remote system to wait for, before bringing
down its bfd session with this system.
"Your Restart Interval" is the time interval in microseconds, specified
by the remote system, that it expects this system, to wait for,
by this system before bringing down its bfd session by the remote
system.
A.Palanivelan [Page 8]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
Once the packet exchanges are complete and the bfd sessions are
up,every bfd session will have info, about the time interval, its
remote system will wait during a Restart and also the time interval
this system has to wait,when the remote system restarts.The "My
Restart Interval" and the "Your Restart Interval" values can be modified
after the session is up, just like the other bfd parameters, and
in this case, the packet exchanges shall sync up the restart interval
times (My and Your) on both the sides appropriately.
The exchange of GR Specific parameters, during bfd session establishment
is indicated in the diagram below.The diagram shows only the part of
control packets, for the purpose of clarity.
SystemA SystemB
| |
| |
|----------------------------------->|
| {bfd.version = 2 |
| bfd.MyRestartInterval = AAAA |
| bfd.YourRestartInterval = 0 } |
| |
|<-----------------------------------|
| {bfd.version = 2 |
| bfd.MyRestartInterval = BBBB |
| bfd.YourRestartInterval= AAAA } |
| |
|----------------------------------->|
| {bfd.version = 2 |
| bfd.MyRestartInterval = AAAA |
| bfd.YourRestartInterval= BBBB } |
| |
The initial bfd packet exchange between the system to remote system
shall have the exchanged values for the "My Restart Interval" or
0.The "Your Restart Interval" will reflect the value received in "My
Restart Interval" from the corresponding remote system or is Zero if
value is not set.A value of Zero for "Your Restart Interval" shall
mean that the bfd GR is disabled at the remote end and similarly a
value of Zero for "My Restart Interval" shall mean that bfd GR is
disabled at the transmitting system.
6.2. Remote Neighbor Restart and Recovery
When the bfd neighbors that have their bfd sessions established (with
their bfd GR timer values exchanged as described above),the following
set of operations take place, when the remote neighbor attempts a
graceful restart (For eg.with a GR enabled routing protocol like OSPFv2/
IS-IS tied with BFD).
A.Palanivelan [Page 9]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
Once the packet exchanges are complete and the bfd sessions are
up,every bfd session will have info, about the time interval, its
remote system will wait during a Restart and also the time interval
this system has to wait,when the remote system restarts.
For clarity, let us revisit the bfd timers and bfd detection time as
described in [BFD].
The Detection Time (the period of time without receiving BFD packets
after which the session is determined to have failed) is not carried
explicitly in the protocol. Rather, it is calculated independently
in each direction by the receiving system based on the negotiated
transmit interval and the detection multiplier.
This means that a bfd control packet shall be received from the remote
neighbor within the detection time.When the bfd control packet is not
received from the remote neighbor within this time, the timer expiry,
shall bring the bfd session state to down.
In the case of Graceful Restart scenario, we may end up in a situation
that the routing protocol (like ospfv2) is in graceful restart mode
with the remote neighbor restarting, and the system not receiving bfd
control packets within the detection time, due to other CPU intensive
processes in the system.This shall be addressed if the technology
proposed by this document were adapted.
When the set of systems had their bfd sessions established , with GR
support, as described in this document,when the remote neighbor
restarts, it shall set the bfd diagnostics field to a value of 9
(Neighbor Restarting) in the control packet to its neighbor (local
system).
When the local system receives bfd control packet with diag field set to
9, the local system shall update its timer to the previously exchanged
value of "Your Restart Interval".This effectively means that the local
system shall wait for a bfd control packet till "Your Restart Interval"
instead of Detection time.This shall be the case as long as the diag
field from the remote neighbor is 9.
When the restart is complete and the remote neighbor recovers, the remote
neighbor shall set the Diagnostics field to a value of 0.The local system
on receiving bfd control packets, with diag field set to 0, understands
that the restart process for remote neighbor is complete and hence shall
revert the timer, back to detection time (by calculation) and shall expect
control packets from the neighbor within this detection time.
If the remote neighbor is not recovering in time to send a bfd control
packet within the previously communicated "Your Restart Interval", the
timer expiry, shall bring the session down.
It is important to have a meaningful values to the "Your Restart Interval"
and "My Restart Interval" to complement the GR timers in the associated
protocol.
A.Palanivelan [Page 10]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
7. Security Considerations
Security considerations discussed in [BFD], [BFD-1HOP] apply to this
document.
8. IANA Considerations
If this technology were to be implemented, it would need two sections
added to the BFD generic packet format namely "My Restart Interval" and
"Your Restart Interval" as described in Section4 of this document.
This document also defines a Diag value of 9 to be used to
specify "Neighbor Restarting" in addition to the "BFD Diagnostic Codes"
defined by [BFD] and referred in Section4.2 of this document.If this
technology were to be implemented, the "BFD Diagnostic Codes" need to be
updated as:
Value BFD Diagnostic Code Name
----- ------------------------
0 No Diagnostic
1 Control Detection Time Expired
2 Echo Function Failed
3 Neighbor Signaled Session Down
4 Forwarding Plane Reset
5 Path Down
6 Concatenated Path Down
7 Administratively Down
8 Reverse Concatenated Path Down
9 Neighbor Restarting
10-31 Unassigned
9. References
9.1. Normative References
[BFD] Katz, D., and Ward, D., "Bidirectional Forwarding Detection",
RFC 5880, June, 2010.
[BFD-1HOP] Katz, D., and Ward, D., "BFD for IPv4 and IPv6 (Single
Hop)", RFC 5881, June, 2010.
9.2. Informative References
[IS-IS-GRACE] Shand, M., and Ginsberg, L., "Restart signaling for IS-
IS", RFC 5306, October 2008.
[OSPF-GRACE] Moy, J., et al, "Graceful OSPF Restart", RFC 3623,
November 2003.
A.Palanivelan [Page 11]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010
10. Authors' Addresses
Palanivelan A
Cisco Systems,
Bangalore,India.
Email: apvelan@cisco.com
A.Palanivelan [Page 12]
Internet Draft draft-palanivelan-bfd-v2-gr-05.txt June 2010