CCAMP Working Group Jonathan P. Lang, Ed
Internet Draft Bala Rajagopalan, Ed.
Expiration Date: March, 2004
September, 2003
Generalized MPLS Recovery Functional Specification
draft-ietf-ccamp-gmpls-recovery-functional-01.txt
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 [RFC2026].
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet- Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document presents a functional description of the protocol
extensions needed to support GMPLS-based recovery (i.e. protection
and restoration). Protocol specific formats and mechanisms will be
described in companion documents.
Lang, J., Rajagopalan, B., et al [Page 1]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
Contributors
This document was the product of many individuals working together
in the CCAMP WG Protection and Restoration design team. The
following are the authors that contributed to this document:
Deborah Brungard (AT&T)
Rm. D1-3C22 - 200 S. Laurel Ave.
Middletown, NJ 07748, USA
E-mail: dbrungard@att.com
Sudheer Dharanikota (Consult)
E-mail: sudheer@ieee.org
Jonathan P. Lang
Email: jplang@ieee.org
Guangzhi Li (AT&T)
180 Park Avenue,
Florham Park, NJ 07932, USA
E-mail: gli@research.att.com
Eric Mannie (Consult)
E-mail: eric_mannie@hotmail.com
Dimitri Papadimitriou (Alcatel)
Francis Wellesplein, 1
B-2018 Antwerpen, Belgium
E-mail: dimitri.papadimitriou@alcatel.be
Bala Rajagopalan
Tellium
2 Crescent Place - P.O. Box 901
Oceanport, NJ 07757-0901, USA
Email: braja@tellium.com
Yakov Rekhter (Juniper)
1194 N. Mathilda Avenue
Sunnyvale, CA 94089, USA
E-mail: yakov@juniper.net
1. Introduction
A requirement for the development of a common control plane for both
optical and electronic switching equipment is that there must be
signaling, routing, and link management mechanisms that support data
plane fault recovery. In this document, the term "recovery" is
generically used to denote both protection and restoration; the
specific terms "protection" and "restoration" are only used when
differentiation is required. The subtle distinction between
Lang, J., Rajagopalan, B., et al [Page 2]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
protection and restoration is made based on the resource allocation
done during the recovery period (see [TERM]).
A label-switched path (LSP) may be subject to local (span), segment,
and/or end-to-end recovery. Local span protection refers to the
protection of the link (and hence all the LSPs marked as required
for span protection and routed over the link) between two
neighboring switches. Segment protection refers to the recovery of
an LSP segment (i.e., an SNC in the ITU-T terminology) between two
nodes, i.e. the boundary nodes of the segment. End-to-end protection
refers to the protection of an entire LSP from the ingress to the
egress port. The end-to-end recovery models discussed in this draft
apply to segment protection where the source and destination refer
to the protected segment rather than the entire LSP. Multiple
recovery levels may be used concurrently by a single LSP for added
resiliency; however, the interaction between levels becomes
affecting any one direction of the LSP results in both directions of
the LSP being switched to a new span, segment, or end-to-end path.
Unless otherwise stated, all references to ôlinkö in this draft
indicate a bi-directional link (which may be realized as a pair of
unidirectional links).
Consider the control plane message flow during the establishment of
an LSP. This message flow proceeds from an initiating (or source)
node to a terminating (or destination) node, via a sequence of
intermediate nodes. A node along the LSP is said to be UPSTREAM from
another node if the former occurs first in the sequence. The latter
node is said to be DOWNSTREAM from the former node. That is, an
UPSTREAM node is closer to the initiating node than a node further
DOWNSTREAM. Unless otherwise stated, all references to UPSTREAM and
DOWNSTREAM are in terms of the control plane message flow.
The flow of the data traffic is defined from ingress (source node)
to egress (destination node). Note that for bi-directional LSPs
there are two different data plane flows, one for each direction of
the LSP. This document presents a protocol functional description to
support GMPLS-based recovery (i.e., protection and restoration).
Protocol specific formats and mechanisms will be described in
companion documents.
2. Span Protection
Consider a (working) link i between two nodes A and B. There are two
fundamental models for span protection. The first is referred to as
1+1 protection. Under this model, a dedicated link j is pre-assigned
to protect link i. LSP traffic is permanently bridged onto both
links i and j at the ingress node and the egress node selects the
signal (i.e., normal traffic) from i or j, based on a selection
function (e.g., signal quality). Under unidirectional 1+1 span
protection (Section 2.1), each node A and B acts autonomously to
select the signal from the working link (i) or the protection link
Lang, J., Rajagopalan, B., et al [Page 3]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
(j). Under bi-directional 1+1 span protection (Section 2.2) the two
nodes A and B coordinate the selection function such that they
select the signal from the same link, i or j.
Under the second model, a set of N working links are protected by a
set of M protection links, usually cwith M <= N. A failure in any of the N working links results in traffic being switched to one of the
M protection links that is available. This is typically a three-step
process: first the data plane failure is detected at the egress node
and reported (notification), then a protection link is selected, and
finally, the LSPs on the failed link are moved to the protection
link. If reversion is supported, a fourth step is included, i.e.
return of the traffic to the working link (when the working link has
recovered from the failure). In Section 2.3, 1:1 span protection is
described. In Section 2.4, M:N span protection is described, where M
<= N.
2.1 Unidirectional 1+1 dedicated protection
Suppose a bi-directional LSP is routed over link i between two nodes
A and B. Under unidirectional 1+1 protection, a dedicated link j is
pre-assigned to protect the working link i. LSP traffic is
permanently bridged on both links at the ingress node and the egress
node selects the normal traffic from one of the links, i or j. If a
node (A or B) detects a failure of a span, it autonomously invokes a
process to receive the traffic from the protection span. Thus, it is
possible that node A selects the signal from link i in the B to A
direction of the LSP, and node B selects the signal from link j in
the A to B direction.
The following functionality is required for 1+1 unidirectional span
protection:
o Routing: A single TE link encompassing both working and
protection links should be announced with Link Protection Type
ôDedicated 1+1ö along with the bandwidth parameters for the
working link. As the resources are consumed/released, the
bandwidth parameters of the TE link are adjusted accordingly.
Encoding of the Link Protection Type and bandwidth parameters
in IS-IS is specified in [GMPLS-ISIS]. Encoding of this
information in OSPF is specified in [GMPLS-OSPF].
o Signaling: The Link Protection object/TLV should be used to
request "Dedicated 1+1" link protection for that LSP. This
object/TLV is defined in [GMPLS-SIG]. If the Link Protection
object/TLV is not used, link selection is a matter of local
policy. No additional signaling is required when a fail-over
occurs.
o Link management: Both nodes must have a consistent view of the
link protection association for the spans. This can be done
Lang, J., Rajagopalan, B., et al [Page 4]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
using the Link Management Protocol (LMP), or if LMP is not
used, this must be configured manually.
2.2 Bi-directional 1+1 dedicated protection
Suppose an LSP is routed over link i between two nodes A and B.
Under bi-directional 1+1 protection, a dedicated link j is pre-
assigned to protect the working link i. LSP traffic is permanently
duplicated on both links and under normal conditions, the traffic
from link i is received by nodes A and B (in the appropriate
directions). A failure affecting link i results in both A and B
switching to the traffic on link j in the respective directions.
Note that some form of signaling is required to ensure that both A
and B start receiving from the protection link.
The basic steps in 1+1 bi-directional span protection are as
follows:
1. If a node (A or B) detects the failure of the working link (or
a degradation of signal quality over the working link), it
should begin receiving on the protection link and send a
switchover message reliably to the other node (B or A,
respectively). This message should indicate the identity of the
failed working link and other relevant information.
2. Upon receipt of the switchover message, a node MUST begin
receiving from the protection link and send a switchover
response message to the other node (A or B, respectively).
Since both the working/protect spans are exposed to routing &
signaling as a single link, the switchover should be
transparent to routing and signaling.
o The routing procedures are the same as in 1+1
unidirectional.
o The signaling procedures are the same as in 1+1
unidirectional.
o In addition to the procedures described in 1+1
(unidirectional), a switchover request message must be
used to signal the switchover request. This can be done
using LMP. Note that GMPLS-based mechanisms may not be
necessary when the underlying span (transport) technology
provides such a mechanism.
2.3 Dedicated 1:1 protection with Extra Traffic
Consider two adjacent nodes A and B. Under 1:1 protection, a
dedicated link j between A and B is pre-assigned to protect working
link i. Link j may be carrying (preemptable) Extra Traffic. A
failure affecting link i results in the corresponding LSP(s) being
Lang, J., Rajagopalan, B., et al [Page 5]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
restored to link j. Extra Traffic being routed over link j may need
to be preempted to accommodate the LSPs that have to be restored.
Once a fault is isolated/localized, the affected LSP(s) must be
moved to the protection link. The process of moving an LSP from a
failed (working) link to a protection link must be initiated by one
of the nodes, A or B. This node is referred to as the ômasterö. The other node is called the ôslaveö. The determination of the master and the slave may be based on configured information or protocol
specific requirements.
The basic steps in dedicated 1:1 span protection (ignoring
reversion) are as follows:
1. If the master detects/localizes a link failure event, it
invokes a process to allocate the protection link to the
affected LSP(s).
2. If the slave detects a link failure event, it informs the
master of the failure using a failure indication message.
The master then invokes the same procedure as (1) to move
the LSPs to the protection link. If the protection link is
carrying Extra Traffic, the slave stops using the span for
the Extra Traffic.
3. Once the span protection procedure is invoked in the
master, it requests the slave to switch the affected LSP(s)
to the protection link. Prior to this, if the protection
link is carrying Extra Traffic, the master stops using the
span for this traffic (i.e., the traffic is dropped by the
master and not forwarded into or out of the protection
link).
4. The slave sends an acknowledgement to the master. Prior to
this, the slave stops using the link for Extra
Traffic(i.e.,the traffic is dropped by the slave and not
forwarded into or out of the protection link). It then
starts sending the normal traffic on the selected
protection link.
5. When the master receives the acknowledgement, it starts
sending and receiving the normal traffic over the new link.
The switchover of the LSPs is thus completed.
From the description above, it is clear that 1:1 span protection may
require up to three signaling messages for each failed span: a
failure indication message, an LSP switchover request message, and
an LSP switchover response message. Furthermore, it may be possible
to switch multiple LSPs from the working span to the protection span
simultaneously.
o Pre-emption MUST be supported to accommodate Extra
Traffic.
o Routing: A single TE link encompassing both working and
protection links is announced with Link Protection Type
Lang, J., Rajagopalan, B., et al [Page 6]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
"Dedicated 1:1". If Extra Traffic is supported over the
protection link, then the bandwidth parameters for the
protection link must also be announced. The
differentiation between bandwidth for working and protect
links is made using priority mechanisms. In other words,
the network must be configured such that bandwidth at
priority X or lower is considered Extra Traffic.
If there is a failure on the working link, then the normal
traffic is switched to the protection link, preempting
Extra Traffic if necessary. The bandwidth for the
protection link must be adjusted accordingly.
o Signaling: To establish an LSP on the working link, the
Link Protection object/TLV indicating "Dedicated 1:1"
should be included in the signaling request message for
that LSP. To establish an LSP on the protection link, the
appropriate priority (indicating Extra Traffic) should be
used for that LSP. These objects/TLVs are defined in
[GMPLS-SIG]. If the Link Protection object/TLV is not
used, link selection is a matter of local policy.
o Link management: Both nodes must have a consistent view of
the link protection association for the spans. This can be
done using LMP or via manual configuration.
o When a link failure is detected at the slave, a failure
indication message must be sent to the master informing
the node of the link failure.
2.4 Shared M:N protection
Shared M:N protection is described with respect to two neighboring
nodes A and B. The scenario considered is as follows:
o At any point in time, there are two sets of links between
A and B, i.e., a working set of N (bi-directional) links
carrying traffic subject to protection and a protection
set of M (bi-directional) links. A protection link may be
carrying Extra Traffic. There is no a priori relationship
between the two sets of links, but the value of M and N
may be pre-configured. The specific links in the
protection set MAY be pre-configured to be physically
diverse to avoid the possibility that failure events
affect a large proportion of protection links (along with
working links).
o When a link in the working set is affected by a failure,
the normal traffic is diverted to a link in the protection
set, if such a link is available. Note that such a link
might be carrying more than one LSP, e.g., an OC-192 link
carrying four STS-48 LSPs.
Lang, J., Rajagopalan, B., et al [Page 7]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
o More than one link in the working set may be affected by
the same failure event. In this case, there may not be an
adequate number of protection links to accommodate all of
the affected traffic carried by failed working links. The
set of affected working links that are actually restored
over available protection links is then subject to
policies (e.g., based on relative priority of working
traffic). These policies are not specified in this draft.
o When normal traffic must be diverted from a failed link in
the working set to a protection link, the decision as to
which protection link is chosen is always made by one of
the nodes, A or B. This node is considered the "master"
and it is required to both apply any policies and select
specific protection links to divert working traffic. The
other node is considered the "slave". The determination of
the master and the slave may be based on configured
information, protocol specific requirements, or as a
result of running a neighbor discovery procedure.
o Failure events themselves are detected by transport layer
mechanisms if available (e.g., SONET Alarm Indication
Signal (AIS)/ Remote Defect Indication (RDI)). Since the
bi-directional links are formed by a pair of
unidirectional links, a failure in the link from A to B is
typically detected by B and a failure in the opposite
direction is detected by A. It is possible that a failure
simultaneously affects both directions of the bi-
directional link. In this case, A and B will concurrently
detect failures, in the B-to-A direction and in the A-to-B
direction, respectively.
The basic steps in M:N protection (ignoring reversion) are as
follows:
1. If the master detects a failure of a working link, it
autonomously invokes a process to allocate a protection link
to the affected traffic.
2. If the slave detects a failure of a working link, it MUST
inform the master of the failure using a failure indication
message. The master then invokes the same procedure as above
to allocate a protection link. (It is possible that the
master has itself detected the same failure, for example, a
failure simultaneously affecting both directions of a link).
3. Once the master has determined the identity of the
protection link, it indicates this to the slave and requests
the switchover of the traffic (using a "switchover request"
message). Prior to this, if the protection link is carrying
Extra Traffic, the master stops using the link for this
Lang, J., Rajagopalan, B., et al [Page 8]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
traffic (i.e., the traffic is dropped by the master and not
forwarded into or out of the protection link).
4. The slave sends a "switchover response" message back to the
master. Prior to this, if the selected protection link is
carrying traffic that could be preempted, the slave stops
using the link for this traffic (i.e., the traffic is
dropped by the slave and not forwarded into or out of the
protection link). It then starts sending the normal traffic
on the selected protection link.
5. When the master receives the switchover response, it starts
sending and receiving the traffic that was previously
carried on the now-failed link over the new link.
From the description above, it is clear that M:N span restoration
(involving LSP local recovery) may require up to three messages for
each working link being switched: a failure indication message, a
switchover request message and a switchover response message.
o Pre-emption MUST be supported to accommodate Extra
Traffic.
o Routing: A single TE link encompassing both sets of
working and protect links should be announced with Link
Protection Type "Shared M:N". If Extra Traffic is
supported over set of the protection links, then the
bandwidth parameters for the set of protection links must
also be announced. The differentiation between bandwidth
for working and protect links is made using priority
mechanisms.
If there is a failure on a working link, then the affected
LSP(s) must be switched to a protection link, preempting Extra
Traffic if necessary. The bandwidth for the protection link
must be adjusted accordingly.
o Signaling: To establish an LSP on the working link, the
Link Protection object/TLV indicating "Shared M:N" should
be included in the signaling request message for that LSP.
To establish an LSP on the protection link, the
appropriate priority (indicating Extra Traffic) should be
used for that. These objects/TLVs are defined in [GMPLS-
SIG]. If the Link Protection object/TLV is not used, link
selection is a matter of local policy.
o For link management, both nodes must have a consistent
view of the link protection association for the links.
This can be done using LMP or via manual configuration.
Lang, J., Rajagopalan, B., et al [Page 9]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
2.6 Messages
The following messages are used in local span protection procedures.
All these messages must be transmitted reliably from the message
source to the message destination.
2.6.1 Failure Indication Message
This message is sent from the slave to the master to indicate the
identities of one or more failed working links. (This message may
not be necessary when the transport plane technology itself provides
for such a notification).
The number of links included in the message would depend on the
number of failures detected within a window of time by the sending
node. A node may choose to send separate failure indication messages
in the interest of completing the recovery for a given link within
an implementation-dependent time constraint.
2.6.2 Switchover Request Message
Under bi-directional 1+1 span protection, this message is used to
coordinate the selecting function at both nodes. This message is
originated at the node that detected the failure.
Under dedicated 1:1 and shared M:N span protection, this message is
used as an LSP switchover request. This message is sent from the
master node to the slave node (reliably) to indicate that the LSP(s)
on the (failed) working link can be switched to an available
protection link. If so, the ID of the protection link as well as the
LSP labels (if necessary) must be indicated. These identifiers used
must be consistent with those used in GMPLS signaling.
A working link may carry multiple LSPs. Since the normal traffic
carried over the working link is switched to the protection link, it
may be possible for the LSPs on the working link to be mapped to the
protection link without re-signaling each individual LSP. For
example, if link bundling [BUNDLE] is used where the working and
protect links are mapped to component links, and the labels are the
same on the working and protection links, it may be possible to
change the component links without needing to re-signal each
individual LSP. Optionally, the labels may need to be explicitly
coordinated between the two nodes. In this case, the switchover
request message should carry the new label mappings.
The master may not be able to find protection links to accommodate
all failed working links. Thus, if this message is generated in
response to a Failure Indication message from the slave then the set
of failed links in the message may be a sub-set of the links
received in the Failure Indication message. Depending on time
constraints, the master may switch the normal traffic from the set
of failed links in smaller batches. Thus, a single failure
indication message may result in the master sending more than one
Switchover Request message to the same slave node.
Lang, J., Rajagopalan, B., et al [Page 10]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
2.6.3 Switchover Response Message
This message is sent from the slave to the master (reliably) to
indicate the completion (or failure) of switchover at the slave. In
this message, the slave may indicate that it cannot switch over to
the corresponding free link for some reason. The master and slave in
this case notify the user (operator) of the failed switchover. A
notification of the failure may also be used as a trigger in an end-
to-end recovery.
2.7 Preventing Unintended Connections
An unintended connection occurs when traffic from the wrong source
is delivered to a receiver. This must be prevented during protection
switching. This is primarily a concern when the protection link is
being used to carry Extra Traffic. In this case, it must be ensured
that the LSP traffic being switched from the (failed) working link
to the protection link is not delivered to the receiver of the
preempted traffic. Thus, in the message flow described above, the
master node MUST disconnect (any) preempted traffic on the selected
protection link before sending the Switchover Request. The slave
node MUST also disconnect preempted traffic before sending the
Switchover Response. In addition, the master node should start
receiving traffic for the protected LSP from the protection link.
Finally, the master node should start sending protected traffic on
the protection link upon receipt of the Switchover Response.
3.0 End-to-End (Path) Protection and Restoration
End-to-end path protection and restoration refer to the recovery of
an entire LSP from the initiator to the terminator. Suppose the
primary path of an LSP is routed from the initiator (Node A) to the
terminator (Node B) through a set of intermediate nodes. In the
following subsections, we describe three previously proposed end-
tend protection schemes and the functional steps needed to implement
them.
3.1 Unidirectional 1+1 Protection
A dedicated, resource-disjoint alternate path is pre-established to
protect the LSP. Traffic is simultaneously sent on both paths and
received from one of the functional paths by the end nodes A and B.
There is no explicit signaling involved with this mode of
protection.
3.2 Bi-directional 1+1 Protection
A dedicated, resource-disjoint alternate path is pre-established to
protect the LSP. Traffic is simultaneously sent on both paths; under
normal conditions, the traffic from the working path is received by
nodes A and B (in the appropriate directions). A failure affecting
Lang, J., Rajagopalan, B., et al [Page 11]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
the working path results in both A and B switching to the traffic on
the protection path in the respective directions.
Note that this requires coordination between the end nodes to switch
to the protection path.
The basic steps in bi-directional 1+1 path protection are as
follows:
o Failure detection: There are two possibilities for this.
1. A node in the working path detects a failure
event. Such a node must send a failure indication
message towards the upstream or/and downstream end
node of the LSP (node A or B). This message may be
forwarded along the working path, or routed over a
different path if the network has general routing
intelligence. Mechanisms provided by the data
transport plane may also be used for this, if
available.
2. The end nodes (A or B) detect the failure
themselves (e.g., loss of light).
o Switchover: The action when an end node detects a failure
in the working path is as follows: Start receiving from
the protection path. At the same time, send a switchover
request message to the other end node to enable switching
at the other end.
The action when an end node receives a switchover message is as
follows:
- Start receiving from the protection path. At the same
time, send a switchover response message to the other end
node.
GMPLS signaling mechanisms may be used to (reliably) signal the
switchover request. This message may be forwarded along the
protection path if no other routing intelligence is available in the
network.
3.2.1 Identifiers
LSP Identifier: A unique identifier for each LSP. The LSP Identifier
is within the scope of the Source ID and Destination ID.
Source ID: ID of the source (e.g., IP address).
Destination ID: ID of the destination (e.g., IP address).
Lang, J., Rajagopalan, B., et al [Page 12]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
3.2.2 Nodal Information
Each node that is on the working or protection path of an LSP must
have knowledge of the LSP identifier as well as the previous and
next nodes in the LSP. This is so that restoration-related messages
may be forwarded properly. The optical network may also have general
routing intelligence. In this case, messages may be forwarded along
paths different than that of the LSP.
The nodal information may be assembled when the working and
protection paths of the LSP are provisioned using signaling, or may
be configured when LSP provisioning does not involve signaling
(e.g., provisioning through a management system). This information
must remain until the LSP is explicitly de-provisioned.
3.2.3 End-to-End Failure Indication Message
This message is sent (reliably) by an intermediate node towards the
source of an LSP. For instance, such a node might have attempted
local span protection and failed. This message may not be necessary
if the data transport layer provides mechanisms for the notification
of LSP failure by the endpoints (i.e. if LSP endpoints are co-
located with a corresponding data (transport) maintenance/recovery
domain).
Consider a node that detects a link failure. The node must determine
the identities of all LSPs that are affected by the failure of the
link, and send an end-to-end failure indication message to the
source of each LSP. Each intermediate node receiving such a message
must forward the message to the appropriate next node such that the
message would ultimately reach the LSP source. Furthermore, if an
intermediate node is itself generating a failure indication message,
there SHOULD be a mechanism to suppress all but one source of
failure indication messages. Finally, the failure indication message
must be sent reliably from the node detecting the failure to the LSP
source. Reliability may be achieved, for example, by re-transmitting
the message until an acknowledgement is received.
3.2.4 End-to-End Failure Acknowledge Message
This message is sent by the source node in response to an End-to-End
failure indication message. This message is sent to the originator
of the failure indication message. The acknowledge message should be
sent for each failure indication message received. Each
intermediate node receiving the acknowledge message must forward it
towards the destination of the message.
3.2.5 End-to-End Switchover Request Message
This message is generated by the source node receiving an indication
of failure in an LSP. It is sent to the LSP destination, and it
carries the identifier of LSP being restored. The End-to-End
Lang, J., Rajagopalan, B., et al [Page 13]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
Switchover message must be sent reliably from the source to the
destination of the LSP.
3.2.6 End-to-End Switchover Response Message
This message is sent by the destination node receiving an End-to-End
Switchover Request message towards the source of the LSP. This
message should identify the LSP being switched over. This message
must be transmitted in response to each End-to-End Switchover
Request message received.
3.3 Shared Mesh Restoration
Shared mesh restoration refers to schemes under which protection
paths for multiple LSPs share common link and node resources. Under
these schemes, the protection capacity is pre-reserved, i.e., link
capacity is allocated to protect one or more LSPs but explicit
action is required to instantiate a specific protection LSP. This
requires restoration signaling along the protection path.
Typically, the protection capacity is shared only amongst LSPs whose
working paths are physically diverse. This criterion can be enforced
when provisioning the protection path. Specifically, provisioning-
related signaling messages may carry information about the working
path to nodes along the protection path. This can be used as call
admission control to accept/reject connections along the protection
path based on the identification of the resources used for the
primary path.
Thus, shared mesh restoration is designed to protect an LSP after a
single failure event, i.e., a failure that affects the working path
of at most one LSP sharing the protection capacity. It is possible
that a protection path may not be successfully activated when
multiple, concurrent failure events occur. In this case, shared mesh
restoration capacity may be claimed for more than one failed LSP and
the protection path can be activated only for one of them (at most).
For implementing shared mesh restoration, the identifier and nodal
information related to signaling along the control path are as
defined for 1+1 protection in Sections 3.2.1 and 3.2.2. In addition,
each node must also keep (local) information needed to establish the
data plane of the protection path. This information must indicate
the local resources to be allocated, the fabric cross-connect to be
established to activate the path, etc. The precise nature of this
information would depend on the type of node and LSP (the GMPLS
signaling draft describes different type of switches [GMPLS_SIG]).
It would also depend on whether the information is fine or coarse-
grained. For example, fine-grained information would indicate pre-
selection of all details pertaining to protection path activation,
such as outgoing link, labels, etc. Coarse-grained information, on
the other hand, would allow some details to be determined during
protection path activation. For example, protection resources may be
pre-selected at the level of a TE link, while the selection of the
Lang, J., Rajagopalan, B., et al [Page 14]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
specific component link and label occurs during protection path
activation.
While the coarser specification allows some flexibility in selection
of the precise resource to activate, it also brings in more
complexity in decision making and signaling during the time-critical
restoration phase. Furthermore, the procedures for the assignment of
bandwidth to protection paths must take into account the total
resources in a TE link so that single-failure survivability
requirements are satisfied.
3.3.1 End-to-End Failure Indication and Acknowledgement
The End-to-End failure indication and acknowledgement procedures and
messages are as defined in Sections 3.2.3 and 3.2.4.
3.3.2 End-to-End Switchover Request
This message is generated by the source node receiving an indication
of failure in an LSP. It is sent to the LSP destination along the
protection path, and it identifies the LSP being restored. If any
intermediate node is unable to establish cross-connects for the
protection path, then it is desirable that no other node in the path
establishes cross-connects for the path. This would allow shared
mesh restoration paths to be efficiently utilized.
The End-to-End Switchover message must be sent reliably from the
source to the destination of the LSP along the protection path.
3.3.3 End-to-End Switchover Response
This message is sent by the destination node receiving an End-to-End
Switchover Request message towards the source of the LSP, along the
protection path. This message should identify the LSP that is being
switched over. Prior to activating the secondary bandwidth at each
hop along the path, Extra Traffic (if used) must be dropped and not
forwarded
This message must be transmitted in response to each End-to-End
Switchover Request message received.
4. Reversion and other Administrative Procedures
Reversion refers to the process of moving an LSP back to the
original working path after a failure is cleared and the path is
repaired. Reversion applies both to local span and end-to-end path
protected LSPs. Reversion is desired for the following reasons.
First, the protection path may not be optimal as compared to the
working path from a routing and resource consumption point of view.
Second, moving an LSP to its working path allows the protection
resources to be used to protect other LSPs. Reversion has the
disadvantage of causing a second service disruption. Use of
Lang, J., Rajagopalan, B., et al [Page 15]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
reversion is at the option of the operator. Reversion implies that a
working path remains allocated to the LSP that was originally routed
over it even after a failure. It is
important to have mechanisms that allow reversion to be performed
with minimal service disruption to the customer. This can be
achieved using a ôbridge-and-switchö approach (often referred to as make-before-break).
The basic steps involved in bridge-and-switch are:
1. The source node commences the process by ôbridgingö the signal onto both the working and the protection paths (or links in the
case of span protection).
2. Once the bridging process is complete, the source node sends a
Bridge and Switch Request message to the destination,
identifying the LSP and other information necessary to perform
reversion. Upon receipt of this message, the destination
selects the signal from the working path. At the same time, it
bridges the transmitted signal onto both the working and
protection paths.
3. The destination then sends a Bridge and Switch Response message
to the source confirming the completion of the operation.
4. When the source receives this message, it switches to receive
from the working path, and stops transmitting traffic on the
protection path. The source then sends a Bridge and Switch
Completed message to the destination confirming that the LSP
has been reverted.
5. Upon receipt of this message, the destination stops
transmitting along the protection path and de-activates the LSP
along this path. The de-activation procedure should remove the
cross-LSPs along the protection path (and frees the resources
to be used for restoring other failures.
Administrative procedures other than reversion include the ability
to force a switchover (from working to protect or vice versa), and
locking out switchover, i.e., preventing an LSP from moving from
working to protect administratively. These administrative conditions
have to be supported by signaling.
5. Discussion
5.1 LSP Priorities During Protection
Under span protection, a failure event could affect more than one
working link and there could be fewer protection links than the
number of failed working links. Furthermore, a working link may
contain multiple LSPs of varying priority. Under this scenario, a
decision must be made as to which working links (and therefore LSPs)
should be protected. This decision may be based on LSP priorities.
In general, a node might detect failures sequentially, i.e., all
failed working links may not be detected simultaneously, but only
Lang, J., Rajagopalan, B., et al [Page 16]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
sequentially. In this case, as per the proposed signaling
procedures, LSPs on a working link may be switched over to a given
protection link, but another failure (of a working link carrying
higher priority LSPs) may be detected soon afterwards. In this case,
the new LSPs may bump the ones previously switched over the
protection link.
In the case of end-to-end shared mesh restoration, priorities may be
implemented for allocating shared link resources under multiple
failure scenarios. As described in Section 3.3, more than one LSP
can claim shared resources under multiple failure scenarios. If such
resources are first allocated to a lower priority LSP, they may have
to be reclaimed and allocated to a higher priority LSP.
6. Author's Addresses
Jonathan P. Lang Bala Rajagopalan
email: jplang@ieee.org Tellium, Inc.
2 Crescent Place
P.O. Box 901
Oceanport, NJ 07757-0901
email: braja@tellium.com
7. Intellectual Property Considerations
This section is taken from Section 10.4 of [RFC2026].
The IETF takes no position regarding the validity or scope of any
intellectual property or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; neither does it represent that it
has made any effort to identify any such rights. Information on the
IETFÆs procedures with respect to rights in standards-track and
standards-related documentation can be found in BCP-11. Copies of
claims of rights made available for publication and any assurances
of licenses to be made available, or the result of an attempt made
to obtain a general license or permission for the use of such
proprietary rights by implementors or users of this specification
can be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights which may cover technology that may be required to practice
this standard. Please address the information to the IETF Executive
Director.
Lang, J., Rajagopalan, B., et al [Page 17]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
8. References
8.1 Normative References
[BUNDLE] Kompella, K., Rekhter, Y. and Berger, L., "Link Bundling in
MPLS Traffic Engineering", draft-ietf-mpls-bundle-04.txt (work in
progress).
[GMPLS-ISIS] Kompella, K., Rekhter, Y., Banerjee, A. et al, "IS-IS
Extensions in Support of Generalized MPLS", draft-ietf-isis-gmpls-
extensions-16.txt (work in progress).
[GMPLS-OSPF] Kompella, K., Rekhter, Y., Banerjee, A. et al, "OSPF
Extensions in Support of Generalized MPLS", draft-ietf-ccamp-ospf-
gmpls-extensions-09.txt (work in progress).
[GMPLS-SIG] Ashwood-Smith, P., Banerjee, A., et al, "Generalized
MPLS - Signaling Functional Description," RFC 3471.
[LMP] Lang, P, ed., "Link Management Protocol (LMP) v1.0" Internet
Draft, Work in progress, draft-ietf-ccamp-lmp-09.
8.2 Informative References
[RFC2026] Bradner, S., "The Internet Standards Process -- Revision
3," BCP 9, RFC 2026, October 1996.
[TERM] Mannie, E., Papadimitriou, D., ed., "Recovery (Protection
Internet Draft, draft-mannie-gmpls-recovery-terminology-02.txt,
(work in progress).
Full Copyright Statement
ôCopyright ¨ The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
Lang, J., Rajagopalan, B., et al [Page 18]
Internet Draft draft-ietf-ccamp-gmpls-recovery-functional-01.txt
This document and the information contained herein is provided on an
ôAS ISö basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.ö
This draft expires in March, 2004.
Lang, J., Rajagopalan, B., et al [Page 19]