PCE W. Lu
Internet-Draft S. Kini
Intended status: Standards Track S. Narayanan
Expires: April 21, 2011 Ericsson
October 18, 2010
Relayed CSPF for Multi-Area Multi-AS PCE
draft-lu-relayed-cspf-00
Abstract
For LSPs that span across multiple areas or multiple autonomous
systems (AS), the path computation element (PCE) in each area or each
AS can cooperate and conclude the optimal results if so exist. An
upstream PCE, though incapable to carry out the path computation for
the tailend outside of its domain, can provide the history of the
computation to its downstream PCEs which will assume the computation
job till it is accomplished or relay the baton to the next runner.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 21, 2011.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Lu, et al. Expires April 21, 2011 [Page 1]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4
1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. CSPF Seed . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Multiple Seeds . . . . . . . . . . . . . . . . . . . . . . 4
2.2. Heap Equivalence . . . . . . . . . . . . . . . . . . . . . 4
2.2.1. SPT Equivalence . . . . . . . . . . . . . . . . . . . 5
2.2.2. Seed Deposit Timing . . . . . . . . . . . . . . . . . 6
2.2.3. Seed Set Reduction . . . . . . . . . . . . . . . . . . 6
3. Multi-Area Path Computation . . . . . . . . . . . . . . . . . 6
3.1. Inter-area PCE . . . . . . . . . . . . . . . . . . . . . . 6
3.2. Initial Seed Set . . . . . . . . . . . . . . . . . . . . . 7
3.3. PCE Relay . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4. Relay Content . . . . . . . . . . . . . . . . . . . . . . 8
3.5. PCE Elect . . . . . . . . . . . . . . . . . . . . . . . . 9
4. Multi-Home Tail-End . . . . . . . . . . . . . . . . . . . . . 9
4.1. Relay Timer . . . . . . . . . . . . . . . . . . . . . . . 10
5. Multi-AS Path Computation . . . . . . . . . . . . . . . . . . 11
5.1. Information Hiding . . . . . . . . . . . . . . . . . . . . 11
5.2. Transit Link . . . . . . . . . . . . . . . . . . . . . . . 12
5.3. AS Number . . . . . . . . . . . . . . . . . . . . . . . . 12
6. Other Cases . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.1. Backups . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.2. SRLG . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.3. Loose ERO . . . . . . . . . . . . . . . . . . . . . . . . 13
6.3.1. Pre-computed EROs . . . . . . . . . . . . . . . . . . 13
6.3.2. Re-Query . . . . . . . . . . . . . . . . . . . . . . . 13
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
9. Security Considerations . . . . . . . . . . . . . . . . . . . 13
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
10.1. Normative References . . . . . . . . . . . . . . . . . . . 14
10.2. Informative References . . . . . . . . . . . . . . . . . . 14
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14
Lu, et al. Expires April 21, 2011 [Page 2]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
1. Introduction
The demand for multi-area multi-AS path computation ability has
evolved from the academic discussion to carrier network's feature
request.
A few solutions have been proposed and there are three major
variations in this field.
A global TE database is ideal and the simplest for serving multi-area
or multi-as path request. This idea is apparently prohibitive for
two reasons.
1. Such database may be too big and negate the purpose of having
multiple areas or ASes;
2. This violates the information hiding and confidentiality
requirement and is unacceptable by ISPs.
A crankback method is more practical as it is an exhaustive search
based mechanism and will find an LSP if it exists. The drawback
nevertheless is obvious:
1. It does not scale, for one that it often requires and wastes more
than one tryout to find a qualified LSP, and for two it is RSVP
signaling based which is by its nature poor in scaling;
2. The extra signaling messages add burdens on the existing network;
3. The path, if found, is not guaranteed optimal;
4. It requires lots of manual configurations to specify border
routers and hence is labor intensive;
5. It requires substantial RSVP changes, in both protocol and
operation.
BRPC [RFC5441] is another idea in coping with the aforementioned
problems. It assumes that the destination is known in a particular
domain and area. This assumption is not always true. Besides the
destination may be multi-homed, meaning reachable through different
areas and domains. The RFC method cannot handle this. Further more,
the RFC's procedure mandates a PCEP [RFC5440] extension which
understands the Virtual Shortest Path Tree (VSPT). This further
complicates the method. Further more, VSPT approach only address one
destination at a time.
This document describes a relayed CSPF based PCE scheme which offers
Lu, et al. Expires April 21, 2011 [Page 3]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
a solution with optimality, scalability, and simplicity.
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
1.2. Acronyms
CSPF - Constrained Shortest Path First
PCE - Path Computation Element
PCReq - Path Computation Request
PCRep - Path Computation Reply
AS - Autonomous System
LSP - Label Switched Path
RSVP - Resource ReserVation Protocol
PCEP - PCE to PCE Communication Protocol
SPT - Shortest Path Tree
VSPT - Virtual Shortest Path Tree
2. CSPF Seed
Call the node initially deposited into the heap seed. CSPF, or more
generically SPF, is a seed based algorithm. The entire SPT is built
upon this seed.
2.1. Multiple Seeds
It is not necessary, however, that the heap can have only one seed.
In fact, most of the time during the SPF expansion, the heap contains
many nodes which can be perceived as seeds for further expansion.
2.2. Heap Equivalence
An SPF heap possesses following properties:
Lu, et al. Expires April 21, 2011 [Page 4]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
1. A heap with one initial seed is equivalent to that with multiple
intermediate seeds in any SPF stages for the destinations that
have not yet been reached.
2. The deposit time of seeds is insensitive to destinations that
have not yet been reached, provided that the seeds carry correct
attributes values such as cost and nexthop.
3. The multiple seeds in property 1 can further be reduced to those
that constitute a set of nodes besides which the destinations are
not viable.
2.2.1. SPT Equivalence
The property 1 is not difficult to comprehend. During normal SPF
cycles, the heap will change and the path tree will grow. At any SPF
cycle, the path tree records reachability to certain destinations.
This is important to generic SPF applications such as IGP routing
protocols. With IGP, either OSPF [RFC2328] or IS-IS
[RFC1195][ISO.10589.1992], all destinations must be included, whether
they come out of the heap early or late. None of those can be
neglected.
Nevertheless in CSPF, since only the targeted destinations matter,
non-relevant records can be thrown away. The early path tree records
are insignificant and can be disregarded. Therefore for the selected
destinations, the expanded heap with multiple seeds is equivalent
with the heap at the initial stage.
Figure 1 gives a simple illustration of this concept. With normal
SPF computation, the Headend node "H" is used as an initial seed to
the SPF heap. The final shortest path tree (SPT) will provide
reachability to nodes "A", "B" and the Tailend node "T". If one uses
nodes "A" and "B" as deposit seeds, he will conclude a SPT with the
same reachability for "T". The difference between the two SPTs is
the reachabilities from "H" to nodes "A" and "B".
A
/ \
H T
\ /
B
Figure 1: SPT Equivalence
Lu, et al. Expires April 21, 2011 [Page 5]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
2.2.2. Seed Deposit Timing
The second property indicates that the seed deposit time does not
change its SPT contribution for destinations that have not yet been
reached. Figure 2 shows this concept in detail.
---A---
/ \
H--B---C--T
Figure 2: Seed Deposit Timing
For destination "T", using seed set "A" and "B", or "A" and "C" will
produce identical results.
2.2.3. Seed Set Reduction
The third property allows us to remove the seeds that will not
contribute to the path to the destinations. The statement is the key
point that this proposal is based upon.
As shown in Figure 3, for the reachability to "T", the initial seed
"H" can be replaced with "A and B", per property 1. The two seeds
can further be replaced with "A, C and D", per property 2. Now since
"D" will not contribute to the path to "T", the seed set "A, C, D"
can further be reduced to "A and C", which is the property 3.
---A---
/ \
H--B---C--T
\
--D--E
Figure 3: Seed Set Reduction
These properties are the base of multi-area multi-AS path computation
method described in this document.
3. Multi-Area Path Computation
3.1. Inter-area PCE
Per IGP nature, the Tailend in a different area are not visible to
the computing PCE if the PCE only knows the TE database of the
Headend area. There exist options for multi-area TE database or even
Lu, et al. Expires April 21, 2011 [Page 6]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
a global multi-AS TE-database. But that will have scalability and
confidentiality consequences. And this option is beyond of the scope
of this document. Nevertheless, using the method of this document,
the cross-area path can be achieved with no need of global TE
database, nor much manual intervene.
3.2. Initial Seed Set
Area border routers, or BN (Border Nodes) for short, lie in the
necessary paths to the destination in next area, or areas beyond.
They are natural choices of the initial seed set.
3.3. PCE Relay
The path computation proceeds as a relayed CSPF job, one per each
area. Take Figure 4 for example, assuming routers "A" and "C" are
BNs. The column line divides the topology into two areas, area
"west" on the left, and "east" on the right.
west : east
----A--
/ : \
H--B--C---T
:
Figure 4: PCE Relay
Following steps describe the procedure:
a. PCE in area "west", or PCE-west for short, computes paths to "A"
and "C";
b. PCE-west sends PCReq to PCE of "east" for relayed CSPF. The path
information to "A" and "C" are encoded in the same PCReq message.
The path information contains path attributes such as cost,
bandwidth, admin-group, hop-count, etc.
c. PCE-east uses the path information to BNs "A" and "C" and
deposits them to its SPF heap. It then starts its SPF, and
concludes the path computation to "T". The path to "T" if found,
will be the segment in area "east".
d. PCE-east then sends PCRep back to PCE-west with its path segment
information.
e. PCE-west maps the path segment to one of BNs. It then stitches
the segment to the one in area "west" and forms a complete path.
Lu, et al. Expires April 21, 2011 [Page 7]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
If the destination is not in east, but further in an area eastward,
PCE-east will relay the PCReq to a PCE further downstream, which will
either find the target or continue to relay the job. The stitching
will proceed likewise, but in a reversed order.
3.4. Relay Content
Besides standard PCReq format, an extension to PCEP protocol is
needed to allow encoding of seed information. Following table
describes the minimum data fields:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Len | Node-ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node-ID (Cont) | Sub-Type | Sub-Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seg-ID | Cost |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Cost (Cont) | Hops | Sub-Type | ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sub-Type #1 |
| |
// //
| Sub-Type #M |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5: Relay Content
TLV:
Type - 1 byte, value TBD (RELAYED-SEED)
Len - 1 byte, actual length in value
Node-ID - 4 bytes, BN's node ID
This TLV MAY have zero or more occurrences
Sub-TLV:
Sub-Type - 1 byte, value TBD (SEGMENT)
Lu, et al. Expires April 21, 2011 [Page 8]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
Sub-Len - 1 byte, value 6
Seg-ID - Segment number, 1 byte, a BN can have multiple
segment paths to it. This is to accommodate additive
constraints;
Cost - 4 bytes, starting cost
Hops - 1 byte, starting hop count
This TLV MAY have one or more occurrences
There is no need to carry admin-group or bandwidth information.
These can be learned from the standard path request information
fields.
3.5. PCE Elect
An area PCE has to know all BNs which connect to another area. This
can be achieved with the help from the IGP protocols and their TE
extensions. The method itself however is beyond the scope of this
document.
Once the list of BNs is known, the area PCE can send PCReq to PCEs in
other areas. For a particular downstream area, only one PCE is
necessary. Among eligible PCEs, an election mechanism may be
necessary to avoid the waste and contention of computation resources.
The elect PCE can be any router or a dedicated PCE. It does not have
to be the transit router.
Any tie-breaker algorithm can be used in such election. One simple
method is to choose the one with the highest (or lowest) router ID.
The election is a local decision of the requesting PCE. It can be
proprietary and does not require standardization.
4. Multi-Home Tail-End
A destination may be viable through multiple areas. This happens
typically in dual-homed or multi-homed destination cases. To
maximize the path availability and optimality, the PCReq SHOULD be
sent to each neighboring area and to each area's PCE elect.
As shown in Figure 6, the Headend in area A may reach out through its
borders with area B and area C. The PCReq message it sends is
different per recipient PCE's service area, either B or C. For PCE of
area B, the message encodes the seeds of BNs between area A and B.
Lu, et al. Expires April 21, 2011 [Page 9]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
Likewise the PCReq for PCE of area C lists seeds of BNs between A and
C.
Both PCEs may find paths to the tailend, and send PCRep back to the
Headend in A. To accommodate the race condition to two possible
paths, the requester needs to implement a timer to allow reasonable
wait time to collect all possible PCRep, details in the next section.
The sending of multiple PCReq is not limited to the headend. It can
happen to any intermediate PCEs as far as they have multiple exit
borders to other areas.
----- ----- ------
| || B || |
| ||-----|| |
| A | | D |
| ||-----|| |
| || C || |
----- ----- ------
Figure 6: Multi-Homed Tailend
4.1. Relay Timer
A PCE should always send PCRep back to its requester whether it finds
the path successfully or not. However the PCRep may take time and
may be lost at all. For this reason it is advisable for the
requester to instantiate a relay timer so that it will not wait
indefinitely if the PCRep never comes. The timer also facilitates it
to collect and compare all returning PCRep. Figure 7 is a sample
pseudo code of the timer logic.
Lu, et al. Expires April 21, 2011 [Page 10]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
Sending PCReq:
For (PCE_Elect of each border area) {
send_PCReq(PCE_Elect, path_Req_Info, seeds_of_BNs);
add_to_PCReq_Pending_List(PCE_Elect);
if (relay_timer==NULL) {
create_relay_timer();
}
}
Receiving PCRep:
PCE_Elect = lookup(PCRep's source address);
path = retrieve_from(PCRep, PCE_Elect);
best_path = better_path(best_path, path);
remove_from_PCReq_Pending_List(PCE_Elect);
if (Pending_List==NULL) {
Cancel_timer(relay_timer);
if (self==Headend) {
terminate;
} else {
send_PCRep(best_path, upstream_PCE);
}
}
Relay Timer Expires:
cleanup_Pending_List();
if (Pending_List==NULL) {
Cancel_timer(relay_timer);
if (self==Headend) {
terminate;
} else {
send_PCRep(best_path, upstream_PCE);
}
}
Figure 7: Relay Timer Pseudo Code
5. Multi-AS Path Computation
For path computation across different autonomous systems, the methods
for multi-area in section 3 mostly apply as well except a few special
handlings described below.
5.1. Information Hiding
When a PCE sends a PCRep to the requester which is in a different AS,
it does not send explicit hop-by-hop EROs. Instead it sends a loose
ERO with path level characters such as cost metric. This ensures the
Lu, et al. Expires April 21, 2011 [Page 11]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
information hiding from different ASes while the End-to-End path can
still be established.
5.2. Transit Link
Unlike the multi-area setup where an area border router sits over
both areas, two AS border routes need a transit link to connect them
together.
----- Transit -----
|AS1|---------|AS2|
----- Link -----
Figure 8: Transit Link
As shown in Figure 8, the transit link needs to be considered for an
ASBR when it is to send PCReq to its peer ASBR. The link's
characters such as metric, hop count, bandwidth, MUST be taken into
account of the seed value. The PCReq relay logic remains the same.
5.3. AS Number
The BN election concept does not apply in cross AS BNs. There is no
need to carry the AS number into the TE database. The PCReq and the
corresponding seeds SHOULD be sent to each viable AS peer node.
6. Other Cases
6.1. Backups
Backup LSPs, or bypass LSPs, or pass re-optimization, or any path
computation that requires the knowledge of an existing LSP MUST have
the information of that LSP passed along with the PCReq and seeds.
In case of multi-AS path computation, the upstream LSR MUST hide the
path segment detail from the downstream LSR.
6.2. SRLG
The SRLG handling should be no different than that of single AS
single area path computation. As long as the SRLG information is
available, through for example GMPLS, each relayed PCE should compute
correct PCE segment, and the end-to-end path should meet SRLG
requirement.
Lu, et al. Expires April 21, 2011 [Page 12]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
6.3. Loose ERO
In multi-AS case, since the explicit EROs are not passed back, the AS
border router has to recover the path segment that it promised
upstream LSR when the LSP request reaches it. Following two
approaches can be used for this purpose.
6.3.1. Pre-computed EROs
The LSR remembers and makes a record of the path segment. When RSVP
request comes into the ASBR LSR, it already has resolved EROs stored
locally. This is quick but requires RSVP implementation change.
6.3.2. Re-Query
The LSR does not need to remember the explicit path segment. It is
kept stateless. When requested, the ASBR LSR will query its own PCE,
because it does not remember the previous query and forget its
response. The result nevertheless should be the same, provided the
topology has no changes since.
If the answer is different or no longer available, the network has
dramatically changed. The whole path computation process needs redo.
And this has to be done anyway regardless of the method.
7. Acknowledgements
TBD
8. IANA Considerations
This document defines the following additional TLVs to the PCEP
protocol:
+------+------------------+---------------+
| Type | Name | Source |
+------+------------------+---------------+
| TBD | RELAYED-SEED | This document |
| TBD | SEGMENT(Sub-TLV) | This document |
+------+------------------+---------------+
9. Security Considerations
There are no specific security considerations within the scope of
this document.
Lu, et al. Expires April 21, 2011 [Page 13]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
10. References
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
10.2. Informative References
[ISO.10589.1992]
International Organization for Standardization,
"Intermediate system to intermediate system intra-domain-
routing routine information exchange protocol for use in
conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473)",
ISO Standard 10589, 1992.
[RFC1195] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and
dual environments", RFC 1195, December 1990.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.
[RFC5440] Vasseur, JP. and JL. Le Roux, "Path Computation Element
(PCE) Communication Protocol (PCEP)", RFC 5440,
March 2009.
[RFC5441] Vasseur, JP., Zhang, R., Bitar, N., and JL. Le Roux, "A
Backward-Recursive PCE-Based Computation (BRPC) Procedure
to Compute Shortest Constrained Inter-Domain Traffic
Engineering Label Switched Paths", RFC 5441, April 2009.
Authors' Addresses
Wenhu Lu
Ericsson
300 Holger Way
San Jose, California 95134
USA
Phone: 408 750-5436
Email: wenhu.lu@ericsson.com
Lu, et al. Expires April 21, 2011 [Page 14]
Internet-Draft Relayed CSPF for Multi-Area Multi-AS PCE October 2010
Sriganesh Kini
Ericsson
300 Holger Way
San Jose, California 95134
USA
Phone: 408 750-5210
Email: sriganesh.kini@ericsson.com
Srikanth Narayanan
Ericsson
300 Holger Way
San Jose, California 95134
USA
Phone: 408 750-8567
Email: srikanth.narayanan@ericsson.com
Lu, et al. Expires April 21, 2011 [Page 15]