Network Working Group Tony Li
INTERNET DRAFT Juniper Networks
February 1999
Domain-wide Prefix Distribution with Multi-Level IS-IS
<draft-ietf-isis-domain-wide-00.txt>
Status
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
1.0 Abstract
This document describes extensions to the IS-IS protocol to support
optimal routing within a multi-level domain. The IS-IS protocol is
specified in ISO 10589 [1], with extensions for supporting IPv4
specified in RFC 1195 [2].
This document extends the semantics presented in RFC 1195 so that a
routing domain running with both Level 1 and Level 2 Intermediate
Systems (IS) [routers] can distribute IP prefixes between Level 1 and
Level 2 and vice versa. This distribution requires certain
restrictions to insure that persistent forwarding loops do not form.
The goal of this domain-wide prefix distribution is to increase the
granularity of the routing information within the domain.
2.0 Introduction
An IS-IS routing domain (a.k.a., an autonomous system running IS-IS)
can be partitioned into multiple level 1 (L1) areas, and a level 2
(L2) connected subset of the topology that interconnects all of the
L1 areas. Within each L1 area, all routers exchange link state
information. L2 routers also exchange L2 link state information to
compute routes between areas.
RFC 1195 [2] defines the Type, Length and Value (TLV) tuples that are
used to transport IPv4 routing information in IS-IS. RFC 1195 also
specifies the semantics and procedures for interactions between
levels. Specifically, routers in a L1 area will exchange information
within the L1 area. For IP destinations not found in the prefixes in
the L1 database, the L1 router should forward packets to the nearest
router that is in both L1 and L2 (i.e., an L1L2 router) with the
'attach' bit set in its L1 Link State Protocol Data Unit (LSP).
Also per RFC 1195, an L1L2 router should be manually configured with
a set of prefixes that summarize the IP prefixes found in that L1
area. These summaries are injected into L2. RFC 1195 specifies no
further interactions between L1 and L2 for IPv4 prefixes.
2.1 Motivations for domain-wide prefix distribution
The mechanisms specified in RFC 1195 are appropriate in many
situations, and lead to excellent scalability properties. However,
in certain circumstances, the domain administrator may wish to
sacrifice some amount of scalability and distribute more specific
information than is described by RFC 1195. This section discusses
the various reasons why the domain administrator may wish to make
such a tradeoff.
One major reason for distributing more prefix information is to
improve the quality of the resulting routes. A well know property of
prefix summarization or any abstraction mechanism is that it
necessarily results in a loss of information. This loss of
information in turn results in the computation of a route based upon
less information, which will frequently result in routes that are not
optimal.
A simple example can serve to demonstrate this adequately. Suppose
that a L1 area has two L1L2 routers that both advertise a single
summary of all prefixes within the L1 area. To reach a destination
inside the L1 area, any other L2 router is going to compute the
shortest path to one of the two L1L2 routers for that area. Suppose,
for example, that both of the L1L2 routers are equidistant from the
L2 source, and that the L2 source arbitrarily selects one L1L2
router. This router may not be the optimal router when viewed from
the L1 topology. In fact, it may be the case that the path from the
selected L1L2 router to the destination router may traverse the L1L2
router that was not selected. If more detailed topological
information or more detailed metric information was available to the
L2 source router, it could make a more optimal route computation.
This situation is symmetric in that an L1 router has no information
about prefixes in L2 or within a different L1 area. In using the
nearest L1L2 router, that L1L2 is effectively injecting a default
route without metric information into the L1 area. The route
computation that the L1 router performs is similarly suboptimal.
Besides the optimality of the routes computed, there is another
significant driver for the domain wide distribution of prefix
information. That driver is the current practice of using the IGP
(IS-IS) metric as part of the BGP Multi-Exit Discriminator (MED).
The value in the MED is advertised to other domains and is used to
inform other domains of the optimal entry point into the current
domain. Current practice is to take the IS-IS metric and insert it
as the MED value. This tends to cause external traffic to enter the
domain at the point closest to the exit router. Note that the
receiving domain may, based upon policy, choose to ignore the MED
that is advertised. However, current practice is to distribute the
IGP metric in this way in order to optimize routing wherever
possible. This is possible in current networks that only are a
single area, but becomes problematic if hierarchy is to be installed
into the network. This is again because the loss of end-to-end
metric information means that the MED value will not reflect the true
distance across the advertising domain. Full distribution of prefix
information within the domain would alleviate this problem as it
would allow accurate computation of the IS-IS metric across the
domain, resulting in an accurate value presented in the MED.
2.2 Scalability
The disadvantage to performing the domain-wide prefix distribution
described above is that it has an impact to the scalability of IS-IS.
Areas within IS-IS help scalability in that LSPs are contained within
a single area. This limits the size of the link state database, that
in turn limits the complexity of the shortest path computation.
Further, the summarization of the prefix information aids scalability
in that the abstraction of the prefix information removes the sheer
number of data items to be transported and the number of routes to be
computed.
It should be noted quite strongly that the distribution of prefixes
on a domain wide basis impacts the scalability of IS-IS in the second
respect. It will increase the number of prefixes throughout the
domain. This will result in increased memory consumption,
transmission requirements and computation requirements throughout the
domain.
It must also be noted that the domain-wide distribution of prefixes
has no effect whatsoever on the first aspect of scalability, namely
the existence of areas and the limitation of the distribution of the
link state database.
Thus, the net result is that the introduction of domain-wide prefix
distribution into a formerly flat, single area network is a clear
benefit to the scalability of that network. However, it is a
compromise and does not provide the maximum scalability available
with IS-IS. Domains that choose to make use of this facility should
be aware of the tradeoff that they are making between scalability and
optimality and provision and monitor their networks accordingly.
Normal provisioning guidelines that would apply to a fully
hierarchical deployment of IS-IS will not apply to this type of
configuration.
4.0 New semantics for external type metrics
RFC 1195 defines two TLVs for carrying IP prefixes. TLV 128 is
defined to carry 'internal' prefixes and TLV 130 is defined to carry
'external' prefixes. The original intent of RFC 1195 was to carry
intra-domain routes within the internal prefix TLV and inter-domain
routes or intra-domain routes from alternate IGPs in an external
prefix TLV. Interestingly, TLV type 130 is not documented to exist
in Level 2 LSPs.
In addition to this distinction, RFC 1195 provides for a bit in each
of these TLVs that distinguishes between an internal metric type and
an external metric type. Similarly, the clear intent was that the
internal metric type should reflect a total metric that is the sum of
the metrics to the advertising router plus the metric to the prefix.
Further, for an external metric type, the total metric should simply
be the metric advertised to the prefix, not including the total
metric necessary to reach the exit router. Prefixes with internal
metrics are always preferred over external metrics, regardless of the
value of the metrics.
It should be noted that the combination of an internal prefix with an
external metric type is not obviously useful, and is not well defined
by RFC 1195.
It should also be noted that as of this writing, the author knows of
no deployed implementations that make use of either the external
prefix or the external metric type. The implication is that this
proposal is free to redefine the semantics of the external metric
type without conflict.
An essential property when redistributing prefixes between levels is
to insure that no persistent loops form in the distribution of
information (i.e., a routing loop), as this would lead to the
indefinite propagation of the information, even in the event that the
information was no longer originated by some system in the domain.
Further, a routing loop is likely to form a forwarding loop, where
actual traffic traverses the network in a cycle in the topology.
Forwarding loops are known to consume large amounts of resources and
are to be avoided.
4.1 Proposed semantics for the external metric type
To provide the above properties, this proposal defines the following
semantics.
1) Only internal metric type prefixes are redistributed from L1 into
L2, and these will be marked as an external metric type when
advertised into L2.
2) All prefixes can be redistributed from L2 into L1 but again will
be marked as an external metric type when advertised into L1.
3) Within L1, a route to a prefix with an internal metric type is
preferred over a route to the same prefix with an external metric
type, regardless of the comparison of the metrics.
Based on these rules, we first observe that this proposal is free
from routing loops. No prefix can be redistributed from L2 to L1 and
back into L2, because the route is marked with external metric type
in L1 and by rule 1 cannot be redistributed into L2. Similarly, a
prefix redistributed from L1 to L2 and back into the original L1 area
will not be used while an L1 internal metric type prefix is
available. There is the possibility of a transient routing loop in
this situation when the original prefix is withdrawn and the external
prefix is selected. However, all link state protocols are subject to
transient routing loops, so this is no worse than the status quo.
Note that this proposal is not radically different than the current
semantics for RFC 1195: internal metric types are always preferred
over externals, so rule (3) is an extension that allows external
metric types in internal prefix TLVs. It does not introduce a new
comparison between internal and external metric types.
4.2 Transition issues
Because no implementations currently make use of the external metric
type, the deployment of prefixes with an external metric type is
somewhat problematic. There is the possibility that the new type of
advertisement may result in software instability in systems that do
not deal with even the original semantics correctly. Further, there
is a danger that haphazard deployment of systems supporting this
proposal and legacy systems would have an unfortunate interaction.
It is recommended, for any L1 area that should perform the mutual
redistribution described in this proposal, that the L1L2 systems be
updated first. If these systems operate correctly, this is
sufficient to insure that there are no persistent routing loops.
5.0 Comparisons with other proposals
There are two other proposals currently being discussed which are
similar to this proposal in nature. This section discusses each of
these proposals and their relationship to this proposal.
5.1 Creation of new TLVs
In [3], a new TLV is proposed to transport IP prefix information.
Because this is a new TLV, it is somewhat harder to deploy, requiring
that all systems understand the new TLV before it can become
effective. For this reason, this proposal provides an alternative
that can be deployed sooner. There is no effective semantic
difference between the two proposals. In [3], a bit is defined to
mark a prefix as 'up' or 'down'. This is essentially the same
semantics as is proposed here.
5.2 Usage of external prefixes
An alternate proposal, [4], also uses effectively the same semantics,
but encodes the information somewhat differently. Prefixes that
would be marked with the external metric type would be instead
encoded as external prefixes. This forces the usage of a separate
TLV, resulting in a few extra bytes of overhead. This is not a
significant difference. The primary differences are syntactic, and
the addition of the external prefix TLV to the L1 LSP. The latter is
a clear omission in RFC 1195 and should have been in the original
RFC.
6.0 Security Considerations
This document raises no new security issues for IS-IS.
7.0 Acknowledgments
The authors would like to thank Henk Smit for his comments on this
work.
8.0 References
[1] ISO 10589, "Intermediate System to Intermediate System Intra-
Domain Routeing Exchange Protocol for use in Conjunction with the
Protocol for Providing the Connectionless-mode Network Service (ISO
8473)" [Also republished as RFC 1142]
[2] RFC 1195, "Use of OSI IS-IS for routing in TCP/IP and dual
environments", R.W. Callon, Dec. 1990
[3] Smit, H., Li, T. "IS-IS extensions for Traffic Engineering",
draft-ietf-isis-traffic-00.txt, work in progress
[4] Patel, A., Przygienda, T., "L1/L2 Optimal IS-IS Routing", draft-
ietf-isis-l1l2-00.txt, work in progress
9.0 Author's Address
Tony Li
Juniper Networks, Inc.
385 Ravendale Dr.
Mountain View, CA 94043
Email: tli@juniper.net
Fax: +1 650 526 8001
Voice: +1 650 526 8006