Inter-Domain Routing P. Marques, Ed.
Internet-Draft R. White
Intended status: Standards Track Cisco Systems, Inc.
Expires: June 24, 2011 December 21, 2010
Topology-based aggregation
draft-marques-idr-aggregate-00
Abstract
This document defines a mechanism which allows more-specific IP
address prefixes to be aggregated when they are topologically
equivalent or less preferable than a less-specific advertisement.
It is designed to allow multi-homed sites to use "Provider
Aggregatable" (PA) addresses and obtain both redundancy and local
traffic optimizations when using multiple service providers.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 24, 2011.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
Marques & White Expires June 24, 2011 [Page 1]
Internet-Draft Topology-based aggregation December 2010
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Topology-based aggregation . . . . . . . . . . . . . . . . . . 4
3. BGP AGGREGATE_INFO attribute . . . . . . . . . . . . . . . . . 6
4. BGP extension deployment . . . . . . . . . . . . . . . . . . . 8
5. Path selection criteria . . . . . . . . . . . . . . . . . . . 8
6. Network deployment . . . . . . . . . . . . . . . . . . . . . . 9
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 10
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
10. Security Considerations . . . . . . . . . . . . . . . . . . . 11
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
11.1. Normative References . . . . . . . . . . . . . . . . . . 11
11.2. Informative References . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11
Marques & White Expires June 24, 2011 [Page 2]
Internet-Draft Topology-based aggregation December 2010
1. Introduction
With the existing inter-domain routing functionality as defined by
RFC 4271 [RFC4271], multi-homed sites feel compelled to advertise
their individual prefixes to the entire Internet in order to achieve
the desired reliability and traffic-engineering behavior.
Multi-homed sites typically advertise "Provider Independent" (PI)
prefixes. An alternative approach would be for "Provider
Aggregatable" (PA) space to be used along with a set of procedures
that allow for route advertisements to be aggregated. This option
must retain the functionality that is provided today by PI
advertisements.
One assumption made here is that renumbering of a multi-homed site is
economically feasible given the increased usage of dynamic host
configuration protocols and/or network address translation.
This document is being written at a time when IP addresses are
becoming scarse. It is difficult to predict whether Internet address
allocation and assignment policies will drift torwards the use of PI
space in order to achieve more efficient allocation. Or whether
scarcity will make it harder to obtain PI space.
In the latter case, this document define an approach that would allow
multi-homed sites a method for using PA addresses without bumping
into address space filtering rules that may be in place to limit the
growth of the internet table size.
In order to meet the requirements stated above for multi-home site
routing, the following is proposed:
The routing advertisement must be taken out of "Provider
Aggregatable" (PA) space.
The routing advertisement must be leaked through one or more
alternate providers, other than the one owning the PA space.
These more-specific route advertisements shall be automatically
aggregated, depending on the network topology.
If the multi-homed site becomes disconnected from the owner of the
address space it must be possible to unsuppress the most-specific
adververtisement.
In order to provide topology-dependent aggregation, this document
defines a new BGP path attribute, AGGREGATE_INFO, which defines a BGP
prefix as being a more specific of a given aggregate prefix. A BGP
Marques & White Expires June 24, 2011 [Page 3]
Internet-Draft Topology-based aggregation December 2010
speaker that receives such a prefix MUST compare the received prefix
with the specified aggregate, if present in its Loc-RIB. The
standard path selection algorithm is applied between the paths of the
more-specific prefix and the best-path of the aggregate. If the
best-path of the aggregate is preferable, the more-specific prefix
should be considered as "Inactive". It SHOULD NOT be further re-
advertised into External BGP sessions. It MAY BE re-advertised into
Internal BGP sessions, if the path-selection criteria between the
aggregate and more-specific justifies it.
Conceptually, the aggregate prefix conveys implicit path information
that applies to the delegated more-specifics. Path selection occurs
between the explicit paths that are present in the routing system and
these implicit paths represented by the aggregates.
The AGGREGATE_INFO attribute contains an operational status field.
This field is used to indicate the status of the connectivity between
the multi-homed site and the provider owning the aggregate. It can
be used in a situation of failure in which the customer becomes
detached from the service provider originating the PA aggregate.
When the operational status denotes connectivity failure this will
result on the more-specific being unsuppressed and attracting traffic
through the failover paths. The operational status is used
explicitly in order to inform downstreams that the more-specific is
temporary and will be removed from the routing system once
connectivity is restored.
The operational status field uses three colors: green, yellow and
red. Green means full connectivity. Red means no connectivity.
Yellow informs the routing system that while the site itself has no
direct connectivity to the primary provider, it believes that there
is sufficient redundant connectivity in the network that its prefix
is still reachable through it.
2. Topology-based aggregation
The intent of this extension is to achieve the same semantics as
"Provider Independent" (PI) advertisements, while removing the more
specifics from the BGP routing table in locations of the network
where the aggregate provides equal or better service to the IP
destination prefix in question.
Marques & White Expires June 24, 2011 [Page 4]
Internet-Draft Topology-based aggregation December 2010
+------+
| AS 10|
+------+
/ \
/ \
+------+ +------+
| AS 1 | | AS 2 |
+------+ +------+
| \ / |
| \ / |
| \/ |
| /\ |
| / \ |
| / \ |
+------+ +------+
| AS 3 | | AS 4 |
+------+ +------+
\ /
\ /
+------+
| AS 20|
+------+
Figure 1
Figure 1 contains an example of the usage of the BGP AGGREGATE_INFO
attribute. AS 10 in the example above has been delegated "10.0.1/24"
prefix by AS 1. Using this extension, it will advertise the prefix
into AS 2, which will likely prefer a customer router over a peer
route to AS 1. When AS 2 re-advertises the more-specific "10.0.1/24"
to its peers, AS 3 and 4 in this example, the peers will compare the
more-specific to the "10.0/16" aggregate received from AS 1.
Typically AS 3 will prefer the aggregate (as-path: "1", length 1)
over the more-specific (as-path: "2 10", length: 2). When this is
the case, the more-specific will be suppressed and no longer
propagated in the network. If, for any reason, AS 1 becomes
disconnected from AS 3, the more-specific route to "10.0.1/24" will
become active again, achieving the required failover protection.
From a traffic-engineering perspective, the more-specific is selected
in locations in the network where AS 10 is topologically closer than
AS 1.
In the example described above, the aggregate route may have a
shorter as-path than the equivalent PI prefix that is in use
currently. A PI prefix that is injected by the customer AS (AS 10)
would be advertised to AS 3 with an as-path of "1 10". In order to
Marques & White Expires June 24, 2011 [Page 5]
Internet-Draft Topology-based aggregation December 2010
provide multi-homed sites with equivalent functionality as it is
available to them using PI space, the AGGREGATE_INFO BGP attribute
allows the originator to specify an AS_PATH attribute to be appended
with the path contained in the aggregate route. This allows the
customer AS (AS 10) to indicate to AS 3 that the attribute comparison
should be performed between the explicitly advertised more-specific
with as-path "2 10" and an implicit more-specific path with an as-
path of "1 10". This implicit path is derived from the aggregate
prefix.
3. BGP AGGREGATE_INFO attribute
The BGP AGGREGATE_INFO attribute is a well-known, transitive
attribute with Type Code 129. It contains a list of one or more
aggregate target elements. Each aggregate target contains a
mandatory part, with the operational status field followed by a route
prefix. That may be followed by additional BGP PATH attributes that
apply to the specified aggregate target prefix.
The operational status is encoded as a 1-octect field with the
following values:
+-------+--------+-----------------------------------------------+
| Value | Color | Description |
+-------+--------+-----------------------------------------------+
| 0 | Red | No connectivity between customer and provider |
| 1 | Yellow | Direct connectivity unavailable |
| 2 | Green | Connectivity fully operational |
+-------+--------+-----------------------------------------------+
The prefix is encoded as a 2 byte AFI [RFC1700] value, followed by a
variable length prefix encoded as a 1 byte prefix-length in bits and
the prefix itself padded to a byte boundary. This is the same
encoding used for NLRI in BGP UPDATE messages.
The prefix contained in the AGGREGATE_INFO attribute SHOULD be a
less-specific prefix containing all the NLRI specified in the BGP
UPDATE message that includes this attribute.
Following the route prefix, the encoding allows for one of more BGP
path attributes using the encoding specified the BGP [RFC4271]
protocol specification. An implementation MAY choose to include an
AS_PATH attribute in this optional element.
When an AS_PATH attribute is contained inside an AGGREGATE_INFO
attribute, the path segments that it contains shall be appended to
the AS_PATH of the implicit path represented by the aggregate prefix.
Marques & White Expires June 24, 2011 [Page 6]
Internet-Draft Topology-based aggregation December 2010
This implicit path is then compared with the best path of NLRI
prefix(es) included in the UPDATE message containing this attribute.
Example encoding for prefix 10.0/16, as-path "10":
Attr Flags = 0x40, Attr Code = 0x81, Attr Length = 0x0e
OpStatus=0x2, AFI = 0x00 0x01, Prefix Length = 0x10, Prefix Data =
0x0a 0x00
Attr Flags = 0x40, Attr Code = 0x02, Attr Length = 0x04, Data =
0x02 0x01 0x00 0x0a
In the example given above, an AS_PATH segment of "10" in the
aggregate-info attribute and an aggregate path with an AS_PATH of "1"
would result in a as-path of "1 10", of length 2.
When multiple aggregate target prefixes are present in a
AGGREGATE_INFO attribute, the most significant prefix present in the
Loc-Rib is used to generate the implicit path used in path selection.
Multiple targets can be used when prefix assignment and delegation
happens at more than one level.
As an example, a provider X may have a /16 out of which it delegates
to Y a specific /22 block. Y then allocates a /24 to a specific
multi-homed customer Z. If Y itself is using aggregation its prefix
may be suppressed. Where Z to originate a route with a single
aggregation-target (/22), that prefix would not be aggregated in
regions of the network where the /22 had itself be aggregated.
For this mechanism to behave as expected one would have to ensure
that if Y's prefix has been suppress then Z's has also been
suppressed. Otherwise if Z's prefix is present, its aggregation
target of Y will be ignored.
Since this condition cannot be guaranteed, the protocol allows the
originator of the more-specific prefix (Z) to include multiple
aggregation targets (Y and X) in its route advertisement. Whenever Y
is present in the Loc-Rib of BGP speaker, Y is used as source of the
implicit aggregation path. Otherwise X is used if present.
The choice of explicitly listing the aggregation targets rather than
automatically deriving the parent is designed to avoid situations in
which the less-specific is being artificially generated such as, for
instance, the default route.
Marques & White Expires June 24, 2011 [Page 7]
Internet-Draft Topology-based aggregation December 2010
4. BGP extension deployment
BGP speakers that support the extensions described in this document
SHALL use the Capability Advertisement [RFC5492] BGP extension to
advertise that support to its BGP peers.
Compliant implementations should advertise the BGP Capability Code
TBD. The capability data should contain a 1-byte value which is
interpreted as the version of this specification. It should contain
the value 1.
When a BGP route is placed in the Out-RIB for a given external BGP
peer and the peer in question doesn't support this capability, if the
path in the Loc-Rib contains the AGGREGATE_INFO attribute this should
result in the prefix being suppressed. If a previous path was
advertised to this peer that path shall be withdrawn.
If the peer in question is an internal BGP peer which doesn't support
this capability an implementation MAY choose to replace this
attribute with the NO_EXPORT [RFC1997] BGP community attribute,
rather than suppress the path.
This mechanism assures that a path that originated with an
AGGREGATE_INFO attribute is not used by a router without being
compared to the respective aggregate. This is intended to facilitate
the incremental deployment of this functionality.
5. Path selection criteria
A BGP implementation shall run its path selection algorithm
unmodified between all the paths for a given prefix. If the selected
best-path contains the BGP AGGREGATE_INFO attribute, this path shall
be compared with the best-path of the aggregate prefix indicated by
the attribute in question.
The AGGREGATE_INFO attribute represents an implicit path for the
more-specific prefix (the NLRI containing that attribute). The BGP
path attributes of this implicit prefix are the attributes of the
best-path of the aggregate prefix. If the AGGREGATE_INFO contains an
optional AS_PATH attribute, the AS_PATH segments in that attribute
shall be appended to the AS_PATH of the aggregate prefix best-path
before comparison.
When the Operational Status of the specified aggregate target is
"Red" the corresponding implicit path is considered to be
unreachable. When the Operational Status is "Yellow" the originating
AS of the aggregate target prefix MUST treat the implicit path as
Marques & White Expires June 24, 2011 [Page 8]
Internet-Draft Topology-based aggregation December 2010
unreachable also and use the more-specific. Autonomous-systems
further downstream MAY choose whether to ignore or use the
aggregation information.
The "Yellow" state represents that the originator of the prefix
believes that there is a path between the primary and backup
providers for the site such that this path always prefers the more-
specific advertisement. This is often the case if both providers
have a direct peering relationship.
When comparing the more-specific path with its implicit path
(represented by the aggregate), the following changes to the standard
path selection algorithm should be taken into account:
o The Origin attributes of both paths are not comparable. This is
step b) in the path selection algorithm and should be bypassed.
o If the paths in question are equal upto step d) of path selection
algorithm, if both paths are EBGP paths, the less-specific
(aggregate) should be preferred. This replaces the step in path
selection where the oldest EBGP path is preferred [RFC5004].
o If both paths are iBGP paths, the less-specific (aggregate) should
be preferred in case where the paths are equal up-to the router-id
comparison step of path selection.
When the aggregate path is considered to be preferable over the more-
specific, the more-specific should be considered inactive and should
not be installed in the FIB or subsequently advertised to other
peers.
6. Network deployment
The objective of this document is to provide multi-homed sites with
the resilience to failures and limited traffic-engineering
capabilities without the need to recurse to PI advertisements.
Instead of using a PI prefix, a multi-homed site can choose to
address its network with PA prefix from one service provider which it
then advertises through a secondary provider. Or it may choose to
dual address its hosts and/or NAT appliances.
In order for a multi-homed site to achieve the required resilience it
should be allowed by other service providers to inject the more-
specifics that have been delegated to it with the BGP AGGREGATE_INFO
attribute.
Marques & White Expires June 24, 2011 [Page 9]
Internet-Draft Topology-based aggregation December 2010
The AGGREGATE_INFO attribute should only be added to a BGP path by
the originator of the route advertisement. This rule is intended to
ensure that there aren't instances of the same BGP path information
flowing through the Internet routing system with and without the
specified attribute.
In order to maintain the loop free properties of BGP one must ensure
that when suppressing a more-specific this doesn't result in traffic
being forwarded in a way which results in a loop.
For this to occur, the following conditions would be necessary:
A transit AS (X) prefers the more-specific route.
Another AS (Y) receives both aggregate and more-specific from X
and prefers the former.
Y is in the transit path for the more-specific.
The last condition cannot occur since Y, by definition prefers the
aggregate path and will not advertise the more-specific.
7. Acknowledgements
There have been several prior proposals to reduce routing information
used in muli-homing scenarios. For instance, using BGP communities
[I-D.white-bounded-longest-match] and AS hops
[I-D.ietf-idr-as-hopcount].
The current document builds upon the previous work and proposes the
use of standard BGP path selection using both implicit and explicit
paths in order limit information to parts of the network where it is
useful.
8. Contributors
Central parts of the protocol operation where defined by Robert
Raszuk and Keyur Patel. Russ White, Enke Chen, Dave Meyer and Vince
Fuller provided essential input in the early stages of the proposal.
9. IANA Considerations
This memo requests IANA to allocate a BGP attribute type code value,
for the BGP aggregate-info attribute defined herein. It also
requests IANA to allocate a Capability Code according to the
Marques & White Expires June 24, 2011 [Page 10]
Internet-Draft Topology-based aggregation December 2010
procedures defined in RFC 5492 [RFC5492].
10. Security Considerations
The BGP aggregate-info attribute in itself doesn't create a new
security threat. This attribute can only lead to the route being
suppressed.
The presence of more-specifics in the routing system makes a stronger
case for the usefulness of performing origin authentication of route
advertisements.
11. References
11.1. Normative References
[RFC1700] Reynolds, J. and J. Postel, "Assigned Numbers", RFC 1700,
October 1994.
[RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP
Communities Attribute", RFC 1997, August 1996.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC5004] Chen, E. and S. Sangli, "Avoid BGP Best Path Transitions
from One External to Another", RFC 5004, September 2007.
[RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement
with BGP-4", RFC 5492, February 2009.
11.2. Informative References
[I-D.ietf-idr-as-hopcount]
Li, T., "The AS_HOPCOUNT Path Attribute",
draft-ietf-idr-as-hopcount-00 (work in progress),
December 2005.
[I-D.white-bounded-longest-match]
Hares, S., "Bounding Longer Routes to Remove TE",
draft-white-bounded-longest-match-02 (work in progress),
July 2008.
Marques & White Expires June 24, 2011 [Page 11]
Internet-Draft Topology-based aggregation December 2010
Authors' Addresses
Pedro Marques (editor)
Cisco Systems, Inc.
170 W. Tasman Dr.
San Jose, CA 94040
US
Phone: +1 408 853 1193
Email: roque@cisco.com
Russ White
Cisco Systems, Inc.
7025 Kit Creek Road
Research Triangle Park, NC 27709
US
Email: riw@cisco.com
Marques & White Expires June 24, 2011 [Page 12]