Internet Engineering Task Force Olivier Bonaventure
INTERNET DRAFT FUNDP
Stefaan De Cnodder
Alcatel
Jeffrey Haas
NextHop
Russ White
cisco
July, 2001
Expires January, 2002
Controlling the redistribution of BGP routes
<draft-bonaventure-bgp-redistribution-01.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document proposes the redistribution extended community. This
new well-known extended community allows a router to influence how a
specific route should be redistributed towards a specified set of
eBGP speakers. The redistribution community allows to indicate that
a specific route should not be announced to a set of eBGP speakers,
should only be announced to a set of eBGP speakers or should be
prepended n times when announced to a set of eBGP speakers.
1 Introduction
Bonaventure/De Cnodder/Haas/White [Page 1]
draft-bonaventure-bgp-redistribution-01.txt July 2001
In today's commercial Internet, many ISPs need to have some control
on their interdomain traffic. In the outgoing direction, this control
can be obtained by configuring the BGP routers of the ISP to favor
some routes over others by using the LOCAL-PREF attribute. However,
due to the assymetry of Internet traffic, most ISPs mainly need to
control their incoming traffic.
+---------------+
| |
| AS22 |
| |
+---------------+
||
+---------------+ +---------------+
| 13.0.0.0/8 | | AS21 |
| 12.0.0.0/8 |===============| |
| AS20 | +---------------+
+---------------+
||
+---------------+
| |
| AS10 |
| |
+---------------+
Figure 1: Simple interdomain topology
In the incoming direction, the only way to influence the traffic flow
is to control the redistribution of its routes. Several methods exist
and are used in practice [Hal97]. In this case, it needs to influence
the redistribution and the selection of its own routes by remote
ISPs. Since the default configuration of many BGP routers is to
select the route with the smallest AS path length, a common technique
is to artificially increase the length of the AS path for some
announced routes. For example, in figure 1, if AS20 wanted to
indicate that it prefers to receive its traffic towards subnet
13.0.0.0/8 through its link with AS22, then it would announce this
prefix as usual on this link to AS22 and announce a prefix with the
AS20:AS20:AS20:AS20 path to AS21 and AS10. If AS10 and AS21 rely only
on the AS path length to select the best BGP route, they will prefer
the shorter route received by AS22. This requires a manual
configuration of the BGP routers, but path prepending is used very
often on the Internet according to [Hus01]. In some cases, the
configuration burden can be reduced by using the BGP communities
attribute.
Recently, several large ISPs have gone one step further by defining
BGP communities that allow their customers to influence the
Bonaventure/De Cnodder/Haas/White [Page 2]
draft-bonaventure-bgp-redistribution-01.txt July 2001
redistribution of their routes. For example, in figure 1, AS20 could
configure its BGP routers to always prepend four times AS20 when they
announce via eBGP a route received from one of AS20's customers with
a special community attribute. For this, AS20 needs to publish the
specific BGP communities that it supports and its customers need to
configure their router appropriately. If AS20 needs to define a new
BGP community or change an existing one, it must inform all its
customers would will then have to update the configuration of their
routers. A quick survey of the RIPE database in May 2001 revealed
that the utilization of BGP community attributes to control outbound
routes is becoming more and more frequent. Several utilizations of
the BGP community attributes are interesting to mention.
- More than twenty different AS define their own BGP community
attributes to allow their customers/peers to indicate that a
particular route should not be propagated towards a specific AS,
towards the routers attached to a specific IX, or towards AS
within a given geographical area (e.g. a European AS could want
to prohibit a route from being announced to US peers).
- More than twenty different AS define their own BGP community
attributes to allow their peers or customers to indicate that an
announced route should be prepended when announced towards a
specific AS, IX or set of AS.
- Five AS define their own BGP community attribute to indicate
that a given route should only be redistributed towards a
specified AS.
From this survey, it is clear that this utilization of the BGP
communities attribute occurs in today's Internet. However, asking
each AS to select its own values for the BGP communities and
documenting these values in the RIPE database is not very efficient
because it forces the BGP routers to be configured manually based on
information found in the RIPE database or in peering agreements.
Given the growing utilization of the BGP community attribute to
support such facilities, we propose in this document a new type of
well-known BGP extended community. By using well-known BGP extended
communities with a precise syntax, we support most of the current
utilizations of the BGP communities without relying unnecessarily on
manual configuration of the BGP routers. We believe that reducing the
manual configuration of these routers would be very useful for the
stability and the performance of the global Internet.
2 Controlled redistribution of BGP routes
This document defines a method to allow a BGP speaker to influence
how its peers will redistribute its own routes. For this, the BGP
speaker may define for each announced route a redistribution policy
that controls how this route will be redistributed. This is done by
defining a set of allowed or requested operations and a list of BGP
Bonaventure/De Cnodder/Haas/White [Page 3]
draft-bonaventure-bgp-redistribution-01.txt July 2001
speakers. The list of BGP speakers can be specified by indicating
either the BGP speakers that are covered by the redistribution policy
or those that are not covered by this policy. The current version of
this document supports the following operations :
- the attached route should not be announced to the BGP speakers cov-
ered by the policy
- the attached route should only be announced to the BGP speakers
covered by the policy
- the attached route should be announced with the NO_EXPORT attribute
to the BGP speakers covered by the policy
- the attached route should be prepended n times when announced to to
the BGP speakers covered by the policy
The redistribution policies are encoded in a special type of
extended communities attribute called the redistribution community.
If a redistribution policy applies to a long list of BGP speakers,
then it will be encoded in several redistribution communities.
2.1 The redistribution community
The extended communities attribute is defined in [RTR01]. This
attribute allows a BGP router to attach a set of extended communi-
ties to an UPDATE message. Each extended community value is encoded
as an eight octets quantity with a two octets type field and a 6
octets value field. Several types of extended community values are
defined in [RTR01]. This document proposes a new well-known
extended community : the redistribution community.
The redistribution community is composed of a two octets type field
and a six octets value field. The two octets type field is encoded
as follows. The high order octet indicated that this is a redis-
tribution community. It is encoded as defined in [RTR01]. The
high order bit and the transitive bits of the first octet are set
to one and the 6 lower order bits of this octet are TBD_IANA.
The second octet of the type field indicates the redistribution
policy to apply to the specified BGP speakers for the attached
route. This octet is encoded as follows :
- The high and the second order bits (Bit7 and Bit6) are reserved
Bonaventure/De Cnodder/Haas/White [Page 4]
draft-bonaventure-bgp-redistribution-01.txt July 2001
- Bit5 is the Dist_List flag. When set to 1, it indicates that the
redistribution policy modifies the the redistribution of the route
to the specified BGP speakers (either by inserting NO_EXPORT or by
path prepending) but not the redistribution to the non-specified
BGP speakers. Otherwise, the redistribution policy prohibits the
redistribution to the specified BGP speakers and may modify the
route redistributed to the non-specified BGP speakers.
- Bit4 is the Include/Exclude bit. When set to 1, it means that the
redistribution policy applies to the listed BGP speakers. Other-
wise, the redistribution policy only applies to the BGP speakers
that are not listed.
- Bit3 is the No_Export bit. If set to 1, it means that the NO_EXPORT
community should be inserted when announcing the attached route to
the BGP speakers covered by the redistribution policy.
- Bits2-0 are the Prepend bits. Their value indicate how many times
the AS number of the announcing router should be prepended when
announcing the attached route to the BGP speakers covered by the
redistribution policy. A value of 0 indicates that no prepending
should occur.
The 6 octets value field of the redistribution community indicates
to which BGP speakers the redistribution policy applies. It is
encoded as follows :
- The high order octet indicates the type of the BGP speakers field.
- The five low order octets are the value of this field.
This document defines four types of BGP speakers fields (values
0x01-0x04). Value 0x00 is reserved and values 0x05-0x7f are to be
assigned by IANA. Values larger than 0x7f are vendor specific.
- The BGP speakers field contains a two octets AS number (Speakers
Type 0x01)
- The BGP speakers field contains two two octets AS numbers (Speakers
Type 0x02)
Bonaventure/De Cnodder/Haas/White [Page 5]
draft-bonaventure-bgp-redistribution-01.txt July 2001
- The BGP speakers field contains a CIDR prefix/length pair (Speakers
Type 0x03)
- The BGP speakers field contains a four octets AS number (Speakers
Type 0x04)
The BGP speakers field shall be encoded as follows. If this field
contains a two octet AS number, the AS number shall be placed in
the two high order octets. The three low order octets shall be set
to zero upon transmission and ignored upon reception. If the BGP
speakers field contains two two octets AS numbers, the first AS
number should be placed in the two high order octets. The second AS
number should be placed in the next two octets and the last octet
sent be set to zero upon transmission and ignored upon reception.
If the BGP speakers field contains a four octet AS number, the AS
number shall be placed in the four high order octets. The low order
octet shall be set to zero upon transmission and ignored upon
reception. If the BGP speakers field contains a CIDR prefix/length
pair, the IP prefix shall be placed in the four high order octets
and the low order octet will contain the prefix length.
3 Operations
A router may, depending on its policy, add any redistribution com-
munities to a route originated by itself or received from another
BGP speaker with iBGP or eBGP. In practice, only the originator of
the route should insert the redistribution community as it is an
attempt of the route originator to do some form of inter-domain
traffic engineering. The redistribution communities defined in
this document are only used when a route is redistributed to an
eBGP peer. They do no affect the redistribution of routes via iBGP.
When a router receives a route with redistribution communities, it
should apply the operations specified by these communities when
redistributing the route to eBGP peers. A router should remove the
received redistribution communities when redistributing the route
to eBGP peers. It may however add its own redistribution communi-
ties to this route before redistributing it.
Two redistribution communities are said to be applicable to the
same redistribution policy when their two high order octets are
equal.
Bonaventure/De Cnodder/Haas/White [Page 6]
draft-bonaventure-bgp-redistribution-01.txt July 2001
A router should apply the policies defined by the redistribution
communities to the routes that is has selected for advertisement
from its Adj-RIB-OUT based on its own policy. A route that contains
redistribution policies should be processed as follows. All redis-
tribution communities that correspond to the same redistribution
policy should be processed together by considering the type field
of the redistribution communities and the list of BGP speakers that
are covered by this policy. The pseudo-code below clarifies the
operation :
Bonaventure/De Cnodder/Haas/White [Page 7]
draft-bonaventure-bgp-redistribution-01.txt July 2001
/* Extract from redistribution communities the following information
for each redistribution policy */
/* Dist_List : Bit5 */
/* Include_Exclude : Bit4 */
/* No_Export : Bit3 */
/* Prepend : Bits2-0 */
/* BGP_Speakers : List of AS numbers and CIDR prefixes covered by this redistribution policy */
if ( Dist_List == 1)
{
if ( (Include_Exclude == 1) AND (Peer isin BGP_Speakers) )
OR
( (Include_Exclude == 0) AND not(Peer isin BGP_Speakers) )
{
/* route can be announced to eBGP peer */
if (No_Export == 1)
/* insert NO_EXPORT community */
if (Prepend > 0)
/* Prepend own AS number */
}
else
{
/* The route can be announced as usual to this peer */
}
}
else /* Dist_List == 0 */
{
if ( (Include_Exclude == 1) AND (Peer isin BGP_Speakers) )
OR
( (Include_Exclude == 0) AND not(Peer isin BGP_Speakers) )
{
/* The route cannot be announced to this peer */
}
else
{
/* route can be announced to eBGP peer */
if (No_Export == 1)
/* insert NO_EXPORT community */
if (Prepend > 0)
/* Prepend own AS number */
}
}
Figure 2: Processing of the redistribution communities
As some operators do not wish to allow interdomain traffic
Bonaventure/De Cnodder/Haas/White [Page 8]
draft-bonaventure-bgp-redistribution-01.txt July 2001
engineering on their networks contrary to local policy, an imple-
mentation should provide a mechanism to ignore these communities.
For implementation purposes, the two-octet AS version of the
BGP_Speakers field may problematic for interim implementations
since it does not easily allow an extended communities implementa-
tion to simply add stuff for their particular AS in. I.e. an
implementation can easily match on an unknown community on an exact
basis while the 2-octets version requires to apply a mask and check
on both components.
It should be noted that given the flexibility of the defined redis-
tribution communities, it is possible to define two conflicting
redistribution communities (e.g. one indicating that this route
should not be announced to ASx and the other indicating that this
route should only be announced to ASx). Such cases should be
avoided by the operators. If such problems occur, an implementa-
tion may apply any of the conflicting redistribution communities
and ignore the others. In this case, it would be useful to log the
error.
4 IANA considerations
This document requests the attribution of a new BGP extended commu-
nities type field from IANA.
5 Security considerations
Both the communities and extended communities options have the
potential to introduce additional security concerns into BGP. Tra-
ditional implementations allow third parties to modify (extended)
communities on the routes which may bias reachability of the net-
work in question by appending communities on a third-party basis
according to the semantics of those communities. The redistribu-
tion extended community mechanism further allows someone to mali-
ciously deny reachability to AS's by proxy.
When utilized by the route originator, the redistribution extended
comunity may possibly be used to mitigate DDoS attacks by denying
an attacking AS reachability to the network in question. This
assumes the AS in question is using default-free policy and no
supernets of the network in question are present in the global
routing table.
6 Conclusion
Bonaventure/De Cnodder/Haas/White [Page 9]
draft-bonaventure-bgp-redistribution-01.txt July 2001
This document has proposed the new redistribution community. By
using the defined redistribution communities, a BGP router can
influence the redistribution of a given route by its peers. The
proposed redistribution community is intended to replace the cur-
rent widespread utilization of local BGP extended communities that
relies heavily on manual router configuration.
The redistribution community proposed by this document could also
be useful for inter-provider VPNs such as those described in
[RRB^+01].
Acknowledgements
This work was partially funded by the European Commission, within
the ATRIUM IST project.
References
[Hal97] B. Halabi. Internet Routing Architectures. Cisco Press,
1997.
[Hus01] G. Huston. AS1221 BGP table statistics. available from
http://www.telstra.net/ops/bgp/, 2001.
[ISO93] ISO/IEC, Protocol for Exchange of Inter-domain Routeing
information among Intermediate Systems to Support Forwarding of ISO
8473 PDUs, ISO/IEC 10747:1993
[RRB^+01] E. Rosen, Y. Rekther, T. Bogovic, , R. Vaidyanathan S.
Brannon, M. Morrow, M. Carugi, C. Chase, L. Fang, T. Wo Chung, J.
De Clercq, E. Dean, P. Hitchin, A. Smith, M. Leelanivas, D. Mar-
shall, L. Martini, V. Srinivasan, and A. Vedrenne. BGP/MPLS VPNs.
Internet draft draft-rosen-rfc2547bis-03.txt, work in progress,
February 2001.
[RTR01] S. Ramachandra, D. Tappan, and Y. Rekhter. BGP extended
communities attribute. Internet draft,draft-ramachandra-bgp-ext-
communities-08.txt, work in progress, January 2001.
Bonaventure/De Cnodder/Haas/White [Page 10]
draft-bonaventure-bgp-redistribution-01.txt July 2001
Authors' Addresses
Olivier Bonaventure
Infonet group (FUNDP)
Rue Grandgagnage 21, B-5000 Namur, Belgium
Email: Olivier.Bonaventure@info.fundp.ac.be
URL : http://www.infonet.fundp.ac.be
Stefaan De Cnodder
Alcatel
Carrier Internetworking Division
Francis Wellesplein 1
B-2018 Antwerp, Belgium
Email: stefaan.de_cnodder@alcatel.be
Jeffrey Haas
NextHop Technologies
517 Williams
Ann Arbor, MI 48103-4943
Phone: +1 734 936 2095
Fax: +1 734 615-3241
Email: jhaas@nexthop.com
Russ White
Cisco Systems
Email: ruwhite@cisco.com
Bonaventure/De Cnodder/Haas/White [Page 11]
draft-bonaventure-bgp-redistribution-01.txt July 2001
Appendix 1 Examples
+---------------+ +-------+ +-------+
| | | | | |
| AS22 |=====|AS50 |====| AS40 |
| | | | | |
+---------------+ +-------+ +-------+
|| ||
+---------------+ +---------------+
| | | AS1 |
| |===============| |
| AS20 | +---------------+
+---------------+ ||
|| ||
+---------------+ +---------------+
| | | |
| AS10 R|------IX-------|R AS30 |
| | 1.2.3.0/24 | |
+---------------+ +---------------+
Figure 3: Simple interdomain topology
To better understand the usefulness and the flexibility of the pro-
posed redistribution communities, it is useful to consider a few
examples. Assume the simple interdomain topology shown on figure 2.
If AS30 wanted to offer to AS1 a limited transit service to reach
only the AS connected at IX, then it could simply insert to the
routes received from AS1 a redistribution community like :
- Dist_List=0
- Include/Exclude=0
- NO_EXPORT=0
- Prepend=0
- Value = 1.2.3.0/24
With this redistribution community, the routes received from AS1
will only be announced to the eBGP speakers that are part of the
1.2.3.0/24 subnet.
Assume now that AS20 agrees to provide a limited transit service.
For this, AS20 wants to advertise the route receive from AS1 to all
its cheap peers except its transit upstreams (e.g. AS2 and AS3 -
not shown in the figure). In this case, AS20 would insert the fol-
lowing redistribution community to the routes received from AS1 :
Bonaventure/De Cnodder/Haas/White [Page 12]
draft-bonaventure-bgp-redistribution-01.txt July 2001
- Dist_List=0
- Include/Exclude=1
- NO_EXPORT=0
- Prepend=0
- Value = AS2, AS3
AS20 could also want to provide a kind of "backup" service. For
example, it would announce to its transit upstreams the routes
received from AS1 has low quality routes. In this case, AS20 would
insert the following redistribution community to the routes
received from AS1 :
- Dist_List=1
- Include/Exclude=1
- NO_EXPORT=0
- Prepend=5
- Value = AS2,AS3
If AS20 had three transit provides, AS2, AS3 and AS4 then, it would
need to use two redistribution communities to encode this redistri-
bution policy.
Redistribution community 1
- Dist_List=1
- Include/Exclude=1
- NO_EXPORT=0
- Prepend=5
- Value = AS2,AS3
Redistribution community 2
- Dist_List=1
- Include/Exclude=1
- NO_EXPORT=0
- Prepend=5
- Value = AS4
These two redistribution communities would be processed together
since they apply to the same redistribution policy.
Assume that AS1 receives a lot of traffic from AS22 and AS10. For
traffic engineering purposes, AS1 would like to utilize its link
with AS40 for the traffic coming from AS22 and its link with AS20
for the traffic received from AS10. In this case, AS1 cannot simply
Bonaventure/De Cnodder/Haas/White [Page 13]
draft-bonaventure-bgp-redistribution-01.txt July 2001
prepend its own AS number on the link to AS20 since then the traf-
fic from AS10 will be received through AS30. To control the traffic
received from AS22, AS1 would insert the following redistribution
community to its routes sent to AS20 :
- Dist_List=1
- Include/Exclude=1
- NO_EXPORT=0
- Prepend=3
- Value = AS22
Similarly, to control the traffic received from AS10, AS1 would
insert the following redistribution community to its routes sent to
AS30 :
- Dist_List=1
- Include/Exclude=1
- NO_EXPORT=0
- Prepend=1
- Value = AS10
Bonaventure/De Cnodder/Haas/White [Page 14]