INTERNET-DRAFT D. Meyer (Editor)
Category Informational
Expires: July 2004 January 2004
Operational Concerns and Considerations for Routing Protocol
Design -- Risk, Interference, and Fit (RIFT)
<draft-ietf-grow-rift-00.txt>
Status of this Document
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
The key words "MUST"", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC 2119].
This document is a product of the RIFT Design Team. Comments should
be addressed to the authors, or the mailing list at
grow@lists.uoregon.edu.
Copyright Notice
Copyright (C) The Internet Society (2004). All Rights Reserved.
Meyer, et. al. [Page 1]
INTERNET-DRAFT Expires: July 2004 January 2004
Abstract
The Risk, Interference, and Fit (RIFT) design team was formed to
document the concerns and considerations surrounding the use of
Internet routing protocols for functions not directly related to
routing of IP packets within the Internet and IP networks. This
document is the output of that activity.
Meyer, et. al. [Page 2]
INTERNET-DRAFT Expires: July 2004 January 2004
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Scope of this Work . . . . . . . . . . . . . . . . . . . . . . 5
3. Problem Statement. . . . . . . . . . . . . . . . . . . . . . . 6
3.1. Risk, Interference, and Application Fit (RIFT) . . . . . . 6
3.1.1. Risk: Software Engineering . . . . . . . . . . . . . . . 7
3.1.2. Interference: Protocol Specification/Dynamic Behavior . 7
3.1.3. Application Fit: Distribution Topology . . . . . . . . . 7
4. Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.1. Reachability Information. . . . . . . . . . . . . . . . . . 8
4.2. Layer 3 Routing Information . . . . . . . . . . . . . . . . 8
4.3. Auxiliary (non-routing) Information . . . . . . . . . . . . 9
4.4. Address Family Identifier (AFI) . . . . . . . . . . . . . . 9
4.5. Subsequent Address Family Identifier (SAFI) . . . . . . . . 9
4.6. Network Layer Reachability. . . . . . . . . . . . . . . . . 9
4.7. Application . . . . . . . . . . . . . . . . . . . . . . . . 10
4.8. Routing Protocol. . . . . . . . . . . . . . . . . . . . . . 10
4.9. Fate Sharing. . . . . . . . . . . . . . . . . . . . . . . . 10
5. Architectural Models . . . . . . . . . . . . . . . . . . . . . 11
5.1. General Purpose Transport Infrastructure (GPT) Model. . . . 11
5.2. Special Purpose Transport Infrastructure (SPT) Model. . . . 12
6. Analyzing Risk and Interference. . . . . . . . . . . . . . . . 12
6.1. Risk: Code Impact, and Resource Sharing . . . . . . . . . . 13
6.1.1. Code Impact. . . . . . . . . . . . . . . . . . . . . . . 13
6.1.2. Resource Sharing . . . . . . . . . . . . . . . . . . . . 13
6.1.2.1. Resource Sharing and Operating System Level Issues . 14
6.2. Interference. . . . . . . . . . . . . . . . . . . . . . . . 14
7. GTP and SPT Models: Risk and Interference. . . . . . . . . . . 15
7.1. Risk. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.1.1. Code Impact. . . . . . . . . . . . . . . . . . . . . . . 15
7.1.2. Resource Sharing . . . . . . . . . . . . . . . . . . . . 16
7.1.3. Multisession BGP . . . . . . . . . . . . . . . . . . . . 17
7.2. Interference. . . . . . . . . . . . . . . . . . . . . . . . 18
7.2.1. Multisession BGP . . . . . . . . . . . . . . . . . . . . 19
8. Application Fit. . . . . . . . . . . . . . . . . . . . . . . . 19
8.1. RFC 2547 Style VPNs . . . . . . . . . . . . . . . . . . . . 19
8.2. VPWS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.3. VPLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
9. Operational Implications . . . . . . . . . . . . . . . . . . . 22
10. Other Models. . . . . . . . . . . . . . . . . . . . . . . . . 22
11. Conclusions and Recommendations . . . . . . . . . . . . . . . 22
Meyer, et. al. [Page 3]
INTERNET-DRAFT Expires: July 2004 January 2004
12. Intellectual Property . . . . . . . . . . . . . . . . . . . . 22
13. Design Team . . . . . . . . . . . . . . . . . . . . . . . . . 22
14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23
15. Security Considerations . . . . . . . . . . . . . . . . . . . 24
16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24
17. References. . . . . . . . . . . . . . . . . . . . . . . . . . 25
17.1. Normative References . . . . . . . . . . . . . . . . . . . 25
17.2. Informative References . . . . . . . . . . . . . . . . . . 27
18. Editor's Address. . . . . . . . . . . . . . . . . . . . . . . 29
19. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 29
Meyer, et. al. [Page 4]
INTERNET-DRAFT Expires: July 2004 January 2004
1. Introduction
The stability of the global Internet routing system has been the
subject of much research (see e.g., [RVBIB]) and discussion on
various IETF mailing lists [IETFOL]. Much of the research into the
routing system has centered around the analysis of the dynamics and
stability of the Border Gateway Protocol Version 4 [BGP] (hereafter
referred to as BGP).
However, while the theoretical properties of BGP remains a topic of
great interest, a more recent discussion has focused on effects of
the addition of new types of Network Layer Reachability Information,
or NLRI to BGP. In particular, the advent of two BGP attributes,
Multiprotocol Reachable NLRI (MP_REACH_NLRI), and Multiprotocol
Unreachable NLRI (MP_UNREACH_NLRI) [RFC2858], have made it possible
to encode and transport a wide variety of features and their
associated signaling using the BGP transport infrastructure. Examples
include include IPv6 [RFC2460], flow specification rules [FLOW], IP
VPNs [RFC2547BIS], Virtual Private LAN services [VPLS], Virtual
Private Wire Service [VPWS], and auto-discovery mechanisms for VPNs
in general [BGPVPN],
This document outlines the concerns and issues surrounding using the
BGP infrastructure as a generic feature and signaling transport.
However, the similar concerns apply to the Interior Gateway Protocols
(IGPs) in common use (e.g., ISIS [RFC1142] or OSPF [RFC2328]).
The rest of this document is organized as follows: Section 2 outlines
the scope of this work. Section 3 introduces the problem statement
which is the focus of this document, section 4 provides definitions,
and section 5 outlines the main architectural models that are
discussed. The remaining sections discuss the the implications of
those models.
2. Scope of this Work
It is the intention of the RIFT design team that this document serve
as a guide for both protocol designers and network operators. The
goal is to outline the implications associated with employing
existing routing protocols to enable additional feature sets and
functionality, as contrasted with designing new mechanisms to carry
those feature sets and functionalities.
The issues, concerns and considerations discussed in this document
Meyer, et. al. Section 2. [Page 5]
INTERNET-DRAFT Expires: July 2004 January 2004
focus on the implications for BGP [BGP,RFC1771]. It is important to
note that similar issues will arise when considering generalizations
to the information that the IGPs carry.
3. Problem Statement
The advent of the MP_REACH_NLRI and MP_UNREACH_NLRI attributes,
combined with the resulting generalization to the BGP infrastructure,
have created the opportunity to use BGP to transport a wide variety
of data types and their associated signaling. The combination of a
BGP data type and its associated signaling is frequently called an
"application"; example applications include the IPv4 and IPv6
[RFC2460] routing systems, flow specification rules [FLOW], auto-
discovery mechanisms for Layer 3 VPNs [BGPVPN], virtual private LAN
services [VPLS], and virtual private Wire Service [VPWS].
More recently, the discussion in the IETF community has focused on
the use of the BGP as a generalized feature transport infrastructure
[IETFOL]. The debate has recently intensified due to the emergence of
a new class of application that uses the BGP infrastructure to
distribute information that is not directly related to inter-domain
routing. Examples of such applications include the use of the BGP
transport infrastructure to provide auto-discovery for IP VPNs
[RFC2547BIS], the virtual private LAN services mentioned above [VPLS]
and VPNs in general [BGPVPN].
3.1. Risk, Interference, and Application Fit (RIFT)
As mentioned above, much of the debate surrounding these new uses of
the BGP transport infrastructure has focused on the potential
tradeoffs between the stability of the Internet routing system, as
effected by the deployment of new applications, and the desire on the
part of service providers to rapidly deploy these new applications,
and to reduce the operational cost by re-using existing protocols.
These tradeoffs have at times been described in terms of risk,
interference, and application fit. Risk models the software
engineering impact of new applications on a generic implementation,
while interference models the impact of new applications on protocol
definition and behavior. Finally, application fit models the
similarity between an application's data and signaling requirements
Meyer, et. al. Section 3.1. [Page 6]
INTERNET-DRAFT Expires: July 2004 January 2004
and a specific distribution algorithm. Each is described below.
3.1.1. Risk: Software Engineering
Risk attempts to assess the robustness tradeoffs inherent in the
addition of new applications to a given implementation. That is, risk
models the impact of generic software engineering issues on a given
implementation. These issues include the impact of new applications
on existing implementations and on the fate sharing properties of
those implementations.
A second aspect of risk lies in the trade-off of extending an
existing protocol versus designing, implementing, and deploying a new
protocol.
3.1.2. Interference: Protocol Specification/Dynamic Behavior
Interference models the potential for a new application to adversely
effect the operation of an existing implementation at the protocol
level, by inadvertently introducing a detrimental dependency of some
kind. That is, an application is said to "interfere" with an existing
application if, by virtue of the application's protocol extension(s),
one or more fundamental properties of the protocol's operation are
detrimentally altered. For example, could we create a new state which
introduces an unanticipated deadlock situation to occur? Or could we
destabilize the distributed behavior of the protocol? Or might we
simply run out of the attributes or bits available (as happened, for
example, with RADIUS [RFC2138])?
3.1.3. Application Fit: Distribution Topology
Application fit refers to how closely the requirements of the data to
be distributed match the underlying capabilities of a distribution
mechanism. For example, it is clearly inefficient to broadcast data
to all peers that is only required between two peers, just as it is
inefficient to unicast (replicate) data that is required by all peers
when a single broadcast would do.
Meyer, et. al. Section 3.1.3. [Page 7]
INTERNET-DRAFT Expires: July 2004 January 2004
4. Definitions
4.1. Reachability Information
Reachability information refers to information describing some part
of a network, along with how one can reach it, and perhaps also
containing attributes of the implied path to the network locale.
Typically, this information pertains to IP routing information; an
example of non-IP reachability is VPLS information [VPLS].
4.2. Layer 3 Routing Information
Layer 3 routing information represents either link state information
or network reachability information. Link state information
represents Layer 3 adjacencies and topology. Link state routing
protocols, such as OSPF [RFC2328] and ISIS [RFC1142], flood link
state information throughout an IGP domain, so that each
participating router maintains an identical copy of a database that
is computed to reflect the complete Layer 3 topology.
Layer 3 reachability information expressed as an IP address prefix
represents the set of destinations (systems) whose IP addresses are
contained in the IP address prefix. Distance/path vector routing
protocols, such as BGP, distribute Layer 3 reachability information
among routing domains.
Routers use both types of Layer 3 routing information (link state and
reachability) to produce IP forwarding tables. That, is, for purposes
of this discussion, "routing information" relates to the Layer 3
inter-domain routing data traditionally carried by BGP.
Finally, if one defines routing information as "information used to
forward packets", combined with the above definition of reachability
information, then we can consider information such as described in
[FLOW] (for example) to be routing information (since it is
attempting to add a level of granularity to how an 'aggregate' is
defined). That is, [FLOW] intends to complement to the existing
routing information, and the flow information is dependent on IP4
unicast reachability advertised by the same neighbor.
Meyer, et. al. Section 4.2. [Page 8]
INTERNET-DRAFT Expires: July 2004 January 2004
4.3. Auxiliary (non-routing) Information
Auxiliary Information is any information that is exchanged by routers
which is neither Layer 3 routing information, nor reachability
information. IS-IS hostname TLVs are an example of Axillary
information [RFC1142].
4.4. Address Family Identifier (AFI)
An Address Family contains addresses that share common structure and
semantics. An Address Family Identifier (AFI) uniquely identifies
each address family. Several routing protocol messages contain a
field that represents the AFI. The AFI identifies the address type
used by another data item contained in that message. The Routing
Information Protocol (RIP) [RFC2453], Distance Vector Multicast
Routing Protocol (DVMRP) [RFC1075], and BGP all employ the AFI field.
For example, the BGP MP_REACH_NLRI and MP_UNREACH_NLRI attributes
contain an AFI field. These BGP attributes also contain a NLRI field
that enumerates reachable or unreachable subnetworks corresponding to
the associated address family. The AFI field indicates the address
type by which reachable subnetworks are identified. When BGP is used
to distribute Layer 3 routing information, AFIs can indicate the
following address types: IPv4, IPv6, VPNv4 [RFC2547BIS]. When BGP is
used to distribute auxiliary information, AFIs can indicate other
address families.
4.5. Subsequent Address Family Identifier (SAFI)
A Subsequent Address Family Identifier (SAFI) is part of the BGP
MP_REACH_NLRI and MP_UNREACH_NLRI attributes. These BGP attributes
also contain a NLRI field that enumerates reachable or unreachable
subnetworks. The SAFI augments the AFI, carrying additional
information regarding networks enumerated in the NLRI field.
4.6. Network Layer Reachability
Network Layer Reachability Information, or NLRI is the data described
Meyer, et. al. Section 4.6. [Page 9]
INTERNET-DRAFT Expires: July 2004 January 2004
by the AFI/SAFI fields [AFI,SAFI]. While these concepts were
originally described for protocols such as DVMRP [RFC1075], the bulk
of the generalization of the NLRI described in this document derives
from the introduction of the MP_REACH_NLRI and MP_UNREACH_NLRI
attributes to BGP [RFC2858].
4.7. Application
The term application is used in this document to refer to the
combination of a BGP data type and any signaling data that is carried
by BGP in support of the service the data type carries. The data type
is typically described in an AFI/SAFI, while the actual data is
frequently contained in both NLRI and BGP community attributes
[RFC1997].
4.8. Routing Protocol
A routing protocol is composed of two basic components: a data
distribution algorithm and a decision algorithm. A router typically
obtains Layer 3 routing information via its data distribution
algorithm, and it uses this information to produce an IP forwarding
table (by applying the protocol's decision algorithm to the received
routing data). Note that it is the use of BGP's data distribution
algorithm that is the focus of this document. However, when judging
application fit, one may also consider whether the decision
algorithms suit the application.
4.9. Fate Sharing
The fate sharing principle for end to end network protocols was first
enunciated by Dave Clark [CLARK]. As applied to software systems,
fate sharing refers to the sharing of common resources among a group
of applications. In our case, the particular "fate" of most interest
is the ability of one application, call it application A, to cause an
application with which it is fate sharing, call it application B, to
experience one or more faults due to faults in application A. Fate-
sharing can exist at many levels, including between modules on a
system, between routing protocols, between sessions of a routing
protocols such as BGP, or between applications within a routing
protocol.
Meyer, et. al. Section 4.9. [Page 10]
INTERNET-DRAFT Expires: July 2004 January 2004
5. Architectural Models
In this section, we consider the two architectural models which are
motivated by salient questions considered in this document, namely:
(i). Does the BGP distribution protocol suit a particular
application (i.e., does an application fit the BGP
distribution protocol)?
(ii). What are the effects on the global routing system (if
any) of carrying that application using the BGP distribution
protocol?
These questions must be analyzed in terms of the cost of protocol and
code development, as well as in terms of the operational expense that
may be incurred by utilizing (or not utilizing) the mechanisms
already present in BGP.
Two models, describing alternate viewpoints, are examined in the
following sections.
5.1. General Purpose Transport Infrastructure (GPT) Model
The GPT model models BGP data distribution infrastructure as a
generic application transport mechanism. As such, it focuses on
application fit, and assumes that the tradeoffs, both in terms of
risk and interference can be managed in an efficient manner. As a
result, the GTP models these issues not in terms of whether the
application and signaling data that need to be distributed are part
of some particular class (routing, in this case), but rather whether
the requirements for the distribution these attributes are similar
enough to the distribution mechanisms of BGP. In those cases when
distribution requirements are sufficiently similar, BGP can be a
logical candidate for a transport infrastructure. Note that this is
not because of the nature of information distributed, but rather due
to the similarity in the transport requirements. There are of course
other operational considerations that make BGP a logical candidate,
including its close to ubiquitous deployment in the Internet (as well
as in intra-nets), its policy capabilities, and operator comfort
levels with the technology.
Meyer, et. al. Section 5.1. [Page 11]
INTERNET-DRAFT Expires: July 2004 January 2004
5.2. Special Purpose Transport Infrastructure (SPT) Model
The SPT model, on the other hand, models the BGP infrastructure as a
special purpose transport designed specifically to transport inter-
domain routing information. As such, it is more sensitive to risk and
interference than to application fit.
There are two basic arguments supporting the SPT model: The first is
based on the perceived risk profile involved in adding new
applications to the BGP transport infrastructure or new features to
existing BGP applications. The concern here is that changes to BGP
implementations will cause software quality to degrade, and hence
destabilize the global routing system. This position is based upon
well understood software engineering principles, and is strengthened
by long-standing experience that there is a direct correlation
between software features and software stability [MULLER1999]. This
concern is augmented by the fact that in many cases, the existence of
the code for these features, even if unused, can also cause
destabilization in the routing system, since in many cases software
faults cannot be isolated.
A second concern is based on interference arguments, notably that the
increase in complexity of BGP due to the number of data types that it
carries can also potentially destabilize the global routing system.
This concern is based on a wide range of concerns, including the fact
that the interaction of BGP dynamics and current deployment practices
are poorly understood, and that the addition of non-routing data
types may adversely effect convergence and other scaling properties
of the global routing system.
6. Analyzing Risk and Interference
One way to frame the tradeoffs involved in a model's risk profile is
in terms of the software engineering issues surrounding where an
implementation might demultiplex among applications. The important
point here is that an implementation's choice of demultiplexing point
directly affects the implementation's risk profile due to its effects
on existing code, and on the system resources it requires to be
shared among those applications.
Meyer, et. al. Section 6. [Page 12]
INTERNET-DRAFT Expires: July 2004 January 2004
6.1. Risk: Code Impact, and Resource Sharing
For purposes of this discussion, then, we consider the risk profile
of the SPT and GPT models with respect to their application
demultiplexing point. The GPT model typically provides a single point
for demultiplexing all applications (i.e., the AFI/SAFI). On the
other hand, the SPT model, provides an application demultiplexing
point above BGP (typically at the TCP port level). That is, in the
GPT model, applications typically share a common transport session,
while the SPT model generally envisions one or more applications per
transport session (see section 7.1.3 for a discussion of the impact
of multisession BGP [MULTISESSION,SOFTNOTIFY] on this taxonomy).
Finally, note that these models can have very different risk profiles
with respect to code impact and resource sharing. Some of the
questions relating to risk assessment are considered below.
6.1.1. Code Impact
In this section, we outline the high-level questions one might ask in
assessing the difference in risk between GPT model and the SPT model
based on their effect on an existing code base.
o Does the code below the demultiplexing point need to be
changed when a new application is added?
o Does the code in existing applications have to be changed when
a new application is added (that is, to what extent are the
applications decoupled)?
o Can the code in separate applications be developed, tested,
released, debugged and packaged independently from other
applications?
o Is there significant code below the demultiplexing point that
can be shared among all applications?
6.1.2. Resource Sharing
In this section, we outline the high-level questions one might ask in
assessing the difference in risk between GPT model and the SPT model
Meyer, et. al. Section 6.1.2. [Page 13]
INTERNET-DRAFT Expires: July 2004 January 2004
with respect to the requirements and properties of the system
resource sharing they require. In particular:
o Do applications have to compete for socket buffers, and hence
have the potential to block or starve each other (at the TCP
port level)?
o Do applications have to compete for possible protocol-level
transport-related buffers and queues, and hence have the
potential to starve or block each other at the protocol
send/receive level?
o Do applications have to compete for a possible per-connection
processing time budget, hence have the potential to starve
each other at the intra-process scheduling level?
6.1.2.1. Resource Sharing and Operating System Level Issues
In this section, we outline the high-level questions one might ask in
assessing the difference in risk between GPT model and the SPT model
based on the affect on resource sharing at the operating system
level. In particular:
o Do applications share a common scheduling context? That is,
do applications have to compete for per-process scheduling
budgets?
o What is the degree of fate sharing between applications?
6.2. Interference
Interference models the potential for an application to affect the
behavior of an existing application or applications. For example, in
the case of the Internet routing system, one might ask if a certain
application "interferes" with IPv4 Unicast routing by affecting some
aspect of its protocol operation (e.g., convergence time).
Interference in the Internet routing system has its roots in the
observation that the routing system itself can be described as highly
self-dissimilar, with extremely different scales and levels of
Meyer, et. al. Section 6.2. [Page 14]
INTERNET-DRAFT Expires: July 2004 January 2004
abstraction. Complex systems with this property are susceptible to
"coupling", which RFC 3439 [RFC3439] defines as follows:
The Coupling Principle states that as things get larger, they
often exhibit increased interdependence between components.
COROLLARY: The more events that simultaneously occur, the larger
the likelihood that two or more will interact. This phenomenon
has also been termed "unforeseen feature interaction"
[WILLINGER2002].
That is, interference, if and where it occurs, has its roots in
complexity and is frequently the result of application coupling.
7. GTP and SPT Models: Risk and Interference
In this section, we analyze the risk and interference profiles of the
SPT and GPT models.
7.1. Risk
As mentioned above, risk models the robustness tradeoffs around
generic software architecture and engineering associated with
protocol implementations, including the impact on existing protocol
implementations, and on the fate sharing properties of those
implementations. In the following sections we consider these
components of risk for both the GPT and SPT models.
7.1.1. Code Impact
In this section, we outline the answers to the questions posed above.
o Does the code below the demultiplexing point need to be
changed when a new application is added?
In theory, such code changes are unlikely to be required in
the SPT model, as the SPT model envisions that a new
application will have a new demultiplexing point (port).
Meyer, et. al. Section 7.1.1. [Page 15]
INTERNET-DRAFT Expires: July 2004 January 2004
The GPT model does not by definition require new code below
the demultiplexing point either. Specifically, it should in
theory be possible to isolate code below the demultiplexing
point with suitable abstraction and constructs such as
AFI/SAFI API registries.
o Does the code in existing applications have to be changed when
a new application is added (that is, to what extent are the
applications decoupled)?
The SPT model envisions application independence with respect to
demultiplexing point. As such, it is unlikely to require such
changes. However, it is important to note that good software
engineering practices encourage code reuse and construction of
general purpose libraries. As a result, if applications share
libraries and/or other code, the practical independence
decreases, and consequently risk increases. The same analysis
can be made for the GPT model, since in this case we are already
demultiplexing on the AFI/SAFI fields.
o Can the code in separate applications be developed, tested,
released, debugged and packaged independently from other
applications?
While this is theoretically possible in the SPT model (and
possibly more difficult in the GPT model) practice and
experience has shown that achieving this type of independence is
difficult in either model.
7.1.2. Resource Sharing
In this section, we address the questions raised above to assess the
difference in risk between GPT model and the SPT model based on the
effect on resource sharing considerations.
o Do applications have to compete for socket buffers, and hence
have the potential the to block or starve each other (at the GPT
level)?
The SPT model does not require applications to compete for
socket level resources. It should also be possible to achieve
this type of application independence in the GPT model with
multisession BGP.
o Do applications have to compete for possible protocol-level
Meyer, et. al. Section 7.1.2. [Page 16]
INTERNET-DRAFT Expires: July 2004 January 2004
transport-related buffers and queues, and hence have the
potential to starve or block each other at the protocol
send/receive level?
Again, while the SPT model does not require competition for
transport-level resources, it should be possible to achieve
similar behavior with multisession BGP.
o Do applications have to compete for a possible per-connection
processing time budget, hence have the potential to starve
each other at the intra-process scheduling level?
Applications written to the the SPT model should not require
this type of resource competition. It should also be possible to
reduce this type of resource competition with multisession BGP.
o Do applications have to compete for resources within the
network (e.g., bandwidth), when the protocol session spans
multiple hops ?
Neither the SPT model nor the GPT model (again, with
multisession BGP) should require competition for network
resources in this case.
7.1.3. Multisession BGP
Suppose that one makes the simplifying assumption that a GPT
implementation's risk profile is dominated by the probability that an
error in one AFI/SAFI stream will cause some subset of the other
AFI/SAFI streams to malfunction (e.g., reset). In this case, risk
might be characterized as a function of the model and the number of
AFI/SAFI carried. Given this simplification, the risk profile looks
loosely like
Risk = f(Model, |{AFI,SAFI}|)
where
f:{GPT, SPT} X |{AFI, SAFI}| -> N
Note that we assume that
f(SPT,n) = O(f(GPT,n))
Meyer, et. al. Section 7.1.3. [Page 17]
INTERNET-DRAFT Expires: July 2004 January 2004
where
O(f) = {g:N->R | there exists c > 0 and n such that g(n) < c*f(n)}
That is, that the SPT risk profile is bounded by the GPT risk
profile. Clearly, the existence of such an upper bound is an integral
aspect of any argument favoring the SPT model.
Note that for the SPT model, we can think of the number of AFI/SAFI
that a single session carries as a small constant, call it k. k will
typically be small (close to 1), since by definition the SPT model
envisions a small number of AFI/SAFI per session (e.g., for AFI/SAFI
IPv4/unicast and IPv6/unicast, k = 2).
When formulated in this way, one can see that one objective of
multisession BGP is to find a value, call it g, such that
f(GPT, g) ~ f(SPT,k), for small values of k (i.e., k close to 1)
where
A(n) ~ B(k) ==> A(n) = B(k) + h(n), h(n) >= 0
That is, A(n) is approaches B(k)
In this case, g is the size of the multisession AFI/SAFI grouping,
and for small values of g, multisession BGP can have a risk profile
that looks very much like the SPT risk profile. In particular, for g
= 1, both models would have similar risk profiles. Of course, there
are many other components of risk that that are not considered by
this analysis, such as collateral issues resulting from the existence
of faulty shared code, operating system process and memory structure,
etc.
7.2. Interference
Interference concerns stem from the possibility that application
coupling can lead to the destabilization of the Internet routing
system in unanticipated and unexpected ways. In this section we
consider interference properties of the GPT and SPT models.
Meyer, et. al. Section 7.2. [Page 18]
INTERNET-DRAFT Expires: July 2004 January 2004
7.2.1. Multisession BGP
Multisession BGP also seeks to reduce the interference profile of the
GPT model by eliminating one potential source of interference,
namely, the potential interference due to presence of multiple
AFI/SAFIs in a single BGP session. Following the analysis presented
in section 7.1.3, we can see that for small groupings (described as
small values of g in section 7.1.3), the interference profiles of
both models converge.
8. Application Fit
In the following sub-sections, application fit is examined from the
perspective of analyzing the data distribution needs of three
representative classes of application, namely:
RFC 2547 Style VPNs
VPWS
VPLS
8.1. RFC 2547 Style VPNs
First, it is useful to review the distribution mechanisms available
in BGP, in particular, in i-BGP. i-BGP has been described loosely as
a broadcast mechanism since an i-BGP speaker sends information to all
its peers. This is typically achieved by means of one or more route
reflectors; a more direct but less scalable means is for each i-BGP
speaker to have a BGP session with each i-BGP peer.
However, it is more accurate to characterize i-BGP as a constrained
broadcast mechanism. This is because the use of communities in
conjunction with import and export policies allows an i-BGP speaker
to effectively limit its communication to a subset of the full set of
i-BGP peers; the efficiency of constrained broadcast can be improved
by techniques such as described in [ORF] and [RTCONST].
There are five classes of information that need to be distributed for
RFC 2547 style VPNs:
Meyer, et. al. Section 8.1. [Page 19]
INTERNET-DRAFT Expires: July 2004 January 2004
(a). Membership (auto-discovery)
(b). Prefixes
(c). Labels
(d). BGP nexthop, and
(e). Path selection attributes
The first of these, membership or auto-discovery, must be sent to all
peers, as a BGP speaker does not know a priori which of its peers are
members of a given VPN. Membership of a given VPN is recognized by
the use of certain extended communities called Route Targets. BGP is
clearly eminently well-suited for this mode of distribution.
The next three of these constitute the reachability information.
They say what part of a given VPN (b) is reachable, and how it is to
be reached (c and d). The final piece of information is used for
selection if there are multiple paths to a given prefix of a VPN, as
in the case of multi-homing. All of these pieces of information need
only be distributed to members of the VPN, i.e., they require a
constrained broadcast mechanism. BGP is reasonably well-suited for
this mode of distribution using import and export NLRI filtering.
The addition of the mechanism in [RTCONST] makes BGP even better
suited to this.
The encoding of this information as defined in [RFC2547BIS] puts all
of this information in a single NLRI. This seems to imply that a
broadcast mechanism has to be used for the distribution of RFC 2547
VPN information. However, the combination of [RTCONST] and [RFC2918]
allow BGP to distribute this information correctly yet efficiently.
Finally, it is useful to observe that standard BGP path selection
mechanisms (local pref, MED, AS path length, etc.) can be applied to
the information in (e).
The conclusion is that BGP is quite well-suited to this application,
and, with the addition of mechanisms such as [RTCONST] and [RFC2918],
the fit is even closer.
8.2. VPWS
A VPN based on a Virtual Private Wire Service [VPWS] connects a
number of sites by virtual wires (or pseudo-wires). The information
needed to create such a VPN comprises:
(a). Membership (auto-discovery)
Meyer, et. al. Section 8.2. [Page 20]
INTERNET-DRAFT Expires: July 2004 January 2004
(b). VPN site identification
(c). Labels
(d). BGP nexthop
(e). Path selection attributes, and
(f). Per-wire information
The analysis of the first five items is exactly as for RFC 2547 VPNs,
with the slight change that the definition of a 'part of a VPN' is no
longer an IP prefix, but is a VPN site identifier, which can be
viewed as the VPWS prefix. The distribution requirements and the fit
with BGP distribution mechanisms is identical to RFC 2547.
The one major change is the potential for 'per-wire' attributes, such
as bandwidth for a given site-to-site connection. This information
should be distributed on a point-to-point basis. BGP mechanisms are
not efficient for point-to-point distribution. However, it is an
open question whether such 'per-wire' attributes really need to be
exchanged, as evidenced by the fact that LDP signaling for pseudo-
wires [MARTINI] has not defined any such attributes. If per-wire
information is indeed not necessary, BGP distribution mechanisms are
as well-suited for VPWS VPNs as for RFC 2547 VPNs.
Note that existing BGP path selection mechanisms can be used as is
for VPWS, and can prove useful for multi-homed sites.
8.3. VPLS
A VPLS connects a number of sites by an emulated LAN segment. The
information needed to create a VPLS consists of:
(a). Membership (auto-discovery)
(b). VPLS site identification
(c). Labels
(d). BGP nexthop, and
(e). Path selection attributes
The notion of 'VPLS site identification' is analogous to a VPN site
identifier for VPWS. The analysis of the distribution needs of these
five items is exactly as for RFC 2547 VPNs, and the conclusion is
that BGP is reasonably well-suited for this application, and with the
addition of [RTCONST] and [REFRESH], the fit is even better.
Note that existing BGP path selection mechanisms can be used as is
for VPLS, and can prove useful for multi-homed sites.
Meyer, et. al. Section 8.3. [Page 21]
INTERNET-DRAFT Expires: July 2004 January 2004
9. Operational Implications
10. Other Models
11. Conclusions and Recommendations
12. Intellectual Property
The IETF takes no position regarding the validity or scope of any
intellectual property or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; neither does it represent that it
has made any effort to identify any such rights. Information on the
IETF's procedures with respect to rights in standards-track and
standards-related documentation can be found in BCP-11 [RFC2028].
Copies of claims of rights made available for publication and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementors or users of this
specification can be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights which may cover technology that may be required to practice
this standard. Please address the information to the IETF Executive
Director.
13. Design Team
The design team that produced this document consisted of Daniel
Awduche (awduche@awduche.com), Ron Bonica (Ronald.P.Bonica@mci.com),
Hank Kilmer (hank@rem.com), Kireeti Kompella (kireeti@juniper.net),
Chris Lewis (chrlewis@cisco.com), Danny McPherson (danny@tcb.net),
David Meyer (dmm@1-4-5.net) and Peter Whiting
Meyer, et. al. Section 13. [Page 22]
INTERNET-DRAFT Expires: July 2004 January 2004
(pwhiting@vericenter.com).
14. Acknowledgments
David Ball, Peter Gutierrez, Susan Harris, Pedro Marques, Eric Rosen,
Pekka Savola, and Mark Townsley have all made many insightful
comments on earlier versions of this document.
Meyer, et. al. Section 14. [Page 23]
INTERNET-DRAFT Expires: July 2004 January 2004
15. Security Considerations
This document specifies neither a protocol nor an operational
practice, and as such, it creates no new security considerations.
16. IANA Considerations
This document creates a no new requirements on IANA namespaces
[RFC2434].
Meyer, et. al. Section 16. [Page 24]
INTERNET-DRAFT Expires: July 2004 January 2004
17. References
17.1. Normative References
[AFI] http://www.iana.org/assignments/address-family-numbers
[BGP] Rekhter, Y, T.Li, and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", draft-ietf-idr-bgp4-23.txt.
Work in progress.
[BGPVPN] Ould-Brahim, H., E. Rosen, and Y. Rekhter, "Using
BGP as an Auto-Discovery Mechanism for
Provider-provisioned VPNs",
draft-ietf-l3vpn-bgpvpn-auto-00.txt. Work in
progress.
[CLARK] Clark, D., "Design Philosophy of the DARPA Internet
Protocols", Computer Communication Review, volume
25, number 1, January 1995. ISSN # 0146-4833.
[EXTCOMM] Sangali, S., D. Tappan, and Y. Rekhter, "BGP
Extended Communities Attribute",
draft-ietf-idr-bgp-ext-communities-06.txt. Work
in progress.
[FLOW] Marques, P, et. al., "Dissemination of flow
specification rules",
draft-marques-idr-flow-spec-00.txt. Work in
progress.
[L2TPv3] Lau, J., M. Townsley and I. Goyret (Editors),
"Layer Two Tunneling Protocol (Version
3)", draft-ietf-l2tpext-l2tp-base-11.txt. Work in
progress.
[MARTINI] Martini, L., E.Rosen, and T. Smith, "Pseudowire
Setup and Maintenance using LDP",
draft-ietf-pwe3-control-protocol-05.txt. Work in
progress.
[MULLER1999] Muller, R. et. al., "Control System Reliability
Requires Careful Software Installation
Procedures", International Conference on
Accelerator and Largeand Large Experimental
Physics Systems, 1999, Trieste, Italy.
Meyer, et. al. Section 17.1. [Page 25]
INTERNET-DRAFT Expires: July 2004 January 2004
[MULTISESSION] Scudder, J. and C. Appanna, "Multisession BGP,
draft-scudder-bgp-multisession-00.txt. Work in
progress.
[ORF] Chen, E., and Rekhter, Y., "Cooperative Route
Filtering Capability for BGP-4",
draft-ietf-idr-route-filter-09.txt. Work in
progress.
[RTCONST] Bonica, R. et al, "Constrained VPN route
distribution",
draft-marques-ppvpn-rt-constrain-01.txt. Work in
progress.
[SOFTNOTIFY} Nalawade, G., K. Patel, J. Scudder, and D. Ward,
"BGPv4 Soft-Notification Message",
draft-nalawade-bgp-soft-notify-00.txt., Work in
progress.
[RFC1075] Waitzman, D., C. Partridge, and S. Deering,
"Distance Vector Multicast Routing Protocol", RFC
1075, November, 1988.
[RFC1142] Oran, D. Editor, "OSI IS-IS Intra-domain Routing
Protocol", RFC 1142, February, 1990.
[RFC1771] Rekhter, Y., and T. Li, "A Border Gateway
Protocol 4 (BGP-4)", RFC 1771, March 1995.
[RFC1958] Carpenter, B., "Architectural principles of the
Internet", Editor. RFC 1958, June 1996.
[RFC1997] Chandra, R., P. Traina, and T. Li, "BGP
Communities Attribute", RFC 1997, August, 1996.
[RFC2138] Rigney, C., et. al., "Remote Authentication Dial
In User Service (RADIUS)", RFC 2138, April, 1997.
[RFC2328] Moy, J., "OSPF Version 2", RFC 2328, April, 1998.
[RFC2453] Malkin, G., "RIP Version 2", RFC 2453, November,
1998.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol,
Version 6 (IPv6) Specification", RFC 2460,
December, 1998.
Meyer, et. al. Section 17.1. [Page 26]
INTERNET-DRAFT Expires: July 2004 January 2004
[RFC2547BIS] Rosen, E., et. al., "BGP/MPLS IP VPNs",
draft-ietf-l3vpn-rfc2547bis-00.txt. Work in
progress.
[RFC2858] Bates, T., et. al., "Multiprotocol Extensions
for BGP-4", RFC 2858, June 2000.
[RFC2918] Chen, E., "Route Refresh Capability for BGP-4",
RFC 2918, September 2000.
[RFC3036] Anderson, L., et. al., "LDP Specification", RFC
3036, January 2001.
[RFC3439] Bush, R. and D. Meyer, "Some Internet
Architectural Guidelines and Philosophy", RFC
3439, December, 2002.
[SAFI] http://www.iana.org/assignments/safi-namespace
[VLPS] Kompella, K., et. al. "Virtual Private LAN
Service", draft-ietf-l2vpn-vpls-bgp-02.txt.
Work in progress.
[VPWS] Kompella, K. et.al. "Layer 2 VPNs Over Tunnels",
draft-kompella-ppvpn-l2vpn-04.txt. Work in
progress.
17.2. Informative References
[IETFOL] https://www1.ietf.org/mailman/listinfo/routing-discussion
[RFC2119] Bradner, S., "Key words for use in RFCs to
Indicate Requirement Levels", RFC 2119, March,
1997.
[RFC2026] Bradner, S., "The Internet Standards Process --
Revision 3", RFC 2026/BCP 9, October, 1996.
[RFC2028] Hovey, R. and S. Bradner, "The Organizations
Involved in the IETF Standards Process", RFC
2028/BCP 11, October, 1996.
Meyer, et. al. Section 17.2. [Page 27]
INTERNET-DRAFT Expires: July 2004 January 2004
[RFC2434] Narten, T., and H. Alvestrand, "Guidelines for
Writing an IANA Considerations Section in RFCs",
RFC 2434/BCP 26, October 1998.
[RVBIB] http://www.routeviews.org/papers
[WILLINGER2002] Willinger, W., and J. Doyle, "Robustness and the
Internet: Design and evolution", 2002.
Meyer, et. al. Section 17.2. [Page 28]
INTERNET-DRAFT Expires: July 2004 January 2004
18. Editor's Address
David Meyer
Email: dmm@1-4-5.net
19. Full Copyright Statement
Copyright (C) The Internet Society (2004). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Meyer, et. al. Section 19. [Page 29]
INTERNET-DRAFT Expires: July 2004 January 2004
Meyer, et. al.