Network Working Group Yiqun Cai
Internet Draft Mike McBride
Expiration Date: April 2007 Chris Hall
Maria Napierala
October 2006
Multicast VPN Deployment Recommendations
draft-ycai-mboned-mvpn-deploy-00.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
This document is an Internet-Draft and is in full conformance with
all provisions of RFC 3978/3979 .
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Cai, McBride, et al. [Page 1]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
Abstract
Multicast VPN based on early standards has been in operation in
production networks for several years now. This document describes
some of the experience gained from implementation and deployment and
as such is informational only.
Table of Contents
1 Introduction ....................................... 3
2 Implementation ..................................... 3
2.1 RPF ................................................ 3
2.2 MTU ................................................ 4
2.3 Load Balancing ..................................... 4
2.4 MTRACE ............................................. 4
3 Operational Experience ............................. 5
3.1 Multicast VPN Design Considerations ................ 5
3.2 PIM Modes For MI-PMSI .............................. 5
3.2.1 PIM-SSM for MI-PMSI ................................ 6
3.2.2 PIM-SM for MI-PSMI ................................. 6
3.3 PIM Modes For S-PMSI ............................... 6
3.4 CE to PE PIM Modes ................................. 7
3.5 Timer Alignment .................................... 7
3.6 MDT SAFI ........................................... 7
3.7 Addressing ......................................... 8
3.8 Filtering .......................................... 8
3.9 Scalability ........................................ 8
3.10 MPLS and IP ........................................ 9
3.11 QOS ................................................ 9
4 Security Considerations ............................ 9
5 Iana Considerations ................................ 9
6 Acknowledgments .................................... 9
7 Normative References ............................... 10
8 Informative References ............................. 10
9 Authors' Addresses ................................. 10
10 Full Copyright Statement ........................... 11
11 Intellectual Property .............................. 11
Cai, McBride, et al. [Page 2]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
1. Introduction
Multicast support for L3 VPN based on RFC2547 [2547bis] was first
presented in San Diego IETF, 2000. It had not been included in the
charter of L3VPN (formerly PPVPN) working group until San Diego IETF
in 2004 and stayed on as an individual submission. During the time,
the drafts, known as "rosen-draft" [ROSEN-8], continued to evolve.
Several vendors provided implementation based on the drafts and
service providers started deploying the solution in production
networks. Limited interoperability testing has also been done.
Since the working group officially accepted the challenge to define a
solution or solutions to support multicast, several proposals have
been suggested. They are now captured in [MVPN] which forms a base
for future standards work.
The unique history of multicast support in L3VPN, that is, the
implementation and deployment started way before IETF adopted the
work, has caused certain confusion. This is only natural with any
pre-standard work.
In this document, we describe some of the lessons learned from
implementing and deploying MVPN. We hope it will benefit
implementors as well network operators looking to deploy MVPN
services.
2. Implementation
As of writing, there are two known implementations: IOS from Cisco
and JunOS from Juniper Networks. Contact these vendors for
implementation details beyond what is provided in this draft. The
following sections describe common mvpn deployment considerations.
2.1. RPF
[MVPN], as well as early "rosen-drafts", specifies that the source
address of any PIM packets a PE router generates over MI-PMSI (or MDT
tunnel) be the same as the BGP nexthop for updates originated by the
PE router for all multicast traffic sources existing in the site.
However, it was discovered that one implementation didn't do so,
which caused interoperability problems. The symptom of the problem
is that a PE router couldn't resolve the RPF neighbour towards the
source connected to a remote PE router. Interoperability should
otherwise occur when using recent OS versions.
Cai, McBride, et al. [Page 3]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
2.2. MTU
When GRE encapsulation is used in the core, 24 bytes are added to the
IP packets generated in the VPNs. Due to the lack of a path MTU
discovery mechanism for multicast, a PE router may have to fragment
the incoming packets.
The best practice is to fragment the packets before performing any
GRE encapsulation. This spares the egress PE routers from
reassembling the fragments, and leaves that for the end-systems.
This doesn't work if the "DF" bit is set in the original packet since
the packet will be dropped.
2.3. Load Balancing
Some vendors implement a special feature called "EIBGP load
balancing". What it does is install multiple routes from both EBGP
and IBGP in the VRF unicast routing table. When this is enabled on
all PE routers, multicast RPF may be affected if it also supports
load balancing.
The best practice is to make sure multicast RPF procedure selects
EBGP paths only when both are present.
2.4. MTRACE
MTRACE is a tool that allows a network operator to obtain multicast
routing information from routers, and to explore a path to the source
of the traffic or the RP.
Since there is no security mechanism embedded in the protocol, some
service providers expressed concern when the mtrace packet has to
traverse the PE routers in order to obtain the full information.
At this moment, vendors have their own mechanism to remove, or hide,
certain fields in the MTRACE packets in order to satisfy the needs of
their customers. It is becoming obvious that we need to define a
better mechanism for the protocol for use in MVPN.
Cai, McBride, et al. [Page 4]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
3. Operational Experience
3.1. Multicast VPN Design Considerations
When deploying a multicast VPN service, providers try to optimize
multicast traffic distribution and delays while reducing the amount
of state. The following considerations have given MVPN providers
direction in their MVPN deployment:
+ Core multicast routing states should typically be kept to a
minimum
+ MVPN packet delays should typically be the same as unicast
traffic
+ Data should typically be sent only to PEs with interested
receivers
3.2. PIM Modes For MI-PMSI
In [ROSEN-8], "MI-PMSI" is also known as default MDTs, which is used
to build an overlay network connecting all PE routers attaching to
the same MVPN.
Service providers have implemented PIM-SM and PIM-SSM instantiated
MI-PMSI in production networks. When PIM-SSM is used, BGP based
auto-discovery based on [ROSEN-8] has also been deployed. The
majority of current default mdt deployments are using PIM-SM using
static Anycast RP with MSDP assignment. But a dynamic RP discovery
protocol, such as BSR, could also be used.
The decision to deploy either PIM-SM or PIM-SSM is based on the
following concerns,
+ the number of multicast routing states
+ the overhead of managing the RP if PIM-SM is used
+ the difference of forwarding delay between shared tree and source
trees
Cai, McBride, et al. [Page 5]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
3.2.1. PIM-SSM for MI-PMSI
Optimal MVPN forwarding is most easily achievable when there is a
single multicast tree per MVPN per PE. Such trees are naturally built
with PIM-SSM since it permits the PE to directly join a source tree
for an MDT. With PIM-SSM, no Rendezvous Points are required. With
SSM, however, all PEs on an MVPN tree need to maintain source state.
Each PE, which is participating in MVPN, is a source. Unless VPN
customers locate their multicast sources within a constrained set of
sites, SSM may become a scalability concern in the service providers
network. Aggregating multiple VPNs into a single multicast tree might
be necessary to reduce state.
3.2.2. PIM-SM for MI-PSMI
One solution to minimize the amount of multicast state in an MVPN
environment is to configure PIM-SM to stay on the shared tree or to
configure bi-directional (BIDIR) PIM. With shared trees, multicast
state scalability is no longer a function of the number of PE's but
rather of the number of VPNs.
The scale benefit of shared trees comes at the cost of less efficient
multicast distribution. MVPN providers use Data MDTS as defined in
[ROSEN-8] to achieve bandwidth optimality. MVPN providers may address
the sub-optimality of shared tree forwarding by deploying an RP at
the best location for each VPN. Such an assignment would be based on
the VPN source locations which may be difficult to maintain.
3.3. PIM Modes For S-PMSI
In [ROSEN-8], "S-PMSI" is also known as data MDT.
Data MDTs have also been deployed by service providers. Both PIM-SM
and PIM-SSM are used. As of writing, the switching from MI-PMSI to
S-PMSI is based on traffic rate, which is what implementations
support today. The majority of data mdt deployments today are using
SSM since the source address is included in the PIM Hello packet sent
from the source PE, ie, there is no overlay signalling necessary.
MVPN providers deploy Data MDTs (S-PMSI) to achieve optimal bandwidth
useage, especially when SSM is deployed as well. S-PMSIs are
optimized for active sources and receivers and triggered per (S,G)
for a subset of (S,G) of a given VPN. Since Data-MDTs are triggered
by (S,G) states in a VPN, they could increase the amount of multicast
states in an MVPN network.
Cai, McBride, et al. [Page 6]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
3.4. CE to PE PIM Modes
The PIM protocols, which are deployed within the customer VPN, are
independent of the PIM Protocols in use within the Provider core.
Customers can choose to deploy PIM-DM, PIM-SM, Bidir, or SSM. With SM
or Bidir, customers may choose to deploy the RP on either a PE or CE
router. It is recommended to have a CE router serve as the RP to
avoid additional burden on a PE. We have, however, seen RPs deployed
on PEs as well as CE routers. If a customer desires to have a
managed RP, they may consider having the service provider manage
their CE and have it serve as the RP. To avoid managing an RP
altogether, SSM should be deployed. Deploying PIM-DM is not
recommended and at least one implementation does not switch to Data
MDTs (S-PMSI) upon receipt of customer PIM-DM traffic.
3.5. Timer Alignment
When PIM-SM is used for both MI-PMSI and S-PMSI, some interesting
observations were made.
For example, when BSR is used in the service provider network to
discover RPs, it takes more than 3 minutes to detect the failure of
an RP if default timer is used. During the window, PIM Hellos
originated by C-PIM instances will be dropped, which cause PIM
adjacencies to be torn down. But since the default PIM Hello timer
is 30 seconds, C-PIM instance on a PE router detects an outage much
faster than the P-PIM instance on the same PE router. This is also a
factor to be considered when choosing the protocol for RP redundancy.
One option, when using BSR, is to use it only for RP discovery and
then utilize Anycast-RP for RP redundancy.
3.6. MDT SAFI
Prior to [MDT SAFI], the PE BGP VPNv4 prefix update was sent using an
extended community using RD type 2. With the introduction of [MDT
SAFI], the update is sent with RD type 0. For backward compatibility,
BGP allows sending RD type 2 updates to peers unable to understand
the new MDT SAFI. It is our experience, and recommendation, that
customers run routers with all MDT address family or routers with all
pre MDT SAFI to prevent any BGP update conflicts.
Cai, McBride, et al. [Page 7]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
3.7. Addressing
It has become general practice to use 239/8 private address space
when assigning address space to mvpn's. This helps to prevent leaking
vpn traffic outside the mvpn core and helps keep customer data
private. When SSM is used, 239.232/16 addressing is the common
practice according to RFC 2365, Administratively Scoped IP Multicast.
Operators typically deploy an addressing tool to manage their
addresses.
3.8. Filtering
It may be necessary to modify existing filters to permit GRE and UDP
port 3232 to allow default and data MDT group traffic to pass.
3.9. Scalability
MVPN defines use of PIM across the default MDT. PIM Hellos and
join/prune messages will continue to increase with increase in PE's
participating in that default MDT. There have been no scaling issues
in the current deployments of MVPN. Currently, MVPN deployments
consist of up to a few hundred sites per MVPN. Subsequently, the
number of PE's participating in a default MDT continues to increase
as customers extend the multicast group participation to additional
VPN sites. There are unicast VPN customers with several thousand
sites. These sites are gradually becoming multicast enabled.
At some level of scaling of the default MDT, PIM Hello's and J/P
messages may become a scaling issue. The scaling point at which these
messages become a real operational problem is not clear. Empirical
field data shows they do not affect the broad range of MVPN
deployments today. [ROSEN 8] is scalable as specified across a wide
range of deployments. Some analysis is needed to clarify at what
operational level PIM messages do become a problem. The L3VPN WG has
gathered requirements information in [MORIN]. A benchmarking draft
[DRY] has been submitted to the BMWG to provide consistent MVPN test
methodology. The PIM WG is evaluating methods to decrease PIM
messages when this becomes of operational value.
Increasing the Hello timer and increasing the periodic join/prune
timer may help in MVPN scaling. Doing so, however, may affect join
and leave latency in times when control messages are lost. OAM, to
verify the health of the data and control paths, would also be
affected if the Hello timer were increased or removed altogether.
Cai, McBride, et al. [Page 8]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
3.10. MPLS and IP
Though the majority of MVPN deployments are over an MPLS core, there
have been deployments in both MPLS and IP cores. We have seen L3TPv3
tunneling used successfully for transporting MVPN GRE across an IP
core within a vrf.
3.11. QOS
Deployments of MVPN, that have deployed QOS, are using the same QOS
mechanisms for the MVPN GRE header that they are for their other data
traffic. VPN customers may want to separate the queuing of multicast
data from unicast data. Service Providers are extending their QOS
portfolio to support more classes of service to allow for better
separation of multicast and unicast traffic. Enhanced QOS mechanisms
support applications with short bursts but which require bounded
delay (such as video streaming). Since multicast (UDP) traffic might
not be subject to the same drop behavior as TCP traffic, QOS profiles
support Weighted Random Early Detection (WRED) treatment.
4. Security Considerations
This document has no known security implications.
5. Iana Considerations
This document creates no new requirements on IANA namespaces.
6. Acknowledgments
We'd like to thank Dino Farinacci, Yuji Kamite and Hitoshi Fukuda for
their feedback on this draft.
Cai, McBride, et al. [Page 9]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
7. Normative References
[2547bis] "BGP/MPLS VPNs", Rosen, Rekhter, et. al., September 2003,
draft-ietf-l3vpn-rfc2547bis-01.txt
[MVPN] "Multicast in MPLS/BGP IP VPNs", Rosen, Aggarwal, May 2005,
draft-ietf-l3vpn-2547bis-mcast-00.txt
8. Informative References
[ROSEN-8] E. Rosen, Y. Cai, I. Wijnands, "Multicast in MPLS/BGP IP
VPNs", draft-rosen-vpn-mcast-08.txt
[MVPN-PIM] R. Aggarwal, A. Lohiya, T. Pusateri, Y. Rekhter, "Base
Specification for Multicast in MPLS/BGP VPNs", draft-raggarwa-l3vpn-
2547-mvpn-00.txt
[RAGGARWA-MCAST] R. Aggarwal, et. al., "Multicast in BGP/MPLS VPNs
and VPLS", draft-raggarwa-l3vpn-mvpn-vpls-mcast--01.txt".
[RP-MVPN] S. Yasukawa, et. al., "BGP/MPLS IP Multicast VPNs", draft-
yasukawa-l3vpn-p2mp-mcast-00.txt
[MDT SAFI] G. Nalawade, et. al., "MDT SAFI", draft-nalawade-idr-mdt-
safi-02.txt
[MORIN] T. Morin, "Requirements for Multicast in L3 Provider-
Provisioned VPNs", draft-ietf-l3vpn-ppvpn-mcast-reqts-09.txt
[DRY] S. Dry, "Multicast VPN Scalability Benchmarking", draft-sdry-
bmwg-mvpnscale-00.txt
9. Authors' Addresses
Yiqun Cai
ycai@cisco.com
Mike McBride
mmcbride@cisco.com
Chris Hall
chall@sprint.net
Maria Napierala
mnapierala@att.com
Cai, McBride, et al. [Page 10]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006
10. Full Copyright Statement
Copyright (C) The Internet Society (2006).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
11. Intellectual Property
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf-
ipr@ietf.org.
Cai, McBride, et al. [Page 11]
Internet Draft draft-ycai-mboned-mvpn-deploy-00.txt October 2006