Network Working Group Yiqun Cai
Internet Draft Mike McBride
Intented Status: Informational Cisco
Expires: August 2008
Chris Hall
Sprint
Maria Napierala
AT&T
Wim Henderickx
Alcatel-Lucent
February 2008
PIM Based MVPN Deployment Recommendations
draft-ycai-mboned-mvpn-pim-deploy-02.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Cai, et al. [Page 1]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
Copyright Notice
Copyright (C) The IETF Trust (2008).
Abstract
Multicast VPN, based on pre-standard drafts, has been in operation in
production networks for many years. This document describes some of
the practices and experiences gained from implementation and
deployment of MVPN using PIM with GRE tunnels. It is informational
only.
Table of Contents
1 Introduction ....................................... 3
2 Implementation ..................................... 3
2.1 RPF ................................................ 3
2.2 MTU ................................................ 4
2.3 EIBGP Load Balancing ............................... 4
2.4 MTRACE ............................................. 5
3 Operational Experience ............................. 5
3.1 Multicast VPN Design Considerations ................ 5
3.2 PIM Modes For MI-PMSI .............................. 6
3.2.1 PIM-SSM for MI-PMSI ................................ 6
3.2.2 ASM for MI-PMSI .................................... 6
3.3 PIM Modes For S-PMSI ............................... 7
3.4 CE to PE PIM Modes ................................. 7
3.5 Timer Alignment .................................... 8
3.6 Addressing ......................................... 8
3.7 Filtering .......................................... 8
3.8 Scalability ........................................ 9
3.9 QOS ................................................ 11
4 Security Considerations ............................ 11
5 Iana Considerations ................................ 11
6 Acknowledgments .................................... 11
7 Normative References ............................... 12
8 Informative References ............................. 12
9 Authors' Addresses ................................. 12
10 Full Copyright Statement ........................... 13
11 Intellectual Property .............................. 13
Cai, et al. [Page 2]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
1. Introduction
Multicast support for L3VPN based on RFC2547 [2547bis] was first
presented in San Diego IETF, 2000. It had not been included in the
charter of L3VPN (formerly PPVPN) working group until San Diego IETF
in 2004 and stayed on as an individual submission. The mvpn draft
continued to evolve as pre-standards work. Several vendors provided
implementations based on this solution. Service providers began
deploying this mvpn solution in production networks.
Since the working group officially accepted the challenge to define a
solution or solutions to support multicast, several proposals have
been proposed. They are now captured in [MVPN] which forms a base
for future standards work.
This document provides MVPN deployment experience based solely on the
original PIM and GRE based MVPN solution. This solution is now
outlined as one of the options in the standards track [MVPN]
document.
In this document, we describe some of the lessons learned from
implementing and deploying MVPN. We hope it will benefit
implementors as well network operators looking to deploy MVPN
services. Throughout the document, where the term "MVPN" is used, the
reference is to the original MVPN deployment based upon PIM and GRE
tunnels.
2. Implementation
There are three known MVPN implementations: IOS from Cisco, JunOS
from Juniper Networks and TimOS from Alcatel-Lucent. Contact these
vendors for implementation details beyond what is provided in this
draft. The following sections describe common mvpn deployment
considerations.
2.1. RPF
[MVPN] specifies that the source address of any PIM packets that a PE
router generates over the MDT tunnel must be the same as the BGP
nexthop for updates originated by the PE router for all multicast
traffic sources existing in the site. Otherwise, a PE router will not
resolve the RPF neighbour towards the source connected to a remote PE
router.
A PE needs to have a particular IP address which it uses in both the
IP source address field of the PIM packet and the next hop field of
Cai, et al. [Page 3]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
the BGP updates. If this requirement is overlooked, RPF
determination may fail. This has caused interoperability problems in
the past and implementors should be careful about it in the future.
2.2. MTU
When GRE encapsulation is used in the core, 24 bytes are added to the
IP packets generated in the VPNs. Due to the lack of a path MTU
discovery mechanism for multicast, a PE router may have to fragment
the incoming packets.
The best practice is to fragment the packets before performing any
GRE encapsulation. This spares the egress PE routers from
reassembling the fragments, and leaves that for the end-systems.
This doesn't work if the "DF" bit is set in the original packet since
the packet will be dropped. Its best to ensure that the backbone does
not have any links with a 1500 byte MTU.
It is further recommended to read Section 5.1 of [Worster] for a more
detailed look at preventing fragmentation and reassembly.
2.3. EIBGP Load Balancing
External and Internal Border Gateway Protocol (eiBGP) load sharing is
an enhancement to BGP that enables load sharing over parallel links
between CE and PE routers. EIBGP enables service providers to share
customer traffic loads over parallel paths within an MPLS core
network.
When EIBGP load balancing is enabled on all PE routers, we have seen
that multicast RPF check, inside the VRF, may be affected if the path
towards the source resolves via iBGP. The best practice is to ensure
that, when both eBGP and iBGP routes are present, multicast RPF
selects eBGP paths only.
Example:
+---+
------ |PE1|
+--+ +---+
source -- |CE|
+--+ +---+
------ |PE2|
+---+
With EIBGP configured, PE1 will have two paths towards the source,
one directly via the CE using eBGP, and one via PE2 using iBGP.
Cai, et al. [Page 4]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
By default PIM will pick the neighbor with the highest IP address. If
this happens to be PE2, the RPF check will fail as it will use the
global table.
A workaround would be either a static mroute on PE1 pointing towards
the CE router, or make sure that CE has a higher ip address than PE2.
The better solution is additional logic in the RPF code that where
EIBGP is used the EBGP link is preferred.
2.4. MTRACE
MTRACE is a tool that allows a network operator to obtain multicast
routing information from routers and to explore a path to the source
of the traffic or the RP.
Since there is no security mechanism embedded in MTRACE, some service
providers express concern when the mtrace packet has to traverse the
PE routers in order to obtain the full information.
Vendors have their own mechanisms to remove, or hide, certain fields
in the MTRACE packets in order to satisfy the needs of their
customers. We need to define a better mechanism for MTRACE in an MVPN
environment.
3. Operational Experience
3.1. Multicast VPN Design Considerations
When deploying a multicast VPN service, providers try to optimize
multicast traffic distribution and delays while reducing the amount
of state. The following considerations have given MVPN providers
direction in their MVPN deployment:
+ Core multicast routing states should typically be kept to a
minimum
+ MVPN packet delays should typically be the same as unicast
traffic
+ Data should typically be sent only to PEs with interested
receivers
Cai, et al. [Page 5]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
3.2. PIM Modes For MI-PMSI
The MI-PMSI is used to build an overlay network connecting all PE
routers attaching to the same MVPN.
Service providers have implemented PIM-SM and PIM-SSM instantiated
MI-PMSI in production networks. The majority of MI-PMSI deployments
are using PIM-SM using static Anycast RP with MSDP assignment. But a
dynamic RP discovery protocol, such as BSR, is also being used.
The decision to deploy either PIM-SM or PIM-SSM is based on the
following concerns,
+ the number of multicast routing states
+ the overhead of managing the RP if PIM-SM is used
+ the difference of forwarding delay between shared tree and source
trees
3.2.1. PIM-SSM for MI-PMSI
Optimal MVPN forwarding is most easily achievable when there is a
single multicast tree per MVPN per PE. Such trees are naturally built
with PIM-SSM since it permits the PE to directly join a source tree
for an MDT. With PIM-SSM, no Rendezvous Points are required. With
SSM, however, all PEs on an MVPN tree need to maintain source state.
Each PE, which is participating in MVPN, is a source. Unless VPN
customers locate their multicast sources within a constrained set of
sites, SSM may become a scalability concern in the service providers
network.
3.2.2. ASM for MI-PMSI
One solution to minimize the amount of multicast state in an MVPN
environment is to configure PIM-SM or BIDIR PIM to stay on the shared
tree. With shared trees, multicast state scalability is no longer a
function of the number of PE's but rather of the number of VPNs.
The scale benefit of shared trees comes at the cost of less efficient
multicast distribution. MVPN providers use the MI-PMSI to achieve
bandwidth optimality. MVPN providers may address the sub-optimality
of shared tree forwarding by deploying an RP at the best location for
each VPN. Such an assignment would be based on the VPN source
Cai, et al. [Page 6]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
locations, something which may be difficult to maintain.
3.3. PIM Modes For S-PMSI
The S-PMSI has also been widely deployed by service providers. While
both PIM-SM and PIM-SSM are used, PIM-SSM is the more widely
deployed, and recommended, S-PMSI tree building model. The majority
of S-PMSI deployments today are using SSM since the source address is
included in the PIM Hello packet sent from the source PE.
MVPN providers deploy the S-PMSI to achieve optimal bandwidth usage,
especially when SSM is deployed as well. S-PMSIs are optimized for
active sources and receivers and triggered per (S,G) for a subset of
(S,G) of a given VPN. Since S-PMSIs are triggered by (S,G) states in
a VPN, they could increase the amount of multicast states in an MVPN
network.
The decision to switch from MI-PMSI to S-PMSI is always made by the
ingress PE based upon the traffic load exceeding a configurable
threshold.
3.4. CE to PE PIM Modes
The PIM protocols that are deployed within the customer VPN are
independent of the PIM Protocols in use within the Provider core.
Customers can choose to deploy PIM-DM, PIM-SM, Bidir, or SSM.
With SM or Bidir, customers may choose to deploy the RP on either a
PE or CE router. It is generally recommended to have a CE router
serve as the RP. This is done primarily to avoid an increase in
customer/provider interaction on matters such as the integration of
the PE/RP into the customer chosen RP discovery mechanism and to
avoid any additional burden on a busy PE router. While RP deployment
is most commonly performed on CE routers, we have seen RPs deployed
successfully on PE as well as CE routers.
If a customer desires to have a provider managed RP, they should
consider requesting the service provider manage a CE and have it
serve as the RP. To avoid managing an RP altogether, SSM should be
deployed.
Cai, et al. [Page 7]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
3.5. Timer Alignment
When PIM-SM is used in an SP's core MVPN environment, some
interesting observations were made. When BSR, for example, is used in
the service provider network, to discover RPs in the provider tunnel,
it takes more than 3 minutes to detect the failure of the RP if the
default timer is used. During the window, PIM Hellos originated by
C-PIM instances will be dropped, which cause PIM adjacencies to be
torn down. But since the default PIM Hello timer is 30 seconds, C-
PIM instance on a PE router detects an outage much faster than the
P-PIM instance on the same PE router.
This is a factor to be considered when choosing the protocol for RP
redundancy and fast failover. One option, for fast failover, is to
use BSR only for RP discovery and then utilize Anycast-RP for RP
redundancy.
3.6. Addressing
It has become general practice to use 239/8 private address space
when assigning address space to mvpn's. This helps to prevent vpn
traffic from being sent outside the mvpn core. When SSM is used,
239.232/16 addressing is the common practice according to [Meyer],
Administratively Scoped IP Multicast. Operators typically deploy an
addressing tool to manage their addresses.
This addressing practice can also be used to prevent non-VPN traffic,
originating outside the SP boundaries, from entering a VPN.
The reader should also reference section 11.5.4 of the [MVPN] draft
entitled "Avoiding Conflict with Internet Multicast".
3.7. Filtering
Filtering at the SP boundaries is needed to prevent VPN security
violations. It may be necessary to modify these deployed filters to
permit GRE and possibly UDP port 3232. UDP port 3232 is the UDP port
used for the S-PMSI Join messages.
Cai, et al. [Page 8]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
3.8. Scalability
PIM retransmission overhead on a given MI-PMSI increases in linear
proportion to any increase in the number of PEs that join the MI-
PMSI. The overhead also increases in linear proportion with an
increase in the number of J/P messages received from the CEs.
There have been no scaling issues with current deployments of MVPN.
Current MVPN deployments consist of up to a few hundred sites per
MVPN. The number of PE's participating in a MI-PMSI continues to
increase as customers extend the multicast group participation to
additional VPN sites. There are unicast VPN customers with several
thousand sites. These sites are gradually becoming multicast enabled.
The number of J/P messages received from CEs will also increase over
time.
At some level of scaling of the MI-PMSI, PIM Hello's and J/P messages
will become a scaling issue. The scaling point at which these
messages become a real operational problem is not clear. Empirical
field data shows they do not affect the broad range of MVPN
deployments today. MVPN is scalable as specified across a wide range
of deployments.
Some analysis is needed to clarify at what operational level PIM
messages do become a problem. The L3VPN WG has gathered requirements
information in [MORIN]. A benchmarking draft [DRY] has been submitted
to the BMWG to provide consistent MVPN test methodology. The PIM WG
is evaluating methods to decrease PIM messages when this becomes of
operational value. Extensions to PIM such as PIM J/P Acks and TCP
based approaches are being evaluated by the working group.
Increasing the Hello timer and increasing the periodic join/prune
timer may also help in future MVPN scaling. Doing so, however, may
affect join and leave latency in times when control messages are
lost. OAM, to verify the health of the data and control paths, would
also be affected if the Hello timer were increased or removed
altogether.
The following analysis compares different pim modes and their
resulting mvpn state using the following values:
Number of P interfaces 5
Number of PE P-PIM interfaces 2
Number of PE C-PIM interfaces 1
Number of PE 20
Cai, et al. [Page 9]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
Number of M-VPN 100
S-PMSI/VPN 2
PIM adjacencies C-PIM 100
PIM adjacencies P-PIM 1900
PIM adjacencies PE 2
PIM adjacencies Total 2002
MIPMSI/PE SPMSI/PE MDT/PE MIPMSI/RP(P) SPMSI/RP(P) MDT/P
PIM SSM PIM SSM PIM SSM
(S,G) state 2000 4000 6000 2000 4000 6000
(*,G) state 0 0 0 0 0 0
total state 2000 4000 6000 2000 4000 6000
MIPMSI nbrs 1900 1900 NA NA NA
MIPMSI 100 100 NA NA NA
Inband MDT 3800 3800 NA NA NA
Outband MDT 200 200 NA NA NA
PIM SM PIM SSM MIPMSI:SM
SPMSI:SSM
(S,G) state 2000 4000 6000 2000 4000 6000
(*,G) state 100 0 100 100 0 100
total state 2100 4000 6100 2100 4000 6100
MIPMSI nbrs 1900 1900 NA NA NA
MIPMSI 100 100 NA NA NA
Inband MDT 3800 3800 NA NA NA
Outband MDT 200 200 NA NA NA
SM(no spt) SSM MIPMSI:SM(no spt)
SPMSI:SSM
(S,G) state 100 4000 4100 2000 4000 6000
(*,G) state 100 0 100 100 0 100
total state 200 4000 4200 2100 4000 6100
MIPMSI nbrs 1900 1900 NA NA NA
MIPMSI 100 100 NA NA NA
Inband MDT 3800 3800 NA NA NA
Outband MDT 200 200 NA NA NA
Cai, et al. [Page 10]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
3.9. QOS
Deployments of MVPN, that have deployed QOS, typically use the same
QOS mechanisms for the MVPN GRE header that they would use for their
other data traffic. VPN customers may want to separate the queuing of
multicast data from unicast data. Service Providers are extending
their QOS portfolio to support more classes of service to allow for
better separation of multicast and unicast traffic. Enhanced QOS
mechanisms support applications with short bursts but which require
bounded delay (such as video streaming). Since multicast (UDP)
traffic might not be subject to the same drop behavior as TCP
traffic, QOS profiles support Weighted Random Early Detection (WRED)
treatment.
4. Security Considerations
The use of GRE encapsulations and IP Multicast has certain security
implications. As discussed in [Farinacci], security in a network
using GRE should be relatively similar to security in a normal IPv4
network. And Section 6 of [Fenner] clearly outlines the various
security concerns related to PIM and how to use IPsec to secure the
protocol.
5. Iana Considerations
This document does not require any action on the part of IANA.
6. Acknowledgments
We'd like to thank Dino Farinacci, Yuji Kamite, Hitoshi Fukuda and
Eric Rosen for their feedback on this draft.
Cai, et al. [Page 11]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
7. Normative References
[2547bis] "BGP/MPLS VPNs", Rosen, Rekhter, et. al., February 2006,
RFC 4364
[MVPN] "Multicast in MPLS/BGP IP VPNs", Rosen, Aggarwal, July 2007,
draft-ietf-l3vpn-2547bis-mcast-05.txt
8. Informative References
[MORIN] T. Morin, "Requirements for Multicast in L3 Provider-
Provisioned VPNs", RFC 4834
[DRY] S. Dry, "Multicast VPN Scalability Benchmarking", draft-sdry-
bmwg-mvpnscale-02.txt
[Meyer] D. Meyer, "Administratively Scoped IP Multicast". RFC 2365
[Farinacci] D. Farinacci, "Generic Routing Encapsulation (GRE)". RFC
2784
[Fenner] B. Fenner, "Protocol Independent Multicast - Sparse Mode
(PIM-SM)". RFC 4601.
[Worster] T. Worster, "Encapsulating MPLS in IP or Generic Routing
Encapsulation (GRE)" RFC 4023.
9. Authors' Addresses
Yiqun Cai
ycai@cisco.com
Mike McBride
mmcbride@cisco.com
Chris Hall
chall@sprint.net
Maria Napierala
mnapierala@att.com
Wim Henderickx
wim.henderickx@alcatel-lucent.be
Cai, et al. [Page 12]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008
10. Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
11. Intellectual Property
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf-
ipr@ietf.org.
Cai, et al. [Page 13]
Internet Draft draft-ycai-mboned-mvpn-pim-deploy-02.txt February 2008