Network Working Group                                          Yiqun Cai
Internet Draft                                              Mike McBride
Intented Status: Informational                                     Cisco
Expires: August 2008
                                                              Chris Hall
                                                                  Sprint

                                                         Maria Napierala
                                                                    AT&T

                                                          Wim Henderickx
                                                          Alcatel-Lucent

                                                           February 2008


               PIM Based MVPN Deployment Recommendations


                draft-ycai-mboned-mvpn-pim-deploy-02.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.








Cai, et al.                                                     [Page 1]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


Copyright Notice

   Copyright (C) The IETF Trust (2008).

Abstract

   Multicast VPN, based on pre-standard drafts, has been in operation in
   production networks for many years.  This document describes some of
   the practices and experiences gained from implementation and
   deployment of MVPN using PIM with GRE tunnels. It is informational
   only.



Table of Contents

    1          Introduction  .......................................   3
    2          Implementation  .....................................   3
    2.1        RPF  ................................................   3
    2.2        MTU  ................................................   4
    2.3        EIBGP Load Balancing  ...............................   4
    2.4        MTRACE  .............................................   5
    3          Operational Experience  .............................   5
    3.1        Multicast VPN Design Considerations  ................   5
    3.2        PIM Modes For MI-PMSI  ..............................   6
    3.2.1      PIM-SSM for MI-PMSI  ................................   6
    3.2.2      ASM for MI-PMSI  ....................................   6
    3.3        PIM Modes For S-PMSI  ...............................   7
    3.4        CE to PE PIM Modes  .................................   7
    3.5        Timer Alignment  ....................................   8
    3.6        Addressing  .........................................   8
    3.7        Filtering  ..........................................   8
    3.8        Scalability  ........................................   9
    3.9        QOS  ................................................  11
    4          Security Considerations  ............................  11
    5          Iana Considerations  ................................  11
    6          Acknowledgments  ....................................  11
    7          Normative References  ...............................  12
    8          Informative References  .............................  12
    9          Authors' Addresses  .................................  12
   10          Full Copyright Statement  ...........................  13
   11          Intellectual Property  ..............................  13









Cai, et al.                                                     [Page 2]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


1. Introduction

   Multicast support for L3VPN based on RFC2547 [2547bis] was first
   presented in San Diego IETF, 2000.  It had not been included in the
   charter of L3VPN (formerly PPVPN) working group until San Diego IETF
   in 2004 and stayed on as an individual submission.  The mvpn draft
   continued to evolve as pre-standards work.  Several vendors provided
   implementations based on this solution. Service providers began
   deploying this mvpn solution in production networks.

   Since the working group officially accepted the challenge to define a
   solution or solutions to support multicast, several proposals have
   been proposed.  They are now captured in [MVPN] which forms a base
   for future standards work.

   This document provides MVPN deployment experience based solely on the
   original PIM and GRE based MVPN solution. This solution is now
   outlined as one of the options in the standards track [MVPN]
   document.

   In this document, we describe some of the lessons learned from
   implementing and deploying MVPN.  We hope it will benefit
   implementors as well network operators looking to deploy MVPN
   services. Throughout the document, where the term "MVPN" is used, the
   reference is to the original MVPN deployment based upon PIM and GRE
   tunnels.


2. Implementation

   There are three known MVPN implementations: IOS from Cisco, JunOS
   from Juniper Networks and TimOS from Alcatel-Lucent. Contact these
   vendors for implementation details beyond what is provided in this
   draft. The following sections describe common mvpn deployment
   considerations.


2.1. RPF

   [MVPN] specifies that the source address of any PIM packets that a PE
   router generates over the MDT tunnel must be the same as the BGP
   nexthop for updates originated by the PE router for all multicast
   traffic sources existing in the site. Otherwise, a PE router will not
   resolve the RPF neighbour towards the source connected to a remote PE
   router.

   A PE needs to have a particular IP address which it uses in both the
   IP source address field of the PIM packet and the next hop field of



Cai, et al.                                                     [Page 3]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


   the BGP updates.  If this requirement is overlooked, RPF
   determination may fail. This has caused interoperability problems in
   the past and implementors should be careful about it in the future.


2.2. MTU

   When GRE encapsulation is used in the core, 24 bytes are added to the
   IP packets generated in the VPNs.  Due to the lack of a path MTU
   discovery mechanism for multicast, a PE router may have to fragment
   the incoming packets.

   The best practice is to fragment the packets before performing any
   GRE encapsulation.  This spares the egress PE routers from
   reassembling the fragments, and leaves that for the end-systems.
   This doesn't work if the "DF" bit is set in the original packet since
   the packet will be dropped. Its best to ensure that the backbone does
   not have any links with a 1500 byte MTU.

   It is further recommended to read Section 5.1 of [Worster] for a more
   detailed look at preventing fragmentation and reassembly.


2.3. EIBGP Load Balancing

   External and Internal Border Gateway Protocol (eiBGP) load sharing is
   an enhancement to BGP that enables load sharing over parallel links
   between CE and PE routers. EIBGP enables service providers to share
   customer traffic loads over parallel paths within an MPLS core
   network.

   When EIBGP load balancing is enabled on all PE routers, we have seen
   that multicast RPF check, inside the VRF, may be affected if the path
   towards the source resolves via iBGP. The best practice is to ensure
   that, when both eBGP and iBGP routes are present, multicast RPF
   selects eBGP paths only.

   Example:
                            +---+
                     ------ |PE1|
                +--+        +---+
      source -- |CE|
                +--+        +---+
                     ------ |PE2|
                            +---+

   With EIBGP configured, PE1 will have two paths towards the source,
   one directly via the CE using eBGP, and one via PE2 using iBGP.



Cai, et al.                                                     [Page 4]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


   By default PIM will pick the neighbor with the highest IP address. If
   this happens to be PE2, the RPF check will fail as it will use the
   global table.

   A workaround would be either a static mroute on PE1 pointing towards
   the CE router, or make sure that CE has a higher ip address than PE2.
   The better solution is additional logic in the RPF code that where
   EIBGP is used the EBGP link is preferred.



2.4. MTRACE

   MTRACE is a tool that allows a network operator to obtain multicast
   routing information from routers and to explore a path to the source
   of the traffic or the RP.

   Since there is no security mechanism embedded in MTRACE, some service
   providers express concern when the mtrace packet has to traverse the
   PE routers in order to obtain the full information.

   Vendors have their own mechanisms to remove, or hide, certain fields
   in the MTRACE packets in order to satisfy the needs of their
   customers. We need to define a better mechanism for MTRACE in an MVPN
   environment.



3. Operational Experience

3.1. Multicast VPN Design Considerations

   When deploying a multicast VPN service, providers try to optimize
   multicast traffic distribution and delays while reducing the amount
   of state. The following considerations have given MVPN providers
   direction in their MVPN deployment:

     + Core multicast routing states should typically be kept to a
       minimum

     + MVPN packet delays should typically be the same as unicast
       traffic

     + Data should typically be sent only to PEs with interested
       receivers






Cai, et al.                                                     [Page 5]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


3.2. PIM Modes For MI-PMSI

   The MI-PMSI is used to build an overlay network connecting all PE
   routers attaching to the same MVPN.

   Service providers have implemented PIM-SM and PIM-SSM instantiated
   MI-PMSI in production networks.  The majority of MI-PMSI deployments
   are using PIM-SM using static Anycast RP with MSDP assignment. But a
   dynamic RP discovery protocol, such as BSR, is also being used.

   The decision to deploy either PIM-SM or PIM-SSM is based on the
   following concerns,

     + the number of multicast routing states

     + the overhead of managing the RP if PIM-SM is used

     + the difference of forwarding delay between shared tree and source
       trees



3.2.1. PIM-SSM for MI-PMSI

   Optimal MVPN forwarding is most easily achievable when there is a
   single multicast tree per MVPN per PE. Such trees are naturally built
   with PIM-SSM since it permits the PE to directly join a source tree
   for an MDT. With PIM-SSM, no Rendezvous Points are required. With
   SSM, however, all PEs on an MVPN tree need to maintain source state.
   Each PE, which is participating in MVPN, is a source. Unless VPN
   customers locate their multicast sources within a constrained set of
   sites, SSM may become a scalability concern in the service providers
   network.



3.2.2. ASM for MI-PMSI

   One solution to minimize the amount of multicast state in an MVPN
   environment is to configure PIM-SM or BIDIR PIM to stay on the shared
   tree. With shared trees, multicast state scalability is no longer a
   function of the number of PE's but rather of the number of VPNs.

   The scale benefit of shared trees comes at the cost of less efficient
   multicast distribution. MVPN providers use the MI-PMSI to achieve
   bandwidth optimality. MVPN providers may address the sub-optimality
   of shared tree forwarding by deploying an RP at the best location for
   each VPN. Such an assignment would be based on the VPN source



Cai, et al.                                                     [Page 6]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


   locations, something which may be difficult to maintain.



3.3. PIM Modes For S-PMSI

   The S-PMSI has also been widely deployed by service providers. While
   both PIM-SM and PIM-SSM are used, PIM-SSM is the more widely
   deployed, and recommended, S-PMSI tree building model. The majority
   of S-PMSI deployments today are using SSM since the source address is
   included in the PIM Hello packet sent from the source PE.

   MVPN providers deploy the S-PMSI to achieve optimal bandwidth usage,
   especially when SSM is deployed as well. S-PMSIs are optimized for
   active sources and receivers and triggered per (S,G) for a subset of
   (S,G) of a given VPN. Since S-PMSIs are triggered by (S,G) states in
   a VPN, they could increase the amount of multicast states in an MVPN
   network.

   The decision to switch from MI-PMSI to S-PMSI is always made by the
   ingress PE based upon the traffic load exceeding a configurable
   threshold.



3.4. CE to PE PIM Modes

   The PIM protocols that are deployed within the customer VPN are
   independent of the PIM Protocols in use within the Provider core.
   Customers can choose to deploy PIM-DM, PIM-SM, Bidir, or SSM.

   With SM or Bidir, customers may choose to deploy the RP on either a
   PE or CE router. It is generally recommended to have a CE router
   serve as the RP. This is done primarily to avoid an increase in
   customer/provider interaction on matters such as the integration of
   the PE/RP into the customer chosen RP discovery mechanism and to
   avoid any additional burden on a busy PE router. While RP deployment
   is most commonly performed on CE routers, we have seen RPs deployed
   successfully on PE as well as CE routers.

   If a customer desires to have a provider managed RP, they should
   consider requesting the service provider manage a CE and have it
   serve as the RP. To avoid managing an RP altogether, SSM should be
   deployed.







Cai, et al.                                                     [Page 7]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


3.5. Timer Alignment

   When PIM-SM is used in an SP's core MVPN environment, some
   interesting observations were made. When BSR, for example, is used in
   the service provider network, to discover RPs in the provider tunnel,
   it takes more than 3 minutes to detect the failure of the RP if the
   default timer is used.  During the window, PIM Hellos originated by
   C-PIM instances will be dropped, which cause PIM adjacencies to be
   torn down.  But since the default PIM Hello timer is 30 seconds, C-
   PIM instance on a PE router detects an outage much faster than the
   P-PIM instance on the same PE router.

   This is a factor to be considered when choosing the protocol for RP
   redundancy and fast failover. One option, for fast failover, is to
   use BSR only for RP discovery and then utilize Anycast-RP for RP
   redundancy.



3.6. Addressing

   It has become general practice to use 239/8 private address space
   when assigning address space to mvpn's. This helps to prevent vpn
   traffic from being sent outside the mvpn core. When SSM is used,
   239.232/16 addressing is the common practice according to [Meyer],
   Administratively Scoped IP Multicast. Operators typically deploy an
   addressing tool to manage their addresses.

   This addressing practice can also be used to prevent non-VPN traffic,
   originating outside the SP boundaries, from entering a VPN.

   The reader should also reference section 11.5.4 of the [MVPN] draft
   entitled "Avoiding Conflict with Internet Multicast".


3.7. Filtering

   Filtering at the SP boundaries is needed to prevent VPN security
   violations. It may be necessary to modify these deployed filters to
   permit GRE and possibly UDP port 3232. UDP port 3232 is the UDP port
   used for the S-PMSI Join messages.










Cai, et al.                                                     [Page 8]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


3.8. Scalability

   PIM retransmission overhead on a given MI-PMSI increases in linear
   proportion to any increase in the number of PEs that join the MI-
   PMSI.  The overhead also increases in linear proportion with an
   increase in the number of J/P messages received from the CEs.

   There have been no scaling issues with current deployments of MVPN.
   Current MVPN deployments consist of up to a few hundred sites per
   MVPN. The number of PE's participating in a MI-PMSI continues to
   increase as customers extend the multicast group participation to
   additional VPN sites. There are unicast VPN customers with several
   thousand sites. These sites are gradually becoming multicast enabled.
   The number of J/P messages received from CEs will also increase over
   time.

   At some level of scaling of the MI-PMSI, PIM Hello's and J/P messages
   will become a scaling issue. The scaling point at which these
   messages become a real operational problem is not clear. Empirical
   field data shows they do not affect the broad range of MVPN
   deployments today. MVPN is scalable as specified across a wide range
   of deployments.

   Some analysis is needed to clarify at what operational level PIM
   messages do become a problem. The L3VPN WG has gathered requirements
   information in [MORIN]. A benchmarking draft [DRY] has been submitted
   to the BMWG to provide consistent MVPN test methodology. The PIM WG
   is evaluating methods to decrease PIM messages when this becomes of
   operational value. Extensions to PIM such as PIM J/P Acks and TCP
   based approaches are being evaluated by the working group.

   Increasing the Hello timer and increasing the periodic join/prune
   timer may also help in future MVPN scaling. Doing so, however, may
   affect join and leave latency in times when control messages are
   lost. OAM, to verify the health of the data and control paths, would
   also be affected if the Hello timer were increased or removed
   altogether.

   The following analysis compares different pim modes and their
   resulting mvpn state using the following values:

   Number of P interfaces         5

   Number of PE P-PIM interfaces  2

   Number of PE C-PIM interfaces  1

   Number of PE                  20



Cai, et al.                                                     [Page 9]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


   Number of M-VPN              100

   S-PMSI/VPN                     2

   PIM adjacencies C-PIM        100

   PIM adjacencies P-PIM       1900

   PIM adjacencies PE             2

   PIM adjacencies Total       2002


              MIPMSI/PE SPMSI/PE   MDT/PE MIPMSI/RP(P) SPMSI/RP(P) MDT/P

                PIM SSM  PIM SSM  PIM SSM

   (S,G) state     2000     4000     6000     2000       4000       6000
   (*,G) state        0        0        0        0          0          0
   total state     2000     4000     6000     2000       4000       6000
   MIPMSI nbrs     1900              1900       NA         NA         NA
   MIPMSI           100               100       NA         NA         NA
   Inband MDT               3800     3800       NA         NA         NA
   Outband MDT               200      200       NA         NA         NA

                 PIM SM  PIM SSM  MIPMSI:SM
                                  SPMSI:SSM

   (S,G) state     2000     4000     6000     2000       4000       6000
   (*,G) state      100        0      100      100          0        100
   total state     2100     4000     6100     2100       4000       6100
   MIPMSI nbrs     1900              1900       NA         NA         NA
   MIPMSI           100               100       NA         NA         NA
   Inband MDT               3800     3800       NA         NA         NA
   Outband MDT               200      200       NA         NA         NA


                SM(no spt)   SSM  MIPMSI:SM(no spt)
                                   SPMSI:SSM

   (S,G) state      100     4000     4100     2000     4000       6000
   (*,G) state      100        0      100      100        0        100
   total state      200     4000     4200     2100     4000       6100
   MIPMSI nbrs     1900              1900       NA       NA         NA
   MIPMSI           100               100       NA       NA         NA
   Inband MDT               3800     3800       NA       NA         NA
   Outband MDT               200      200       NA       NA         NA




Cai, et al.                                                    [Page 10]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


3.9. QOS

   Deployments of MVPN, that have deployed QOS, typically use the same
   QOS mechanisms for the MVPN GRE header that they would use for their
   other data traffic. VPN customers may want to separate the queuing of
   multicast data from unicast data. Service Providers are extending
   their QOS portfolio to support more classes of service to allow for
   better separation of multicast and unicast traffic. Enhanced QOS
   mechanisms support applications with short bursts but which require
   bounded delay (such as video streaming). Since multicast (UDP)
   traffic might not be subject to the same drop behavior as TCP
   traffic, QOS profiles support Weighted Random Early Detection (WRED)
   treatment.




4. Security Considerations

   The use of GRE encapsulations and IP Multicast has certain security
   implications. As discussed in [Farinacci], security in a network
   using GRE should be relatively similar to security in a normal IPv4
   network.  And Section 6 of [Fenner] clearly outlines the various
   security concerns related to PIM and how to use IPsec to secure the
   protocol.




5. Iana Considerations

   This document does not require any action on the part of IANA.



6. Acknowledgments

   We'd like to thank Dino Farinacci, Yuji Kamite, Hitoshi Fukuda and
   Eric Rosen for their feedback on this draft.












Cai, et al.                                                    [Page 11]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


7. Normative References

   [2547bis] "BGP/MPLS VPNs", Rosen, Rekhter, et. al., February 2006,
   RFC 4364

   [MVPN] "Multicast in MPLS/BGP IP VPNs", Rosen, Aggarwal, July 2007,
   draft-ietf-l3vpn-2547bis-mcast-05.txt


8. Informative References

   [MORIN] T. Morin, "Requirements for Multicast in L3 Provider-
   Provisioned VPNs", RFC 4834

   [DRY] S. Dry, "Multicast VPN Scalability Benchmarking", draft-sdry-
   bmwg-mvpnscale-02.txt

   [Meyer] D. Meyer, "Administratively Scoped IP Multicast". RFC 2365

   [Farinacci] D. Farinacci, "Generic Routing Encapsulation (GRE)". RFC
   2784

   [Fenner] B. Fenner, "Protocol Independent Multicast - Sparse Mode
   (PIM-SM)". RFC 4601.

   [Worster] T. Worster, "Encapsulating MPLS in IP or Generic Routing
   Encapsulation (GRE)"  RFC 4023.


9. Authors' Addresses

   Yiqun Cai
   ycai@cisco.com

   Mike McBride
   mmcbride@cisco.com

   Chris Hall
   chall@sprint.net

   Maria Napierala
   mnapierala@att.com

   Wim Henderickx
   wim.henderickx@alcatel-lucent.be






Cai, et al.                                                    [Page 12]


Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008


10. Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.




11. Intellectual Property

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at ietf-
   ipr@ietf.org.




Cai, et al.                                                    [Page 13]

Internet Draft  draft-ycai-mboned-mvpn-pim-deploy-02.txt   February 2008