Network Working Group                             Eric C. Rosen (Editor)
Internet Draft                                               Arjen Boers
Intended Status: Proposed Standard                             Yiqun Cai
Expires: January 6, 2010                               IJsbrand Wijnands
                                                     Cisco Systems, Inc.

                                                            July 6, 2009


            MVPN: Optimized use of PIM, Wild Card Selectors,
S-PMSI Join Extensions, Bidirectional Tunnels, Extranets, Hub and Spoke

                  draft-rosen-l3vpn-mvpn-mspmsi-04.txt

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


Copyright and License Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.





Rosen, et al.                                                   [Page 1]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


Abstract

   Specifications for a number of important topics were arbitrarily
   omitted from the initial MVPN specifications, so that those
   specifications could be "frozen" and advanced.  The current document
   provides some of the missing specifications.  The topics covered are:
   (a) using Wild Card selectors to bind multicast data streams to
   tunnels, (b) using Multipoint-to-Multipoint Label Switched Paths as
   tunnels, (c) binding bidirectional customer multicast data streams to
   specific tunnels, (d) running PIM (i.e., sending and receiving
   multicast control traffic) over a set of tunnels that are created
   only if needed to carry multicast data traffic, (e) extranets, (f)
   support for anycast sources, and (g) support for "hub and spoke"
   VPNs.





































Rosen, et al.                                                   [Page 2]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


Table of Contents

 1          Specification of requirements  .........................   4
 2          Introduction  ..........................................   4
 2.1        Topics Covered  ........................................   4
 2.2        Terminology  ...........................................   6
 3          S-PMSI Join Extensions  ................................   6
 3.1        mLDP P2MP P-Tunnels  ...................................   6
 3.2        IPv6 (S,G) with GRE P-tunnels  .........................   7
 3.3        Multiple S-PMSI Joins per Datagram  ....................   8
 4          Wild Cards: S-PMSI A-D Routes & S-PMSI Join Messages  ..   8
 5          Binding (C-*,C-G) to a Unidirectional P-Tunnel  ........   9
 6          S-PMSI Procedures for Using Bidirectional P-Tunnels  ...  10
 6.1        Bidirectional P-Tunnels  ...............................  10
 6.1.1      MP2MP LSPs  ............................................  10
 6.1.2      BIDIR-PIM  .............................................  11
 6.2        General Procedures: MS-PMSIs  ..........................  11
 6.3        Use of Multiple Bidirectional P-tunnels  ...............  12
 6.3.1      Binding (C-S,C-G)  .....................................  12
 6.3.2      Binding (C-*,C-G) Flows from Unidirectional C-trees  ...  13
 6.3.3      Binding (C-*,C-G) Flows from Bidirectional C-trees  ....  13
 6.3.4      Binding (C-*,C-*)  .....................................  14
 6.3.5      Default Tunnel Identifier for MP2MP LSPs  ..............  16
 6.4        Single Bidirectional P-Tunnel  .........................  16
 6.5        Other Methods of Instantiating an MS-PMSI  .............  17
 7          PIM over MS-PMSI  ......................................  17
 8          Extranets using PIM as the MVPN Control Plane  .........  19
 8.1        Default PMSI  ..........................................  20
 8.2        Red method  ............................................  20
 8.2.1      Control Plane RPF Check  ...............................  21
 8.2.2      Data Plane RPF Check  ..................................  21
 8.3        Blue method  ...........................................  21
 8.4        Binding Specific Extranet C-Flows to S-PMSIs  ..........  22
 8.5        Two VRFs on One PE  ....................................  22
 9          Supporting Anycast Sources with PIM Control Plane  .....  23
10          Hub and Spoke MVPNs  ...................................  24
10.1        Unicast Hub and Spoke VPNs  ............................  24
10.2        Multicast Hub and Spoke VPNs  ..........................  26
11          IANA Considerations  ...................................  28
12          Security Considerations  ...............................  29
13          Acknowledgments  .......................................  29
14          Authors' Addresses  ....................................  29
15          Normative References  ..................................  30



Rosen, et al.                                                   [Page 3]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


16          Informative References  ................................  30






1. Specification of requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


2. Introduction

   The documents [MVPN] and [MVPN-BGP] contain specifications for a
   large number of MVPN topics.  However, a number of important topics
   have been declared to be "out of scope" of those documents.  This
   document provides the specifications for some of those topics.  This
   document is not expected to be read as a stand-alone document;
   terminology from [MVPN] is used freely and knowledge of [MVPN] and
   [MVPN-BGP] is presupposed.

   Any necessary procedures not explicitly specified here are as in
   [MVPN] and/or [MVPN-BGP].


2.1. Topics Covered

   The topics covered in this document are the following:

     - The use of Wild Card Selectors in S-PMSI A-D routes and S-PMSI
       Join Messages.

       As specified in [MVPN] and [MVPN-BGP], one can use an S-PMSI A-D
       route or an S-PMSI Join Message to assign a particular
       C-multicast flow, identified as (C-S,C-G), to a particular
       S-PMSI.  The Wild Card Selectors specified in this document
       provide additional functionality:

         * One can send an S-PMSI A-D route or S-PMSI Join Message whose
           semantics are "assign all the traffic traveling the (C-*,C-G)
           tree to this S-PMSI".







Rosen, et al.                                                   [Page 4]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


         * One can send an S-PMSI A-D route or S-PMSI Join Message whose
           semantics are "use this S-PMSI as the default method for
           carrying any (C-S,C-G) or (C-*,C-G) traffic that isn't
           assigned to a different S-PMSI".  That is, it allows for the
           use of S-PMSIs as the default PMSIs for carrying data
           traffic.

     - S-PMSI Join Extensions for IPv6 and MPLS

     - MS-PMSI: A new kind of PMSI instantiated by a bidirectional
       P-tunnel (e.g., a Multipoint-to-Multipoint Label Switched Path
       (MP2MP LSPs) or a BIDIR-PIM tree with GRE encapsulation).

       A new kind of PMSI is defined, the MS-PMSI.  An S-PMSI is defined
       in [MVPN] to have a single PE as its transmitter.  An MS-PMSI is
       a set of S-PMSIs with the following property: If PE1 can transmit
       on the MS-PMSI, and PE2 can receive on the MS-PMSI, then PE2 can
       transmit on the MS-PMSI and PE1 can receive on the MS-PMSI.  The
       MS-PMSI thus has the "multidirectional" property of an MI-PMSI,
       but the "selective" property of an S-PMSI; transmissions on the
       MS-PMSI may not reach all PEs of a given VPN, but the set of PEs
       belonging to the MS-PMSI can use it send and receive data to/from
       each other.

       The most efficient way to instantiate an MS-PMSI is with a single
       bidirectional P-tunnel.  This allows one to create P-tunnels
       which contain only a subset of the PEs attached to a given VPN,
       but which can be used by any member of that subset to transmit to
       the other members of the subset.  MS-PMSIs are advertised using
       S-PMSI A-D routes or S-PMSI Join messages.

     - PIM over MS-PMSI.

       [MVPN] specifies how to run PIM [PIM] as the multicast routing
       protocol of a particular MVPN, by running it over an MI-PMSI for
       that MVPN.  In this specification, we provide a specification for
       running PIM over an MS-PMSI.  When PIM is run over an MI-PMSI,
       there may need to be P-tunnels that only carry PIM messages, but
       do not carry multicast data.  However, when PIM is run over an
       MS-PMSI, there is never any need to create a P-tunnel just for
       control messages; the only P-tunnels needed are those which carry
       multicast data.

     - MVPN Extranets with PIM Control Plane.

       In an MVPN "extranet", the transmitter of a multicast traffic
       flow is in a different VPN than the receivers.  Additional
       procedures are defined to determine how the traffic is associated



Rosen, et al.                                                   [Page 5]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


       with a particular MI-PMSI or MS-PMSI, and how the RPF checks are
       done.

     - Support for Anycast Sources, using a PIM Control Plane

     - Support for "Hub and Spoke" VPNs, using a PIM Control Plane


2.2. Terminology

   In the following, we will sometimes talk of a PE receiving traffic
   from a PMSI and then discarding it.  If PIM is being used as the
   multicast control protocol between PEs, this always implies that the
   discarded traffic will not be seen by PIM on the receiving PE.

   In the following, we will sometimes speak of an S-PMSI A-D route
   being "ignored".  When we say the route is "ignored", we do not mean
   that it's normal BGP processing is not done, but that the route is
   not considered when determining which P-tunnel to use when sending
   multicast data, and that the MPLS label values it conveys are not
   used.  We will generally use "ignore" in quotes to indicate this
   meaning.


3. S-PMSI Join Extensions

3.1. mLDP P2MP P-Tunnels

   The S-PMSI Join message is defined in section 7.4.2.2 of [MVPN].  In
   this specification, we define the "type 2" and "type 3" S-PMSI Joins,
   which are used when the S-PMSI tunnel is a P2MP LSP created by mLDP,
   and the tunnel is to carry C-flows of, respectively, IPv4 or IPv6
   multicast traffic.

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |     Type      |           Length            |    Reserved     |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                           C-Source
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+.......
       |                           C-Group
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+.......
       |                           FEC Element
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+.......
       |    Padding
       +-+-+-+-+-+-+-+.......




Rosen, et al.                                                   [Page 6]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   Type (8 bits):

     - 2 if C-Source and C-Group are IPv4 addresses,

     - 3 if C-Source and C-Group are IPv6 addresses.

   Length (16 bits): the total number of octets in the Type, Length,
   Reserved and Value fields combined, rounded up to the next multiple
   of 4, encoded as an unsigned binary integer.

   Reserved (8 bits):  This field SHOULD be zero when transmitted, and
   MUST be ignored when received.

   C-Source: address of the traffic source in the VPN

     - for type 2, a 32-bit IPv4 address

     - for type 3, a 128-bit IPv6 address

   C-Group: address of the traffic destination in the VPN

     - for type 2, a 32-bit IPv4 address

     - for type 3, a 128-bit IPv6 address

   FEC Element: this variable length field is a P2MP FEC element,
   encoded as a TLV as specified in [MLDP].

   Padding: 0-3 bytes, as needed for 32-bit alignment.  The padding
   bytes SHOULD be zero on transmission and MUST be ignored on
   reception.


3.2. IPv6 (S,G) with GRE P-tunnels

   MVPN defines the S-PMSI Join type (type 1) used when assigning IPv4
   (S,G) to a GRE P-tunnel.  When assigning IPv6 (S,G) to a GRE
   P-tunnel, S-PMSI Join type 4 is used, and the C-Source and C-Group
   are IPv6 addresses.












Rosen, et al.                                                   [Page 7]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


3.3. Multiple S-PMSI Joins per Datagram

   A single UDP datagram MAY carry multiple S-PMSI Join Messages, as
   many as can fit entirely within it.  If there are multiple S-PMSI
   Joins in a UDP datagram, they MUST be of the same S-PMSI Join type.
   The end of the last S-PMSI Join (as determined by the S-PMSI Join
   length field) MUST coincide with the end of the UDP datagram, as
   determined by the UDP length field.  When processing a received UDP
   datagram that contains one or more S-PMSI Joins, a router MUST be
   able to process all the S-PMSI Joins that fit into the datagram.



4. Wild Cards: S-PMSI A-D Routes & S-PMSI Join Messages

   As specified in [MVPN] and [MVPN-BGP], one can use an S-PMSI A-D
   route or an S-PMSI Join Message to assign a particular C-multicast
   flow, identified as (C-S,C-G), to a particular S-PMSI.

   However, [MVPN-BGP] does not specify any means of encoding wild cards
   ("*", in multicast terminology) in the Source or Group fields.
   Similarly, [MVPN] does not specify any means of encoding wild cards
   in the C-Source or C-Group fields of the S-PMSI Join messages.

   This omission makes it difficult to provide optimized multicast
   routing for customers that use ASM ("Any Source Multicast")
   multicasts, in which flows may be traveling along "shared" C-trees.
   We use the term "shared C-trees" to refer both to the the
   unidirectional "RPT trees" used in sparse mode, and to the
   bidirectional trees used in BIDIR-PIM [BIDIR-PIM].

   When a customer is using ASM multicast, it is useful to be able to
   select the set of flows that are traveling along a shared C-tree, and
   to bind that entire set of flows to a specified P-tunnel.
   Conceptually, we would like to have a way to express that we want
   (C-*,C-G) traffic bound to the specified P-tunnel.

   Another useful feature would be a way of using an S-PMSI A-D route to
   say "by default, all multicast traffic (within a given VPN) that has
   not been bound to any other P-tunnel is bound to the specified
   P-tunnel".  To do this we, need to have a way to express that we want
   (C-*, C-*) traffic bound to the P-tunnel.

   This specification therefore establishes the following conventions:







Rosen, et al.                                                   [Page 8]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


     - In an S-PMSI A-D route, the use of a zero length source or group
       field is to be interpreted as specifying a wild card value for
       the respective field. A single wild card represents all Multicast
       Source or Multicast Group values of all address families; there
       is no need to use a different wild card for IPv4 addresses than
       is used for IPv6 addresses.

     - In an S-PMSI Join message, the use of an all-zero C-Source or
       C-Group field is to be interpreted as specifying a wild card
       value for the respective field.  A wild card represents all
       C-Source or C-group values of a particular address family (IPv4
       or IPv6), as specified by the S-PMSI Join message type.

   When wildcards are used, the following two combinations MUST BE
   supported:

     - (C-*,C-G): Source Wildcard, Group specified.

     - (C-*,C-*): Source Wildcard, Group Wildcard.

   This specification does not provide support for the combination of a
   specified source and a group wildcard.  A received S-PMSI A-D route
   or S-PMSI Join message specifying this combination will be "ignored".


5. Binding (C-*,C-G) to a Unidirectional P-Tunnel

   Consider an S-PMSI A-D Route whose NLRI specifies (C-*,C-G), and that
   contains a PTA that specifies a unidirectional P-tunnel.  The
   P-tunnel may be a P2MP LSP, or it may be a unidirectional PIM-created
   multicast distribution tree specified either as P-(*,G) or as
   P-(S,G).

   Alternately, consider an S-PMSI Join message, whose C-Source and
   C-Group fields specify (C-*,C-G), and that specifies a unidirectional
   P-tunnel (either a P2MP LSP or a unidirectional PIM-created multicast
   distribution tree.)

   If C-G is known to be an SSM group address, the S-PMSI A-D route or
   S-PMSI Join message is "ignored".

   Otherwise, the semantics are the following: the originator of the
   S-PMSI A-D route or S-PMSI Join message is saying that if it
   receives, over a VRF interface, any traffic that is traveling on the
   (C-*,C-G) shared tree, it will transmit such traffic on the specified
   P-tunnel.  Any PE interested in receiving such traffic from the
   originator MUST join that P-tunnel.




Rosen, et al.                                                   [Page 9]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   (A PE receiving (C-S,C-G) multicast traffic can always tell whether
   that traffic is traveling on a (C-*,C-G) shared tree by consulting
   its C-PIM state.  Similarly, each PE in an MVPN, by virtue of running
   C-PIM, knows whether it is interested in receiving traffic from the
   (C-*,C-G) tree.)


6. S-PMSI Procedures for Using Bidirectional P-Tunnels

6.1. Bidirectional P-Tunnels

   This document specifies the use of two kinds of bidirectional
   P-tunnels: (a) MP2MP LSPs created using mLDP, and (b) BIDIR-PIM
   P-tunnels using GRE encapsulation.

   Whenever n PEs belong to a bidirectional P-tunnel, exactly one of
   them is considered to be the "root" of the P-tunnel.  How the root is
   identified depends on the particular technology of the P-tunnel.  A
   bidirectional P-tunnel is advertised only by its root.


6.1.1. MP2MP LSPs

   If the P-tunnel is an MP2MP LSP, the root is explicitly identified in
   the mLDP messages used to construct and join the P-tunnel [MLDP].
   That is, in order for a PE to join an MP2MP LSP, the PE must know the
   root of the LSP.

   An MP2MP LSP may be advertised in the PTA of an S-PMSI A-D route, or
   in the FEC Element field of an S-PMSI Join message.

   In either case, the MP2MP LSP is identified by a "FEC element" that
   contains the IP address of the "root", followed by an "opaque value"
   that identifies the MP2MP LSP uniquely in the context of the root's
   IP address.  This opaque value may be configured or autogenerated,
   and within an MVPN, there is no need for different roots to use the
   same opaque value.  When PIM is used as the PE-PE control protocol,
   the root IP address MUST be the same IP address the root uses for
   sending and receiving PIM control messages.

   Whether the MP2MP LSP is advertised in the PTA of an S-PMSI A-D
   route, or in the FEC element field of an S-PMSI Join message, the
   advertisement MUST be originated by the PE that is the root (as
   specified in the "FEC element") of the MP2MP LSP.  Any such
   advertisement that is not originated by the root MUST be "ignored".
   If the "ignored" advertisement is an S-PMSI A-D route, any MPLS label
   specified in its PTA MUST be ignored, and any PE Distinguisher Labels
   specified in the route MUST be ignored.



Rosen, et al.                                                  [Page 10]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


6.1.2. BIDIR-PIM

   Each BIDIR-PIM tree is identified by a unique P-group address.  The
   P-group address for a BIDIR-PIM P-tunnel must be configured at the PE
   that is to be the root of the P-tunnel. Associated with each such
   P-group address is a "Rendezvous Point Address" (RPA).  Every PE that
   needs to join a particular BIDIR-PIM P-tunnel must be able to
   determine the RPA that corresponds to the P-tunnel's P-group address.
   This may be known through configuration, or by some automated means
   of RPA discovery.  The RPA for a given P-group MUST uniquely identify
   the PE that is to be the root of the BIDIR-PIM tunnel.

   A BIDIR-PIM P-tunnel may be advertised in the PTA of an S-PMSI A-D
   route, or in the P-group field of an S-PMSI Join message.  In either
   case, the advertisement MUST be originated by the root of the
   BIDIR-PIM tunnel.  Any advertisement that is not originated by the
   root MUST be "ignored".  If the "ignored" advertisement is an S-PMSI
   A-D route, any MPLS label specified in its PTA MUST be ignored, and
   any PE Distinguisher Labels specified in the route MUST be ignored.


6.2. General Procedures: MS-PMSIs

   According to the definition of S-PMSI in [MVPN], only a single PE can
   transmit onto a given S-PMSI.  Note though that a single
   bidirectional P-tunnel containing n PEs can be used to instantiate n
   S-PMSIs, each of which has a different PE as its transmitter -- each
   PE can use the tunnel to transmit data to the other n-1 PEs.
   Therefore when a bidirectional P-tunnel is specified in an S-PMSI
   Join message or in the PTA of an S-PMSI A-D route, we consider the
   S-PMSI Join message or S-PMSI A-D route to be implicitly advertising
   a number of S-PMSIs: one for the PE that is advertising the P-tunnel,
   and one for each other PE that joins the P-tunnel.  We will call the
   latter S-PMSIs the "implicitly advertised reverse S-PMSIs" (or just
   "reverse S-PMSIs").

   When a bidirectional P-tunnel is specified in an S-PMSI Join message
   or in the PTA of an S-PMSI A-D route, we will use the term "MS-PMSI"
   to refer the set of S-PMSIs that (including the reverse S-PMSIs) that
   are thereby (explicitly or implicitly) advertised.

   If the PTA in the S-PMSI A-D route contains an MPLS label, then any
   PE that, as a result of having received that route, transmits a
   packet onto the MS-PMSI will first push that label onto the packet's
   label stack.  The interpretation of that label when the packet is
   received is as specified in [MVPN] and [MVPN-BGP].  The use of this
   label allows multiple VPNs to share a single bidirectional P-tunnel.




Rosen, et al.                                                  [Page 11]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   When MS-PMSIs are used to provide MVPN support (as detailed in
   subsequent sections), it is in general necessary to have more than
   one MS-PMSI per MVPN.  There are two methods for using bidirectional
   P-tunnels to instantiate MS-PMSIs.  In one method, a single
   bidirectional P-tunnel is used to instantiate all the MS-PMSIs of the
   MVPN.  In the other method, multiple bidirectional P-tunnels are
   used.  These two methods are considered separately.  Which method is
   in use is a matter of provisioning.



6.3. Use of Multiple Bidirectional P-tunnels

   In this method, each PE attached to a given MVPN is potentially the
   root of a distinct bidirectional P-tunnel.  Each such PE may
   advertise an MS-PMSI for which the originating PE is the root.  In
   effect, each such PE advertises an MS-PMSI.  We will sometimes refer
   to the MS-PMSIs as "partitions", and to the PE that advertised it as
   the root of the MS-PMSI or the root of the partition.  This notion is
   useful both in support for BIDIR-PIM C-multicast traffic and for
   running PIM over MS-PMSI.  Details are given in later sections.

   The procedures that follow presuppose when a packet is received from
   a bidirectional P-tunnel, it can be associated with one or more VRFs,
   and processed in the context of that VRF or VRFs.  If the
   bidirectional P-tunnel was advertised in an S-PMSI Join message or in
   the PTA of an S-PMSI A-D route that did not specify an MPLS label,
   then all packets received from the P-tunnel are associated with the
   same set of VRFs.  If the bidirectional P-tunnel was advertised in
   the PTA of an S-PMSI A-D route, and the PTA does specify an MPLS
   label, then received packets will carry a label that must be
   processed in order to determine the context.  If the P-tunnel is a
   MP2MP LSP, this label appears below the label that identifies the LSP
   itself.


6.3.1. Binding (C-S,C-G)

   When PE1 advertises an S-PMSI A-D route that binds a (C-S,C-G) flow
   to a bidirectional P-tunnel, or when PE1 sends an S-PMSI Join message
   that binds a (C-S,C-G) flow to a bidirectional P-tunnel, the
   semantics are as follows.  PE1 is stating that any (C-S,C-G) traffic
   that it needs to transmit to other PEs will be transmitted on the
   specified P-tunnel.  Any other PE that needs to receive such traffic
   from PE1 (i.e., any other PE that needs to receive (C-S,C-G) traffic
   and which has selected PE1 as the upstream PE for C-S) MUST join that
   P-tunnel.




Rosen, et al.                                                  [Page 12]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   If a PE has joined the P-tunnel, but does not need to receive the
   (C-S,C-G) traffic, or if it needs to receive (C-S,C-G) traffic but
   has not selected PE1 as the upstream PE for C-S, then the PE MUST
   discard any such received traffic.  Please note that if PIM is being
   used as the multicast control protocol, any traffic that is discarded
   will not be seen by PIM, and hence will not cause the generation of
   Assert messages.


6.3.2. Binding (C-*,C-G) Flows from Unidirectional C-trees

   When PE1 advertises an S-PMSI A-D route or sends an S-PMSI Join
   message that binds (C-*,C-G) to a bidirectional P-tunnel, where C-G
   is not an SSM group, and the (C-*,C-G) traffic is traveling on a
   unidirectional shared C-tree, the semantics are as follows.  PE1 is
   stating that any traffic to C-G that is traveling the shared C-tree
   and which PE1 needs to transmit to other PEs will be transmitted on
   the specified P-tunnel.  Any other PE that needs to receive such
   traffic from PE1 (i.e., any other PE that needs to receive (C-*,C-G)
   traffic and which has selected PE1 as the upstream PE for the C-RP
   corresponding to the C-G group) MUST join that P-tunnel.

   If a PE has joined the P-tunnel, but does not need to receive the
   (C-*,C-G) traffic, or if it needs to receive (C-*,C-G) traffic but
   has not selected PE1 as the upstream PE for the C-RP that corresponds
   to C-G, then the PE MUST discard any such received traffic.  Please
   note that if PIM is being used as the multicast control protocol,
   traffic that is discarded will not be seen by PIM.


6.3.3. Binding (C-*,C-G) Flows from Bidirectional C-trees

   When PE1 advertises an S-PMSI A-D route or sends an S-PMSI Join
   message that binds (C-*,C-G) to a bidirectional P-tunnel, where C-G
   is not an SSM group, and the (C-*,C-G) traffic is traveling on a
   bidirectional shared C-tree, the semantics are as follows:

     - PE1 is stating that any traffic to C-G that it (PE1) needs to
       send downstream will be sent on the specified P-tunnel

     - Any other PE that is interested in receiving (C-*,C-G) traffic
       MUST join the specified P-tunnel

     - Any other PE, say PE2, that (a) has traffic to C-G to send
       upstream and (b) has selected PE1 as its upstream PE for the
       C-RPA corresponding to C-G, MUST join the specified P-tunnel, and
       MUST send such traffic on the specified P-tunnel.  (I.e., such
       traffic is bound to the MS-PMSI instantiated by the bidirectional



Rosen, et al.                                                  [Page 13]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


       P-tunnel that is rooted at PE2.)

     - If a PE, say PE3, has joined the specified P-tunnel, but does not
       need to receive the (C-*,C-G) traffic, or has not selected PE1 as
       the upstream PE for the C-RPA corresponding to C-G, then PE3 MUST
       NOT send any (C-*,C-G) traffic on that P-tunnel, and MUST discard
       any (C-*,C-G) traffic it received on that P-tunnel.

   These procedures implement, for S-PMSIs, the "partitioning" scheme
   described in section 11.2 of [MVPN], with each MS-PMSI being a
   "partition".

   The specification given so far requires an S-PMSI A-D route or an
   S-PMSI Join message to be sent for each (C-*,C-G) that is using a
   bidirectional C-tree.  A more efficient method is given in the next
   section.


6.3.4. Binding (C-*,C-*)

   When PE1 advertises an S-PMSI A-D route or sends an S-PMSI Join
   message that binds (C-*,C-*) to a specified bidirectional P-tunnel of
   which PE1 is the root, the semantics are as that the bidirectional
   P-tunnel is to be used to carry C-multicast traffic in the following
   sets of cases:

      1. If PE1 has (C-S,C-G) traffic that is traveling on a
         source-specific C-tree, and PE1 needs to transmit that data to
         one or more other PEs, and PE1 has not bound (C-S,C-G) or
         (C-*,C-G) to a different P-tunnel, then the (C-S,C-G) traffic
         is sent by PE1 on the specified bidirectional P-tunnel.

      2. If PE1 has (C-*,C-G) traffic that is traveling on a
         unidirectional shared C-tree, and PE1 needs to transmit that
         data to one or more other PEs, and PE1 has not bound (C-*,C-G)
         to a different P-tunnel, then the (C-*,C-G) traffic is sent by
         PE1 on the specified bidirectional P-tunnel.

      3. If PE1 has (C-*,C-G) traffic that is traveling on a
         bidirectional shared C-tree, and PE1 needs to transmit that
         data to one or more other PEs, and PE1 has not bound (C-*,C-G)
         to a different P-tunnel, then the (C-*,C-G) traffic is sent by
         PE1 on the specified bidirectional P-tunnel.

      4. Consider some other PE, PE2, that has received the S-PMSI A-D
         route or S-PMSI Join message from PE1.  If PE2 has (C-*,C-G)
         traffic that is traveling on a bidirectional shared C-tree, and
         PE2 needs to transmit that traffic UPSTREAM, and PE2 has



Rosen, et al.                                                  [Page 14]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


         selected PE1 as the upstream PE for the C-RPA corresponding to
         C-G, and PE1 has not bound (C-*,C-G) to any other P-tunnel,
         then the (C-*,C-G) traffic is sent by by PE2 on the specified
         bidirectional P-tunnel.

      5. If a PE receives traffic from a particular MS-PMSI, and the
         traffic is traveling a unidirectional (C-*,C-G) or (C-S,C-G)
         tree, and the root of the MS-PMSI is not the PE's selected
         upstream PE for the (C-*,C-G) or (C-S,C-G), the PE MUST discard
         the traffic.

      6. If a PE receives traffic from a particular MS-PMSI, and the
         traffic is traveling a bidirectional (C-*,C-G) tree, and the
         PE's selected upstream PE for the C-RPA corresponding to C-G is
         not the root of the MS-PMSI, then the PE MUST discard the
         traffic.

   With respect to traffic traveling a bidirectional C-tree, these
   procedures implement, for S-PMSIs, the "partitioning" scheme
   described in section 11.2 of [MVPN], without the need to send an
   S-PMSI A-D route for each (C-*,C-G) that is using a bidirectional
   C-tree.  Each PE becomes the root of an MS-PMSI, and binds the double
   wildcard selector to it.  The MS-PMSIs serve as the "partitions".
   The MS-PMSI rooted at PE1 becomes the default MS-PMSI for all traffic
   that PE1 needs to send downstream to other PEs.  It also becomes the
   default MS-PMSI for all traffic that others PEs need to send
   upstream, as long as those other PEs have selected PE1 as the
   upstream PE for the C-RPA corresponding to that traffic.

   Note that other PEs SHOULD NOT join the specified bidirectional
   P-tunnel unless they have a need to send or receive data over it.  A
   PE knows when it needs to receive data by virtue of having certain
   multicast state in its C-PIM instance.  With regard to multicast data
   traveling on a bidirectional (C-*,C-G) tree, a PE may not know
   whether it has to send data until such data actually arrives over a
   VRF interface; the PE may be on a "sender-only" branch.  However, the
   PE in this case would have to know, through provisioning or some
   automatic procedure such as "Bootstrap Routing Protocol for PIM"
   (BSR) [BSR], the set of C-RPAs that are being used to support
   (C-*,C-G) traffic.  For each C-RPA, the PE could join the
   bidirectional P-tunnel advertised by its selected upstream PE for
   that C-RPA.  Alternatively the PE could defer joining the P-tunnel
   until it actually has data to send.








Rosen, et al.                                                  [Page 15]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


6.3.5. Default Tunnel Identifier for MP2MP LSPs

   To identify a MP2MP LSP, the S-PMSI Join message or the PMSI Tunnel
   Attribute of an S-PMSI A-D route contains an MP2MP FEC Element [mLDP]
   in its "Tunnel Identifier" field.  This contains the IP address of
   the PE at the root of the LSP, as well as an "opaque value" which is
   unique at that PE.  Each PMSI Tunnel is associated at its root PE
   with a particular VRF, and each VRF in a given PE has a unique
   default RD.  Therefore one way to uniquely identify a MP2MP LSP is to
   use a MP2MP FEC Element whose Opaque Value length is 8 and whose
   Opaque Value value is the default RD of the associated VRF.  This
   method of assigning a Tunnel Identifier MUST be the default method
   for any PMSI Tunnel which is bound to (C-*,C-*) traffic.  Other
   methods MAY be available as well.

   Note that if aggregation of multiple VPNs onto a single default
   MS-PMSI is not being supported, this method of assigning the Tunnel
   Identifier allows each PE to algorithmically determine the Tunnel
   Identifier that has been assigned by a particular upstream PE.  A PE
   decides to join a particular MS-PMSI because it has chosen that
   MS-PMSI's root as the upstream PE for a particular VPN-IP address.
   The RD of that VPN-IP address is the contents of the Opaque Value
   field of the corresponding MS-PMSI.


6.4. Single Bidirectional P-Tunnel

   When a single bidirectional P-tunnel is used for a given VPN (rather
   than multiple bidirectional P-tunnels), the PE at the root of the
   P-tunnel MUST advertise it in the PTA of an S-PMSI A-D root.  The PE
   that is at the root of the P-tunnel MUST include a "PE Distinguisher
   Labels" attribute in either in its I-PMSI A-D route, or in the S-PMSI
   A-D route containing the PTA that identifies the P-tunnel.  The PE
   MUST use the attribute to bind an upstream-assigned MPLS label to the
   IP address of each other PE that attaches to the same MVPN (as
   determined by the RTs of the A-D route).  That is, the PE at the root
   of the P-tunnel assigns a distinct label to each of the other PEs
   attaching to the same MVPN. This set of PEs is learned via the
   reception of I-PMSI A-D routes.

   The procedures for using a single bidirectional P-tunnel differ from
   the procedures for using multiple bidirectional P-tunnels only in the
   following way.  Let PE1 be the root of the P-tunnel.  When a packet
   that is traveling on a unidirectional C-tree is transmitted on the
   P-tunnel by a particular PE, say PE2, PE2 must push on the packet's
   label stack the label that PE1 assigned to PE2 via the procedure
   above.  When a packet that is traveling on a bidirectional C-tree is
   transmitted on the P-tunnel by PE2, PE2 must push on the packet's



Rosen, et al.                                                  [Page 16]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   label stack the label that PE1 assigned to PE3, where PE3 is the
   upstream PE that PE2 has selected for the C-RPA corresponding to C-G.

   For unidirectional flows, this allows the transmitter to be
   identified, and for bidirectional flows, this allows the partition to
   be identified.  Packets received from the wrong upstream PE or from
   the wrong partition MUST be discarded.  (In effect, this is a case of
   tunnel hierarchy, where the PE Distinguisher Labels represent a set
   of MP2MP LSPs, each of which instantiates an MS-PMSI, but those LSPs
   are all tunneled through a single bidirectional P-tunnel.)

   If the PTA identifying the bidirectional P-tunnel contains an MPLS
   label, then that label shall appear in the label stack immediately
   preceding the label specified in the PE Distinguisher Labels
   attribute.


6.5. Other Methods of Instantiating an MS-PMSI

   Strictly speaking, what is required to instantiate an MS-PMSI is not
   that the P-tunnels be bidirectional, but that they provide an
   any-to-any multicast service for some subset of the PEs in the MVPN.
   One could, for instance, instantiate an MS-PMSI as a PIM sparse mode
   group. In this case, the PTA of the S-PMSI A-D routes would identify
   a "PIM-SM Tree".  Every PE would have to advertise a PIM-SM tree with
   a distinct group address, and the PE and the PE advertising a given
   group address would be considered to be the "root" of the
   corresponding MS-PMSI.

   Generally speaking, this is not an efficient method of instantiating
   an MS-PMSI.  However, it can be useful in certain circumstances, such
   as the "hub and spoke" MVPN discussed in section 10.1.


7. PIM over MS-PMSI

   [MVPN] provides two alternative means of distributing C-multicast
   routing information:  PIM or BGP.  Procedures for running PIM over
   MI-PMSI are specified in that document.  However, a number of
   efficiencies can be obtained by running PIM instead over an MS-PMSI,
   instantiated as a set of MP2MP LSPs.  The procedures for this are as
   follows.

   Each PE that attaches to a given MVPN MUST originate an Intra-AS
   I-PMSI A-D route that does NOT contain a PTA.  Each such PE MUST also
   originate an S-PMSI A-D route whose PTA is a bidirectional P-tunnel
   rooted at the originating PE.  This S-PMSI A-D MUST bind the LSP to
   the "double wildcard" (*,*).  The use of these bidirectional



Rosen, et al.                                                  [Page 17]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   P-tunnels for sending and receiving data traffic is as specified in
   the previous section.  In effect, each PE in the MVPN has advertised
   an MS-PMSI for which it is the root.

   If PE1 needs to direct a PIM Join/Prune message to PE2, PE1 MUST join
   the PE2's MS-PMSI by joining the P-tunnel advertised in PE2's
   corresponding S-PMSI A-D route.  The PIM J/P messages MUST be sent
   over that MS-PMSI.

   If PE1 does not need to direct a PIM Join/Prune message to PE2, then
   PE1 SHOULD NOT join the P-tunnel advertised in PE2's S-PMSI A-D
   route, as PE1 will not be receiving any multicast data on that LSP.

   Any PE that sends a PIM Join/Prune message on a given P-tunnel is
   automatically considered to be a PIM adjacency of every PE that
   receives the message on that P-tunnel.  This implies that any PE
   receiving the LSP MUST accept a PIM Join/Prune message on that
   P-tunnel from any other PE, even if the PE that transmitted the
   Join/Prune messages has not previously transmitted a PIM Hello.  That
   is, the "adjacency relationship" does not depend on the reception of
   PIM Hellos.

   PIM Hellos may still be useful for OAM purposes.  Any PIM Hellos that
   PE1 sends MUST be sent on the P-tunnel advertised in PE1's S-PMSI A-D
   route above.

   Standard PIM procedures are used, except for:

     - The above change in the adjacency maintenance procedures.

     - Changes in the "RPF determination" or "RPF checking" procedures
       as may be defined in [MVPN] or in subsequent sections of this
       document (such as section 8.2).

   Note that the data handling procedures of the previous section will
   prevent PIM from ever seeing any packets that come from the wrong
   transmitter or that are in the wrong partition; when such packets are
   received they are discarded, rather than being passed to PIM's state
   machinery.  As a result, such packets do not cause Asserts to be
   generated.  Other standard PIM procedures, such as Join Suppression
   and Prune Override may come into play, however.

   By running PIM over MS-PMSI instead of over MI-PMSI, one completely
   avoids the need to have PEs join P-tunnels that would carry only
   control messages.  A PE need not ever join a particular a P-tunnel
   unless it either has data to send on it, or needs to receive data on
   it.




Rosen, et al.                                                  [Page 18]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   It is also possible to run PIM over MS-PMSI when a single
   bidirectional P-tunnel is used.  In that case, the PE at the root of
   the P-tunnel MUST include a PE Distinguisher Labels attribute in its
   S-PMSI A-D route, and must assign a label to each of the other PEs
   that attach to the same MVPN.  (This set is auto-discovered through
   the I-PMSI A-D routes.)  When sending a PIM J/P packet, one must push
   onto its label stack the label identifying the PE to which the J/P
   packet is being directed.  When receiving a PIM J/P packet, a PE
   discards any that are not carrying the PE distinguisher label that
   has been bound to its own IP address.

   All other MVPN-specific PIM procedures are as specified in [MVPN].


8. Extranets using PIM as the MVPN Control Plane

   Suppose there are two VPNs.  VPN1 consists of a set of VRFs, each of
   which has been configured with RT1 as it export and import Route
   Target.  VPN2 consists of a set of VRFs, each of which has been
   configured with RT2 as it export and import Route Target.  For
   convenience, we will use the term "blue" instead of "RT1" and the
   term "red" instead of "RT2".  Thus we will call VPN1 the "blue VPN"
   and VPN2 the "red VPN".  Similarly, the blue VPN consists of a number
   of "blue sites" containing "blue systems"; these sites are attached
   to PEs via VRF interfaces that are associated with "blue VRFs".

   We want to create an MVPN extranet in which blue receivers can join
   multicast groups whose sources and/or RPs are red.

   The first step is to ensure that the blue VRFs (or the subset of blue
   VRFs whose attached sites are allowed to receive multicasts from red
   sources) import routes to the red sources.  This is done as follows:

     - The red VRFs are configured so that the subset of red routes that
       are to be part of the extranet are exported with a seconds RT
       value (call it RT3), as well as with RT2.  For convenience, we
       will call RT3 "violet".

     - The blue VRFs are configured so that they import violet routes as
       well as blue routes.

   There are two different methods of providing the extranets, which
   will shall call the "red method" and the "blue method".  (Remember
   that the red VPN contains the transmitter, and the blue VPN contains
   the receivers.)

   This document assumes that in the case of non-SSM extranet multicast
   groups, the mapping between a group address and an RP is



Rosen, et al.                                                  [Page 19]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   pre-configured in the PEs.

   This document does not provide support for bidirectional C-trees in
   extranets.


8.1. Default PMSI

   Some of the procedures subsequently specified in this section are
   largely independent of whether PIM is used with (a) an MI-PMSI or (b)
   with an MS-PMSI that has been bound to the double wildcard.  We will
   use the term "default PMSI" as a general term to mean either (a) or
   (b), depending upon which technique is actually being used in a given
   network.


8.2. Red method

   In the "red method", extranet multicasts are carried by default in
   the default PMSI of the red VPN, which we will of course call the
   "red PMSI".

   To use this method, blue VRFs must be configured to import "red"
   I-PMSI A-D routes and red S-PMSI A-D routes.  If MI-PMSIs are being
   used, the blue VRFs must immediately join the P-tunnels specified in
   the red I-PMSI A-D routes.  If MS-PMSIs are being used, a blue VRF
   need not join the MS-PMSI P-tunnel rooted at a particular PE unless a
   PIM Join needs to be sent to that PE.

   The PIM C-instance associated with a blue VRF will treat the red and
   blue default PMSIs as two different PIM interfaces.

   The blue VRFs must also be configured to "associate" violet unicast
   routes with the red default PMSI.  What this means is that the red
   default PMSI will be considered to be the RPF interface for the
   violet unicast routes.  The RPF interface for the blue unicast routes
   remains, as usual, the blue default PMSI.

   All that remains to be specified is how the control plane and data
   plane RPF checks are done.  Apart from these MVPN-specific procedures
   for the RPF check, ordinary PIM procedures are used.










Rosen, et al.                                                  [Page 20]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


8.2.1. Control Plane RPF Check

   Suppose a PE receives a PIM Join(S,G) from a CE, over a VRF interface
   that is associated with a blue VRF.  The PE does the RPF check for S
   by looking up S in the blue VRF.  If the route matching S is a blue
   route (i.e., carries the blue RT but not the violet RT), then a Join
   is sent over the blue default PMSI.  However, if the route matching S
   is a violet route (i.e., carries the violet RT), a Join is sent over
   the red default PMSI.

   If the PE receives a PIM Join(*,G) from a CE, the RPF check is done
   against the address of the corresponding RP; otherwise the procedure
   is the same.


8.2.2. Data Plane RPF Check

   Suppose a red default PMSI has been associated with a blue VRF, as
   specified above, and an (S,G) multicast data packet is received from
   the red default PMSI.  Then S is looked up in the (blue) VRF.  If it
   matches a violet route, the packet is forwarded normally.  However,
   if it matches a blue route, the packet is discarded as having failed
   the RPF check.

   This prevents the blue sites from receiving packets from red
   transmitters, except in the case where routes to the red receivers
   have been explicitly imported into the blue VRF.


8.3. Blue method

   In the "blue method", extranet multicasts are carried by default in
   the default PMSI of the blue VPN.

   In the blue method, the red VRFs must be configured to import "blue"
   I-PMSI and S-PMSI A-D routes.  If MI-PMSIs are being used the
   P-tunnels specified therein must be joined immediately.  If MS-PMSIs
   are being used, the P-tunnels need not be joined unless and until it
   is necessary to send a PIM Join to the root of the P-tunnel.

   The PIM C-instance associated with a red VRF will treat the red
   default PMSI and the blue default PMSI as two different PIM
   interfaces.

   PIM Joins from blue receivers are then received at the red VRF over
   the blue PMSI, whereas PIM Joins from red receivers are received at
   the red VRF over the red PMSI.  As a result, PIM may add one or the
   other or both PMSIs to a particular multicast tree's olist.



Rosen, et al.                                                  [Page 21]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   In this method, the blue VRFs are associated with only one default
   PMSI, so the RPF check for both blue and violet sources (and RPs)
   always resolves to that PMSI.  Hence the special RPF check procedures
   of the red method are not necessary.  However, a PE with a red VRF
   may need to transmit multicast traffic on more than one MI-PMSI.

   Note that since the data plane RPF check of section 8.2.2 is not
   needed, one does not really need a "violet" RT value.  Rather, one
   may simply configure certain routes from the red VRF to be exported
   with both the red and the blue RTs.


8.4. Binding Specific Extranet C-Flows to S-PMSIs

   If the procedure of [MVPN] section 7.4.2 is used, the S-PMSI Join
   message MUST be sent on whatever default PMSI or default PMSIs are
   used to carry the C-flow identified in the message.

   If the procedure of [MVPN]section 7.4.1 is used, then procedures
   differ slightly depending upon whether the red method or the blue
   method is in use.

   If the red method is in use, and if a C-flow whose target source is
   exported from a red VRF is bound to an S-PMSI, then the S-PMSI A-D
   route that specifies the binding must carry both the red RT and the
   violet RT.  Blue VRFs must be configured to import the violet S-PMSI
   A-D routes.

   If the blue method is in use, and if a C-flow whose target source is
   exported from a red VRF is bound to an S-PMSI, then the S-PMSI A-D
   route that specifies the binding:

     - must carry the red RT if the C-flow has any receivers on the red
       default PMSI, and

     - must carry the blue RT if the C-flow has any receivers on the
       blue default PMSI.


8.5. Two VRFs on One PE

   It is possible that a red VRF and a blue VRF will exist on the same
   PE.  Then by the above procedures, one of these VRFs will need to
   join a PMSI that it can use for sending control packets to and
   receiving data packets from the other.  However, the protocol used to
   construct the P-tunnels instantiating the PMSI may not provide a
   mechanism by which a given PE can join a P-tunnel of which it is the
   root.  In this case, the PE implementation MUST support a local



Rosen, et al.                                                  [Page 22]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   function whereby a given VRF, say VRF1, can "join" a P-tunnel whose
   root is another VRF, say VRF2, on the same PE.  The PE MUST also
   support a local function whereby packets can be transmitted from one
   VRF to another just as if the VRFs had been on separate PEs.


9. Supporting Anycast Sources with PIM Control Plane

   Suppose that some customer site contains router C-R1 and some other
   customer site in the same VPN contains router C-R2.  And that each
   sends a PIM Join(C-S,C-G) messages towards C-S.  Ordinarily, the
   result will be to create a single C-tree whose root is C-S and whose
   leaves include C-R1 and C-R2.

   However, in some deployment scenarios, C-S may be an anycast address
   that belongs to two or more different sources, say C-S1 and C-S2.
   Let's suppose that these two sources attach to the VPN backbone
   through two different PEs, and let's further suppose that C-S1 is
   "close" to C-R1, and C-S2 is "close" to C-R2.  Then even though both
   C-R1 and C-R2 send Join(S,G) messages, what is really desired is to
   create two C-trees, one rooted at C-S1 (with C-R1 as a leaf) and one
   rooted at C-S2 (with C-R2 as a leaf).

   If the data traffic traveling along both C-trees is carried on a
   single MI-PMSI, it is important that a (C-S,C-G) data packet is
   forwarded towards C-R1 only if the packet is actually traveling on
   the C-tree rooted at C-S1, and not on the C-tree rooted as C-S2.

   To ensure this, if a particular MVPN is providing anycast service,
   its PEs MUST use the procedure described in section 9.1.1 of [MVPN],
   and MUST NOT use the procedures described in sections 9.1.2 and 9.1.3
   of [MVPN].

   This also enables the use of C-RPs that have anycast addresses.

   Furthermore, if anycast source support is provided for a particular
   multicast group C-G, all PEs MUST execute the procedure described in
   section 4.2.1 of [PIM], and MUST act as if SwitchToSPTDesired(S,G)
   (defined in [PIM] section 4.2.1) is true when the first (S,G) packet
   (from any PE) is received.  (This procedure MUST be executed by each
   PE even if the PE is not the "last hop" of the C-tree.)  This will
   ensure that each PE receives and forwards (C-S,C-G) traffic from the
   appropriate source C-tree, even if PE has received only Join(C-*,C-G)
   messages but not Join(C-S,C-G) messages from its directly attached
   CEs.






Rosen, et al.                                                  [Page 23]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


10. Hub and Spoke MVPNs

   The Layer 3 Virtual Private Network (L3VPN) technology of [RFC4364]
   generally provides an "any-to-any" network service, where any system
   at one site of a VPN can send traffic to and receive traffic from a
   system at any other site.  Or more precisely, nothing in the
   procedures governing the distribution of routing information in the
   VPN prevents any-to-any communication.

   In some deployments, however, it has been convenient to distinguish
   between two kinds of VPN site, the "hub site" and the "spoke sites".
   In this section, we first describe how the "hub and spoke"
   configuration affects the distribution of unicast routing.  We then
   specify a means of providing multicast VPN service in the hub and
   spoke configuration.


10.1. Unicast Hub and Spoke VPNs

   In a unicast hub and spoke VPN:

     - any system in a hub site can send traffic to and receive traffic
       from any other system in a hub site;

     - any system in a hub site can send traffic to and receive traffic
       from any system in a spoke site;

     - any system in a spoke site can send traffic to and receive
       traffic from any system in a hub site;

     - a system in one spoke site cannot send traffic to and cannot
       receive traffic from a system in a different spoke site.

   Using the technology of [RFC4364], it is possible to create this sort
   of "hub and spoke" VPN by suitable restricting the flow of routing
   information among the sites.  One way to construct a hub and spoke
   VPN is as follows:

     - Within a given VPN, every site is denoted as either a hub site or
       a spoke site.

     - On a given PE, every spoke site is attached to a distinct VRF
       (i.e., all interfaces of that VRF lead to the same spoke site).
       We will call these "Spoke VRFs".







Rosen, et al.                                                  [Page 24]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


     - On a given PE, any number of hub sites can be attached to a
       single "Hub VRF".

     - Each Hub VRF is configured with an export-RT that we shall call
       "Hub_Route", and with a pair of import-RTs, one of which is
       "Hub_Route", and the other of which we shall call "Spoke_Route".
       (Of course, each hub and spoke VPN has its unique Hub_Route RT
       and its unique Spoke_Route RT.)

     - Each Spoke VRF is configured with export-RT "Spoke_Route" and
       import-RT "Hub_Route".

   With this configuration, the Spoke VRFs will contain only routes to
   systems at hub sites, whereas the Hub VRFs will contain routes to
   systems at both hub and spoke sites.  Even if two spoke sites attach
   to the same PE, they cannot communicate directly, because they are
   associated with different VRFs, and their respective VRFs do not
   import each others' routes.  (There are implementation techniques
   that can eliminate the need to configure a separate VRF for each
   spoke site on a PE, but these are out of scope of this document.)

   There are several different variations on this theme.  For example,
   in a particular VPN, spoke-to-spoke communication may be allowed, but
   only if the spoke-to-spoke traffic first enters a hub site.  Some
   system at the hub site would be responsible for "turning the traffic
   around", i.e., sending it back to VPN backbone for delivery to the
   target spoke site.  This can be useful if the "turnaround system" at
   the hub site performs some sort of inspection of the spoke-to-spoke
   traffic and then applies authorization policies of some sort.  To
   provide this sort of Hub and Spoke VPN:

     - The total set of routes exported by the Hub VRFs must include
       routes that "summarize" all the routes exported by the Spoke
       VRFs.  For example, one or more Hub VRFs may export a default
       route.  In the Hub VRFs, each of these summary routes will have
       one of the VRF interfaces as its next hop interface.

     - When such a summary route is exported as a VPN-IP route, it MUST
       be advertised with a label for which the Next Hop Label
       Forwarding Entry (see section 3.10 of [RFC3031]) specifies on of
       the VRF interfaces as the next hop interface.

   In this scenario, if a PE receives traffic from a spoke site, and the
   IP destination address of that traffic is a system in another spoke
   site, the traffic will be tunneled to a PE that attaches to a hub,
   and then sent over one of the Hub VRF's "VRF interfaces", i.e., sent
   to a Hub CE router.  The Hub PE, when it receives the tunneled
   packet, does not look up the packet's IP destination address in the



Rosen, et al.                                                  [Page 25]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   Hub VRF, but rather forwards based on the MPLS label.  If the Hub CE
   decides (possibly after inspecting the packet and authorizing the
   transmission) to "turn the packet around", sending it back to the PE,
   the PE will look up the IP destination address in the Hub VRF, find
   that it matches one of the routes imported from a spoke VRF, and
   tunnel the packet to the PE attaches to the corresponding spoke site.

   Note that setting up a hub and spoke VPN is just a matter of proper
   configuration.  There are no protocol differences between a Hub and
   Spoke VPN and any other kind of RFC 4364 VPN.


10.2. Multicast Hub and Spoke VPNs

   Sometimes it is necessary to support multicast service over a Hub and
   Spoke VPN.  In this scenario, it is generally desired to provide an
   MVPN service with the following properties:

     - A receiver at a hub site may receive multicast traffic from a
       transmitter at a spoke site (including the case where the RP is
       at a spoke site)

     - A receiver at a spoke site may receive multicast traffic from a
       transmitter at a hub site (including the case where the RP is at
       a hub site)

     - A receiver at a spoke site must not be allowed to join a shared
       tree (i.e., a (C-*,C-G) tree whose root (i.e., the RP) is at a
       different spoke site.

     - A receiver at a spoke site must not be allowed to receive
       multicast traffic from a transmitter at a different spoke site,
       except possibly in the case where the traffic traverses a hub
       site on its path from one spoke site to the other.

   This type of MVPN service can be provided by using a variation of the
   "PIM over MS-PMSI" model described in section 7.  In this model, each
   PE advertises an MS-PMSI for each VRF.  If these advertisements are
   made using BGP S-PMSI A-D routes, the A-D route originating at a Hub
   VRF carries the "Hub_Route" RT; an A-D route originating at a spoke
   VRF carries the "Spoke_Route" RT.  That is, the S-PMSI A-D routes
   originating at a given VRF carry the same RT as the unicast routes
   originating at that VRF.

   To support Hub and Spoke functionality, the MS-PMSIs originating at
   the spoke VRFs may all specify the same P-tunnel identifier.
   Similarly, the MS-PMSIs originating at the hub VRFs may all specify
   the same P-tunnel identifier, but this must be a different P-tunnel



Rosen, et al.                                                  [Page 26]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   identifier than the one specified for the MS-PMSIs originating from
   the spoke VRFs.  In this case, it is convenient to speak of the Hub
   and Spoke infrastructure as consisting of two MS-PMSIs, a
   "spoke-rooted" MS-PMSI and a "hub-rooted" MS-PMSI.

   As discussed in section 6.5, it is possible to instantiate an MS-PMSI
   as a set of PIM-SM trees.  This means of instantiation can be useful
   in Hub and Spoke scenarios when GRE/PIM tunneling is used.  In this
   case, for a given VPN, there MAY be a single sparse mode group
   address associated with the MS-PMSIs rooted at the spoke VRFs, and a
   second sparse mode group address associated with the MS-PMSIs rooted
   at the hub VRFs.  The result is the creation of two distinct sets of
   P-tunnels for the VPN, one set used to carry data traffic from spoke
   sites to hub sites (and PIM control traffic in the opposite
   direction), and the other set used to carry data traffic from hub
   sites to spoke sites (and PIM control traffic in the opposite
   direction).

   Suppose that a spoke VRF and a hub VRF are on the same PE, and that
   an MS-PMSI advertisement exported by one of those VRFs is imported by
   the other.  The PE implementation MUST support a local function
   whereby the importing VRF can "join" the MS-PMSI exported by the
   other VRF, and MUST support a local function whereby packets
   transmitted from one VRF onto the MS-PMSI are received by the other
   VRF (if and only if the latter VRF has joined the MS-PMSI exported by
   the former).

   Since spoke VRFs do not import each others' S-PMSI A-D routes, and do
   not import each other's unicast routes, and since there is no
   MI-PMSI, there is no way for a C-Join to be transmitted directly from
   one spoke VRF to another.  If a CE at a spoke site sends a Join(S,G)
   to its PE, the PE will forward it on the hub-rooted MS-PMSI
   advertised by the hub site that is the BGP next hop for S; no spoke
   VRF can receive PIM control packets on that MS-PMSI.

   In this scheme, each hub VRF joins two MS-PMSIs, the one spoke-rooted
   MS-PMSI and the hub-rooted MS-PMSI.  Normal PIM procedures would see
   these as two PIM interfaces.  If a hub VRF at PE1 receives a
   Join(S,G) from the hub-rooted MS-PMSI, where S is at a spoke site,
   normal PIM/MVPN procedures would cause PE1 to send a Join(S,G) over
   the spoke-rooted PMSI towards a PE that attaches to S's site.  If
   these procedures are followed, a receiver at a spoke site could get
   multicast data from a different spoke site; the data would get
   "turned around" at a PE that attaches to a hub site.  Since this
   violates the requirements as stated above, a PE providing Hub and
   Spoke MVPN service MUST NOT send a Join message on one MS-PMSI as a
   result of having received a Join message over another.  As a result,
   data traffic received by a hub PE on one of these MS-PMSIs will never



Rosen, et al.                                                  [Page 27]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


   get forwarded by that PE onto the other MS-PMSI.

   Note that this does not completely prevent a receiver in a spoke site
   from being able to receive multicast data from a transmitter in a
   different spoke site. For example, suppose:

     - A receiver R1 at a spoke site, Site1, joins a (C-*,C-G) tree,

     - The RP for (C-*,C-G) is at a hub site

     - A system S2 at a different spoke site, Site2, transmits multicast
       traffic to group C-G,

     - The hub site containing the RP is multiply connected to the SP
       backbone,

     - The best path from R1 to the RP enters the RP's hub site via a
       particular PE-CE link, link1,

     - The best path from S2 to the RP enters the RP's hub site via a
       different PE-CE link, link2.

   In this case, it is possible for multicast data traffic to travel
   from S2 to link1 to the RP to link2 to R1.  If this is not desirable,
   the customer must ensure that transmitters at spoke sites do not send
   data to C-G addresses for which the RP is at a hub site.

   The procedures described in this section are compatible with the
   procedures of section 9.


11. IANA Considerations

   [MVPN] creates an IANA registry for the "S-PMSI Join Message Type
   Field". This document requires three new values:

     - The value 2 should be registered, and its description should read
       "mLDP P2MP S-PMSI for IPv4 traffic (unaggregated)".

     - The value 3 should be registered, and its description should read
       "mLDP P2MP S-PMSI for IPv6 traffic (unaggregated)".

     - The value 4 should be registered, and its description should read
       "GRE S-PMSI for IPv6 traffic (unaggregated)".







Rosen, et al.                                                  [Page 28]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


12. Security Considerations

   There are no additional security considerations beyond those of
   [MVPN] and [MVPN-BGP].


13. Acknowledgments

   Rajesh Sharma contributed significantly to sections 9 and 10. We also
   thank Karthik Subramanian, DP Ayadevara, and Rayen Mohanty.


14. Authors' Addresses

   Arjen Boers
   Cisco Systems, Inc.
   170 Tasman Drive
   San Jose, CA, 95134
   E-mail: aboers@cisco.com



   Yiqun Cai
   Cisco Systems, Inc.
   170 Tasman Drive
   San Jose, CA, 95134
   E-mail: ycai@cisco.com



   Eric C. Rosen
   Cisco Systems, Inc.
   1414 Massachusetts Avenue
   Boxborough, MA, 01719
   E-mail: erosen@cisco.com



   IJsbrand Wijnands
   Cisco Systems, Inc.
   De kleetlaan 6a Diegem 1831
   Belgium
   E-mail: ice@cisco.com








Rosen, et al.                                                  [Page 29]


Internet Draft    draft-rosen-l3vpn-mvpn-mspmsi-04.txt         July 2009


15. Normative References

   [BIDIR-PIM] "Bidirectional Protocol Independent Multicast", Handley,
   Kouvelas, Speakman, Vicisano, RFC 5015, October 2007

   [MLDP] "Label Distribution Protocol Extensions for
   Point-to-Multipoint and Multipoint-to-Multipoint Label Switched
   Paths", Minei, Kompella, Wijnands, Thomas,
   draft-ietf-mpls-ldp-p2mp-06.txt, April 2009

   [MVPN] "Multicast in MPLS/BGP IP VPNs", Rosen, Aggarwal, et. al.,
   draft-ietf-l3vpn-2547bis-mcast-08.txt, March 2009

   [MVPN-BGP] "BGP Encodings and Procedures for Multicast in MPLS/BGP IP
   VPNs", Aggarwal, Rosen, Morin, Rekhter, Kodeboniya,
   draft-ietf-l3vpn-2547bis-mcast-bgp-07.txt, April 2009

   [PIM] "Protocol Independent Multicast - Sparse Mode (PIM-SM):
   Protocol Specification (Revised)", Fenner, Handley, Holbrook,
   Kouvelas, RFC 4601, August 2006

   [RFC2119] "Key words for use in RFCs to Indicate Requirement
   Levels.", Bradner, March 1997

   [RFC3031] "MPLS Architecture", Rosen, Viswanathan, Callon, January
   2001

   [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006



16. Informative References

   [BSR] "Bootstrap Router (BSR) Mechanism for PIM", N. Bhaskar, et.al.,
   RFC 5059, January 2008
















Rosen, et al.                                                  [Page 30]