Network Working Group                                           T. Morin
Internet-Draft                              France Telecom - Orange Labs
Intended status: Experimental                                 Y. Rekhter
Expires: May 7, 2009                                         R. Aggarwal
                                                        Juniper Networks
                                                           W. Henderickx
                                                                P. Muley
                                                          Alcatel-Lucent
                                                        November 3, 2008


                  Multicast VPN fast upstream failover
                draft-morin-l3vpn-mvpn-fast-failover-00

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on May 7, 2009.

Abstract

   This document defines multicast VPN extensions and procedures that
   allow fast failover for upstream failures, by allowing downstream PEs
   to take into account the status of Provider-Tunnels (P-tunnels) when
   selecting the upstream PE for a VPN multicast flow, and extending BGP
   mVPN routing so that a C-multicast route can be advertised toward a
   standby upstream PE.




Morin, et al.              Expires May 7, 2009                  [Page 1]


Internet-Draft         mVPN fast upstream failover         November 2008


Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  UMH Selection based on tunnel status . . . . . . . . . . . . .  3
     3.1.  Determining the status of a tunnel . . . . . . . . . . . .  4
       3.1.1.  mVPN tunnel root tracking  . . . . . . . . . . . . . .  5
       3.1.2.  PE-P Upstream link status  . . . . . . . . . . . . . .  5
       3.1.3.  P2MP RSVP-TE tunnels . . . . . . . . . . . . . . . . .  5
       3.1.4.  Leaf-initiated P-tunnels . . . . . . . . . . . . . . .  6
       3.1.5.  P2MP LSP OAM . . . . . . . . . . . . . . . . . . . . .  6
       3.1.6.  (S,G) counter information  . . . . . . . . . . . . . .  6
   4.  Standby C-multicast route  . . . . . . . . . . . . . . . . . .  7
     4.1.  Downstream PE behavior . . . . . . . . . . . . . . . . . .  7
     4.2.  Upstream PE behavior . . . . . . . . . . . . . . . . . . .  8
     4.3.  Reachability determination . . . . . . . . . . . . . . . .  9
   5.  Hot leaf standby . . . . . . . . . . . . . . . . . . . . . . .  9
   6.  Duplicate packets  . . . . . . . . . . . . . . . . . . . . . .  9
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 10
     10.2. Informative References . . . . . . . . . . . . . . . . . . 11
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11
   Intellectual Property and Copyright Statements . . . . . . . . . . 13


















Morin, et al.              Expires May 7, 2009                  [Page 2]


Internet-Draft         mVPN fast upstream failover         November 2008


1.  Introduction

   In the context of multicast in BGP/MPLS VPNs, it is desirable to
   provide mechanisms allowing fast recovery of connectivity on
   different types of failures.  This document addresses failures of
   elements in the provider network that are upstream of PEs connected
   to VPN sites with receivers.

   The two first section describe two independent mechanisms, allowing
   different levels of resiliency, and providing different failure
   coverage:

   o  Section 3 describes local procedures allowing an egress PE (a PE
      connected to a receiver site) to take into account the status of
      P-Tunnels to determine the Upstream Multicast Hop (UMH) for a
      given (C-S, C-G).

   o  Section 4 describes protocol extensions that can speed up failover
      by not requiring any multicast VPN routing message exchange at
      recovery time.

   Moreover, section 5 describes a "hot leaf standby" mechanism, that
   uses a combination of these two mechanisms.


2.  Terminology

   The terminology used in this document is the terminology defined in
   [I-D.ietf-l3vpn-2547bis-mcast] and
   [I-D.ietf-l3vpn-2547bis-mcast-bgp].


3.  UMH Selection based on tunnel status

   Current multicast VPN specifications [I-D.ietf-l3vpn-2547bis-mcast],
   section 5.1, describe the procedures used by a multicast VPN
   downstream PE to determine what the upstream multicast hop (UMH) is
   for a said (C-S,C-G).

   The procedure described here is an OPTIONAL procedure that consist in
   having a downstream PE take into account the status of P-tunnels
   rooted at each possible upstream PEs, for including or not including
   each said PE in the list of candidate UMHs for a said (C-S,C-G)
   state.  The result is that, if a P-tunnel is "down" (see
   Section 3.1), the PE that is the root of the P-Tunnel won't be
   considered for UMH selection, which will result in the downstream PE
   to failover to the upstream PE which is next in the list of
   candidates.



Morin, et al.              Expires May 7, 2009                  [Page 3]


Internet-Draft         mVPN fast upstream failover         November 2008


   More precisely, UMH determination for a said (C-S,C-G) will consider
   the UMH candidates in the following order:

   o  first, the UMH candidates that advertise a PMSI bound to a tunnel
      that is "up", and *if* the "allowed-SPMSI-only" configuration flag
      is set (see below), the UMH candidates that do not advertise any
      I- or S- PMSI applicable to the said (C-S,C-G)

   o  second, the UMH candidates that advertise a PMSI bound to a tunnel
      that is "down" -- these will thus be used as a last resort to
      ensure a graceful fallback to the basic mVPN UMH selection
      procedures in the hypothetical case where a false negative would
      occur when determining the status of all tunnels

   The "allowed-SPMSI-only" configuration flag mentioned above is a
   configuration flag that MUST be provided, that is necessary to allow
   an upstream PE to use a policy where no I-PMSI is advertized for a
   said VRF and where only S-PMSI are used, the S-PMSI advertisement
   being possibly done only after the upstream PE receives a C-multicast
   route for (C-S, C-G)/(C-*, C-G) to be carried over the advertised
   S-PMSI.

   For a said downstream PE and a said VRF, the P-tunnel corresponding
   to a said upstream PE for a said (C-S,C-G) state is the S-PMSI tunnel
   advertized by that upstream PE for this (C-S,C-G) and imported into
   that VRF, or if there isn't any such S-PMSI, the I-PMSI tunnel
   advertized by that PE and imported into that VRF.

3.1.  Determining the status of a tunnel

   Different factors can be considered to determine the "status" of a
   P-tunnel and are described in the following sub-sections.  The
   procedure proposed here also allows that all downstream PEs don't
   apply the same rules to define what the status of a P-tunnel is
   (please see Section 6), and some of them will produce a result that
   may be different for different downstream PEs.  Thus what is called
   the "status" of a P-tunnel in this section, is not a characteristic
   of the tunnel in itself, but is the status of the tunnel, *as seen
   from a particular downstream PE*.

   Depending on the criteria used to determine the status of a P-tunnel,
   there may be an interaction with other resiliency mechanism used for
   the P-tunnel itself, and the UMH update may happen immediately or may
   need to be delayed.  Each particular case is covered in each separate
   sub-section below.






Morin, et al.              Expires May 7, 2009                  [Page 4]


Internet-Draft         mVPN fast upstream failover         November 2008


3.1.1.  mVPN tunnel root tracking

   A condition to consider that the status of a P-tunnel is up is that
   the root of the tunnel, as determined in the PMSI tunnel attribute,
   is reachable through unicast routing tables.  In this case the
   downstream PE can immediately update its UMH when the reachability
   condition changes.

   This is similar to BGP next-hop tracking for VPNv4 routes, except
   that the address considered is not the BGP next-hop address, but the
   root address in the PMSI tunnel attribute.

   If BGP next-hop tracking is done for VPNv4 routes, and the root
   address of a said tunnel happens to be the same as the next-hop
   address in the BGP autodiscovery route advertising the tunnel, then
   this mechanisms may be omitted for this tunnel, as it will not bring
   any specific benefit.

3.1.2.  PE-P Upstream link status

   A condition to consider a tunnel status as up can be that the last-
   hop link of the P-tunnel is up.

   In that case, if the PE can determine that there is no fast
   restoration mechanism (such as MPLS FRR [RFC4090]) in place for the
   P-tunnel, it can update the UMH immediately.  Else, it should wait
   before updating the UMH, to let the P-tunnel restoration mechanims
   happen.  A configurable timer MUST be provided for this purpose, and
   it is recommended to provide a reasonable default value for this
   timer.

3.1.3.  P2MP RSVP-TE tunnels

   For P-Tunnels of type P2MP MPLS-TE, the status of the P-Tunnel is
   considered up if one or more of the P2MP RSVP-TE LSPs, identified by
   the P-Tunnel Attribute, are in up state.  The determination of
   whether a P2MP RSVP-TE LSP is in up state requires Path and Resv
   state for the LSP and is based on procedures in [RFC4875].  In this
   case the downstream PE can immediately update its UMH when the
   reachability condition changes.

   When signaling state for a P2MP TE LSP is removed (e.g. if the
   ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE
   LSP changes state from up to down as determined by procedures in
   [RFC4875], the status of the corresponding P-Tunnel SHOULD be re-
   evaluated.  If the P-Tunnel transitions from up to down state, the
   upstream PE, that is the ingress of the P-Tunnel, SHOULD not be
   considered a valid UMH.



Morin, et al.              Expires May 7, 2009                  [Page 5]


Internet-Draft         mVPN fast upstream failover         November 2008


3.1.4.  Leaf-initiated P-tunnels

   A PE can be removed from the UMH candidate list for a said (S,G) if
   the P-tunnel for this S,G (I or S , depending) is leaf triggered
   (PIM, mLDP), but for some reason internal to the protocol the
   upstream one-hop branch of the tunnel from P to PE cannot be built.
   In this case the downstream PE can immediately update its UMH when
   the reachability condition changes.

3.1.5.  P2MP LSP OAM

   When a P2MP connectivity verification mechanism such as
   [I-D.katz-ward-bfd-multipoint] used in conjunction with bootstraping
   mechanisms described in [I-D.ietf-mpls-mcast-cv] has been setup for a
   tunnel, the result of the connectivity verification can be used to
   define the status of the tree.

   If a MultipointHead session has been established on a P2MP MPLS LSP
   so that BFD packets are periodically sent from the root toward
   leaves, a condition to consider the status of corresponding tunnel as
   up is that the BFD SessionState is Up.

   When such a procedure is used, in context where fast restoration
   mechanisms are used for the P-tunnels, downstream PEs should be
   configured to wait before updating the UMH, to let the P-tunnel
   restoration mechanims happen.  A configurable timer MUST be provided
   for this purpose, and it is recommended to provide a reasonable
   default value for this timer.

3.1.6.  (S,G) counter information

   In cases, where the downstream node can be configured so that the
   maximum inter-packet time is known for all the multicast flows mapped
   on a P-tunnel, the local per-(C-S,C-G) traffic counter information
   for traffic received on this P-tunnel can be used to determine the
   status of the P-tunnel.

   When such a procedure is used, in context where fast restoration
   mechanisms are used for the P-tunnels, downstream PEs should be
   configured to wait before updating the UMH, to let the P-tunnel
   restoration mechanims happen.  A configurable timer MUST be provided
   for this purpose, and it is recommended to provide a reasonable
   default value for this timer.

   This method can be applicable for instance when a (S,G) flow is
   mapped on an S-PMSI.

   In cases where this mechanism is used in conjunction with Hot leaf



Morin, et al.              Expires May 7, 2009                  [Page 6]


Internet-Draft         mVPN fast upstream failover         November 2008


   standby, then no prior knowledge of the rate of the multicast streams
   is required ; downstream PEs can compare reception on the two
   P-tunnels to determine when one of them is down.


4.  Standby C-multicast route

   The procedures described below are limited to the case where the site
   that contains C-S is connected to exactly two PEs.  The procedures
   require all the PEs of that mVPN to follow the single forwarder PE
   selection, as specified in [I-D.ietf-l3vpn-2547bis-mcast].  The
   procedures assume that if a site of a given mVPN that contains C-S is
   dual-homed to two PEs, then all then other sites of that mVPN would
   have two VPN-IPv4 routes to C-S, each with its own RD.

   As long as C-S is reachable via both PEs, a said downstream PE will
   select one of the PEs connected to C-S as its Upstream PE with
   respect to C-S.  We will refer to the other PE connected to C-S as
   the "Standby Upstream PE".  Note that if the connectivity to C-S
   through the Primary Upstream PE becomes unavailable, then the PE will
   select the Standby Upstream PE as its Upstream PE with respect to
   C-S.

   For readability, in the following sub-sections, the procedures are
   described for BGP C-multicast Source Tree Join routes, but they apply
   equally to BGP C-multicast Shared Tree Join routes failover for the
   case where the customer RP is dual-homed (substitute "C-RP" to
   "C-S").

4.1.  Downstream PE behavior

   When a (downstream) PE connected to some site of an mVPN needs to
   send a C-multicast route (C-S, C-G), then following the procedures
   specified in Section "Originating C-multicast routes by a PE" of
   [I-D.ietf-l3vpn-2547bis-mcast-bgp] the PE sends the C-multicast route
   with RT that identifies the Upstream PE selected by the PE
   originating the route.  As long as C-S is reachable via the Primary
   Upstream PE, the Upstream PE is the Primary Upstream PE.  If C-S is
   reachable only via the Standby Upstream PE, then the Upstream PE is
   the Standby Upstream PE.

   If C-S is reachable via both the Primary and the Standby Upstream PE,
   then in addition to sending the C-multicast route with an RT that
   identifies the Primary Upstream PE, the PE also originates and sends
   a C-multicast route with an RT that identifies the Standby Upstream
   PE.  This route is formed so that it carries the semantic of being a
   'standby' C-multicast route (to be completed in a further revision).
   .



Morin, et al.              Expires May 7, 2009                  [Page 7]


Internet-Draft         mVPN fast upstream failover         November 2008


   If at some later point the local PE determines that C-S is no longer
   reachable through the Primary Upstream PE, the Standby Upstream PE
   becomes the Upstream PE, and the local PE re-sends the C-multicast
   route with RT that identifies the Standby Upstream PE, except that
   now the route does not carry the Standby PE BGP Community (which
   results in replacing the old route with a new route, with the only
   difference between these routes being the presence/absence of the
   Standby PE BGP Community).

4.2.  Upstream PE behavior

   When a PE receives a C-multicast route for a particular (C-S, C-G),
   and the RT carried in the route results in importing the route into a
   particular VRF on the PE, if the route carries the Standby PE BGP
   Community, then the PE performs as follows:

      when the PE determines that C-S is not reachable through some
      other PE, the PE SHOULD install VRF PIM state corresponding to
      this BGP C-multicast route (the result will be that a PIM Join
      message will be sent to the CE towards C-S, and that the PE will
      receive (C-S,C-G) traffic), and the PE SHOULD forward (C-S, C-G)
      traffic received by the PE to other PEs through a P-tunnel rooted
      at the PE.

   Furthermore, irrespective of whether C-S carried in that route is
   reachable through some other PE:

   a) based on local policy, as soon as the PE receives this BGP
      C-multicast route, the PE MAY install VRF PIM state corresponding
      to this BGP Source Tree Join route (the result will be that Join
      messages will be sent to the CE toward C-S, and that the PE will
      receive (C-S,C-G) traffic)

   b) based on local policy, as soon as the PE receives this BGP
      C-multicast route, the PE MAY forward (C-S, C-G) traffic to other
      PEs through a P-tunnel independently of the reachability of C-S
      through some other PE. [note that this implies also doing (a)]

   Doing neither (a), nor (b) for a said (C-S,C-G) is called "cold root
   standby".

   Doing (a) but not (b) for a said (C-S,C-G) is called "mild root
   standby".

   Doing (b) (which implies also doing (a)) for a said (C-S,C-G) is
   called "hot root standby".





Morin, et al.              Expires May 7, 2009                  [Page 8]


Internet-Draft         mVPN fast upstream failover         November 2008


4.3.  Reachability determination

   The standby PE can use the following information to determine that
   C-S can or cannot be reached through the primary PE:

   o  presence/absence of a VPNv4 route toward C-S

   o  supposing that the standby PE is an egress of the tunnel rooted at
      the Primary PE, the standby PE can determine the reachability of
      C-S through the Primary PE based on the status of this tunnel,
      determined thanks to the same criteria as the ones described in
      Section 3.1 (without using the UMH selection procedures of
      Section 3)

   o  other mechanisms MAY be used


5.  Hot leaf standby

   The mechanisms defined in the two previous section can be used
   together as follows.

   The principle is that, for a said VRF (or possibly only for a said
   C-S,C-G):

   o  downstream PEs advertise a Standby BGP C-multicast route (based on
      Section 4)

   o  upstream PEs use the "hot standby" optional behavior and thus will
      forward traffic for a said multicast state as soon as they have
      whether a (primary) BGP C-multicast route or a Standby BGP
      C-multicast route for that state (or both)

   o  downstream PEs accept traffic from the primary or standby tunnel,
      based on the status of the tunnel (based on Section 3)

   Other combinations of the mechanisms proposed in Section 4) and
   Section 3 are for further study.


6.  Duplicate packets

   Multicast VPN specifications [I-D.ietf-l3vpn-2547bis-mcast] impose
   that a PE only forwards to CEs the packets coming from the expected
   usptream PE (Section 9.1).

   We highlight the reader's attention to the fact that the respect of
   this part of multicast VPN specifications is especially important



Morin, et al.              Expires May 7, 2009                  [Page 9]


Internet-Draft         mVPN fast upstream failover         November 2008


   when two distinct upstream PEs are succeptible to forward the same
   traffic on P-tunnels at the same time in steady state.  This will be
   the case when "hot root standby" mode is used (Section 4), and which
   can also be the case if procedures of Section 3 are used and (a) the
   rules determining the status of a tree are not the same on two
   distinct downstream PEs or (b) the rule determining the status of a
   tree depend on conditions local to a PE (e.g. the PE-P upstream link
   being up).


7.  IANA Considerations

   Allocation is expected from IANA for the BGP "Standby PE" community.
   (TBC)

   [Note to RFC Editor: this section may be removed on publication as an
   RFC.]


8.  Security Considerations


9.  Acknowledgements

   TBC.


10.  References

10.1.  Normative References

   [I-D.ietf-l3vpn-2547bis-mcast]
              Aggarwal, R., Bandi, S., Cai, Y., Morin, T., Rekhter, Y.,
              Rosen, E., Wijnands, I., and S. Yasukawa, "Multicast in
              MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast-07 (work
              in progress), July 2008.

   [I-D.ietf-l3vpn-2547bis-mcast-bgp]
              Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
              Encodings and Procedures for Multicast in MPLS/BGP IP
              VPNs", draft-ietf-l3vpn-2547bis-mcast-bgp-05 (work in
              progress), June 2008.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4875]  Aggarwal, R., Papadimitriou, D., and S. Yasukawa,
              "Extensions to Resource Reservation Protocol - Traffic



Morin, et al.              Expires May 7, 2009                 [Page 10]


Internet-Draft         mVPN fast upstream failover         November 2008


              Engineering (RSVP-TE) for Point-to-Multipoint TE Label
              Switched Paths (LSPs)", RFC 4875, May 2007.

10.2.  Informative References

   [I-D.ietf-mpls-mcast-cv]
              Swallow, G., "Connectivity Verification for Multicast
              Label Switched Paths", draft-ietf-mpls-mcast-cv-00 (work
              in progress), April 2007.

   [I-D.katz-ward-bfd-multipoint]
              Katz, D. and D. Ward, "BFD for Multipoint Networks",
              draft-katz-ward-bfd-multipoint-01 (work in progress),
              January 2008.

   [RFC4090]  Pan, P., Swallow, G., and A. Atlas, "Fast Reroute
              Extensions to RSVP-TE for LSP Tunnels", RFC 4090,
              May 2005.


Authors' Addresses

   Thomas Morin
   France Telecom - Orange Labs
   2, avenue Pierre Marzin
   Lannion  22307
   France

   Email: thomas.morin@orange-ftgroup.com


   Yakov Rekhter
   Juniper Networks
   1194 North Mathilda Ave.
   Sunnyvale, CA  94089
   U.S.A.

   Email: yakov@juniper.net


   Rahul Aggarwal
   Juniper Networks
   1194 North Mathilda Ave.
   Sunnyvale, CA  94089
   U.S.A.

   Email: rahul@juniper.net




Morin, et al.              Expires May 7, 2009                 [Page 11]


Internet-Draft         mVPN fast upstream failover         November 2008


   Wim Henderickx
   Alcatel-Lucent
   Copernicuslaan 50
   Antwerp  2018
   Belgium

   Email: wim.henderickx@alcatel-lucent.com


   Praveen Muley
   Alcatel-Lucent
   701 East Middlefield Rd
   Mountain View, CA  94043
   U.S.A.

   Email: praveen.muley@alcatel-lucent.com



































Morin, et al.              Expires May 7, 2009                 [Page 12]


Internet-Draft         mVPN fast upstream failover         November 2008


Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.











Morin, et al.              Expires May 7, 2009                 [Page 13]