Last Call Review of draft-ietf-bess-mvpn-fast-failover-11

Request Review of draft-ietf-bess-mvpn-fast-failover
Requested rev. no specific revision (document currently at 15)
Type Last Call Review
Team Security Area Directorate (secdir)
Deadline 2020-10-19
Requested 2020-10-05
Authors Thomas Morin, Robert Kebler, Greg Mirsky
Draft last updated 2020-10-23
Completed reviews Rtgdir Last Call review of -11 by Adrian Farrel (diff)
Secdir Last Call review of -11 by Daniel Migault (diff)
Secdir Telechat review of -13 by Daniel Migault (diff)
Assignment Reviewer Daniel Migault 
State Completed
Review review-ietf-bess-mvpn-fast-failover-11-secdir-lc-migault-2020-10-23
Posted at
Reviewed rev. 11 (document currently at 15)
Review result Has Nits
Review completed: 2020-10-23



I reviewed this document as part of the Security Directorate's ongoing effort to
review all IETF documents being processed by the IESG.  These comments were
written primarily for the benefit of the Security Area Directors.  Document
authors, document editors, and WG chairs should treat these comments just like
any other IETF Last Call comments.  Please note also that my expertise in BGP is
limited, so feel free to take these comments with a pitch of salt.  

Review Results: Has Nits

Please find my comments below. 


                  Multicast VPN Fast Upstream Failover


   This document defines multicast VPN extensions and procedures that
   allow fast failover for upstream failures, by allowing downstream PEs
   to take into account the status of Provider-Tunnels (P-tunnels) when
   selecting the Upstream PE for a VPN multicast flow, and extending BGP
   MVPN routing so that a C-multicast route can be advertised toward a
   Standby Upstream PE.

Though it might be just a nit, if MVPN
designates multicast VPN, it might be
clarifying to specify the acronym in the
first sentence. This would later make
the correlation with BGP MVPN clearer. 


1.  Introduction

   In the context of multicast in BGP/MPLS VPNs, it is desirable to
   provide mechanisms allowing fast recovery of connectivity on
   different types of failures.  This document addresses failures of
   elements in the provider network that are upstream of PEs connected
   to VPN sites with receivers.

Well I am not familiar with neither BGP
nor MPLS. It seems that BGP/MLPS IP VPNS
and MPLS/BGP IP VPNs are both used. I am
wondering if there is a distinction
between the two and a preferred way to
designate these VPNs.  My understanding
is that the VPN-IPv4 characterizes the
VPN while MPLS is used by the backbone
for the transport.  Since the PE are
connected to the backbone the VPN-IPv4
needs to be labeled. 


   Section 3 describes local procedures allowing an egress PE (a PE
   connected to a receiver site) to take into account the status of
   P-tunnels to determine the Upstream Multicast Hop (UMH) for a given
   (C-S, C-G).  This method does not provide a "fast failover" solution
I understand the limitation is due to
BGP convergence. 

   when used alone, but can be used together with the mechanism
   described in Section 4 for a "fast failover" solution.

   Section 4 describes protocol extensions that can speed up failover by
   not requiring any multicast VPN routing message exchange at recovery

   Moreover, section 5 describes a "hot leaf standby" mechanism, that
   uses a combination of these two mechanisms.  This approach has
   similarities with the solution described in [RFC7431] to improve
   failover times when PIM routing is used in a network given some
   topology and metric constraints.


3.1.1.  mVPN Tunnel Root Tracking

   A condition to consider that the status of a P-tunnel is up is that
   the root of the tunnel, as determined in the x-PMSI Tunnel attribute,
   is reachable through unicast routing tables.  In this case, the
   downstream PE can immediately update its UMH when the reachability
   condition changes.

   That is similar to BGP next-hop tracking for VPN routes, except that
   the address considered is not the BGP next-hop address, but the root
   address in the x-PMSI Tunnel attribute.

   If BGP next-hop tracking is done for VPN routes and the root address
   of a given tunnel happens to be the same as the next-hop address in
   the BGP A-D Route advertising the tunnel, then checking, in unicast
   routing tables, whether the tunnel root is reachable, will be
   unnecessary duplication and thus will not bring any specific benefit.

It seems to me that x-PMSI address
designates a different interface than
the one used by the Tunnel itself. If
that is correct, such mechanisms seems
to assume that one equipment up on one
interface will be up on the other
interfaces. I have the impression that a
configuration change in a PE may end up
in the P-tunnel being down, while the PE
still being reachable though the x-PMSI
Tunnel attribute. If that is a possible
scenario, the current mechanisms may not
provide more efficient mechanism than
then those of the standard BGP.

Similarly, it is assumed the tunnel is
either up or down and the determination
of not being up if being down.  I am not
convinced that the two only states.
Typically services under DDoS may be
down for a small amount of time. While
this affects the network, there is not
always a clear cut between the PE being
up or down. 


3.1.6.  BFD Discriminator Attribute

   P-tunnel status may be derived from the status of a multipoint BFD
   session [RFC8562] whose discriminator is advertised along with an
   x-PMSI A-D Route.

   This document defines the format and ways of using a new BGP
   attribute called the "BFD Discriminator".  It is an optional
   transitive BGP attribute.  In Section 7.2, IANA is requested to
   allocate the codepoint value (TBA2).  The format of this attribute is
   shown in Figure 1.

I feel that the sentence "In Section ...
TBA2)." should be removed.


       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      |    BFD Mode   |                  Reserved                     |
      |                       BFD Discriminator                       |
      ~                         Optional TLVs                         ~

            Figure 1: Format of the BFD Discriminator Attribute


      BFD Mode field is the one octet long.  This specification defines
      the P2MP BFD Session as value 1 Section 7.2.

      Reserved field is three octets long, and the value MUST be zeroed
      on transmission and ignored on receipt.

      BFD Discriminator field is four octets long.

Morin, et al.             Expires April 5, 2021                 [Page 7]
Internet-Draft         mVPN Fast Upstream Failover          October 2020

      Optional TLVs is the optional variable-length field that MAY be
      used in the BFD Discriminator attribute for future extensions.
      TLVs MAY be included in a sequential or nested manner.  To allow
      for TLV nesting, it is advised to define a new TLV as a variable-
      length object.  Figure 2 presents the Optional TLV format TLV that
      consists of:

      *  one octet-long field of TLV 's Type value (Section 7.3)

      *  one octet-long field of the length of the Value field in octets

      *  variable length Value field.

      The length of a TLV MUST be multiple of four octets.
I am wondering why the constraint on the
length is not mentioned in the paragraph
associated to the field - as opposed to
a  separate paragraph. 



8.  Security Considerations

   This document describes procedures based on [RFC6513] and [RFC6514]
   and hence shares the security considerations respectively represented
   in these specifications.

   This document uses p2mp BFD, as defined in [RFC8562], which, in turn,
   is based on [RFC5880].  Security considerations relevant to each
   protocol are discussed in the respective protocol specifications.  An
   implementation that supports this specification MUST use a mechanism
   to control the maximum number of p2mp BFD sessions that can be active
   at the same time.

At a high level view - or at least my
interpretation of it - the document
proposes a mechanism based on BFD to
detect fault in the path.  Upon a fault
detection a fail-over operation is
instructed using BGP. This rocedure is
expected to perform a faster fail-over
than traditional BGP convergence on
maintaining routing tables. Once the
fail over has been performed, BFD is
confirms the new path is "legitimate"
and works.

It seems correct to me that the current
protocol relies on BGP / BFD security.
That said, having BFD authentication
based on MD5 or SHA1 may suggest that
stronger primitives be recommended.
While this does not concerns the current
document, it seems to me that the
information might be relayed to routing

What remains unclear to me - and I
assume this might be due to my lake or
expertise in routing area - is the impact
associated to performing a fail-over
both on 1) the data plane and 2) the
standard BGP way to establish routing

Regarding the data plane, I am wondering
if fail-over results in a lost of
packets for example - I suppose for
example that at least the packets in the
process of being forwarded might be
lost. I believe that providing details
on this may be good. 

If there are any impacts I would like to
understand also in which cases the
decision to perform a failover operation
may result in more harm than the event
that has been over-interpreted. An
hypothetical scenario could be that the
non reception of a BFD packet is
interpreted as a PE being down while it
may not be correct and the PE might have
been simply under stress. A "too fast" fail-over
may over interpreted it and perform a
fail-over. If such things could happen,
an attacker could leverage a micro event
to perform network operation that are
not negligible. Another way to see that
is that an attacker might not have
direct access to the control plan, but
could use the data plan to generate a
stress and sort of control the fail
over. It seems to me that some text
might be welcome to prevent such cases
to happen. This could be guidance for
declaring a tunnel down for example. 

Similarly, it would be good to add some
text regarding the interferences with
the non-fast forwarding fail over when
performed by the standard BGP.
Typically, my impression is that the
fast fail-over mechanism is a local
decision versus the BGP convergence that
is more global. As a result, even with
more time this two mechanisms may come
with different outcomes. One such
example to illustrate my purpose could
be the following. Note that this is only
illustrative of my purpose, and I let
you find and pick on ethat is more
appropriated.   I am thinking of a case
where a standby PE is be shared among
multiple PEs - supposing this situation
could occur.  Typically, if PE_1, PE_2
are shared by PE_a, ..., PE_z. In case
PE_a and PE_b are down, we expect PE_a
to switch to PE_1 and PE_b to switch to
PE_2. It seems to me that BGP would end
up in such situation while a local
decision may end up in PE_a and PE_a to
switch to PE_1.