Skip to main content

Last Call Review of draft-ietf-bess-evpn-optimized-ir-09
review-ietf-bess-evpn-optimized-ir-09-genart-lc-mishra-2021-10-02-00

Request Review of draft-ietf-bess-evpn-optimized-ir
Requested revision No specific revision (document currently at 12)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2021-09-07
Requested 2021-08-24
Authors Jorge Rabadan , Senthil Sathappan , Wen Lin , Mukul Katiyar , Ali Sajassi
I-D last updated 2021-10-02
Completed reviews Rtgdir Early review of -09 by Julien Meuric (diff)
Tsvart Last Call review of -08 by Michael Tüxen (diff)
Secdir Last Call review of -09 by Derek Atkins (diff)
Genart Last Call review of -09 by Gyan Mishra (diff)
Opsdir Last Call review of -09 by Tim Chown (diff)
Intdir Telechat review of -09 by Pascal Thubert (diff)
Assignment Reviewer Gyan Mishra
State Completed
Request Last Call review on draft-ietf-bess-evpn-optimized-ir by General Area Review Team (Gen-ART) Assigned
Posted at https://mailarchive.ietf.org/arch/msg/gen-art/5qRIthY7Sp4wDjDSrxDSm_yY5rU
Reviewed revision 09 (document currently at 12)
Result Ready w/issues
Completed 2021-10-02
review-ietf-bess-evpn-optimized-ir-09-genart-lc-mishra-2021-10-02-00
I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair.  Please treat these comments just
like any other last call comments.

For more information, please see the FAQ at

<https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.

Document: draft-ietf-bess-evpn-optimized-ir-??
Reviewer: Gyan Mishra
Review Date: 2021-10-02
IETF LC End Date: 2021-09-07
IESG Telechat date: Not scheduled for a telechat

Summary:
I am the GEN-ART reviewer for this draft and am reviewing the draft as a BESS
WG member familiar with the EVPN technology and issues that exist with IR and
understand the need for the IR optimized solution for BUM replication.   This
draft clearly defines the problem to be solved with IR BUM replication & the
proposed EVPN Optimized IR Solution which is technically sound.  My comments,
considerations & recommendations are related re-writing of some of the
technical verbiage to help improve the draft. The draft is well written &
clearly describes the problem with EVPN IR PTA and how the Optimized IR
solution with AR replication RT-11 can be used to provide an optimized
Selective P-Tree so all PEs do not have to receive the BUM as exists today with
RT-3 I-PMSI. This draft provides a EVPN procedure optimization for IR PTA R-3
X-PMSI that utilizes a new RT-11 Leaf A-D  that was introduced in Jeffrey
Zhang’s EVPN BUM Procedure update “draft
draft-ietf-bess-evpn-bum-procedure-updates-10” that utilizes the RFC 6513
Leaf-AD route to create a new Selective tree Leaf A-D Route for optimized EVPN
BUM procedures for inter-as segmentation for any PTA P-Tree being instantiated
including IR.
  Leaf Auto-Discovery (A-D) routes [RFC6513]: For explicit leaf
  tracking purpose.

Leaf A-D concept from RFC 6514 Leaf A-D route for Multicast in  VPLS RFC 7117
Section 8.3 bottom of page 33 &  optimized selective & inclusive P-Tree X-PMSI
tunnels with or without inter-as segmentation and “draft
draft-ietf-bess-evpn-bum-procedure-updates-10” P-Tree Multicast both
specifications uses RFC 7524 Section 4 Inter-Area P2MP Segmented Next hop
extended community  (S-NH-EC) utilized for tunnel segmentation for seamless
MPLS MVPN Multicast setting of “Leaf information required” L  flag in PTA now
used in EVPN BUM procedures updates in draft “draft
draft-ietf-bess-evpn-bum-procedure-updates-10” Section 6.3 and now also used in
EVPN IR Optimizations draft for Assisted Replication function  in RT-11 
(S-NH-EC) with caveat that S-NH-EC is not used is changed from RFC 7524 which
should be reflected in the verbiage.

RFC 7524 S-NH-EC Section 4
4.  Inter-Area P2MP Segmented Next-Hop Extended Community

   This document defines a new Transitive IPv4-Address-Specific Extended
   Community Sub-Type: "Inter-Area P2MP Next-Hop".  This document also
   defines a new BGP Transitive IPv6-Address-Specific Extended Community
   Sub-Type: "Inter-Area P2MP Next-Hop".

   A PE, an ABR, or an ASBR constructs the Inter-Area P2MP Segmented
   Next-Hop Extended Community as follows:

   -  The Global Administrator field MUST be set to an IP address of the
      PE, ABR, or ASBR that originates or advertises the route carrying
      the P2MP Next-Hop Extended Community.  For example this address
      may be the loopback address or the PE, ABR, or ASBR that
      advertises the route.

   -  The Local Administrator field MUST be set to 0.

      If the Global Administrator field is an IPv4 address, the
      IPv4-Address-Specific Extended Community is used; if the Global
      Administrator field is an IPv6 address, the IPv6-Address-Specific
      Extended Community is used.

      The detailed usage of these Extended Communities is described in
      the following sections.

Excerpt from RFC 7524 Section 6.3 also verbiage used in the BUM procedure
update Section 6.3 as well as this EVPN IR optimization draft Section 4 page 9:
6.3.  Use of S-NH-EC

   [RFC7524] specifies the use of S-NH-EC because it does not allow ABRs
   to change the BGP next hop when they re-advertise I/S-PMSI A-D routes
   to downstream areas.  That is only to be consistent with the MVPN
   Inter-AS I-PMSI A-D routes, whose next hop must not be changed when
   they're re-advertised by the segmenting ABRs for reasons specific to
   MVPN.  For EVPN, it is perfectly fine to change the next hop when
   RBRs re-advertise the I/S-PMSI A-D routes, instead of relying on S-
   NH-EC.  As a result, this document specifies that RBRs change the BGP
   next hop when they re-advertise I/S-PMSI A-D routes and do not use S-
   NH-EC.  If a downstream PE/RBR needs to originate Leaf A-D routes, it
   constructs an IP-based Route Target Extended Community by placing the
   IP address carried in the Next Hop of the received I/S-PMSI A-D route
   in the Global Administrator field of the Community, with the Local
   Administrator field of this Community set to 0 and setting the
   Extended Communities attribute of the Leaf A-D route to that
   Community.

RFC 7117 Excerpt Section 8.3 bottom:
       The PE constructs an IP-address-specific RT by placing the IP
       address carried in the Next Hop field of the received S-PMSI A-D
       route in the Global Administrator field of the Community, with
       the Local Administrator field of this Community set to 0 and
       setting the Extended Communities attribute of the leaf A-D route
       to that Community.

This draft EVPN IR Optimization Section 4 page 9
The AR-LEAF constructs an IP-address-specific route-target as
         indicated in [I-D.ietf-bess-evpn-bum-procedure-updates], by
         placing the IP address carried in the Next-Hop field of the
         received Replicator-AR route in the Global Administrator field
         of the Community, with the Local Administrator field of this
         Community set to 0.  Note that the same IP-address-specific
         import route-target is auto-configured by the AR-REPLICATOR
         that sent the Replicator-AR, in order to control the acceptance
         of the Leaf A-D routes.

RFC 6514 Leaf A-D route is being used for EVPN procedures  RT-11 to build the
selective tree optimization using a new Assisted Replication (AR) procedure
which is the EVPN IR optimization in this draft.

The confusing part about this draft is that it mentions NVO3 & MPLS PTA.  In
general, NVO3 overlay encapsulations are used in Data Centers with typically IP
based underlay, however MPLS EVPN procedures RFC 7432 applies to both DC or
Core any underlay IP, MPLS, SR underlay.  This draft as its written applies to
an NVO3 overlay IR procedure optimization utilized in a Data Centers, however
the Data Center underlay as well as Core can be MPLS or IP based and can both
have an NVO3 overlay, however the Data Center environment is generally where
NVO3 VTEP termination tunnel endpoints reside and the core carries the EVPN
control plane inter-DC.   RFC 7432 MPLS / IP EVPN supports both IP & MPLS
underlay with IP underlay supporting IR PTA only and MPLS underlay supporting
all PTAs for RT-3 I-PMSI inclusive tree.  Here are a few scenario for the
authors to think about and where the EVPN IR replication optimization solution
could be utilized.  The point I would like to make here is that for BUM the use
of Multicast P2MP mLDP or RSVP-TE PTA is always the most preferred method to
handle BUM for both Core or DC scenario and only certain scenario’s that exist
where multicast would not be preferred.  As the NVO3 & MPLS can be used in both
DC or Core scenario, I will mention both DC & Core scenario, as both pertains
to this draft.   If MPLS is used in the DC or Core then the DC or Core could be
“PIM” free & “BGP” free in the underlay and mLPD or RSVP-TE PTA options could
be utilized as the optimal BUM solution.  If MPLS is used in the DC or Core
then the DC or Core could be “BGP” free but PIM is enabled in the core for PIM
Rosen MDT RFC 6037.  In the above use cases the IR optimization would not
optimal or preferred solution.  Only if IP Is used in the DC or Core in which
case MVPN PTA options are not possible as MVPN is only utilized with MPLS
underlay & “PIM” is not desired in the underlay then this IR optimization could
be utilized.   However, in the use case where underlay is IP only and not MPLS
& “PIM” is not desired then this IR optimization would be the most desired
solution for BUM with the caveat in this case that as MVPN procedures RFC 6513
& 6514 is used with MPLS underlay for PTA in this case the only viable PTA
would be IR as all the other PTA have MPLS underlay dependency.  So in summary
if MPLS exists then there are a lot of viable X-PMSI PTA options for both DC &
Core for EVPN NVO3 BUM and IR would not be the desired, and only the unique
case for IP underlay when “PIM” is not desired. I believe IR optimization AR
replication solution can be used for MPLS underlay as well as there could be a
use case where even though other PTAs X-PMSI are available it is desired to use
IR as PTA’s that use MPLS based multicast is not desired and in those cases the
IR optimization could be for both DC or Core & could apply to Core “Non NVO3”
use case of EVPN PE-CE AC MLAG All Active Multi-home. This solution breaks up
the BUM 3 tuple “Broadcast, Unknown Unicast, Multicast” into BM
“Broadcast/Multicast” & keeps Unknown Unicast separated out treated the same as
known unicast. As MPLS EVPN has a ubiquitous framework & thus ubiquitous use
cases and can be used for DC or Core and any underlay IP, MPLS or SR where the
two primary use cases for EVPN are NVO3 encapsulation overlay for DC
multi-tenant environments and NG L2 VPN PE-CE L2 AC advancement addressing
VPLS/H-VPLS gaps that existed to NG MPLS L2 VPN “EVPN” E-LINE, E-LAN,E-TREE,
this IR optimization draft as well should apply to any EVPN use case and not
limited to NVO3. BUM & why to separate out BUM 3-tuple (Broadcast, Unknown
Unicast, Multicast) separate out Unknown Unicast BUM handling from Broadcast &
Multicast “BM” traffic.

With regards to the BUM Broadcast / Unkown Unicast -  With Proxy ARP/Proxy ND
what occurs is when the broadcast occurs as an ARP All Fs broadcast, the first
ARP packet goes out and the Type 2 change from unknown mac / ip to Mac when arp
request is sent and then when reply is received the MAC/IP state is created. 
After that point no further ARPs are sent for the device.  Most implementations
have a ARP/ND refresh so to keep the MAC/IP state current and purge the old
entries save on MAC VRF URIB state tradeoff so there is constant ARP and is
does not necessarily stop even with Proxy ARP.  Trade off is maintain the
larger MAC VRF if the ARP/ND refresh did not occur which is worse that you
don’t want to hit the ceiling on the MAC VRF which is worse.  So the draft
states that Broadcast is greatly reduced by Proxy ARP / Proxy ND capability &
Unknown Unicast is greatly reduced by in virtualized NVO3 networks where MAC/IP
is learned in the control plane.  Even with Proxy ARP / ND ARP as stated above
the 1st ARP packet is sent as flood all FFs until the control plane MAC
learning generates the Type 2 MAC-IP route, however since most implementations
track the MAC-IP control plane state with refresh timer to age out and purge
old entries the all FF’s ARP broadcast ends up being sent more often then just
once due to the refresh timers to purge the MAC-IP VRF.  Unknown unicast is a
situation where the switch does not have the MAC address in its CAM table or in
the EVPN scenario the MAC/IP does not exist in leaf within the fabric.  In a L2
switch environment the Unknown unicast “out of sync” of Bridge tables can occur
when first hop routing protocol is salt/peppered even/odd such that only the
Active Router has the MAC and the Standby router does not.  With EVPN All
Active Multi-home MHD/MHN MLAG scenario of host endpoint connections both leafs
are active so there is never an out of sync situation where one leaf has the
MAC and the other leaf does not.  Also EVPN backup path aliasing uniform load
balancing over MLAG & local bias may take care of the Unknown Unicast making it
nill or very rare in a EVPN NVO3 environment.  BUM Broadcast ARP/ND traffic
would definitely exist even with Proxy ARP/ Proxy ND and it can be quite
substantial due to refresh/purge timers.

Is the reason for treating the Unknown Unicast differently broken out from “BM”
because none exists in a NVO3 environment? With regards to EVPN IR optimization
for BUM traffic as this draft addresses BUM optimization when using IR, as
draft draft-ietf-bess-evpn-igmp-mld-proxy defines a new SMET A-D RT-6 route for
IR optimization for BUM which is equivalent to this drafts leaf-ad route but
unsolicited and untargeted.  This draft must mention normatively in the draft,
draft-ietf-bess-evpn-igmp-mld-proxy as an alternative solution for BUM IR
optimization and why this solution should be utilized for BUM IR optimization
over the SMET RT-6 style optimization.  Also how is this drafts RT-11 selective
trees AR replications solution interoperate with draft
draft-ietf-bess-evpn-igmp-mld-proxy SMET route.  Is that possible or do you
have to implement one or the other.

Major issues:

None

Minor issues:

Abstract
OLD TXT
   Network Virtualization Overlay (NVO) networks using EVPN as control
   plane may use Ingress Replication (IR) or PIM (Protocol Independent
   Multicast) based trees to convey the overlay Broadcast, Unknown
   unicast and Multicast (BUM) traffic.  PIM provides an efficient
   solution to avoid sending multiple copies of the same packet over the
   same physical link, however it may not always be deployed in the NVO
   core network.  IR avoids the dependency on PIM in the NVO network
   core.  While IR provides a simple multicast transport, some NVO
   networks with demanding multicast applications require a more
   efficient solution without PIM in the core.  This document describes
   a solution to optimize the efficiency of IR in NVO networks.

NEW TXT
Network Virtualization Overlay (NVO) networks and BGP MPLS Based L2 VPN   
E-LINE, E-LAN, E-TREE flavor Ethernet VPN’s in a Service Provider Core and Data
Center Networks using EVPN as control plane may use any available PMSI Tunnel
Attribute (PTA)such as Ingress Replication (IR) RFC 7988,PIM (Protocol
Independent Multicast)MDT SAFI RFC 6037, mLDP P2MP MP2MP RFC 6388 or RSVP-TE
P2MP RFC 4875 based P-Trees to replicate the overlay Broadcast, Unknown unicast
and Multicast (BUM) traffic.  Multicast based PTA tunnel types provides an
efficient solution to avoid sending multiple copies of the same packet over the
same physical link, however in a Data Center all the PTA tunnel types may not
be available with IP-Based underlay and native PIM is not desirable or with
MPLS-Based underlay with “BGP” and “PIM” free core where the operator is
migrating to Segment Routing and is in the process of eliminating LDP and
RSVP-TE P2MP PTA is not desirable.  In these use cases, the only option
available is to use IR.  While IR provides a simple multicast transport, in the
case of Service Provider Core migrating to Segment Routing or Data Center NVO
networks with IP-Based underlay with demanding multicast applications require a
more efficient solution than IR.  This document describes a solution to
optimize the efficiency of IR in a Service Provider Core in transition to
Segment Routing or Data Center NVO network with IP-Based underlay.

Introduction
OLD TXT
   Ethernet Virtual Private Networks (EVPN) may be used as the control
   plane for a Network Virtualization Overlay (NVO) network.  Network
   Virtualization Edge (NVE) devices and Provider Edges (PEs) that are
   part of the same EVPN Instance (EVI) use Ingress Replication (IR) or
   PIM-based trees to transport the tenant's Broadcast, Unknown unicast
   and Multicast (BUM) traffic.  In NVO networks where PIM-based trees
   cannot be used, IR is the only option.  Examples of these situations
   are NVO networks where the core nodes don't support PIM or the
   network operator does not want to run PIM in the core.

   In some use-cases, the amount of replication for BUM (Broadcat, Unkown
   Unicast, Multicast) traffic is kept under control on the NVEs due to the
   following fairly common assumptions:

   a.  Broadcast is greatly reduced due to the proxy ARP (Address
       Resolution Protocol) and proxy ND (Neighbor Discovery)
       capabilities supported by EVPN on the NVEs.  Some NVEs can even
       provide Dynamic Host Configuration Protocol (DHCP) server
       functions for the attached Tenant Systems (TS) reducing the
       broadcast even further.

   b.  Unknown unicast traffic is greatly reduced in virtualized NVO
       networks where all the MAC and IP addresses are learned in the
       control plane.

   c.  Multicast applications are not used.

   If the above assumptions are true for a given NVO network, then IR
   provides a simple solution for multi-destination traffic.  However,
   the statement c) above is not always true and multicast applications
   are required in many use-cases.

   When the multicast sources are attached to NVEs residing in
   hypervisors or low-performance-replication TORs (Top Of Rack
   switches), the ingress replication of a large amount of multicast
   traffic to a significant number of remote NVEs/PEs can seriously
   degrade the performance of the NVE and impact the application.

NEW TXT
Service Provider Core and Data Center networks may use Ethernet Virtual Private
Networks (EVPN)as the control plane for an Network Virtualization Overlay (NVO)
network with IP-Based Underlay or BGP MPLS Based L2 VPN E-LINE, E-LAN, E-TREE
flavor Ethernet VPN’s Virtualization Edge (NVE) devices and Provider Edges
(PEs) that are part of the same EVPN Instance (EVI)can use Ingress Replication
(IR) or any available MPLS based PTA for P-Tree instantiation to transport the
tenant's Broadcast, Unknown unicast and Multicast (BUM) traffic.  In Service
Provider Core or Data Center NVO networks where MPLS based PTA’s are not
available such as a Service Provider core migrating to Segment Routing where
LDP is being eliminated and RSVP-TE P2MP is not desirable or Data Center
network with IP-Based Underlay and Native PIM is not desirable, IR is the only
option.  Examples of these situations are NVO networks where the core nodes
don't support MPLS based PTA with dependency on mLDP and both Native PIM and
RSVP-TE P2MP LSM is not desirable.

   In some use-cases, the amount of replication for BUM traffic is kept
   under control on the NVEs due to the following fairly common
   assumptions:

   a.  Broadcast is moderately reduced due to the proxy ARP (Address
       Resolution Protocol) and proxy ND (Neighbor Discovery)
       capabilities supported by EVPN on the NVEs with Selective IR
       tunnels optimization defined in draft
       draft-ietf-bess-evpn-igmp-mld-proxy.  Some NVEs can even
       provide Dynamic Host Configuration Protocol (DHCP) server
       functions for the attached Tenant Systems (TS) reducing the
       broadcast even further. During the Proxy ARP/ND process the first ARP   
       packet is still send all F’s broadcast resulting in Type 2 change from
       Unknown Mac-IP route to MAC-IP route when ARP/ND request is sent and
       reply is received the MAC VRF MAC-IP state is created.  Proxy ARP/ND
       then suppresses or proxies all ARP/ND sent by the local hosts. However,
       due to ARP/ND refresh state requirements to keep the MAC-IP state
       current and purge the old entries save on MAC VRF URIB state as a
       tradeoff there maybe additional ARP/ND packets sent for each MAC VRF
       MAC-IP entry. The IGMP-MLD proxy Selective IR tunnel optimization draft
       improves the performance of IR using SMET route and maybe used in
       conjunction with this draft. Even though Proxy ARP/ND suppression
       techniques are utilized as the refresh/purge must be implemented to age
       old entries to control the MAC VRF size the broadcast traffic is only
       moderately reduced and thus RFC 7432 EVPN IR for BUM is not a viable
       solution without the IR optimization solution defined in this draft
       and/or draft-ietf-bess-evpn-igmp-mld-proxy.

***Please investigate if both EVPN IR optimizations can be used together and
what are all the caveats and if they cannot be used together and why**  The
main point here that should be mentioned is that Broadcast traffic is reduced
but there is still a considerable amount of broadcast traffic that needs to be
optimized

   b.  Unknown unicast traffic is eliminated in virtualized NVO
       networks due to all the MAC and IP addresses are learned in the
       control plane for All-Active Multi-home LAG scenario and reduced for
       Single-Active Multi-Home EVPN scenario. Unknown unicast is a situation
       where the packet has the IP and MAC, however the switch is missing the
       MAC entry which occurs due to Layer 2 switch BD table synchronization
       becomes unsynchronized due to salt and pepper of first hop router
       redundancy active router VLAN between L2 switches resulting in Unknown
       unicast.  In an EVPN scenario with All-Active-Multi-Home the MAC-IP
       remains synchronized with ESI auto discovery, however with
       Single-Active-Multi-Home the MAC-IP may not be synchronized resulting in
       Unknown unicast. As a result, there is minimal to none Unknown Unicast
       in a NVO network.

   c.  Multicast applications are not used.

   If the above assumptions are true for a given NVO network, then IR
   provides a simple solution for multi-destination traffic.  However,
   the statement c) above is not always true and multicast applications
   are required in many use-cases.

   When the multicast sources are attached to NVEs residing in
   hypervisors or low-performance-replication TORs (Top Of Rack
   switches), the ingress replication of a large amount of multicast
   traffic to a significant number of remote NVEs/PEs can seriously
   degrade the performance of the NVE and impact the application.

In the draft it should be mentioned the reason why BM (Broacast & Multicast)
are treated differently by this solution then Unknown Unicast.   My answer is
that the Unknown Unicast is minimal to none so does not need the optimization.

Terminology section:

OLD TXT
-  Regular-IR: Refers to Regular Ingress Replication, where the
      source NVE/PE sends a copy to each remote NVE/PE part of the BD.

-  IR-IP: IP address used for Ingress Replication as in [RFC7432].

-  AR-IP: IP address owned by the AR-REPLICATOR and used to
      differentiate the ingress traffic that must follow the AR
      procedures.

New TXT
-  Regular-IR: an EVPN RT-3 ( Route Type 3) Regular Ingress Replication, where
the source NVE/PE sends a copy to each remote NVE/PE part of the BD.

-  IR-IP: PTA Tunnel endpoint identifier which carries the unicast tunnel
endpoint (Loopback) IP address of the Non-AR-Replicator local PE used for
Ingress Replication as defined in RFC 6514.

-  AR-IP: PTA Tunnel endpoint identifier which carries the unicast tunnel
endpoint (loopback) IP address of the AR-REPLICATOR local PE as defined in RFC
6514 and used to differentiate the ingress traffic that must follow the AR
procedures.

Updated the reference to what the AR-IP & IR-IP is basically is the PMSI Tunnel
attribute PTA termination endpoint ID, AR-IP for the AR node & IR-IP for Non-AR
node.

RFC 7432 section 11.2  references RFC 6514 PMSI tunnel attribute must contain
the identity of the tree RFC 7432 Section 11.2 11.2.  P-Tunnel Identification

   In order to identify the P-tunnel used for sending broadcast, unknown
   unicast, or multicast traffic, the Inclusive Multicast Ethernet Tag
   route MUST carry a Provider Multicast Service Interface (PMSI) Tunnel
   attribute as specified in [RFC6514].

   + If the PE that originates the advertisement uses ingress
     replication for the P-tunnel for EVPN, the route MUST include the
     PMSI Tunnel attribute with the Tunnel Type set to Ingress
     Replication and the Tunnel Identifier set to a routable address of
     the PE.

RFC 6514 Section 5
   When the Tunnel Type is set to Ingress Replication, the Tunnel
   Identifier carries the unicast tunnel endpoint IP address of the
   local PE that is to be this PE's receiving endpoint address for the
   tunnel.

Section 3 Solution Requirements

OLD TXT
   a.  It provides an IR optimization for BM (Broadcast and Multicast)
       traffic without the need for PIM, while preserving the packet
       order for unicast applications, i.e., known and unknown unicast
       traffic should follow the same path.  This optimization is
       required in low-performance NVEs.

NEW TXT
   a.  It provides an IR optimization for BM (Broadcast and Multicast)
       traffic without the need for PTA’s with MPLS or PIM based dependencies,
       while preserving the packet order for unicast applications, i.e., known
       and unknown unicast traffic should follow the same path.  This
       optimization is required in low-performance NVEs.

How is IR optimization preserving unicast ordering ?
Normal Unicast traffic is not BUM and thus would not use EVPN IR optimization
AR mechanism.

Section 4 – Type3 is being extended to support -optimized IR – new type 3 – so
that is part of capability exchange

4.  EVPN BGP Attributes for optimized-IR

   This solution extends the [RFC7432] Inclusive Multicast Ethernet Tag
   routes and attributes so that an NVE/PE can signal its optimized-IR
   capabilities.

7432 section 7.3
7.3.  Inclusive Multicast Ethernet Tag Route

   An Inclusive Multicast Ethernet Tag route type specific EVPN NLRI
   consists of the following:

               +---------------------------------------+
               |  RD (8 octets)                        |
               +---------------------------------------+
               |  Ethernet Tag ID (4 octets)           |
               +---------------------------------------+
               |  IP Address Length (1 octet)          |
               +---------------------------------------+
               |  Originating Router's IP Address      |
               |          (4 or 16 octets)             |
               +---------------------------------------+

Please reference below with RFC 6514 Section 5

5.  PMSI Tunnel Attribute

   This document defines and uses a new BGP attribute called the
   "P-Multicast Service Interface Tunnel (PMSI Tunnel) attribute".  This
   is an optional transitive BGP attribute.  The format of this
   attribute is defined as follows:

RFC 6514         BGP Encodings and Procedures for MVPNs    February 2012

      +---------------------------------+
      |  Flags (1 octet)                |
      +---------------------------------+
      |  Tunnel Type (1 octets)         |
      +---------------------------------+
      |  MPLS Label (3 octets)          |
      +---------------------------------+
      |  Tunnel Identifier (variable)   |
      +---------------------------------+

Section 4 top of page 8

As described in the summary section of the review, this section should
reference RFC 7524 Section 4 which is referenced by
“draft-ietf-bess-evpn-bum-procedure-updates” section 6.3 S-NH-EC and also
reference used by RFC 7117 Section 8.3 and in describe that in
“draft-ietf-bess-evpn-bum-procedure-updates” that for EVPN S-NH-EC in the
Leaf-AD routes is not necessary for the response to Replicator-AR route RT-3. 
This should be included in the verbiage.

I updated some normative language – please check
OLD TXT
   In this document, the above RT-3 and PTA can be used in two different
   modes for the same BD:

   -  Regular-IR route: in this route, Originating Router's IP Address,
      Tunnel Type (0x06), MPLS Label and Tunnel Identifier MUST be used
      as described in [RFC7432] when Ingress Replication is in use.  The
      NVE/PE that advertises the route will set the Next-Hop to an IP
      address that we denominate IR-IP in this document.  When
      advertised by an AR-LEAF node, the Regular-IR route SHOULD be
      advertised with type T= AR-LEAF.

   -  Replicator-AR route: this route is used by the AR-REPLICATOR to
      advertise its AR capabilities, with the fields set as follows:

      o  Originating Router's IP Address MUST be set to an IP address of
         the PE that should be common to all the EVIs on the PE (usually
         this is the PE's loopback address).  The Tunnel Identifier and
         Next-Hop SHOULD be set to the same IP address as the
         Originating Router's IP address when the NVE/PE originates the
         route.  The Next-Hop address is referred to as the AR-IP and
         SHOULD be different than the IR-IP for a given PE/NVE.

      o  Tunnel Type = Assisted-Replication Tunnel.  Section 11 provides
         the allocated type value.

      o  T (AR role type) = 01 (AR-REPLICATOR).

      o  L (Leaf Information Required) = 0 (for non-selective AR) or 1
         (for selective AR).

   In addition, this document also uses the Leaf A-D route (RT-11)
   defined in [I-D.ietf-bess-evpn-bum-procedure-updates] in case the
   selective AR mode is used.  The Leaf A-D route MAY be used by the AR-
   LEAF in response to a Replicator-AR route (with the L flag set) to
   advertise its desire to receive the BM traffic from a specific AR-
   REPLICATOR.  It is only used for selective AR and its fields are set
   as follows:

      o  Originating Router's IP Address is set to the advertising PE's
         IP address (same IP used by the AR-LEAF in regular-IR routes).
         The Next-Hop address is set to the IR-IP.

      o  Route Key is the "Route Type Specific" NLRI of the Replicator-
         AR route for which this Leaf A-D route is generated.

      o  The AR-LEAF constructs an IP-address-specific route-target as
         indicated in [I-D.ietf-bess-evpn-bum-procedure-updates], by
         placing the IP address carried in the Next-Hop field of the
         received Replicator-AR route in the Global Administrator field
         of the Community, with the Local Administrator field of this
         Community set to 0.  Note that the same IP-address-specific
         import route-target is auto-configured by the AR-REPLICATOR
         that sent the Replicator-AR, in order to control the acceptance
         of the Leaf A-D routes.

      o  The Leaf A-D route MUST include the PMSI Tunnel attribute with
         the Tunnel Type set to AR, type set to AR-LEAF and the Tunnel
         Identifier set to the IP of the advertising AR-LEAF.  The PMSI
         Tunnel attribute MUST carry a downstream-assigned MPLS label or
         VNI that is used by the AR-REPLICATOR to send traffic to the
         AR-LEAF.

   Each AR-enabled node MUST understand and process the AR type field in
   the PTA (Flags field) of the routes, and MUST signal the
   corresponding type (1 or 2) according to its administrative choice.

NEW TXT
When the PTA builds PMSI tunnel per RFC 6514 section I called the IR-IP changed
to PTA-ID to make it easier for the reader as the source / destination of the
PMSI tunnel termination endpoints is the PTA PMSI Tunnel Attribute Identifier.

**start of the new txt**
   In this document, the above RT-3 and PTA can be used in two different
   modes for the same BD:

   -  Regular-IR route: This route is the regular RT-3 I-PMSI

      Originating Router's Unicast IP Address called the IR-IP MUST be set to
      the PMSI Tunnel Identifier for the PTA Tunnel Type (0x06) used for IR as
      described in 6514 when Ingress Replication is used.  The NVE/PE that
      advertises the route will set the Next-Hop to the remote tunnel endpoint
      PMSI Tunnel Identifier IR-IP as defined in RFC 6514.  When advertised by
      an AR-LEAF node, the Regular-IR route MUST be advertised with type T=
      AR-LEAF.

      o  Tunnel Type = Assisted-Replication Tunnel.  Section 11 provides
         the allocated type value.

      o  T (AR role type) = 10 (AR-LEAF).

      o  L (Leaf Information Required) = 0 (for non-selective AR=0) or
         (for selective AR=1).  Regular-IR route is only used only for Non
         Selective P-Tree.

   -  Replicator-AR route: This route is used by the AR-REPLICATOR to
      advertise its AR capabilities, with the fields set as follows:

      o Originating Router's Unicast IP Address called the AR-IP MUST be set to
      the PMSI Tunnel Identifier for the PTA Tunnel Type(0x06) which is the IP
      address of the PE that should be common to all the EVIs on the PE as
      defined in RFC 6514. The Tunnel Identifier and Next-Hop MUST be set to
      the same IP address as the Originating Router's IP address PTA Tunnel ID
      when the NVE/PE originates the route as described in RFC 6514.  The
      Next-Hop address of the Replicator-AR route as seen on the AR-LEAF is
      referred to as the AR-IP and MUST be unique and cannot be the same as the
      IR-IP for a given PE/NVE.

      o  Tunnel Type = Assisted-Replication Tunnel.  Section 11 provides
         the allocated type value.

      o  T (AR role type) = 01 (AR-REPLICATOR).

      o  L (Leaf Information Required) = 1 (for non-selective AR=0) or
         (for selective AR=1).  Replicator-AR route is only used for  Selective
         P-Tree.

   In addition, this document also uses the Leaf A-D route (RT-11)
   defined in [I-D.ietf-bess-evpn-bum-procedure-updates] in case the
   selective AR mode is used. Draft ietf-bess-evpn-bum-procedure-updates
   updates the EVPN BUM procedures for EVPN Multicast optimized selective trees
   used, introducing three new route types RT-9 Per Region I-PMSI A-D, RT-10
   S-PMSI A/D and RT-11 Leaf A-D utilized for Selective P-Tree PTA inter-as
   segmentation optimizations, and utilizes RFC 7117 concept of selective tree
   optimization procedure to signal leaf-ad route to instantiate inter-as
   P-Tree framework from Intra-AS and Inter-AS VPLS Multicast I/S-PMSI A/D &
   Leaf A-D solution which now is also leveraged by AR replicator for IR
   optimization utilizing RT-11 to build selective tree IR optimization for BUM
   traffic.  Section 6 of bess-evpn-bum-procedure-updates defines the RT-11
   Leaf-AD route selective tree optimization concept from RFC 7117 response to
   I-PMSI route, RFC 7524 Inter-Area P2MP Segmented Next Hop Extended Community
   S-NH-EC which is utilized for Inter-AS P2MP Segmented LSP stitching. RFC
   7524 Section 6 states that it requires the ABRs to keep the next hop
   unchanged for re-advertisement I/S PMSI A-D route which only needs to be
   consistent for MVPN Inter-AS I-PMSI A/D routes whose next hop MUST be
   unchanged. EVPN for inter-as readvertisement of I/S-PMSI A-D route the next
   hop can be changed and so does not need to rely on S-NH-EC.

   The Leaf A-D route MAY be used by the AR-LEAF in response to a
Replicator-AR route (with the L flag set) to advertise its desire to receive
the BM traffic from a specific AR-REPLICATOR.  It is only used for selective AR
and its fields are set as follows:

      o  Originating Router's IP Address is set to the advertising PE's
         IP address (same IP used by the AR-LEAF in regular-IR routes).
         The Next-Hop address is set to the IR-IP.

      o  Route Key is the "Route Type Specific" NLRI of the Replicator-
         AR route for which this Leaf A-D route is generated.

      o  The AR-LEAF constructs an IP-address-specific route-target as
         indicated in [I-D.ietf-bess-evpn-bum-procedure-updates], by
         placing the IP address carried in the Next-Hop field of the
         received Replicator-AR route in the Global Administrator field
         of the Community, with the Local Administrator field of this
         Community set to 0.  Note that the same IP-address-specific
         import route-target is auto-configured by the AR-REPLICATOR
         that sent the Replicator-AR, in order to control the acceptance
         of the Leaf A-D routes.

      o  The Leaf A-D route MUST include the PMSI Tunnel attribute with
         the Tunnel Type set to AR, type set to AR-LEAF and the Tunnel
         Identifier set to the IP of the advertising AR-LEAF.  The PMSI
         Tunnel attribute MUST carry a downstream-assigned MPLS label or
         VNI that is used by the AR-REPLICATOR to send traffic to the
         AR-LEAF.

   Each AR-enabled node MUST understand and process the AR type field in
   the PTA (Flags field) of the routes, and MUST signal the
   corresponding type (1 or 2) according to its administrative choice.

**There are a few different flags & new flags defined in the PTA  - please be
specific as to the type 1 & 2 flags**

***Implementation considerations section – important and also details as to how
does the backwards compatibility work*** As RT-3 introduces a mode and RT-11 is
new in this draft what devices need to be upgraded and do all need to be
upgraded to support the solution? ***Implementation section of any vendor
implementations thus far please list** Also mention any issues found with any
implementations also any operators that have deployed the implementation.

Nits/editorial comments:

Normative reference should be added per the re-written text provided in the
Minor issues section for the following:

 RFC 7524 Inter-AS P2MP Segmented LSP & RFC 7117 Multicast VPLS and draft
 draft-ietf-bess-evpn-igmp-mld-proxy, RFC 6388 mLDP, RFC 6037 MDT SAFI, RFC
 4875 P2MP TE

Informative reference to MVPN procedures RFC 6513 MVPN, RFC 7988 Ingress
Replication, RFC 7348 VXLAN, RFC 8926 GENEVE