L2VPN Workgroup                                               J. Rabadan
Internet Draft                                             W. Henderickx
                                                         S. Palislamovic
Intended status: Standards Track                          Alcatel-Lucent

                                                                F. Balus
                                                          Nuage Networks

                                                                A. Isaac
                                                               Bloomberg


Expires: April 24, 2014                                 October 21, 2013



                    IP Prefix Advertisement in E-VPN
            draft-rabadan-l2vpn-evpn-prefix-advertisement-01


Abstract

   E-VPN provides a flexible control plane that allows intra-subnet
   connectivity in an IP/MPLS and/or an NVO-based network. In Data
   Centers, there is also a need for a dynamic and efficient inter-
   subnet connectivity across Tenant Systems and End Devices that can be
   physical or virtual and may not support their own routing protocols.
   This document defines a new E-VPN route type for the advertisement of
   IP Prefixes and explains some use-case examples where this new route-
   type is used.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt



Rabadan et al.           Expires April 24, 2014                 [Page 1]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on January 16, 2014.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2. Introduction and problem statement  . . . . . . . . . . . . . .  3
     2.1 Inter-subnet connectivity requirements in Data Centers . . .  3
     2.2 The requirement for advertising IP prefixes in E-VPN . . . .  6
     2.3 The requirement for a new E-VPN route type . . . . . . . . .  7
   3. The BGP E-VPN IP Prefix route . . . . . . . . . . . . . . . . .  9
     3.1 IP Prefix Route encoding . . . . . . . . . . . . . . . . . .  9
   4. Benefits of using the E-VPN IP Prefix route . . . . . . . . . . 11
   5. IP Prefix next-hop use-cases  . . . . . . . . . . . . . . . . . 12
     5.1 TS IP address next-hop use-case  . . . . . . . . . . . . . . 12
     5.2 Floating IP next-hop use-case  . . . . . . . . . . . . . . . 15
     5.3 IRB IP next-hop use-case . . . . . . . . . . . . . . . . . . 16
     5.4 ESI next-hop ("Bump in the wire") use-case . . . . . . . . . 18
   6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 20
   7. Conventions used in this document . . . . . . . . . . . . . . . 21
   8. Security Considerations . . . . . . . . . . . . . . . . . . . . 21
   9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 21
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
     10.1 Normative References  . . . . . . . . . . . . . . . . . . . 21
     10.2 Informative References  . . . . . . . . . . . . . . . . . . 21
   11. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 21
   12. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 21







Rabadan et al.           Expires April 24, 2014                 [Page 2]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


1. Terminology

   GW IP: Gateway IP Address

   IPL: IP address length

   IRB: Integrated Routing and Bridging interface

   ML: MAC address length

   NVE: Network Virtualization Edge

   TS: Tenant System

   VA: Virtual Appliance

   Overlay next-hop: object used in the IP Prefix route, as described in
   this document. It can be an IP address in the tenant space or an ESI,
   and identifies the next-hop to be used in IP lookups for a given IP
   Prefix at the routing context importing the route.

   Underlay next-hop: IP address sent by BGP along with any E-VPN route,
   i.e. BGP next-hop. It identifies the NVE sending the route and it is
   used at the receiving NVE as the VXLAN destination VTEP or NVGRE
   destination end-point.

2. Introduction and problem statement

   Inter-subnet connectivity is required within the Data Center,
   therefore IP Prefixes must be advertised in the control plane. This
   section explains why IP-VPN [RFC4364] procedures are not recommended
   for such advertisements and why the existing E-VPN MAC route type
   does not meet the Data Center requirements for the advertisement of
   IP Prefixes, hence a new E-VPN route type is proposed.

   Section 2.1 describes the inter-subnet connectivity requirements in
   Data Centers. Section 2.2 and 2.3 explain why neither IP-VPN nor the
   existing E-VPN route types meet the requirements for IP Prefix
   advertisements. Once the need for a new E-VPN route type is
   justified, sections 2 and 3 will describe this route type and how it
   is used in some specific use cases.

2.1 Inter-subnet connectivity requirements in Data Centers

   [E-VPN] is used as the control plane for a Network Virtualization
   Overlay (NVO3) solution in Data Centers (DC), where Network
   Virtualization Edge (NVE) devices can be located in Hypervisors or
   TORs, as described in [E-VPN-OVERLAYS].



Rabadan et al.           Expires April 24, 2014                 [Page 3]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   If we use the term Tenant System (TS) to designate a physical or
   virtual system identified by MAC and IP addresses, and connected to
   an E-VPN instance, the following considerations apply:

   o The Tenant Systems may be Virtual Machines (VMs) that generate
     traffic from their own MAC and IP.

   o The Tenant Systems may be Virtual Appliance entities (VAs) that
     forward traffic to/from IP addresses of different End Devices
     seating behind them.

        o These VAs can be firewalls, load balancers, NAT devices, other
          appliances or virtual gateways with virtual routing instances.

        o These VAs do not have their own routing protocols and hence
          rely on the E-VPN NVEs to advertise the routes on their
          behalf.

        o In all these cases, the VA will forward traffic to the Data
          Center using its own source MAC but the source IP will be the
          one associated to the End Device seating behind or a
          translated IP address (part of a public NAT pool) if the VA is
          performing NAT.

        o Note that the same IP address could exist behind two of these
          TS. One example of this would be certain appliance resiliency
          mechanisms, where a virtual IP or floating IP can be owned by
          one of the two VAs running the resiliency protocol (the master
          VA). VRRP is one particular example of this. Another example
          is multi-homed subnets, i.e. the same subnet is connected to
          two VAs.

        o Although these VAs provide IP connectivity to VMs and subnets
          behind them, they do not always have their own IP interface
          connected to the E-VPN NVE, e.g. layer-2 firewalls are
          examples of VAs not supporting IP interfaces.

   The following figure illustrates some of the examples described
   above.












Rabadan et al.           Expires April 24, 2014                 [Page 4]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


                       NVE1
                    +--------+
           TS1(VM)--|(EVI-10)|---------+
             IP1/M1 +--------+         |               DGW1
                                  +---------+    +-------------+
                                  |         |----|(EVI-10)     |
     SN1---+           NVE2       |         |    |    IRB1    |
           |        +--------+    |         |    |        (VRF)|---+
     SN2---TS2(VA)--|(EVI-10)|----|         |    +-------------+  _|_
           | IP2/M2 +--------+    |  VXLAN/ |                    (   )
     IP4---+  <-+                 |  nvGRE  |         DGW2      ( WAN )
                |                 |         |    +-------------+ (___)
             vIP23 (floating)     |         |----|(EVI-10)     |   |
                |                 +---------+    |    IRB2    |   |
     SN1---+  <-+      NVE3         |  |  |      |        (VRF)|---+
           | IP3/M3 +--------+      |  |  |      +-------------+
     SN3---TS3(VA)--|(EVI-10)|------+  |  |
           |        +--------+         |  |
     IP5---+                           |  |
                                       |  |
                    NVE4               |  |      NVE5            +--SN5
              +---------------------+  |  |    +--------+        |
     IP6------|(EVI-1)              |  |  +----|(EVI-10)|--TS4(VA)--SN6
              |       \   IRB3      |  |       +--------+        |
              |       (VRF)-(EVI-10)|--+                ESI4     +--SN7
              |       /             |
          |---|(EVI-2)              |
       SN4|   +---------------------+


                    Figure 1 DC inter-subnet use-cases

   Where:

   NVE1, NVE2, NVE3, NVE4, NVE5, DGW1 and DGW2 share the same E-VPN for
   a particular tenant. EVI-10 is the corresponding E-VPN instance on
   each element, and all the hosts connected to that instance belong to
   the same IP subnet. The hosts connected to E-VPN 10 are listed below:

        o TS1 is a VM that generates/receives traffic from/to IP1, where
          IP1 belongs to the E-VPN 10 subnet.

        o TS2 and TS3 are Virtual Appliances (VA) that generate/receive
          traffic from/to the subnets and hosts seating behind them
          (SN1, SN2, SN3, IP4 and IP5). Their IP addresses (IP2 and IP3)
          belong to the E-VPN subnet and they can also generate/receive
          traffic. When these VAs receive packets destined to their own
          MAC addresses (M2 and M3) they will route the packets to the



Rabadan et al.           Expires April 24, 2014                 [Page 5]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


          proper subnet or host. These VAs do not support routing
          protocols to advertise the subnets connected to them and can
          move to a different server and NVE when the Cloud Management
          System decides to do so. These VAs may also support redundancy
          mechanisms for some subnets, similar to VRRP, where a floating
          IP is owned by the master VA and only the master VA forwards
          traffic to a given subnet. E.g.: vIP23 in figure 1 is a
          floating IP that can be owned by TS2 or TS3 depending on who
          the master is. Only the master will forward traffic to SN1.

        o Integrated Routing and Bridging interfaces IRB1, IRB2 and IRB3
          have their own IP addresses that belong to the E-VPN 10 subnet
          too. These IRB interfaces connect the E-VPN 10 subnet to
          Virtual Routing and Forwarding (VRF) instances that can route
          the traffic to other connected subnets for the same tenant
          (within the DC or at the other end of the WAN).

        o TS4 is a layer-2 VA that provides connectivity to subnets SN5,
          SN6 and SN7, but does not have an IP address itself in the E-
          VPN 10. TS4 is connected to a physical port on NVE5 assigned
          to Ethernet Segment Identifier 4.

   All the above DC use cases require inter-subnet forwarding and
   therefore the individual host routes and subnets MUST be advertised:

   a) From the NVEs (since VAs and VMs do not run routing protocols) and
   b) Associated to an overlay next-hop that can be a VA IP address, a
   floating IP address, and IRB IP address or an ESI.


2.2 The requirement for advertising IP prefixes in E-VPN

   In all the inter-subnet connectivity cases discussed in section 2.1
   there is a need to advertise IP prefixes. The advertisement of such
   prefixes must meet certain requirements, specific to NVO-based Data
   Centers:

        o The data plane in NVO-based Data Centers is not based on IP
          over a GRE or MPLS tunnel as required by [RFC4364], but
          Ethernet over an IP tunnel, such as VXLAN or NVGRE.

        o The IP prefixes in the DC must be advertised with a
          flexibility that does not exist in IP-VPNs today. For
          instance:

            a) The advertised overlay next-hop for a given IP prefix can
            be an IRB IP address (see section 5.3), a floating IP
            address (see section 5.2) or even an ESI (see section 5.4).



Rabadan et al.           Expires April 24, 2014                 [Page 6]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


            b) As stated by [E-VPN-OVERLAYS], VXLAN or NVGRE virtual
            identifiers can have a global or a local scope. The
            implementation MUST support the flexibility to advertise IP
            Prefixes associated to a global identifier (32-bit value
            encoded in the E-VPN Ethernet Tag ID) or a locally
            significant identifier (20-bit value encoded in the MPLS
            label field). At the moment, [RFC4364] can only advertise
            Prefixes associated to a locally significant identifier
            (MPLS label).

            c) Since an NVE can potentially advertise many Prefixes with
            different overlay next-hops and different VXLAN/NVGRE
            identifiers, it is highly desirable to be able to advertise
            those prefixes with their corresponding overlay next-hop and
            VXLAN/NVGRE identifier within the same NLRI, for a better
            BGP update packing. [RFC4364] does not have the capability
            of advertising a flexible overlay next-hop together with a
            prefix in the same NLRI.

        o IP prefixes must be advertised by NVE devices that have no VRF
          instances defined and no capability to process IP-VPN
          prefixes. These NVE devices just support E-VPN and advertise
          IP Prefixes on behalf of some connected Tenant Systems. In
          other words: any attempt to solve this problem by simply using
          [RFC4364] routes requires that any EVPN deployment must be
          accompanied with a concurrent IP-VPN topology, which is not
          possible in most of the cases.

        o Finally, Data Center providers want to use a single BGP
          Subsequent Address Family (AFI/SAFI) for the advertisement of
          addresses within the Data Center, i.e. BGP E-VPN only, as
          opposed to using E-VPN and IP-VPN in a concurrent topology.
          This minimizes the control plane overhead in TORs and
          Hypervisors and simplifies the operations.

   E-VPN is extended - as described in this document - to advertise IP
   prefixes with the flexibility required by the current and future Data
   Center applications.

2.3 The requirement for a new E-VPN route type

   [E-VPN] defines a MAC route (or route type 2) where a MAC address can
   be advertised together with an IP address length (IPL) and IP address
   (IP). While a variable IPL might be used to indicate the presence of
   an IP prefix in a route type 2, there are several specific use cases
   in which using this route type to deliver IP Prefixes is not
   suitable.




Rabadan et al.           Expires April 24, 2014                 [Page 7]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   One example of such use cases is the "floating IP" example described
   in section 2.1. In this example we need to decouple the advertisement
   of the prefixes from the advertisement of the floating IP (vIP23 in
   figure 1) and MAC associated to it, otherwise the solution gets
   highly inefficient and does not scale.

   E.g.: if we are advertising 1k prefixes from M2 (using route type 2)
   and the floating IP owner changes from M2 to M3, we would need to
   withdraw 1k routes from M2 and re-advertise 1k routes from M3.
   However if we use a separate route type, we can advertise the 1k
   routes associated to the floating IP address (vIP23) and only one
   route type 2 for advertising the ownership of the floating IP, i.e.
   vIP23 and M2 in the route type 2. When the floating IP owner changes
   from M2 to M3, a single route type 2 withdraw/update is required to
   indicate the change. The remote DGW will not change any of the 1k
   prefixes associated to vIP23, but will only update the ARP resolution
   entry for vIP23 (now pointing at M3).

   Other reasons to decouple the IP Prefix advertisement from the MAC
   route are listed below:

        o Clean identification, operation of troubleshooting of IP
          Prefixes, not subject to interpretation and independent of the
          IPL and the IP value. E.g.: An IP address for ARP resolution
          must be always clearly distinguished from an /32 IP Prefix, or
          a default IP route 0.0.0.0/0 must always be easily and clearly
          distinguished from the absence of IP information.

        o MAC address information must not be compared by BGP when
          selecting two IP Prefix routes. If IP Prefixes are to be
          advertised using MAC routes, the MAC information is always
          present and part of the route key.

        o IP Prefix routes must not be subject to MAC route procedures
          such as MAC Mobility or aliasing. Prefixes advertised from two
          different ESIs do not mean mobility; MACs advertised from two
          different ESIs do mean mobility. Similarly load balancing for
          IP prefixes is achieved through IP mechanisms such as ECMP,
          and not through MAC route mechanisms such as aliasing.

        o NVEs that do not require processing IP Prefixes must have an
          easy way to identify an update with an IP Prefix and ignore
          it, rather than processing the MAC route only to find out
          later that it carries a Prefix that must be ignored.

   The following sections describe how E-VPN is extended with a new
   route type for the advertisement of prefixes and how this route is
   used to address the current and future inter-subnet connectivity



Rabadan et al.           Expires April 24, 2014                 [Page 8]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   requirements existing in the Data Center.

3. The BGP E-VPN IP Prefix route

   The current BGP E-VPN NLRI as defined in [E-VPN] is shown below:

    +-----------------------------------+
    |    Route Type (1 octet)           |
    +-----------------------------------+
    |     Length (1 octet)              |
    +-----------------------------------+
    | Route Type specific (variable)    |
    +-----------------------------------+

   Where the route type field can contain one of the following specific
   values:

   + 1 - Ethernet Auto-Discovery (A-D) route

   + 2 - MAC advertisement route

   + 3 - Inclusive Multicast Route

   + 4 - Ethernet Segment Route

   This document defines an additional route type that will be used for
   the advertisement of IP Prefixes:

   + 5 - IP Prefix Route

   The support for this new route type is OPTIONAL.

   By using a separate route type for IP prefix advertisements, there is
   a clean separation of functions between route types, i.e. route type
   2 or MAC Advertisement route will be used for MAC and ARP resolution
   advertisement, whereas route type 5 or IP Prefix route will be used
   for the advertisement of prefixes. Since this new route type is
   OPTIONAL, an implementation not supporting it will easily ignore the
   route, based on the route type value.

   The detailed encoding of this route and associated procedures are
   described in the following sections.

3.1 IP Prefix Route encoding

   An IP Prefix advertisement route type specific E-VPN NLRI consists of
   the following fields:




Rabadan et al.           Expires April 24, 2014                 [Page 9]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


    +---------------------------------------+
    |      RD   (8 octets)                  |
    +---------------------------------------+
    |Ethernet Segment Identifier (10 octets)|
    +---------------------------------------+
    |  Ethernet Tag ID (4 octets)           |
    +---------------------------------------+
    |  IP Address Length (1 octet)          |
    +---------------------------------------+
    |  IP Address (4 or 16 octets)          |
    +---------------------------------------+
    |     GW IP Address (4 or 16 octets)    |
    +---------------------------------------+
    |        MPLS Label (3 octets)          |
    +---------------------------------------+

   Where:

        o RD, Ethernet Tag ID and MPLS Label fields will be used as
          defined in [E-VPN] and [E-VPN-OVERLAYS].

        o The Ethernet Segment Identifier will be a non-zero 10-byte
          identifier if the ESI is used as an overlay next-hop. It will
          be zero otherwise.

        o The IP address length can be set to a value between 0 and 32
          (bits) for ipv4 and between 0 and 128 for ipv6.

        o The IP address will be a 32 or 128-bit field (ipv4 or ipv6).

        o The GW IP (Gateway IP Address) will be a 32 or 128-bit field
          (ipv4 or ipv6), and will encode the overlay IP next-hop for
          the IP Prefixes. The GW IP field can be zero if it is not used
          as an overlay next-hop.

        o The total route length will indicate the type of prefix (ipv4
          or ipv6) and the type of GW IP address (ipv4 or ipv6). Note
          that the IP Address + the GW IP should have a length of either
          64 or 256 bits, but never 160 bits (ipv4 and ipv6 mixed values
          are not allowed).

   The Eth-Tag ID, IP address length and IP address will be part of the
   route key used by BGP to compare routes. The rest of the fields will
   be out of the route key.

   The route will contain a single overlay next-hop, i.e. if the ESI
   field is zero, the GW IP field will not, and vice versa. The
   following table shows the different inter-subnet use-cases described



Rabadan et al.           Expires April 24, 2014                [Page 10]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   in this document and the corresponding coding of the overlay next-hop
   in the route-type 5.

   +----------------------------+----------------------------------+
   | Overlay next-hop use-case  | Field in the route-type 5        |
   +----------------------------+----------------------------------+
   | TS IP address              | GW IP Address                    |
   | Floating IP address        | GW IP Address                    |
   | IRB IP address             | GW IP Address                    |
   | "Bump in the wire"         | ESI                              |
   +----------------------------+----------------------------------+

4. Benefits of using the E-VPN IP Prefix route

   This section clarifies the different functions accomplished by the E-
   VPN route-type 2 and route-type 5 routes, and provides a list of
   benefits derived from using a separate route type for the
   advertisement of IP Prefixes in E-VPN.

   [E-VPN] describes the content of the BGP E-VPN route type 2 specific
   NLRI, i.e. MAC Advertisement Route, where the IP address length (IPL)
   and IP address (IP) of a specific advertised MAC are encoded. The
   subject of the MAC advertisement route is the MAC address (M) and MAC
   address length (ML) encoded in the route. The MAC mobility and other
   complex procedures are defined around that MAC address. The IP
   address information carries the host IP address required for the ARP
   resolution of the MAC.

   The BGP E-VPN route type 5 defined in this document, i.e. IP Prefix
   Advertisement route, decouples the advertisement of IP prefixes from
   the advertisement of any MAC address related to it. This brings some
   major benefits to NVO-based networks where inter-subnet forwarding is
   required. Some of those benefits are:

   a) Upon receiving a route type 2 or type 5, an egress NVE can easily
      distinguish MACs and IPs for ARP resolution from IP Prefixes. E.g.
      an IP prefix with IPL=32 being advertised from two different
      ingress NVEs (as route type 5) can be identified as such and be
      imported in the designated routing context as two ECMP routes, as
      opposed to two ARP entries competing for the same IP.

   b) Similarly, upon receiving a route, an egress NVE not supporting
      processing IP Prefixes can easily ignore the update, based on the
      route type.

   c) A MAC route includes the ML, M, IPL and IP in the route key that
      is used by BGP to compare routes, whereas for IP Prefix routes,
      only IPL and IP (as well as Ethernet Tag ID) are part of the route



Rabadan et al.           Expires April 24, 2014                [Page 11]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


      key. Advertised IP Prefixes are imported into the designated
      routing context, where there is no MAC information associated to
      IP routes. In the example illustrated in figure 1, subnet SN1
      should be advertised by NVE2 and NVE3 and interpreted by DGW1 as
      the same route coming from two different next-hops, regardless of
      the MAC address associated to TS2 or TS3. This is easily
      accomplished in the route type 5 by including only the IP
      information in the route key.

   d) By decoupling the MAC from the IP Prefix advertisement procedures,
      we can leave the IP prefix advertisements out of the MAC mobility
      procedures defined in [E-VPN] for MACs. In addition, this allows
      us to have an indirection mechanism for IP prefixes advertised
      from a MAC/IP that can move between hypervisors. E.g. if there are
      1,000 prefixes seating behind TS2 (figure 1), NVE2 will advertise
      all those prefixes in type 5 routes associated to the next-hop
      IP2. Should TS2 move to a different NVE, a single MAC
      advertisement route withdraw for the M2/IP2 route from NVE2 will
      invalidate the 1,000 prefixes, as opposed to have to wait for each
      individual prefix to be withdrawn. This may be easily accomplished
      by using IP Prefix routes that are not tied to a MAC address, and
      use a different MAC route to advertise the location and resolution
      of the overlay next-hop to a MAC address.

5. IP Prefix next-hop use-cases

   The IP Prefix route can use a GW IP or an ESI as an overlay next-hop.
   This section describes some use-cases for both next-hop types.

5.1 TS IP address next-hop use-case

   The following figure illustrates an example of inter-subnet
   forwarding for subnets seating behind Virtual Appliances (on TS2 and
   TS3).

















Rabadan et al.           Expires April 24, 2014                [Page 12]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


     SN1---+           NVE2                            DGW1
           |        +--------+    +---------+    +-------------+
     SN2---TS2(VA)--|(EVI-10)|----|         |----|(EVI-10)     |
           | IP2/M2 +--------+    |         |    |    IRB1\    |
     IP4---+                      |         |    |        (VRF)|---+
                                  |         |    +-------------+  _|_
                                  |  VXLAN/ |                    (   )
                                  |  nvGRE  |         DGW2      ( WAN )
     SN1---+           NVE3       |         |    +-------------+ (___)
           | IP3/M3 +--------+    |         |----|(EVI-10)     |   |
     SN3---TS3(VA)--|(EVI-10)|----|         |    |    IRB2\    |   |
           |        +--------+    +---------+    |        (VRF)|---+
     IP5---+                                     +-------------+

                  Figure 2 TS IP address use-case

   An example of inter-subnet forwarding between subnet SN1/24 and a
   subnet seating in the WAN is described below. NVE2, NVE3, DGW1 and
   DGW2 are running BGP E-VPN. TS2 and TS3 do not support routing
   protocols, only a static route to forward the traffic to the WAN.

   (1) NVE2 advertises the following BGP routes on behalf of TS2:

        o Route type 2 (MAC route) containing: ML=48, M=M2, IPL=32,
          IP=IP2

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=0, GW IP address=IP2

   (2) NVE3 advertises the following BGP routes on behalf of TS3:

        o Route type 2 (MAC route) containing: ML=48, M=M3, IPL=32,
          IP=IP3

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=0, GW IP address=IP3

   (3) DGW1 and DGW2 import both received routes based on the RT:

        o Based on the EVI-10 route-target in DGW1 and DGW2, the MAC
          route is imported and M2 is added to the EVI-10 MAC FIB along
          with its corresponding tunnel information. For the VXLAN use
          case, the VTEP will be derived from the MAC route BGP next-hop
          (underlay next-hop) and VNI from the Ethernet Tag or MPLS
          fields (see [E-VPN-OVERLAYS]). IP2 - M2 is added to the ARP
          table.

        o Based on the EVI-10 route-target in DGW1 and DGW2, the IP



Rabadan et al.           Expires April 24, 2014                [Page 13]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


          Prefix route is also imported and SN1/24 is added to the
          designated routing context with next-hop IP2 pointing at the
          local EVI-10. Should ECMP be enabled in the routing context,
          SN1/24 would also be added to the routing table with next-hop
          IP3.

   (4) When DGW1 receives a packet from the WAN with destination IPx,
   where IPx belongs to SN1/24:

        o A destination IP lookup is performed on the DGW1 VRF routing
          table and next-hop=IP2 is found. The tunnel information to
          encapsulate the packet will be derived from the route-type 2
          (MAC route) received for M2/IP2.

        o IP2 is resolved to M2 in the ARP table, and M2 is resolved to
          the tunnel information given by the MAC FIB (remote VTEP and
          VNI for the VXLAN case).

        o The IP packet destined to IPx is encapsulated with:

             . Source inner MAC = IRB1 MAC

             . Destination inner MAC = M2

             . Tunnel information provided by the MAC FIB (VNI, VTEP IPs
               and MACs for the VXLAN case)

   (5) When the packet arrives at NVE2:

        o Based on the tunnel information (VNI for the VXLAN case), the
          EVI-10 context is identified for a MAC lookup.

        o Encapsulation is stripped-off and based on a MAC lookup
          (assuming MAC forwarding on the egress NVE), the packet is
          forwarded to TS2, where it will be properly routed.

   (6) Should TS2 move from NVE2 to NVE3, MAC Mobility procedures will
   be applied to the MAC route IP2/M2, as defined in [EVPN]. Route type
   5 prefixes are not subject to MAC mobility procedures, hence no
   changes in the DGW VRF routing table will occur for TS2 mobility,
   i.e. all the prefixes will still be pointing at IP2 as next-hop.
   There is an indirection for e.g. SN1/24, which still points at
   next-hop IP2 in the routing table, but IP2 will be simply resolved to
   a different tunnel, based on the outcome of the MAC mobility
   procedures for the MAC route IP2/M2.

   Note that in the opposite direction, TS2 will send traffic based on
   its static-route next-hop information (IRB1 and/or IRB2), and regular



Rabadan et al.           Expires April 24, 2014                [Page 14]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   E-VPN procedures will be applied.

5.2 Floating IP next-hop use-case

   Sometimes Tenant Systems (TS) work in active/standby mode where an
   upstream floating IP - owned by the active TS - is used as the next-
   hop to get to some subnets behind. This redundancy mode, already
   introduced in section 2.1 and 2.3, is illustrated in Figure 3.

                       NVE2                           DGW1
                    +--------+    +---------+    +-------------+
       +---TS2(VA)--|(EVI-10)|----|         |----|(EVI-10)     |
       |     IP2/M2 +--------+    |         |    |    IRB1\    |
       |      <-+                 |         |    |        (VRF)|---+
       |        |                 |         |    +-------------+  _|_
      SN1    vIP23 (floating)     |  VXLAN/ |                    (   )
       |        |                 |  nvGRE  |         DGW2      ( WAN )
       |      <-+      NVE3       |         |    +-------------+ (___)
       |     IP3/M3 +--------+    |         |----|(EVI-10)     |   |
       +---TS3(VA)--|(EVI-10)|----|         |    |    IRB2\    |   |
                    +--------+    +---------+    |        (VRF)|---+
                                                 +-------------+
                  Figure 3 Floating IP next-hop for redundant TS

   In this example, assuming TS2 is the active TS and owns IP23:

   (1) NVE2 advertises the following BGP routes for TS2:

        o Route type 2 (MAC route) containing: ML=48, M=M2, IPL=32,
          IP=IP23

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=0, GW IP address=IP23

   (2) NVE3 advertises the following BGP routes for TS3:

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=0, GW IP address=IP23

   (3) DGW1 and DGW2 import both received routes based on the RT:

        o M2 is added to the EVI-10 MAC FIB along with its corresponding
          tunnel information. For the VXLAN use case, the VTEP will be
          derived from the MAC route BGP next-hop and VNI from the
          Ethernet Tag or MPLS fields (see [E-VPN-OVERLAYS]). IP23 - M2
          is added to the ARP table.

        o SN1/24 is added to the designated routing context in DGW1 and



Rabadan et al.           Expires April 24, 2014                [Page 15]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


          DGW2 with next-hop IP23 pointing at the local EVI-10.

   (4) When DGW1 receives a packet from the WAN with destination IPx,
   where IPx belongs to SN1/24:

        o A destination IP lookup is performed on the DGW1 VRF routing
          table and next-hop=IP23 is found. The tunnel information to
          encapsulate the packet will be derived from the route-type 2
          (MAC route) received for M2/IP23.

        o IP23 is resolved to M2 in the ARP table, and M2 is resolved to
          the tunnel information given by the MAC FIB (remote VTEP and
          VNI for the VXLAN case).

        o The IP packet destined to IPx is encapsulated with:

             . Source inner MAC = IRB1 MAC

             . Destination inner MAC = M2

             . Tunnel information provided by the MAC FIB (VNI, VTEP IPs
               and MACs for the VXLAN case)

   (5) When the packet arrives at NVE2:

        o Based on the tunnel information (VNI for the VXLAN case), the
          EVI-10 context is identified for a MAC lookup.


        o Encapsulation is stripped-off and based on a MAC lookup
          (assuming MAC forwarding on the egress NVE), the packet is
          forwarded to TS2, where it will be properly routed.

   (6) When the redundancy protocol running between TS2 and TS3 appoints
   TS3 as the new active TS for SN1, TS3 will now own the floating IP23
   and will signal this new ownership (GARP message or similar). Upon
   receiving the new owner's notification, NVE3 will issue a route type
   2 for M3-IP23. DGW1 and DGW2 will update their ARP tables with the
   new MAC resolving the floating IP. No changes are carried out in the
   VRF routing table.

   In the DGW1/2 BGP RIB, there will be two route type 5 routes for SN1
   (from NVE2 and NVE3) but only the one with the same BGP next-hop as
   the IP23 route type 2 BGP next-hop will be valid.


5.3 IRB IP next-hop use-case




Rabadan et al.           Expires April 24, 2014                [Page 16]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   In some other cases, the NVEs and DGWs will have just IRB interfaces
   as hosts in the E-VPN instance. Figure 4 illustrates an example.

                            NVE1
          +---------------------+                    DGW1
    IP1---|(EVI-1)              |               +-------------+
          |       \   IRB3      |  +---------+  |(EVI-10)     |
          |       (VRF)-(EVI-10)|--|         |--|    IRB1\    |
          |       /             |  |         |  |        (VRF)|---+
        |-|(EVI-2)              |  |         |  +-------------+  _|_
     SN1| +---------------------+  |         |                  (   )
        | +---------------------+  |  VXLAN/ |       DGW2      ( WAN )
        |-|(EVI-2)              |  |  nvGRE  |  +-------------+ (___)
          |       \   IRB4      |  |         |  |(EVI-10)     |   |
          |       (VRF)-(EVI-10)|--|         |--|    IRB2\    |   |
          |       /             |  +---------+  |        (VRF)|---+
    SN2---|(EVI-3)              |               +-------------+
          +---------------------+
                            NVE2

            Figure 4 IRB IP next-hop use-case

   In this case:

   (1) NVE1 advertises the following BGP routes for SN1 resolution:

        o Route type 2 (MAC route) containing: ML=48, M=IRB3-MAC,
          IPL=32, IP=IRB3-IP

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=0, GW IP address=IRB3-IP

   (2) NVE2 advertises the following BGP routes for SN1 resolution:

        o Route type 2 (MAC route) containing: ML=48, M=IRB4-MAC,
          IPL=32, IP=IRB4-IP

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=0, GW IP address=IRB4-IP

   (3) DGW1 and DGW2 import both received routes based on the RT:

        o IRB3-MAC and IRB4-MAC are added to the EVI-10 MAC FIB along
          with their corresponding tunnel information. For the VXLAN use
          case, the VTEP will be derived from the MAC route BGP next-hop
          and VNI from the Ethernet Tag or MPLS fields (see [E-VPN-
          OVERLAYS]). IRB3-MAC - IRB3-IP and IRB4-MAC - IRB4-IP are
          added to the ARP table.



Rabadan et al.           Expires April 24, 2014                [Page 17]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


        o SN1/24 is added to the designated routing context in DGW1 and
          DGW2 with next-hop IRB3-IP (and/or IRB4-IP) pointing at the
          local EVI-10.

   Similar forwarding procedures as the ones described in the previous
   use-cases are followed.

5.4 ESI next-hop ("Bump in the wire") use-case

   The following figure illustrates and example of inter-subnet
   forwarding for a subnet route that uses an ESI as an overlay next-
   hop. In this use-case, TS2 and TS3 are layer-2 VA devices without any
   IP address that can be included as an overlay next-hop in the GW IP
   field of the IP Prefix route.

                      NVE2                           DGW1
                  +--------+    +---------+    +-------------+
     +---TS2(VA)--|(EVI-10)|----|         |----|(EVI-10)     |
     |      ESI23 +--------+    |         |    |    IRB1    |
     |        +                 |         |    |        (VRF)|---+
     |        |                 |         |    +-------------+  _|_
    SN1       |                 |  VXLAN/ |                    (   )
     |        |                 |  nvGRE  |         DGW2      ( WAN )
     |        +      NVE3       |         |    +-------------+ (___)
     |      ESI23 +--------+    |         |----|(EVI-10)     |   |
     +---TS3(VA)--|(EVI-10)|----|         |    |    IRB2    |   |
                  +--------+    +---------+    |        (VRF)|---+
                                               +-------------+

                  Figure 5 ESI next-hop use-case

   Since neither TS2 nor TS3 can run any routing protocol and have no IP
   address assigned, an ESI, i.e. ESI23, will be provisioned on the
   attachment ports of NVE2 and NVE3. This model supports VA redundancy
   in a similar way as the one described in section 4.2 for the floating
   IP next-hop use-case, only using the E-VPN A-D route instead of the
   MAC advertisement route to advertise the location of the overlay
   next-hop. The procedure is explained below:

   (1) NVE2 advertises the following BGP routes for TS2:

        o Route type 1 (A-D route for EVI-10) containing: ESI=ESI23 and
          the corresponding tunnel information (Ethernet Tag and/or MPLS
          label). Assuming the ESI is active on NVE2, NVE2 will
          advertise this route.

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=ESI23, GW IP address=0.



Rabadan et al.           Expires April 24, 2014                [Page 18]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   (2) NVE3 advertises the following BGP routes for TS3:

        o Route type 1 (A-D route for EVI-10) containing: ESI=ESI23 and
          the corresponding tunnel information (Ethernet Tag and/or MPLS
          label). NVE3 will advertise this route assuming the ESI is
          active on NVE2. Note that if the resiliency mechanism for TS2
          and TS3 is in active-active mode, both NVE2 and NVE3 will send
          the A-D route. Otherwise, that is, the resiliency is active-
          standby, only the NVE owning the active ESI will advertise the
          A-D route for ESI23.

        o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
          ESI=23, GW IP address=0.

   (3) DGW1 and DGW2 import the received routes based on the RT:

        o The tunnel information to get to ESI23 is installed in DGW1
          and DGW2. For the VXLAN use case, the VTEP will be derived
          from the A-D route BGP next-hop and VNI from the Ethernet Tag
          or MPLS fields (see [E-VPN-OVERLAYS]).

        o SN1/24 is added to the designated routing context in DGW1 and
          DGW2 with next-hop ESI23 pointing at the local EVI-10.

   (4) When DGW1 receives a packet from the WAN with destination IPx,
   where IPx belongs to SN1/24:

        o A destination IP lookup is performed on the DGW1 VRF routing
          table and next-hop=ESI23 is found. The tunnel information to
          encapsulate the packet will be derived from the route-type 1
          (A-D route) received for ESI23.

        o The IP packet destined to IPx is encapsulated with:

             . Source inner MAC = IRB1 MAC

             . Destination inner MAC = M2 (this MAC will be looked up in
               the EVI-10 FDB using the ESI23 as the key for the
               lookup).

             . Tunnel information provided by the A-D route for ESI23
               (VNI, VTEP IP and MACs for the VXLAN case).

   (5) When the packet arrives at NVE2:

        o Based on the tunnel information (VNI for the VXLAN case), the
          EVI-10 context is identified for a MAC lookup (assuming MAC
          disposition model).



Rabadan et al.           Expires April 24, 2014                [Page 19]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


        o Encapsulation is stripped-off and based on a MAC lookup
          (assuming MAC forwarding on the egress NVE), the packet is
          forwarded to TS2, where it will be properly forwarded.

   (6) If the redundancy protocol running between TS2 and TS3 follows an
   active/standby model and there is a failure, appointing TS3 as the
   new active TS for SN1, TS3 will now own the connectivity to SN1 and
   will signal this new ownership (GARP message or similar). Upon
   receiving the new owner's notification, NVE3 will issue a route type
   1 for ESI23, whereas NVE2 will withdraw it's A-D route for ESI23.
   DGW1 and DGW2 will update their tunnel information to resolve ESI23.
   No changes are carried out in the VRF routing table.

   In the DGW1/2 BGP RIB, there will be two route type 5 routes for SN1
   (from NVE2 and NVE3) but only the one with the same BGP next-hop as
   the ESI23 route type 1 BGP next-hop will be valid.


6. Conclusions

   A new E-VPN route type 5 for the advertisement of IP Prefixes is
   proposed in this document. This new route type will have a
   differentiated role from the route type 2, i.e. MAC advertisement
   route, and will address all the inter-subnet connectivity scenarios
   which are required in the Data Center, where the overlay next-hop can
   be an IP address or an ESI. As discussed throughout the document, IP-
   VPN cannot be used in an NVO-based DC to advertise IP Prefixes and
   the existing E-VPN route type 2 does not meet the requirements for
   all the DC use cases, therefore a new E-VPN route type is required.

   This new E-VPN route type 5 decouples the IP Prefix advertisements
   from the MAC route advertisements in E-VPN, hence:

   a) Allows the clean and clear announcements of ipv4 or ipv6 prefixes
      in an NLRI with no MAC addresses in the route key, so that only IP
      information is used in BGP route comparisons.

   b) Since the route type is different from the MAC advertisement
      route, the advertisement of prefixes will be excluded from all the
      procedures defined for the advertisement of VM MACs, e.g. MAC
      Mobility or aliasing. As a result of that, the current E-VPN
      procedures do not need to be modified.

   c) Allows a flexible implementation where the prefix can be linked to
      different types of next-hops: MAC address, IP address, IRB IP
      address, ESI, etc. and these MAC or IP addresses do not need to
      reside in the advertising NVE.




Rabadan et al.           Expires April 24, 2014                [Page 20]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


   d) An E-VPN implementation not requiring IP Prefixes can simply
      discard them by looking at the route type value.


7. Conventions used in this document

      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
      NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
      in this document are to be interpreted as described in RFC-2119
      [RFC2119].

8. Security Considerations


9. IANA Considerations


10. References

10.1 Normative References

   [RFC4364]Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
      Networks (VPNs)", RFC 4364, February 2006.


10.2 Informative References

   [E-VPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf-
      l2vpn-evpn-03.txt, work in progress, February, 2013

   [E-VPN-OVERLAYS] Sajassi-Drake et al., "A Network Virtualization
      Overlay Solution using E-VPN", draft-sd-l2vpn-evpn-overlay-01.txt,
      work in progress, February, 2013

11. Acknowledgments

      The authors would like to thank Mukul Katiyar and Senthil
      Sathappan for their valuable feedback and contributions.

12. Authors' Addresses

      Jorge Rabadan
      Alcatel-Lucent
      777 E. Middlefield Road
      Mountain View, CA 94043 USA
      Email: jorge.rabadan@alcatel-lucent.com

      Wim Henderickx



Rabadan et al.           Expires April 24, 2014                [Page 21]


Internet-Draft         E-VPN Prefix Advertisement       October 21, 2013


      Alcatel-Lucent
      Email: wim.henderickx@alcatel-lucent.com

      Florin Balus
      Nuage Networks
      Email: florin@nuagenetworks.net

      Aldrin Isaac
      Bloomberg
      Email: aisaac71@bloomberg.net

      Senad Palislamovic
      Alcatel-Lucent
      Email: senad.palislamovic@alcatel-lucent.com





































Rabadan et al.           Expires April 24, 2014                [Page 22]