[Search] [txt|pdfized|bibtex] [Tracker] [Email] [Diff1] [Diff2] [Nits]
Versions: 00 01                                                         
INTERNET-DRAFT                                               A. Ghanwani
Intended Status: Informational                                      Dell
Expires: August 12, 2014                                       L. Dunbar
                                                               V. Bannai
                                                             R. Krishnan
                                                       February 13, 2014

                Multicast Issues in Networks Using NVO3

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect

Ghanwani                Expires August 12, 2014                 [Page 1]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

   to this document.

Ghanwani                Expires August 12, 2014                 [Page 2]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014


   This memo discusses issues with supporting multicast traffic in a
   network that uses Network Virtualization using Overlays over Layer 3
   (NVO3). It describes the various mechanisms that may be used for
   multicast and discusses some of the considerations with supporting
   multicast applications in networks that use NVO3.

Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2. Multicast mechanisms in networks that use NVO3  . . . . . . . .  4
     2.1 No multicast support . . . . . . . . . . . . . . . . . . . .  4
     2.2 Replication at the source NVE  . . . . . . . . . . . . . . .  5
     2.3 Replication at a multicast service node  . . . . . . . . . .  5
     2.4 IP multicast in the underlay . . . . . . . . . . . . . . . .  6
     2.5 Other schemes  . . . . . . . . . . . . . . . . . . . . . . .  7
   3. Simultaneous use of more than one mechanism . . . . . . . . . .  7
   4. IP multicast applications in the overlay  . . . . . . . . . . .  7
   5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
   6. Security Considerations . . . . . . . . . . . . . . . . . . . .  8
   7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . .  8
   8. References  . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     8.1  Normative References  . . . . . . . . . . . . . . . . . . .  8
     8.2  Informative References  . . . . . . . . . . . . . . . . . .  9
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .  9

Ghanwani                Expires August 12, 2014                 [Page 3]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

1. Introduction

   Network virtualization using Overlays over Layer 3 (NVO3) is a
   technology that is used to address issues that arise in building
   large, multitenant data centers that make extensive use of server
   virtualization [PS].

   This document is focused specifically on the problem of supporting
   multicast in networks that use NVO3.  Because of the requirement of
   multi-destination delivery, multicast traffic poses some unique

   The reader is assumed to be familiar with the terminology as defined
   in the NVO3 Framework document [FW].

2. Multicast mechanisms in networks that use NVO3

   In NVO3 environments, traffic between NVEs is transported using a
   tunnel encapsulation such as VXLAN [VXLAN], NVGRE [NVGRE], STT [STT],

   Besides the need to support the Address Resolution Protocol (ARP) and
   Neighbor Discovery (ND), there are several applications that require
   the support of multicast and/or broadcast in data centers [DC-MC].
   With NVO3, there are many possible ways that multicast may be handled
   in such networks.  We discuss some of the attributes of the following
   four methods, but other methods are also possible.

      1. No multicast support.
      2. Replication at the source NVE.
      3. Replication at a multicast service node.
      4. IP multicast in the underlay.

   These mechanisms are briefly mentioned in the NVO3 Framework [FW]
   document.  This document attempts to fill in some more details about
   the basic mechanisms underlying each of these mechanisms and
   discusses the issues and tradeoffs of each.

2.1 No multicast support

   In this scenario, there is no support whatsoever for multicast
   traffic when using the overlay.  This can only work if the following
   conditions are met:

      1. All of the traffic is unicast.  In other words, there are no
         multicast applications in the network and the only multicast
         traffic is due to ARP/ND and due to flooding of frames with an
         unknown MAC destination address.

Ghanwani                Expires August 12, 2014                 [Page 4]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

      2. A network virtualization authority (NVA) is used at the NVE
         to determine the MAC address-to-NVE mapping and to determine
         the MAC address-to-IP address bindings.  In other words,
         there is no data plane learning, and address resolution
         requests via ARP/ND that are issued by the VMs must be
         resolved by the NVE that they are attached to.

   With this approach, certain multicast/broadcast applications such as
   DHCP can be supported by use of a helper function in the NVE.

   The main issues that need to be addressed with this mechanism are the
   handling of hosts for which a mapping does not already exist in the
   NVA.  This issue can be particularly challenging if such end systems
   are reachable through more than one NVE.

2.2 Replication at the source NVE

   With this method, the overlay attempts to provide a multicast service
   without requiring any specific support from the underlay, other than
   that of a unicast service.  A multicast or broadcast transmission is
   achieved by replicating the packet at the source NVE, and making
   copies, one for each destination NVE that the multicast packet must
   be sent to.

   For this mechanism to work, the source NVE must know, a priori, the
   IP addresses of all destination NVEs that need to receive the packet.
    For example, in the case of an ARP broadcast or an ND multicast, the
   source NVE must know the IP addresses of all the remote NVEs where
   there are members of the tenant subnet in question.

   The obvious drawback with this method is that we have multiple copies
   of the same packet that will traverse any common links that are along
   the path to each of the destination NVEs.  If, for example, a tenant
   subnet is spread across 50 NVEs, the packet would have to be
   replicated 50 times at the source NVE.  This also creates an issue
   with the forwarding performance of the NVE, especially if it is
   implemented in software.

   Note that this method is similar to what was used in VPLS [VPLS]
   prior to extensive support of MPLS multicast [MPLS-MC].

2.3 Replication at a multicast service node

   With this method, all multicast packets would be sent using a unicast
   tunnel encapsulation to a multicast service node.  The multicast
   service node, in turn, would create multiple copies of the packet and
   would deliver a copy, using a unicast tunnel encapsulation, to each
   of the NVEs that are part of the multicast group for which the packet

Ghanwani                Expires August 12, 2014                 [Page 5]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

   is intended.

   This mechanism is similar to that used by the ATM Forum's LAN
   Emulation [LANE] specification [LANE].

   Unlike the method described in Section 2.2, there is no performance
   impact at the ingress NVE, nor are there any issues with multiple
   copies of the same packet from the source NVE to the multicast
   service node.  However there remain issues with multiple copies of
   the same packet on links that are common to the paths from the
   multicast service node to each of the egress NVEs.  Additional issues
   that are introduced with this method include the availability of the
   multicast service node, methods to scale the services offered by the
   multicast service node, and the sub-optimality of the delivery paths.

   Finally, the IP address of the source NVE must be preserved in packet
   copies created at the multicast service node if data plane learning
   is in use.  This could create problems if IP source address reverse
   path forwarding (RPF) checks are in use.

2.4 IP multicast in the underlay

   In this method, the underlay supports IP multicast and the ingress
   NVE encapsulates the packet with the appropriate IP multicast address
   in the tunnel encapsulation header for delivery to the desired set of
   NVEs.  The protocol in the underlay could be any variant of Protocol
   Independent Multicast (PIM).  The NVE would be required to
   participate in the underlay as a host using IGMP/MLD in order for the
   underlay to learn about the groups that the NVE participates in.

   With this method, there are none of the issues with the methods
   described in Sections 2.2.

   With PIM Sparse Mode (PIM-SM), the number of flows required would be
   (n*g), where n is the number of source NVEs that source packets for
   the group, and g is the number of groups.  Bidirectional PIM (BIDIR-
   PIM) would offer better scalability with the number of flows required
   being g.

   In the absence of any additional mechanism, e.g. using an NVA for
   address resolution, for optimal delivery, there would have to be a
   separate group for each tenant, plus a separate group for each
   multicast address (used for multicast applications) within a tenant.
   Additional considerations are that only the lower 23 bits of the IP
   address (regardless of whether IPv4 or IPv6 is in use) are mapped to
   the outer MAC address, and if there is equipment that prunes
   multicasts at Layer 2, there will be some aliasing.  Finally, a
   mechanism to efficiently provision such addresses for each group

Ghanwani                Expires August 12, 2014                 [Page 6]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

   would be required.

   There are additional optimizations which are possible, but they come
   with their own restrictions.  For example, a set of tenants may be
   restricted to some subset of NVEs and they could all share the same
   outer IP multicast group address.  This however introduces a problem
   of sub-optimal delivery (even if a particular tenant within the group
   of tenants doesn't have a presence on one of the NVEs which another
   one does, the former's multicast packets would still be delivered to
   that NVE).  It also introduces an additional network management
   burden to optimize which tenants should be part of the same tenant
   group (based on the NVEs they share), which somewhat dilutes the
   value proposition of NVO3 which is to completely decouple the overlay
   and physical network design allowing complete freedom of placement of
   VMs anywhere within the data center.

2.5 Other schemes

   There are still other mechanisms that may be used that attempt to
   combine some of the advantages of the above methods by offering
   multiple replication points, each with a limited degree of
   replication [EDGE-REP].  Such schemes offer a trade-off between the
   amount of replication at an intermediate node (router) versus
   performing all of the replication at the source NVE or all of the
   replication at a multicast service node.

3. Simultaneous use of more than one mechanism

   While the mechanisms discussed in the previous section have been
   discussed individually, it is possible for implementations to rely on
   more than one of these.  For example, the method of Section 2.1 could
   be used for minimizing ARP/ND, while at the same time, multicast
   applications may be supported by one, or a combination of, the other
   methods.  For small multicast groups, the methods of source NVE
   replication or the use of a multicast service node may be attractive,
   while for larger multicast groups, the use of multicast in the
   underlay may be preferable.

4. IP multicast applications in the overlay

   When IP multicast is implemented in the overlay (i.e. the tenant
   traffic is IP multicast), there are a few issues that need to be

   First, in all cases where L2 virtual network interfaces (VNIs) are
   present, the NVE would need to support IGMP/MLD snooping in order to
   prevent delivery of packets to tenant systems that are not interested
   in receiving them.

Ghanwani                Expires August 12, 2014                 [Page 7]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

   Second is the issue of how the groups are setup and mapped to tunnels
   in the underlay.  This can be accomplished entirely by an NVA if the
   mechanisms described in Section 2.2 or Section 2.3 are used, with the
   NVE just participating in snooping of IGMP messages from the tenant
   systems.  If the method of Section 2.4 is used, then a mechanism must
   be provide for mapping the tenant IP multicast address to an IP
   multicast address for use in the underlay, and the NVE would be
   required to translate the information from the snooped IGMP/MLD
   messages from the tenant systems into corresponding requests for the

   Third, when using the scheme described in Section 2.3, it may be
   useful to have the multicast service node support the IGMP querier

   Fourth, if the IP multicast traffic is contained within a single
   virtual network (VN), then the schemes described herein are
   sufficient.  If, on the other hand, the IP multicast traffic needs to
   traverse VNs, then the routing mechanisms at the NVE need to offer IP
   multicast forwarding.  Once again, depending on how the groups are
   setup -- whether by an NVA or some other entity -- the forwarding
   tables at the NVE that has L3 virtual network interfaces (VNIs) would
   need to be setup by that entity.

5. Summary

   This document has identified various mechanisms for supporting
   multicast in networks that use NVO3.  It highlights the basics of
   each mechanism and some of the issues with them.  As solutions are
   developed, the protocols would need to consider the use of these
   mechanisms and co-existence may be a consideration.  It also
   highlights some of the requirements for supporting multicast
   applications in an NVO3 network.

6. Security Considerations

   This is an informational document, and as such, does not introduce
   any new security considerations beyond what may be present in
   proposed solutions.

7. IANA Considerations

   This draft does not have any IANA considerations.

8. References

8.1  Normative References

Ghanwani                Expires August 12, 2014                 [Page 8]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

      [PS]      Lasserre, M. et al., "Framework for DC network
                virtualization", work in progress, January 2014.

      [FW]      Narten, T. et al., "Problem statement: Overlays
                for network virtualization", work in progress,
                July 2013.

8.2  Informative References

      [VXLAN]   Mahalingam, M. et al., "VXLAN: A framework for
                overlaying virtualized Layer 2 networks over Layer 3
                networks," work in progress.

      [NVGRE]   Sridharan, M. et al., "NVGRE: Network virtualization
                using Generic Routing Encapsulation," work in progress.

      [STT]     Davie, B. and Gross J., "A stateless transport
                tunneling protocol for network virtualization,"
                work in progress.

      [DC-MC]   McBride M., and Lui, H., "Multicast in the data
                center overview," work in progress.

      [VPLS]    Lasserre, M., and Kompella, V. (Eds), "Virtual Private
                LAN Service (VPLS) using Label Distribution Protocol
                (LDP) signaling," RFC 4762, January 2007.

      [MPLS-MC] Aggarwal, R. et al., "Multicast in VPLS," work in

      [LANE]    "LAN emulation over ATM," The ATM Forum,
                af-lane-0021.000, January 1995.

                Marques P. et al., "Edge multicast replication for
                BGP IP VPNs," work in progress, June 2012.

Authors' Addresses

   Anoop Ghanwani
   Email: anoop@alumni.duke.edu

   Linda Dunbar
   Email: ldunbar@huawei.com

Ghanwani                Expires August 12, 2014                 [Page 9]

INTERNET DRAFT          Multicast Issues in NVO3       February 13, 2014

   Vinay Bannai
   Email: vbannai@paypal.com

   Ram Krishnan
   Email: ramk@brocade.com

Ghanwani                Expires August 12, 2014                [Page 10]