Network working group                                             X. Xu
Internet Draft                                                   Huawei
Category: Standard Track                                        H. Shah
                                                             Ciena Corp
                                                                L. Yong
                                                                 Huawei
                                                                 Y. Fan
                                                          China Telecom

Expires: January 2013                                     July 13, 2012


               Virtual Private LAN Service (VPLS) Using IS-IS

                        draft-xu-l2vpn-vpls-isis-04

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with
   the provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 13, 2013.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of




Xu, et al.                 Expires January 13, 2013               [Page 1]


Internet-Draft                VPLS Using IS-IS                July 2012

   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.

Abstract

   This document describes a light-weight Virtual Private LAN Service
   (VPLS), referred to as IS-IS VPLS, which uses IS-IS for auto-
   discovery and signaling. IS-IS VPLS is intended to be used as a
   scalable cloud data center network solution.

Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [RFC2119].

Table of Contents


   1. Introduction ................................................ 3
   2. Terminology ................................................. 5
   3. Control Plane ............................................... 6
      3.1. VPLS Info TLV .......................................... 6
      3.2. Auto-discovery ......................................... 7
      3.3. Signaling .............................................. 8
   4. Data Plane .................................................. 8
      4.1. Data Encapsulation and Forwarding ...................... 8
         4.1.1. Unicast ........................................... 8
         4.1.2. Multicast/Broadcast  .............................. 8
      4.2. MAC Address Learning ................................... 9
         4.2.1. Data-plane based MAC Learning ..................... 9
         4.2.2. Control-plane based MAC Learning ................. 10
   5. ARP Broadcast Reduction .................................... 10
   6. Security Considerations .................................... 10
   7. IANA Considerations ........................................ 10
   8. Acknowledgements ........................................... 11
   9. References ................................................. 11
      9.1. Normative References .................................. 11
      9.2. Informative References ................................ 11
   10. Authors' Addresses ........................................ 12








Xu, et al.                 Expires January 13, 2013               [Page 2]


Internet-Draft                VPLS Using IS-IS                July 2012


1. Introduction

   For leveraging the economics of scale, today's cloud data centers
   tend to contain tens to hundreds of thousands of servers. Moreover
   distributed computing and server virtualization are increasingly
   adopted in today's cloud data centers. These factors lead to
   significant scaling, performance, and operational challenges for
   cloud data center networks. Therefore, to design a scalable and
   sustainable cloud data center network solution, the following
   requirements must be taken into account:

   1) LAN Extension

      To achieve service agility and business continuance, Virtual
      Machine (VM) migration and High-Available (HA) cluster are
      commonly used in today's cloud data centers. These two
      applications introduce special requirements on data center
      networks. For instance, to allow a VM to be freely migrated from
      one server to another within data centers while retaining its IP
      address and MAC address, the LAN where the VM is located needs to
      be extend across multiple racks or pods within data centers. In
      addition, as some HA cluster applications rely on link-local
      multicast for cluster member discovery and heartbeat, cluster
      member servers are usually required to reside within the same
      Layer2 domain. As a result, LAN extension becomes a fundamental
      requirement for cloud data center networks.

   2) VPN Instance Space Scalability

      In modern cloud data centers, tens of thousands of tenants (e.g.,
      enterprises or governments who consume computing resources such
      as Infrastructure-as-a-Service (IaaS) offered by cloud service
      providers), could be hosted over a shared network infrastructure.
      For security and performance isolation considerations, these
      tenants SHOULD be isolated from one another. Hence, cloud data
      center networks SHOULD provide a large enough VPN instance space
      for tenant isolation.

   3) Forwarding Table Scalability

      In a highly virtualized cloud data center environment, it's not
      uncommon that millions of VMs are contained over a common network
      infrastructure. Therefore, the forwarding table of each network
      device within cloud data center networks SHOULD be scalable
      enough so as to keep up with that scale of VMs.




Xu, et al.                 Expires January 13, 2013               [Page 3]


Internet-Draft                VPLS Using IS-IS                July 2012

   4) Bandwidth Utilization Maximization

      In modern cloud data centers, distributed computing is driving the
      server-to-sever traffic to become the dominating traffic compared
      to the client-to-server traffic. To meet the growing capacity
      demands for server-to-server connectivity, shortest path
      forwarding and Equal Cost Multi-Path (ECMP) have already been the
      basic capabilities of cloud data center networks.

   5) ARP/Unknown Unicast Flood Suppression

      It's well-known that the flooding of ARP broadcast and unknown
      unicast traffic within large Layer2 networks will lead to
      performance impact on both networks and hosts. As the Layer2
      domain is extended across multiple racks or pods within data
      centers, the above problem will become even worse. As such, how
      to suppress the flooding of ARP broadcast and unknown unicast
      traffic within cloud data centers becomes increasingly desirable.

   6) Flexibility for Tradeoffs between Bandwidth and State

      It's possible that there would be a great difference between
      tenants within a cloud data center, in terms of VM numbers or
      multicast/broadcast traffic volume. For example, some tenants may
      have seldom VMs while others may have a lot of VMs, or some
      tenants may have a high volume of multicast/broadcast traffic
      while others may have little or even no multicast/broadcast
      traffic. As such, there is no "one size fits all" VPN
      multicast/broadcast delivery procedure for these tenants. Hence,
      cloud data center networks SHOULD support both the ingress
      replication procedure and the multicast tree procedure for
      delivering VPN multicast/broadcast traffic, so as to allow for an
      effective tradeoff between bandwidth usage and state maintenance
      on a per tenant basis according to the particular conditions of
      each tenant.

   7) Simplified Provisioning and Operation

      It's not surprising that a single cloud data center has thousands
      of physical switches (e.g., ToR switches). However, a network of
      such scale usually implies a big challenge for data center
      operators. Therefore, how to simplify and even automate network
      provisioning and operation becomes significantly important for
      cloud data center networks.

   8) Reuse Existing Operation Experiences




Xu, et al.                 Expires January 13, 2013               [Page 4]


Internet-Draft                VPLS Using IS-IS                July 2012

      IP, as a proven routing technology, has already been used in most
      today's data centers. Furthermore, those service providers who
      are planning to build cloud data centers have years of experience
      in operating MPLS-based L2VPN and/or L3VPN services which can be
      transported over MPLS or IP-enabled Packet Switching Networks
      (PSNs). To allow data center operators to reuse their existing
      network operation experiences, cloud data center network
      solutions SHOULD reuse existing technologies and protocols where
      appropriate, rather than reinventing the wheel. That's why there
      are increasing interests from the industrial community on how to
      adopt or adapt the existing L2VPN and/or L3VPN technologies for
      cloud data center networks.

   Although the existing VPLS solutions [RFC4761, RFC4762] can meet
   most of the above requirements, there are still spaces for
   improvement. For instance, the existing VPLS solutions require
   establishing full-mesh of pseudo-wires (PWs) between PE routers,
   which implies a significant scaling challenge in the cloud data
   center environment, especially when imagining configuring thousands
   of ToR switches as PE routers and provisioning tens of thousands of
   VPLS instances on them; secondly, the ingress replication procedure
   used for delivering multicast and broadcast traffic in existing VPLS
   solutions is not optimal from the bandwidth utilization perspective;
   thirdly, existing VPLS solutions require running one or more
   separate protocols besides IGP within data centers for VPLS protocol
   (e.g., LDP and/or BGP), which results in a certain complexity in
   network management and operation, especially when considering
   configuring BGP and VPLS parameters on thousands of ToR switches and
   configuring thousands of BGP peers on aggregation or core switches.

   Hence, this document describes a light-weight VPLS solution,
   referred to as IS-IS VPLS, which uses IS-IS [IS-IS][RFC1195] for
   VPLS protocol. IS-IS VPLS retains almost all advantages of existing
   VPLS solutions (e.g., split-horizon forwarding) while overcoming
   their shortages as mentioned above. For example, there is no need
   for full-mesh PWs between PE routers in IS-IS VPLS. Furthermore, it
   allows data center operators to flexibly make tradeoffs between
   bandwidth and state on a per tenant basis. Finally, an already
   deployed IGP within cloud data centers (i.e., IS-IS), rather than a
   dedicated protocol(s), is used for providing VPLS services.

2. Terminology

   This memo makes use of the terms defined in [RFC4664], [VPLS-MCAST],
   [RFC4761] and [RFC4762].





Xu, et al.                 Expires January 13, 2013               [Page 5]


Internet-Draft                VPLS Using IS-IS                July 2012

3. Control Plane

   There are two primary functions of the VPLS control plane: auto-
   discovery and signaling. In IS-IS VPLS, these two functions are
   accomplished by using a single extended IS-IS TLV, referred to as
   VPLS Info TLV (see section 3.1). By propagating such VPLS Info TLVs
   that contain VPLS information within data center networks, PE
   routers automatically discover which other PE routers are part of a
   given VPLS instance and their assigned VPLS labels for that VPLS
   instance. Furthermore, according to the ISIS specification defined
   in [IS-IS] and [RFC1195], IS-IS routers would ignore unknown TLVs in
   the LSP and pass them on to other neighbors unchanged. Therefore, P
   routers don't need processing the VPLS info TLV, but instead
   synchronizing the Link State PDUs (LSP) containing such TLV with
   their adjacent IS-IS neighbors as normal. In addition, to overcome
   the 255-byte TLV limit, IS-IS allows the interpretation of multiple
   TLVs of a given type to be considered additive rather than mutually
   exclusive (see section 6.4 in [RFC5311] for more details), therefore
   there is no scaling issue in using IS-IS for propagating a huge
   amount of VPLS information.

   3.1. VPLS Info TLV

          0                   1                   2                   3
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
         +-+-+-+-+-+-+-+-+
         |Type=VPLS Info |
         +-+-+-+-+-+-+-+-+
         |    Length     |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |                                                               |
         |                 PE's IPv4 or IPv6 Address                     |
         |                         (128 bits)                            |
         |                                                               |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |    Resv (12 bits)     |            VPLS ID (20 bits)          |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |    Resv (12 bits)     |          VPLS Label (20 bits)         |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         :                                                               :
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |    Resv (12 bits)     |            VPLS ID (20 bits)          |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |    Resv (12 bits)     |          VPLS Label (20 bits)         |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



Xu, et al.                 Expires January 13, 2013               [Page 6]


Internet-Draft                VPLS Using IS-IS                July 2012

       Type

           Type code for VPLS Info TLV: TBD.

       Length

           Total number of bytes contained in the value field.

       PE's IPv4 or IPv6 Address

           This 128-bit field is filled with one of the originating PE
           router's IPv4 or IPv6 addresses which are reachable across
           the IP backbone. The address filled in this field SHOULD be
           used as a tunnel destination address by remote PE routers
           when these PE routers acting as ingress PE routers want to
           tunnel a customer Ethernet frame to such PE router. If the
           IP address is IPv4, the last four octets of this field are
           filled with the IPv4 address while the remaining part is set
           to zero. In other words, it is filled with an IPv4-mapped
           IPv6 address.

       VPLS ID

           This field is filled with a 20-bit globally unique VPLS ID
           for a particular attached VPLS instance. In case that a
           larger VPLS ID space is required, the leftmost 12-bit
           reserved field could be used together with the VPLS ID field
           as an extended VPLS ID field. That is to say, the whole 32
           bits are filled with a 32-bit long extended VPLS ID value.

       VPLS Label

           This field is filled with a 20-bit MPLS label corresponding
           to the VPLS instance which is identified by the VPLS ID.

   3.2. Auto-discovery

   In IS-IS VPLS, each PE router could automatically discover which
   other PE routers are part of a given VPLS instance that is
   identified by the globally unique VPLS ID. This allows each PE
   router to be configured with only the identities of the attached
   VPLS instances, but not identities of all the other PE routers
   belonging to these VPLS instances.







Xu, et al.                 Expires January 13, 2013               [Page 7]


Internet-Draft                VPLS Using IS-IS                July 2012

   3.3. Signaling

   In IS-IS VPLS, a PE router would assign the same VPLS label for a
   given VPLS instance to any other PE router. As such, this VPLS label
   is only used for identifying a particular VPLS instance, rather than
   identifying both a particular VPLS instance and the corresponding
   ingress PE router as a PW label does.

4. Data Plane

   4.1. Data Encapsulation and Forwarding

   Since the VPLS label in IS-IS VPLS is only used for identifying a
   particular VPLS instance, in the data-plane based MAC learning case
   (see section 4.2.1), IP-based tunnel (e.g., GRE (Generic Routing
   Encapsulation)/IP [RFC4023] or UDP [MPLS-in-UDP]) is RECOMMENDED to
   be used as the PE-to-PE tunnel technology. As such, during the MAC
   learning process, egress PE router could easily determine the
   ingress PE router of the received VPLS packet from the tunnel source
   address of that packet. Note that, in the control-plane based MAC
   learning case (see section 4.2.2), there is no special requirement
   for PE-to-PE tunnel technology in comparison with existing VPLS
   solutions. The following sub-sections are based on an assumption
   that IP tunnels are used between PE routers.

    4.1.1. Unicast

   For known unicast, MAC-in-MPLS-in-IP encapsulation [RFC4448] is used.
   For unknown unicast, the encapsulation and forwarding procedures are
   the same as that for multicast/broadcast described in the following
   section.

    4.1.2. Multicast/Broadcast

   There are two major modes for delivering multicast and broadcast in
   IS-IS VPLS: ingress replication mode and P-Multicast tree mode. P-
   Multicast tree mode further includes two sub-options: non-
   aggregative P-Multicast tree mode where one P-Multicast distribution
   tree in the IP backbone is exclusively used by a single VPLS
   instance, and aggregative P-Multicast tree mode in which one P-
   Multicast tree is shared by more than one VPLS instance. The
   corresponding encapsulation for each mode is described in the
   following sub-sections.







Xu, et al.                 Expires January 13, 2013               [Page 8]


Internet-Draft                VPLS Using IS-IS                July 2012

   4.1.2.1. Ingress Replication Mode

   In the ingress replication mode, an ingress PE router forward the
   received customer multicast/broadcast frames towards remote PE
   routers in separate tunnels. Hence, the encapsulation in this mode
   has no difference from that for unicast.

   4.1.2.2. Non-aggregative P-Multicast Tree Mode

   In the non-aggregative P-Multicast tree mode, MAC-in-IP
   encapsulation is used directly since the destination IP address
   (i.e., multicast address) contained in the IP-based tunnel header is
   enough for egress PE routers to determine which VPLS instance the
   received VPLS packet belongs to.

   4.1.2.3. Aggregative P-Multicast Tree Mode

   For the aggregative P-Multicast tree mode, MAC-in-MPLS-in-IP
   encapsulation SHOULD be used. Furthermore, the MPLS label here
   SHOULD be treated as an upstream-assigned label. For example, assume
   a PE router has assigned a local label L for a given VPLS instance
   and advertised that VPLS information using the VPLS Info TLV before,
   when this PE router wants to send a multicast VPLS packet of that
   VPLS instance through the corresponding aggregative P-Multicast tree,
   label L as an upstream-assigned label will be contained in that VPLS
   packet.

   4.2. MAC Address Learning

   MAC addresses of local CE hosts would still be learnt by PE routers
   as normal bridges.

   As for learning MAC addresses of remote CE hosts, IS-IS VPLS
   provides two options: data-plane based MAC learning and control-
   plane based MAC learning. If unknown unicast flood suppression is
   required even at the cost of consuming more forwarding table
   resources, the control-plane based MAC learning option could be
   considered. Otherwise, the data-plane based MAC learning option
   SHOULD be preferred.

    4.2.1. Data-plane based MAC Learning

   Upon receiving an VPLS packet from a remote PE router, the MPLS
   label contained in the packet (or the tunnel destination IP address
   in the non-aggregative P-Multicast tree case) is used to determine
   the particular VPLS instance that the packet belongs to, while the




Xu, et al.                 Expires January 13, 2013               [Page 9]


Internet-Draft                VPLS Using IS-IS                July 2012

   tunnel source IP address is used to tell from which ingress PE
   router the packet was sent.

    4.2.2. Control-plane based MAC Learning

   In IS-IS VPLS, MAC addresses of remote CE hosts can also be learnt
   on the control plane by using the MAC-Reachability TLV defined in
   [RFC6165].

   Upon learning the MAC addresses of their local CE hosts, PE routers
   would immediately advertise these MAC addresses to other PE routers
   of the same VPN instance by using the MAC-Reachability TLV defined
   in [RFC6165]. One or more MAC-Reachability TLVs are carried in a LSP
   which in turn is encapsulated with an Ethernet header. The source
   MAC address is the originating PE router's MAC address whereas the
   destination MAC address is a to-be-defined multicast MAC address
   specifically identifying IS-IS VPLS PE routers. Such LSPs are
   forwarded towards remote PE routers as customer Ethernet frames by
   ingress PE routers. Egress PE routers receiving the above packets
   SHOULD intercept them and accordingly process them. IP address of
   the PE router originating these MAC routes could be derived either
   from the "IP Interface Address" field contained in the corresponding
   LSPs (Note that the IP address here SHOULD be identical with that
   contained in the VPLS Info TLV) or from the tunnel source IP address
   of the VPLS packet containing such MAC routes.

   Since these LSPs are fully transparent to P routers, there is no
   impact on the control plane of P routers. More details about the
   control-plane based MAC learning procedure are for further study.

5. ARP Broadcast Reduction

   To suppress ARP broadcast flood within a given VPLS instance, ARP
   cache mechanism can be enabled on PE routers. For more details about
   ARP cache mechanism, please refer to [ARP-Reduction]

6. Security Considerations

   This document doesn't introduce additional security risk to IS-IS
   and VPLS, nor does it provide any additional security feature for
   IS-IS and VPLS.

7. IANA Considerations

   The IS-IS TLV type code for VPLS Info TLV is required to be defined
   by IANA.




Xu, et al.                 Expires January 13, 2013              [Page 10]


Internet-Draft                VPLS Using IS-IS                July 2012

8. Acknowledgements

   Thanks to Tony Li, Peter Ashwood-Smith, Phil Bedard, Kris Price,
   Shahram Davari, Adrian Farrel, Giles Heron and Christian Jacquenet
   for their valuable comments on this proposal.

9. References

   9.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

   9.2. Informative References

   [IS-IS] ISO/IEC 10589, "Intermediate System to Intermediate System
             Intra-Domain Routing Exchange Protocol for use in
             Conjunction with the Protocol for Providing the
             Connectionless-mode Network Service (ISO 8473)", 2005.

   [RFC1195] Callon, R., "Use of OSI IS-IS for Routing in TCP/IP and
             Dual Environments", RFC 1195 1990.

   [RFC5311] McPherson, D., Ginsberg, L., Previdi, S., and M. Shand,
             "Simplified Extension of Link State PDU (LSP) Space for
             IS-IS", RFC 5311, 2009.

   [RFC4448]  Martini, L., Rosen, E., El-Aawar, N., and G. Heron,
             "Encapsulation Methods for Transport of Ethernet over MPLS
             Networks", RFC 4448, April 2006.

   [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating
             MPLS in IP or Generic Routing Encapsulation (GRE)", RFC
             4023, March 2005.

   [MPLS-in-UDP] X. Xu, et al., "Encapsulating MPLS in UDP", draft-xu-
             mpls-in-udp-01.txt (work in progress), May 2012.

   [RFC6165] A. Banerjee., D. Ward, "Extensions to IS-IS for Layer-2
             Systems", RFC 6165, February 2011.

   [VPLS-MCAST] R. Aggarwal., Y. Kamite., L. Fang, "Multicast in VPLS",
             draft-ietf-l2vpn-vpls-mcast-08.txt (work in progress),
             October 2010.






Xu, et al.                 Expires January 13, 2013              [Page 11]


Internet-Draft                VPLS Using IS-IS                July 2012

   [ARP-Reduction] H. Shah., A. Ghanwani., and N. Bitar, "ARP Broadcast
             Reduction for Large Data Centers",
             draft-shah-armd-arp-reduction-02.txt (work in progress),
             October 2011.

   [RFC5331] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS Upstream Label
             Assignment and Context-Specific Label Space", RFC 5331,
             August 2008.

   [RFC4664] Andersson, L. and Rosen, E. (Editors),"Framework for Layer
             2 Virtual Private Networks (L2VPNs)", RFC 4664, Sept 2006.

   [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
             (VPLS) Using BGP for Auto-Discovery and Signaling", RFC
             4761, January 2007.

   [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service
             (VPLS) Using Label Distribution Protocol (LDP) Signaling",
             RFC 4762, January 2007.

10. Authors' Addresses

   Xiaohu Xu
   Huawei Technologies,
   Beijing, China
   Email: xuxiaohu@huawei.com

   Himanshu Shah
   Ciena Corp
   Email: hshah@ciena.com

   Lucy Yong
   Huawei USA
   1700 Alma Dr. Suite 500
   Plano, TX  75075, US
   Email: lucyyong@huawei.com

   Yongbing Fan
   China Telecom
   Guangzhou, China.
   Phone: +86 20 38639121
   Email: fanyb@gsta.com








Xu, et al.                 Expires January 13, 2013              [Page 12]