Network working group                                             X. Xu
Internet Draft                                      Huawei Technologies
Category: Standard Track
Expires: February 2011                                  August 24, 2010


        Virtual Subnet: A Scalable Data Center Network Architecture

                        draft-xu-virtual-subnet-02


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with
   the provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on February 24, 2011.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.






Xu                   Expires February 24, 2011               [Page 1]


Internet-Draft               Virtual Subnet                 August 2010

Abstract

   This document proposes a scalable data center network architecture
   which, as an alternative to the Spanning Tree Protocol Bridge
   network, uses a Layer 3 routing infrastructure to provide scalable
   virtual Layer 2 network connectivity services.

Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [RFC2119].

Table of Contents


   1. Problem Statement............................................3
   2. Terminology..................................................3
   3. Design Goals.................................................3
   4. Architecture Description.....................................4
      4.1. Unicast.................................................4
         4.1.1. Communications within a Service Domain.............4
         4.1.2. Communications between Service Domains.............6
      4.2. Multicast/Broadcast.....................................7
      4.3. Host Discovery..........................................9
      4.4. APR Proxy..............................................10
      4.5. DHCP Relay Agent.......................................11
   5. Conclusions.................................................11
   6. Limitations.................................................12
   7. Future work.................................................12
   8. Security Considerations.....................................12
   9. IANA Considerations.........................................12
   10. Acknowledgements...........................................12
   11. References.................................................12
      11.1. Normative References..................................12
      11.2. Informative References................................12
   Authors' Addresses.............................................13












Xu                   Expires February 24, 2011               [Page 2]


Internet-Draft               Virtual Subnet                 August 2010


1. Problem Statement

   With the popularity of cloud services, the scale of today's data
   centers expands larger and larger. In addition, virtual machine
   migration technology, which allows a virtual machine to be able to
   migrate to any physical server while keeping the same IP address, is
   becoming more and more prevalent for achieving service agility in
   data centers. As a result, large Layer 2 networks are needed for
   server-to-server connectivity. Meanwhile, due to the huge-volume
   traffic exchanged between servers, the L2 networks SHOULD be able to
   provide enough capacity for server-to-server interconnections.

   Unfortunately, today's network architecture for data centers which
   relies on the Spanning-Tree Protocol (STP) Bridge technology, can
   not address the above challenges facing those large-scale data
   centers, e.g., large scale of servers and high-bandwidth demands for
   server-to-server interconnections. First, STP can only calculate out
   a single forwarding tree for all connected servers and it can not
   support multi-path routing, e.g., Equal Cost Multi-Path (ECMP),
   hence it can't maximize the utilization of the totally available
   network resources to provide enough bandwidth capacity between
   servers; Second, since the STP Bridge forwarding depends on the flat
   MAC addresses, the scalability of the forward table would become a
   big issue, especially when the existing large Layer 2 network scales
   even larger; Third, the broadcast storm impact on the network
   performance becomes much more serious and unpredictable in the
   continually growing large-scale STP Bridge networks.

2. Terminology

   This memo makes use of the terms defined in [RFC4364], [MVPN],
   [RFC2236] and [RFC2131]. Below are provided terms specific to this
   document:

      - Service Domain: A group of servers which are dedicated for a
      given service and are usually located in a separate IP subnet.

3. Design Goals

   To overcome the limitations of the STP Bridge networks, in this
   document we propose a new network architecture for data centers
   called Virtual Subnet (VS), which aims to meet the following design
   objectives:

      - Bandwidth Utilization Maximization




Xu                   Expires February 24, 2011               [Page 3]


Internet-Draft               Virtual Subnet                 August 2010

   To provide enough bandwidth between servers, the server-to-server
   traffic SHOULD always be delivered through the shortest path while
   achieving load-balancing by using multi-path routing.

      - Layer 2 Connectivity

   To be backward compatible with the current applications running in
   data centers (e.g., virtual machine migration), those servers of a
   given service domain SHOULD be connected as if they were on a Local
   Area Network (LAN) or an IP subnet.

      - Domain Isolation

   Due to considerations of performance isolation and security, servers
   belonging to different service domains SHOULD be isolated just as if
   they were located on dedicated Virtual LANs (VLAN) or IP subnets.

      - Forwarding Table Scalability

   To accommodate tens to hundreds of thousands of servers within a
   single data center network, the forwarding tables of all forwarding
   devices (e.g., routers or switches) SHOULD be scalable enough.

      - Broadcast Storm Suppression

   To reduce the impact of broadcast storms imposed on the network
   performance, broadcast domains SHOULD be limited to their smallest
   scope.

4. Architecture Description

   VS actually uses BGP/MPLS VPN technology [RFC4364] with some
   extensions, together with some other proven technologies including
   ARP proxy [RFC925][RFC1027] to build a scalable large IP subnet
   across the MPLS/IP backbone of the data center network. As a result,
   VS can be deployed today as a scalable data center network.

   The following sections describe VS in details.

   4.1. Unicast

   4.1.1. Communications within a Service Domain

   As shown in Figure 1, BGP/MPLS VPN technology with some extensions,
   as an alternative to the STP Bridge, is deployed in a data center
   network. To achieve service domain isolation, each service domains
   is mapped to a distinct VPN and servers of a given service domain,



Xu                   Expires February 24, 2011               [Page 4]


Internet-Draft               Virtual Subnet                 August 2010

   as Customer Edge (CE) hosts, are attached to Provider Edge (PE)
   routers of the corresponding VPN directly or through one or more
   Ethernet switches. In addition, to build a large IP subnet across
   the MPLS/IP backbone, different sites of a particular VPN are
   associated with an identical IP subnet. That is to say, each PE
   attached to a given VPN is configured with a distinct IP address of
   an identical IP subnet on the corresponding Virtual Routing
   Forwarding (VRF) attachment circuits. Each PE creates connected host
   routes for each attached VRF automatically according to the Address
   Resolution Protocol (ARP) table of the corresponding VPN. Instead of
   exchanging the route for the configured IP subnet, PEs belonging to
   a given VPN exchange connected host routes among them via BGP. In
   addition, APR proxy is enabled on PEs for each attached VPN, thus,
   upon receiving from a local CE host an ARP request for a remote CE
   host, the PE as an ARP proxy returns one of its MAC addresses in the
   corresponding ARP reply.

                          +--------------------+
    +-----------------+   |                    |   +------------------+
    |VPN_A:10/8       |   |                    |   |VPN_A:10/8        |
    |                 |   |                    |   |                  |
    |    +------+    ++---+-+                +-+---++    +------+     |
    |    |Host A+----+ PE-1 |                | PE-2 +----+Host B|     |
    |    +------+    ++-+-+-+                +-+-+-++    +------+     |
    |   10.1.1.1/8    | | |  IP/MPLS Backbone  | | |    10.1.1.2/8    |
    +-----------------+ | |                    | | +------------------+
                        | +--------------------+ |
                        |                        |
                        |                        |
                        V                        V
    +-------+------------+--------+     +-------+------------+--------+
    |VRF ID |Destination |Next Hop|     |VRF ID |Destination |Next Hop|
    +-------+------------+--------+     +-------+------------+--------+
    | VPN_A |10.1.1.1/32 |  Local |     | VPN_A |10.1.1.2/32 |  Local |
    +-------+------------+--------+     +-------+------------+--------+
    | VPN_A |10.1.1.2/32 |  PE-2  |     | VPN_A |10.1.1.1/32 |  PE-1  |
    +-------+------------+--------+     +-------+------------+--------+

               Figure 1: Intra-domain Communication Example

   Now host A broadcasts an ARP request for host B before communicating
   with B. Upon the receipt of this ARP request, PE-1 lookups the
   associated VRF to find the host route for B. If found and the route
   is learnt from a remote PE, PE-1 acting as an ARP proxy returns one
   of its own MAC addresses in the response to that ARP request.
   Otherwise, no ARP reply SHOULD be sent. After obtaining the ARP
   reply from PE-1, A sends an IP packet to B with destination MAC



Xu                   Expires February 24, 2011               [Page 5]


Internet-Draft               Virtual Subnet                 August 2010

   address of PE-1's MAC address. Upon receiving this packet, PE-1
   acting as an ingress PE, tunnels the packet towards PE-2 which in
   turn, as an egress PE, forwards the packet to B.

   4.1.2. Communications between Service Domains

   For servers in different VPNs (i.e., service domains) to communicate
   with each other, these VPNs SHOULD not be configured with any
   overlapping addresses, and each VPN SHOULD be configured with a
   default route towards the corresponding default gateway (i.e. a CE
   router).

     +-------+------------+--------+   +-------+------------+--------+
     |VRF_ID |Destination |Next Hop|   |VRF_ID |Destination |Next Hop|
     +-------+------------+--------+   +-------+------------+--------+
     | VPN_A |10.1.1.2/32 |  PE-1  |   | VPN_B |20.1.1.2/32 |  PE-2  |
     +-------+------------+--------+   +-------+------------+--------+
     | VPN_A |10.1.1.1/32 | Local  |   | VPN_B |20.1.1.1/32 | Local  |
     +-------+------------+--------+   +-------+------------+--------+
     | VPN_A | 0.0.0.0/0  |10.1.1.1|   | VPN_B | 0.0.0.0/0  |20.1.1.1|
     +-------+------------+--------+   +-------+------------+--------+
             ^                                              ^
             |            +--------------------+            |
             |            |     IP Network     |            |
             |            +----+-----------+---+            |
             |             +---+--+    +---+--+             |
             |             | GW-1 |    | GW-2 |             |
             |             +---+--+    +--+---+             |
             |VPN A:10.1.1.1/8 |          |VPN B:20.1.1.1/8 |
             |                 |          |                 |
             +-------------+---+--+    +--+---+-------------+
                         +-+ PE-3 +----+ PE-4 +-+
    +-----------------+  | +------+    +------+ |  +------------------+
    |VPN A:10/8       |  |                      |  |VPN_B:20/8        |
    |                 |  |                      |  |                  |
    |    +------+    ++--+--+                +--+--++    +------+     |
    |    |Host A+----+ PE-1 |                | PE-2 +----+Host B|     |
    |    +------+    ++-++--+                +--++-++    +------+     |
    |    10.1.1.2/8   | ||   IP/MPLS Backbone   || |    20.1.1.2/8    |
    +-----------------+ ||                      || +------------------+
                        |+----------------------+|
                        |                        |
                        V                        V
    +-------+------------+--------+   +-------+------------+--------+
    |VRF ID |Destination |Next Hop|   |VRF ID |Destination |Next Hop|
    +-------+------------+--------+   +-------+------------+--------+
    | VPN_A |10.1.1.2/32 |  Local |   | VPN_B |20.1.1.2/32 |  Local |



Xu                   Expires February 24, 2011               [Page 6]


Internet-Draft               Virtual Subnet                 August 2010

    +-------+------------+--------+   +-------+------------+--------+
    | VPN_A |10.1.1.1/32 |  PE-3  |   | VPN_B |20.1.1.1/32 |  PE-4  |
    +-------+------------+--------+   +-------+------------+--------+
    | VPN_A | 0.0.0.0/0  |  PE-3  |   | VPN_B | 0.0.0.0/0  |  PE-4  |
    +-------+------------+--------+   +-------+------------+--------+

               Figure 2: Inter-domain Communication Example

   As shown in Figure 2, PE-1 and PE-3 are attached to one VPN (i.e.
   VPN A) while PE-2 and PE-4 are attached to another VPN (i.e., VPN B).
   Host A and its default gateway router (i.e., GW-1) are connected to
   PE-1 and PE-3, respectively. PE-3 is configured with a default route
   for VPN A and this default route is advertised to other PEs.
   Similarly, host B and its default gateway router (i.e., GW-2) are
   connected to PE-2 and PE-4, respectively. PE-4 is configured with a
   default route for VPN B and this default route is advertised to
   other PEs. Now A sends an ARP request for its default gateway (i.e.,
   10.1.1.1) before communicating with B. Upon receiving this ARP
   request, PE-1 lookups the associated VRF to find the host route for
   the default gateway. If found and the route is learnt from a remote
   PE, PE-1 as an ARP proxy, returns one of its own MAC addresses in
   the ARP reply. After obtaining the ARP reply, A sends an IP packet
   for B with destination MAC address of PE-1's MAC. Upon receiving
   this packet, PE-1 as an ingress PE, tunnels it towards PE-3
   according to the best-match route for that packet (i.e., the default
   route) in the associated VRF. PE-3 as an egress PE, in turn,
   forwards this packet towards the default gateway router (i.e., GW-1).
   After the packet arrives at the default gateway router for B (i.e.,
   GW-2) after traveling through an IP network, GW-2 forwards the
   packet to PE-4 with destination MAC address of PE-4's MAC address if
   it has learnt an ARP for B from PE-4. Otherwise, GW-2 SHOULD
   broadcast an APR request for B. Upon receiving this packet, PE-4 as
   an ingress PE, tunnels it towards PE-2 which in turn, forwards it
   towards B.

   4.2. Multicast/Broadcast

   The MVPN technology [MVPN], especially the Protocol-Independent-
   Multicast (PIM) tree option with some extensions, is partially
   reused here to support link-local multicast between servers of a
   given service domain (i.e., VPN). That is to say, the customer
   multicast group addresses of a given VPN are 1:1 or n: 1 mapped to
   the provider multicast group dedicated for that VPN when
   transporting the customer multicast traffic across the backbone. For
   broadcast, a dedicated provider multicast group is reserved for
   carrying broadcast traffic across the IP/MPLS backbone. In other
   words, customer broadcast is processed on PEs as a special customer



Xu                   Expires February 24, 2011               [Page 7]


Internet-Draft               Virtual Subnet                 August 2010

   multicast group. Unless otherwise mentioned, the customer multicast
   term pertains to customer multicast and broadcast. All PEs attached
   to a given VPN SHOULD maintain the identical mappings from customer
   multicast group addresses to provider multicast group addresses. To
   isolate the customer multicast traffics of different VPNs traveling
   through the backbone, different VPNs SHOULD be assigned distinct
   provider multicast group address ranges without any overlapping.

                          +--------------------+
    +-----------------+   |                    |   +------------------+
    |VPN_A:10/8       |   |                    |   |VPN_A:10/8        |
    |                 |   |                    |   |                  |
    |    +------+  E0++---+-+                +-+---++    +------+     |
    |    |Host A+----+ PE-1 |                | PE-2 +----+Host B|     |
    |    +------+    ++-+-+-+                +-+---++    +------+     |
    |    10.1.1.1/8   | | |  IP/MPLS Backbone  |   |    10.1.1.2/8    |
    +-----------------+ | |                    |   +------------------+
                        | +--------------------+
                        |
                        |
                        V
   +-------+---------------+----------+-------+--------+
   |VRF ID |  Customer G   |Provider G| To PE | From PE|
   +-------+---------------+----------+-------+--------+
   | VPN_A |  224.1.1.1/32 | 239.1.1.1| True  |  True  |
   +-------+---------------+----------+-------+--------+
   | VPN_A |  224.0.0.0/4  | 239.1.1.2| True  |  True  |
   +-------+---------------+----------+-------+--------+
   | VPN_A |255.255.255.255| 239.1.1.3| True  |  True  |
   +-------+---------------+----------+-------+--------+

       Figure 3: Link-local Multicast/Broadcast Communication Example

   The multicast forwarding entry can be configured manually by the
   network operators or generated dynamically according to the Internet
   Group Management Protocol (IGMP) Membership Report/Leave messages
   received from CE hosts or remote PEs. Ingress PEs forward customer
   multicast packets to other PEs (i.e., egress PEs) of the same VPN
   via a provider multicast distribution tree, according to the best-
   match multicast forwarding entry of the associated VRF in case that
   the ''To PE'' field of that entry is set to True. Otherwise (i.e.,
   that field set to False), ingress PEs are not allowed to forward the
   customer multicast packets to remote egress PEs. Egress PEs forward
   customer multicast packets received from the provider multicast
   distribution tree to CE hosts via VRF attachment circuits, according
   to the best-match multicast forwarding entry of the associated VRF
   in case that the ''From PE'' field of that entry is set to True.



Xu                   Expires February 24, 2011               [Page 8]


Internet-Draft               Virtual Subnet                 August 2010

   Otherwise (i.e., that field set to False), egress PEs are not
   allowed to forward the customer multicast packets to CE hosts. For
   IGMP messages to be conveyed successfully across the IP/MPLS
   backbone, some multicast forwarding entries of special multicast
   groups including all-routers multicast group (i.e., 224.0.0.2) and
   all-systems group (224.0.0.1) SHOULD be configured in the
   corresponding VRF in advance. Besides, according to the IGMP
   specification [RFC2236], Group-Specific Query messages are sent to
   the group being queried and Membership Report messages are sent to
   the group being reported, Upon receiving these packets from CE hosts,
   the PE SHOULD convey them over the corresponding provider multicast
   distribution tree dedicated for the all-systems group (224.0.0.1) of
   a given VRF. To avoid IGMP Membership Report suppression, those
   Membership Report messages received from PEs or CE hosts SHOULD not
   be forwarded to CE hosts. As an alternative to conveying IGMP
   Report/Leave messages through the provider multicast distribution
   tree, customer multicast routing information exchange among PEs can
   also be achieved by using the approaches defined in [MVPN-BGP].

   As shown in Figure 3, upon receiving a multicast/broadcast packet
   from a CE (e.g., host A), if this packet is destined for 224.1.1.1,
   PE-1 will encapsulate it in a provider multicast packet with
   destination IP address of 239.1.1.1; If it is destined for an IP
   multicast address other than 224.1.1.1, PE-1 will encapsulate it in
   a provider multicast packet with destination IP address of 239.1.1.2;
   if this is a broadcast packet. PE-1 will encapsulate it in a
   provider multicast packet with destination IP address of 239.1.1.3
   which is dedicated for conveying broadcast packets of that VPN.

   The customer multicast forwarding entries, no matter configured
   manually or learnt automatically according to the IGMP Membership
   Reports sent from local CEs, will automatically trigger PEs to join
   the corresponding provider multicast groups in the MPLS/IP backbone.
   For example, assume PE-2 receives an IGMP member report for a given
   customer multicast group (e.g., 224.1.1.1) from a local CE (e.g.,
   host B), it SHOULD automatically join a provider multicast group
   (i.e., 239.1.1.1) corresponding to that customer multicast group.

   4.3. Host Discovery

   To discover all local CE hosts, a PE SHOULD perform at least ARP
   scan once after rebooting. For example, it broadcasts an ARP request
   for each IP address within the subnet of each attached VPN
   (including the network and broadcast addresses). Alternatively, it
   could also broadcast an ARP request for a direct broadcast address
   (i.e., 255.255.255.255), upon receipt of such an ARP request, any
   host SHOULD respond with an ARP reply containing its IP and MAC



Xu                   Expires February 24, 2011               [Page 9]


Internet-Draft               Virtual Subnet                 August 2010

   addresses. After a round of ARP scan, the PE will discover all local
   CE hosts and cache their corresponding ARP entries in its ARP table.
   After that, the PE could send ARP requests in unicast to each
   already-learnt local CE host periodically so as to keep the
   corresponding ARP entry from expiring. This can also be useful to
   check whether a given CE host with known IP and MAC addresses is
   still present on the subnet. Using unicast ARP requests has the
   advantage that it is quieter than using the broadcast because it
   won't be received by all hosts on the subnet. When receiving a
   gratuitous ARP from a local host, the PE SHOULD cache it in the ARP
   table immediately if no ARP entry for that host exists yet.
   Otherwise, the PE SHOULD just update the corresponding ARP entry in
   its ARP table. Most operating systems generate a gratuitous ARP
   request when the host boots up, the host's network interface or
   links comes up, or an address assigned to the interface changes. In
   the scarce scenarios where a host does not generate a gratuitous ARP,
   the PE would have to perform ARP scan periodically although it has
   side-effects on the network performance.

   When a given PE receives a host route for one of its local CE hosts
   from a remote PE, it SHOULD immediately send an ARP request for that
   local CE host to check whether this CE host is still connected
   locally. If an ARP reply received in a short amount of time (imaging
   the host multi-homing scenario), the PE just needs to update the ARP
   entry for that local host as normal. Otherwise (considering the
   virtual machine migration scenario), the PE SHOULD delete the ARP
   entry corresponding to that host from its APR table. Meanwhile, the
   PE SHOULD send a gratuitous ARP on behalf of the local host, with
   the sender hardware address being set as one of its own MAC
   addresses, in order to update the ARP entry for that host which is
   cached on other local hosts. As a result, the subsequent packets
   destined for that host will be sent towards the PE by the other
   local CE hosts.

   4.4. APR Proxy

   The PE, acting as an ARP proxy, SHOULD only respond to those ARP
   requests for remote hosts which have been learnt via BGP from other
   PEs. That is to say, the ARP proxy SHOULD not respond to ARP
   requests for local hosts. Otherwise, in case that the ARP reply from
   the ARP proxy covers that from the requested host, the packets
   destined for that local host would have to be unnecessarily relayed
   by the PE.

   When the Virtual Router Redundancy Protocol (VRRP) [RFC2338] is
   enabled together with ARP proxy, only the VRRP master is delegated




Xu                   Expires February 24, 2011              [Page 10]


Internet-Draft               Virtual Subnet                 August 2010

   to act as an ARP proxy and it SHOULD return the VRRP virtual MAC
   address in the ARP reply.

   4.5. DHCP Relay Agent

   To avoid the Dynamic Host Configuration Protocol (DHCP) [RFC2131]
   broadcast message flooding through the whole data center network,
   the DHCP Relay Agent function can be enabled on PEs. In this way,
   the DHCP broadcast messages from DHCP clients (i.e., local CE hosts)
   would be transformed into DHCP unicast messages by the DHCP Relay
   Agents (i.e., PEs) and then be forwarded to the DHCP servers in
   unicast.

5. Conclusions

   By using Layer 3 routing in the backbone of the data center network
   to replace the STP Bridge forwarding, the traffic between any two
   servers is forwarded along the short path between them. Besides, the
   ECMP can also be easily achieved in Layer 3 routing networks. Thus,
   the total network bandwidth of the data center network is utilized
   to maximum extent.

   By reusing the BGP/MPLS VPN technology to exchange host routes of a
   given VPN among PEs, the servers of that VPN are allowed to
   communicate with each other just as if they are located on a single
   subnet.

   Due to the tunnels used in MPLS/BGP VPN, the forwarding tables of P
   routers just need to hold the reachability information of tunnel
   endpoints (i.e., PEs). Meanwhile, the forwarding tables of PE
   routers can also be ensured to scale well by distributing VPNs among
   different PEs, that is to say, thanks to the Outbound Route
   Filtering (ORF) capability of BGP, a given PE router only needs to
   hold the routing tables of those VPNs to which the PE is attached.
   Thus, the forwarding table scalability issues with data center
   networks are largely alleviated.

   By enabling the APR proxy function on PEs, the ARP broadcast
   messages from local CE hosts are blocked on the attached PEs. Thus,
   the APR broadcast messages will not be flooded through the whole
   data center network. Besides, by enabling the DHCP Relay Agent
   function on PEs, the DHCP broadcast messages from DHCP clients (i.e.,
   local CE hosts) would be transformed into unicast messages by the
   DHCP Relay Agents and then be forwarded to the DHCP servers in
   unicast. Thus, the broadcast storms in the data center networks are
   largely suppressed.




Xu                   Expires February 24, 2011              [Page 11]


Internet-Draft               Virtual Subnet                 August 2010

6. Limitations

   Since the data center network architecture described in this
   document partially reuses the BGP/MPLS VPN technology to construct a
   large-scale IP subnet, rather than a real LAN, the non-IP traffic
   can not be supported in this architecture. However, we believe IP is
   the dominate communication protocol in today's data center networks,
   those non-IP legacy applications will disappear from the data center
   networks with the elapse of time.

7. Future work

   IPv6-based data center network will be considered as a part of the
   further work.

8. Security Considerations

   TBD.

9. IANA Considerations

   There is no requirement for IANA.

10. Acknowledgements

   Thanks to Dino Farinacci for his valuable comments.

11. References

11.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

11.2. Informative References

   [RFC4364] Rosen. E and Y. Rekhter, "BGP/MPLS IP Virtual Private
             Networks (VPNs)", RFC 4364, February 2006.

   [MVPN] Rosen. E and Aggarwal. R, "Multicast in MPLS/BGP IP VPNs",
             draft-ietf-l3vpn-2547bis-mcast-10.txt (work in progress),
             Janurary 2010.

   [MVPN-BGP] R. Aggarwal, E. Rosen, T. Morin, Y. Rekhter,  C.
             Kodeboniya, "BGP Encodings for Multicast in MPLS/BGP IP
             VPNs", draft-ietf-l3vpn-2547bis-mcast-bgp-08.txt,
             September 2009.



Xu                   Expires February 24, 2011              [Page 12]

Internet-Draft               Virtual Subnet                 August 2010

   [RFC925] Postel, J., "Multi-LAN Address Resolution", RFC-925, USC

            Information Sciences Institute, October 1984.

   [RFC1027] Smoot Carl-Mitchell, John S. Quarterman,'' Using ARP to
             Implement Transparent Subnet Gateways'', RFC 1027, October
             1987.

   [RFC2338] Knight, S., et. al., "Virtual Router Redundancy Protocol",
             RFC 2338, April 1998.

   [RFC2131] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131,
             March 1997.

   [RFC2236] Fenner, W., "Internet Group Management Protocol, Version
             2", RFC 2236, November 1997.

Authors' Addresses

   Xiaohu Xu
   Huawei Technologies,
   No.3 Xinxi Rd., Shang-Di Information Industry Base,
   Hai-Dian District, Beijing 100085, P.R. China
   Phone: +86 10 82836073
   Email: xuxh@huawei.com