Network Working Group Youval Nachum
Internet Draft Tal Mizrahi
Intended status: Informational Ilan Yerushalmi
Expires: September 2012 Marvell
March 4, 2012
Scaling the Address Resolution Protocol for Large Data Centers
(SARP)
draft-nachum-sarp-00.txt
Abstract
This document provides a recommended architecture and network
operation named SARP. SARP is based on fast proxies that
significantly reduce broadcast domains and ARP/ND broadcast
transmissions. SARP supports smooth and fast virtual machine (VM)
mobility without any modification to the VM, while keeping the
connection up and running efficiently. SARP is targeted for massive
scaling data centers with a significant number of VMs using ARP and
ND protocols.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 4, 2012.
Nachum, et al. Expires September 4, 2012 [Page 1]
Internet-Draft Informational March 2012
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction ................................................. 3
1.1. SARP Motivation.......................................... 3
1.2. SARP Overview ........................................... 3
1.3. SARP Deployment Options ................................. 5
2. Abbreviations Used in this Document .......................... 5
3. SARP Description ............................................. 6
3.1. Control Plane: ARP/ND ................................... 6
3.1.1. ARP/ND Request for a Local VM ...................... 6
3.1.2. ARP/ND Request for a Remote VM ..................... 6
3.2. Data Plane: Packet Transmission ......................... 7
3.2.1. Local Packet Transmission .......................... 7
3.2.2. Packet Transmission Between Sites .................. 7
3.3. VM Local Migration ...................................... 8
3.4. VM Migration from One Site to Another ................... 8
3.4.1. ARP/ND Table of Mobile VMs ......................... 9
3.5. Multicast and Broadcast ................................ 10
3.6. Non IP packet .......................................... 10
3.7. ARP caching ............................................ 10
3.8. SARP Interaction with Overlay networks ................. 10
4. Security Considerations ..................................... 11
5. IANA Considerations ......................................... 11
6. References .................................................. 11
6.1. Normative References ................................... 11
6.2. Informative References ................................. 11
7. Acknowledgments ............................................. 11
Nachum, et al. Expires September 4, 2012 [Page 2]
Internet-Draft Informational March 2012
1. Introduction
1.1. SARP Motivation
SARP provides operational recommendations that mitigate performance
derogation due to the data center architecture. SARP can be used in
large data centers with large amount of VMs where VMs migrate from
one system to another while keeping their network connections up and
running. Data center operators are required to allow the VMs to keep
their IP and MAC identity while migrating between systems. The direct
outcome of having VMs keep their respective IP and MAC identities is
that Layer 2 broadcast domains are scaling up and protocols such as
[ARP] and [ND] cause network performance derogation. SARP addresses a
scaling problem that is also discussed in [ARMD].
1.2. SARP Overview
SARP uses FAST proxies that break down the large Layer 2 broadcast
domains into small segments. The SARP proxies are located at the
boundaries where the local Layer 2 infrastructure connects to its
Layer 2 cloud. Figure 1 depicts an example of two remote data centers
that are managed as a single flat Layer 2 domain. SARP proxies are
implemented at the edge devices connecting the data center to the
transport network. The direct outcome is significant reduction of
broadcast domains and ARP/ND transmissions. The large L2 broadcast
domains are bounded by the SARP proxies. ARP/ND transmissions are
reduced due to the limited broadcast domains and the use of ARP/ND
proxies and caching.
SARP proxies enable fast migration of a VM between clouds and data
centers, keeping their connections up and running while the mobile
VMs retain their IP and MAC addresses.
Nachum, et al. Expires September 4, 2012 [Page 3]
Internet-Draft Informational March 2012
*-------------------*
| |
+-------| TRANSPORT |-------+
| | | |
| *-------------------* |
| |
*-----------------* *----------------*
| SARP Proxies | | SARP Proxies |
*-----------------* *----------------*
| | | |
*-------* *-------* *-------* *-------*
| Agg | | Agg | | Agg | | Agg |
*-------* *-------* *-------* *-------*
|
*----------*
|Hypervisor|
*----------*
|
*--------*
|Virtual |
|Machine |
*--------*
(West Site) (East Site)
Figure 1 SARP Networking Architecture Example.
SARP distributes the Layer 2 Forwarding Information Base (FIB) from
the edge devices (functioning as SARP proxies) to the VMs. By doing
so, it significantly reduces table sizes on the edge devices. The
source VM maintains the mapping of its destination VMs to the
destination site/cloud in the ARP table. The destination VM IP is
translated to the destination MAC address of the SARP proxy at the
destination site. The SARP proxies only maintain Layer 2 FIB of local
VMs and remote edge devices.
SARP proxies can support FAST VM migration and provide minimum
transition phase. When SARP proxy indicates or is informed of VM
migration, it can update all its peers and triggers a fast update.
SARP seamlessly supports Layer 2 network virtualization services over
the overlay network and significantly reduces their complexity in
terms of table size and performance. The overlay networks are only
required to map MAC addresses of the SARP proxies to the correct
tunnel.
Nachum, et al. Expires September 4, 2012 [Page 4]
Internet-Draft Informational March 2012
1.3. SARP Deployment Options
SARP deployment is tightly coupled with the data center architecture.
SARP proxies are located at the point where the Layer 2
infrastructure connects to its Layer 2 cloud using overlay networks.
SARP proxies can be located at the data center edge (As Figure 1
depicts), data center core, or data center aggregation. SARP can also
be implemented by the hypervisor (As Figure 2 depicts).
To simplify the description, we will focus on data centers that are
managed as a single flat Layer 2 network, where SARP proxies are
located at the boundary where the data center connects to the
transport network (as Figure 1 depicts).
*-------------------*
| |
+-------| TRANSPORT |-------+
| | | |
| *-------------------* |
| |
*-----------------* *----------------*
| Edge Device | | Edge Device |
*-----------------* *----------------*
| |
*-----------------* *----------------*
| Core | | Core |
*-----------------* *----------------*
| | | |
*-------* *-------* *-------* *-------*
| Agg | | Agg | | Agg | | Agg |
*-------* *-------* *-------* *-------*
|
*----------*
|Hypervisor|
*----------*
(West Site) (East Site)
Figure 2 SARP deployment options.
2. Abbreviations Used in this Document
ARP: Address Resolution Protocol
FIB: Forwarding Information Base
Nachum, et al. Expires September 4, 2012 [Page 5]
Internet-Draft Informational March 2012
IP-D: IP address of the destination virtual machine
IP-S: IP address of the source virtual machine
MAC-D: MAC address of the destination virtual machine
MAC-E: MAC address of the East Proxy SARP Device
MAC-S: MAC address of the source virtual machine
ND: Neighbor Discovery
SARP Proxy: The components that participate at SARP protocol.
VM: Virtual Machine
3. SARP Description
3.1. Control Plane: ARP/ND
This section describes the ARP/ND procedure scenarios. In the first
scenario, VMs share the same site. In the second scenario, the source
VM is local and the destination VM is located at the remote site.
In all scenarios, the VMs (source and destination) share the same L2
broadcast domain.
3.1.1. ARP/ND Request for a Local VM
When source and destination VMs are located at the same site, the
Address Resolution process is as described in [ARP]. When the VM
sends an ARP request to learn the IP to MAC mapping of another local
VM, it receives a reply from the other local VM with the IP-D to MAC-
D mapping.
3.1.2. ARP/ND Request for a Remote VM
When the source and destination VMs are located at different sites,
the Address Resolution process is as follows.
In our example, the source VM is located at the west site and the
destination VM is located at the east site.
Nachum, et al. Expires September 4, 2012 [Page 6]
Internet-Draft Informational March 2012
When the source VM sends an ARP/ND request to find out the IP to MAC
mapping of a remote VM, the ARP request is propagated to the Layer 2
broadcast domain in all sites, including the east site.
The destination VM responds to the ARP/ND request and transmits an
ARP/ND reply having the IP-D to MAC-D mapping.
The east SARP proxy functions as the proxy ARP of its Local VMs. The
east SARP proxy modifies the ARP reply message to be IP-D to MAC-E
and forwards the modified ARP reply message to all the SARP proxies.
The West SARP Proxy forwards the modified ARP reply message to the
source VM.
The west SARP proxy can also functions as an ARP cache of the Remote
VMs. By doing so, it significantly reduces the volume of the ARP/ND
transmission over the network.
3.2. Data Plane: Packet Transmission
3.2.1. Local Packet Transmission
When a VM transmits packets to a destination VM that is located at
the same site, there is no change in the data plane. The packets are
sent from (IP-S, MAC-S) to (IP-D, MAC-D).
3.2.2. Packet Transmission Between Sites
Packets that are sent between sites traverse the SARP proxy of both
sites. In our example, all packets sent from the VM located at the
west site to the destination VM located at the east site traverse the
west SARP proxy and the east SARP proxy.
The source VM follows its ARP table and sends packets to (IP-D, MAC-
E) destination addresses and with (IP-s, MAC-S) as the source
addresses.
The west SARP proxy replaces the packet source address to its own
source address (MAC-W), keeps the destination address to be (MAC-E),
and forwards the packet to the east proxy SARP.
When the east proxy SARP receives the packet, it replaces the
destination MAC address to be (MAC-D) based on the packet destination
IP (i.e., IP-D), but it does not change the source MAC addresses.
Nachum, et al. Expires September 4, 2012 [Page 7]
Internet-Draft Informational March 2012
3.3. VM Local Migration
When a VM migrates locally within its site, the SARP protocol is not
required to perform any action. VM migration is resolved entirely by
the Layer 2 mechanisms.
3.4. VM Migration from One Site to Another
VMs migration from one site to another is done seamlessly, without
any changes to the VMs addressing at any level while keeping VMs
connections up and running.
In our example, the VM migrates from the west site to the east site.
VM migration differently affects VMs and networking elements based on
their respective location:
- Origin site (west site)
- Destination site (east site)
- Other sites
Origin site:
The Origin site is the site where the VM started its connections
before the migration, west site in our example.
All VMs at the west site that have an ARP entry of IP-D in their ARP
table have the (IP-D to MAC-D) mapping. ARP mapping is updated by
aging or by a gratuitous ARP message sent by the new hypervisor of
the migrating VM and modified by the SARP proxy of the east site with
(IP-D to MAC-E) mapping. Until ARP tables are updated, the source VMs
from the west site continue sending packets to MAC-D. Switches at the
west site are still configured with the old location of MAC-D. This
can be resolved by MAC table aging or by redirecting the packets to
the proxy SARP of the west site.
Destination Site:
The destination site is the site to which the VM migrated, the east
site in our example.
All VMs at the east site that have an ARP entry of IP-D in their ARP
table have the (IP-D to MAC-W) mapping. ARP mapping is updated by
aging or by a gratuitous ARP message sent by the hypervisor (IP-D to
MAC-D) mapping. Until ARP tables are updated, the source VMs from the
Nachum, et al. Expires September 4, 2012 [Page 8]
Internet-Draft Informational March 2012
west site continue to send packets to MAC-W. This can be resolved by
redirecting the packets from the SARP proxy of the east site to the
migrated VM by updating the destination MAC of the packets to MAC-D.
Other Sites:
All VMs at the other sites that have an ARP entry of IP-D in their
ARP table have the (IP-D to MAC-W) mapping. ARP mapping is updated by
aging or by a gratuitous ARP message sent by the new hypervisor of
the migrated VM and modified by the SARP proxy of the east site (IP-D
to MAC-E) mapping. Until ARP tables are updated, the source VMs from
the west site continue sending packets to MAC-W. This can be resolved
by redirecting the packets from the SARP proxy of the west site to
the SARP proxy of the east site by updating the destination MAC of
the packets to MAC-E.
3.4.1. ARP/ND Table of Mobile VMs
The ARP table of the mobile VMs migrating from the west site to the
east site includes the following types of VMs:
- Origin site (west site)
- Destination site (east site)
- Other Sites inhabitants
The IP to MAC mapping of VMs located at the other sites is unaffected
by the migration.
The IP to MAC mapping of VMs located at east site can be kept with no
change until the ARP aging time since they are mapped to MAC-E. All
traffic from the migrated VM to VMs located at the east site
traverses the SARP proxy of the east Site. This can be mitigated by
ARP advertisement sent by the SARP proxy of the east site or by the
hypervisor.
IP to MAC mapping of VMs located at west sites can be kept with no
change until the ARP entries age out. All MAC addresses of the VMs
located at the west site are unknown at the east site. All unknown
traffic from the VM is intercepted by the SARP proxy of the east site
and forwarded to the SARP proxy of the west site (just for ARP aging
time). This can be resolved earlier by the east SARP proxy. Upon
receiving unknown packets, it can update the migrating VM with the
new IP to MAC mapping by sending a modified gratuitous ARP with (IP-D
to MAC-W) mapping.
Nachum, et al. Expires September 4, 2012 [Page 9]
Internet-Draft Informational March 2012
Note that overlay networks providing the Layer 2 network
virtualization services configure their Edge Device MAC aging timers
to be greater than the ARP request interval.
3.5. Multicast and Broadcast
To be added in a future version of this document
3.6. Non IP packet
To be added in a future version of this document
3.7. ARP caching
To be added in a future version of this document
3.8. SARP Interaction with Overlay networks
SARP interaction with overlay networks providing L2 network
virtualization (such as IP, VPLS, OTV, NVGRE and VxLAN) is efficient
and scalable.
The mapping of SARP to overlay networks is straightforward. The VM
does the destination IP to SARP proxy MAC mapping. The mapping of the
proxy MAC to its correct tunnel is done by the overlay networks. SARP
significantly scales down the complexity of the overlay networks and
transport networks by reducing the mapping tables to the number of
SARP proxies.
Nachum, et al. Expires September 4, 2012 [Page 10]
Internet-Draft Informational March 2012
4. Security Considerations
Security considerations will be added in a future version of this
document.
5. IANA Considerations
There are no IANA actions required by this document.
RFC Editor: please delete this section before publication.
6. References
6.1. Normative References
[ARP] Plummer, D., "An Ethernet Address Resolution Protocol",
RFC 826, November 1982.
[ND] Narten, T., Nordmark, E., Simpson, W., and H. Soliman,
"Neighbor Discovery for IP version 6 (IPv6)", RFC
4861, September 2007.
6.2. Informative References
[ARMD] Narten, T., Karir, M., Foo, I., " Problem Statement for
ARMD", draft-ietf-armd-problem-statement, February
2012.
7. Acknowledgments
This document was prepared using 2-Word-v2.0.template.dot.
Nachum, et al. Expires September 4, 2012 [Page 11]
Internet-Draft Informational March 2012
Authors' Addresses
Youval Nachum
Marvell
6 Hamada St.
Yokneam, 20692 Israel
Email: youvaln@marvell.com
Tal Mizrahi
Marvell
6 Hamada St.
Yokneam, 20692 Israel
Email: talmi@marvell.com
Ilan Yerushalmi
Marvell
6 Hamada St.
Yokneam, 20692 Israel
Email: yilan@marvell.com
Nachum, et al. Expires September 4, 2012 [Page 12]