ARMD BOF                                                       M. Karir
Internet Draft                                                  J. Rees
Intended status: Informational Track                 Merit Network Inc.
Expires: January 2012

                                                          July 10, 2011


                       Address Resolution Statistics
                    draft-karir-armd-statistics-01.txt


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with
   the provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on January 10, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.  Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the BSD License.



Karir                  Expires January 10, 2012                [Page 1]


Internet-Draft             ARMD Statistics                July 10, 2011


Abstract

   As large scale data centers continue to grow with an ever-increasing
   number of virtual and physical servers there is a need to re-
   evaluate performance at the network edge.  Performance is often
   critical for large scale data center scale applications and it is
   important to minimize any unnecessary latency or load in order to
   streamline the operation of services at such large scales.  To
   extract maximum performance from these applications it is important
   to optimize and tune all the layers in the data center stack.  One
   critical area that requires particular attention is the link-layer
   address resolution protocol that maps an IP address with the
   specific hardware address at the edge of the network.

   The goal of this document is to characterize this problem space in
   detail in order to better understand the scale of the problem as
   well as to identify particular scenarios where address resolution
   might have greater adverse impact on performance.

Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 0.

Table of Contents

   1. Introduction...................................................3
   2. Terminology....................................................3
   3. Factors That Might Impact ARP/ND Performance...................4
      3.1. Number of Hosts...........................................4
      3.2. Traffic Patterns..........................................4
      3.3. Network Events............................................4
      3.4. Address Resolution Implementations........................4
      3.5. Layer 2 Network Topology..................................5
   4. Experiments and Measurements...................................5
      4.1. Experiment Architecture...................................5
      4.2. Impact of Number of Hosts.................................8
      4.3. Impact of Traffic Patterns................................8
      4.4. Impact of Network Events..................................9
      4.5. Implementation Issues....................................10
      4.6. Experiment Limitations...................................10
   5. Emulating Address Resolution Behavior.........................11
   6. Conclusion and Recommendation.................................11
   7. Manageability Considerations..................................11
   8. Security Considerations.......................................11
   9. IANA Considerations...........................................12


Karir                  Expires January 10, 2012                [Page 2]


Internet-Draft             ARMD Statistics                July 10, 2011


   10. Acknowledgments..............................................12
   11. References...................................................12
   Authors' Addresses...............................................12
   Intellectual Property Statement..................................13
   Disclaimer of Validity...........................................13



1. Introduction

   Data centers are a key part of delivering Internet scale
   applications.  Performance at such large scales is critical as even
   a few milliseconds or microseconds of additional latency can result
   in loss of customer traffic.  Data center design and network
   architecture is a key part of the overall service delivery plan.
   This includes not only determining the scale of physical and virtual
   servers but also optimizations to the entire data center stack
   including in particular the layer 3 and layer 2 architectures.
   One aspect of data center design that has received some close
   attention is link-layer address resolution protocols such as Address
   Resolution Protocol (ARP - IPv4) and Neighbor Discovery (ND - IPv6).
   The goal of these protocols is to map an IP address of a destination
   node with the hardware address of the network interface for that
   node.  This address resolution occurs at the edge of the network.
   In general, both ARP and ND are query/response protocols.
   In order to maximize performance it is important to understand the
   behavior of these protocols at large scales.  In particular, we need
   to understand what the performance implications of these protocols
   might be in terms of the number of additional messages that they
   generate as well the resulting load on devices on the network that
   must then process these messages.

2. Terminology

   ARP:     Address Resolution Protocol

   ND:      Neighbor Discovery

   ToR:      Top of Rack Switch

   VM:      Virtual Machines






Karir                  Expires January 10, 2012                [Page 3]


Internet-Draft             ARMD Statistics                July 10, 2011


3. Factors That Might Impact ARP/ND Performance

3.1. Number of Hosts

   Every host on the network that attempts to send/receive traffic will
   produce some base level of ARP/ND traffic.  The overall amount of
   ARP/ND traffic on the network will vary with the number of hosts.
   In the case of ARP, all address resolution request messages are
   broadcast and these will be received and processed by all nodes on
   the network. In the case of ND, address resolution messages are sent
   via multicast and therefore may have a lower overall impact on the
   network even though the number of messages exchanged is the same.

3.2. Traffic Patterns

   The traffic pattern can have a significant impact on the level of
   ARP/ND traffic in the network.  Therefore we would expect ARP/ND
   traffic pattern to vary significantly based on the data center
   design as well as the application mix.  The traffic mix determines
   how many other nodes a given node needs to communicate with and how
   frequently.  Both of these directly influence address discovery
   traffic on the network.

3.3. Network Events

   Several specific network events can have a significant impact on
   ARP/ND traffic.  One example of such an event is machine failure.
   If a host that is frequently accessed fails, it could result in much
   higher ARP/ND traffic as other hosts in the network continue to try
   to reach it by repeatedly sending out additional address resolution
   messages.  Another example is Virtual Machine migration.  If a VM is
   migrated to a system on a different switch, VLAN, or even
   geographically different data center, it can cause a significant
   shift in overall traffic patterns as well as ARP/ND traffic.
   Another particularly well-known network event that causes address
   resolution traffic spikes is a network scan.  In a network scan, one
   or more hosts internal or external to the edge network attempt to
   connect to a large number of internal hosts in a very short period
   of time.  This results in a sudden increase in the amount of address
   resolution traffic in the network.

3.4. Address Resolution Implementations

   As with any other protocol, the activity of address resolution
   protocols such as ARP/ND can vary significantly with specific
   implementations as well as the default settings for various protocol
   parameters.  ARP cache timeout is a common parameter that has a


Karir                  Expires January 10, 2012                [Page 4]


Internet-Draft             ARMD Statistics                July 10, 2011


   direct impact on the amount of address resolution traffic.  Older
   versions of Microsoft Windows would use a default value of 2 minutes
   for this parameter, however Windows Vista and Windows 2008
   implementations changed this to be a random value between 15 seconds
   and 45 seconds.  This parameter defaults to 60 seconds for Linux and
   20 minutes for FreeBSD.  The default value for Cisco routers and
   switches is 4 hours.  For ND, one relevant parameter is the prefix
   stale time, which determines when old entries can be aged out.  This
   value is 30 days for Cisco, and 60 seconds for Linux. The overall
   address resolution traffic in a data center will vary based on the
   mix of various ARP implementations that are present.

3.5. Layer 2 Network Topology

   The layer 2 network topology within a data center can also influence
   the impact of various address resolution protocols.  While ARP
   traffic is broadcast and must be processed by all nodes within that
   broadcast domain, a well designed layer 2 topology can limit the
   size of the broadcast domain and the amount of address resolution
   traffic.  ND traffic on the other hand is multicast and might
   potentially increase the load on the directly connected layer 2
   switch if the traffic pattern spans across broadcast domains.

4. Experiments and Measurements

4.1. Experiment Architecture

   In an attempt to quantify address resolution issues in a data center
   environment we have run experiments in our own data center, which is
   used for production services.  We were able to leverage unused
   capacity for our experiments.  The data center topology is fairly
   simple.  There are a pair of redundant access switches which pass
   traffic to and from the data center.  These switches connect to the
   top of the rack switches which in turn connect to blade switches in
   our Dell blade chassis.  The entire hardware platform is managed via
   VMware's vCloud Director.  In total we have access to 8 blades of
   resources on a single chassis, which is roughly 3TB of disk, 200GB
   of RAM and 100GHz of CPU.  The network available to us is a /22
   network block of IPv4 space and a /64 of IPv6 address space in a
   flat topology.

   Using this resource pool we create a 500-node testbed based on
   Centos 5.5.  We use custom command and control software that allows
   us to control these nodes for our experiments.  This allows us to
   issue commands to all nodes to start/stop services and traffic
   generation scripts.  We also use a custom traffic generator agent in



Karir                  Expires January 10, 2012                [Page 5]


Internet-Draft             ARMD Statistics                July 10, 2011


   order to generate both internal and external traffic via wget
   commands to various hosts.

   The command and control software uses UDP broadcast messages for
   communication so that no additional address resolution messages are
   generated that might affect our measurements.  Each of the 500 nodes
   is given a list of other nodes that it must contact at the beginning
   of an experiment.  This is used to affect the traffic patterns for a
   given experiment.  In addition each experiment determines traffic
   rate by specifying the inter-communication delay between attempts to
   contact other nodes.  The shorter the duration the more the traffic
   that will be generated.  The nodes all run dual IPv4/IPv6 stacks.

   A packet tap attached to a monitor port on the access switch allows
   us to monitor the arrival rate of ARP and ND requests and replies.
   We also monitor the CPU load on the access switch at two-second
   intervals via SNMP queries [STUDY].

   Figure 1. shows our experimental setup.






























Karir                  Expires January 10, 2012                [Page 6]


Internet-Draft             ARMD Statistics                July 10, 2011




                        External                External
                           |                        |
                           |                        |
                           |                        |
                           |                        |
                       +---+---------+    +---------+---+
    +------------+     | Data_Agg_1  |    |  Data_Agg_2 |
    |   Packet   |_____|    Cisco    |    |    Cisco    |
    |    Tap     |     |  Catalyst   |    |  Catalyst   |
    +------------+     |    4900M    |    |    4900M    |
                       +---+----+---++    +---+---+--+--+
                           |    |    \        |   |  |
                           |    |     \       |   |  |
                          /      \     \      |   |  |_______
                         /        \     \     |   |_______   |
                        /          \     \____|___________|_ |
       ________________/            \_________|__________ | ||
      |                                       |          || ||
+-----|-------------Dell Enclosure 1----------|--------+ .. ..
|+----+-----+  +----------+  +----------+  +----------+| .. ..
||  Cisco   |__|  Cisco   |__|  Cisco   |__|  Cisco   || .. ..
|| Catalyst |  | Catalyst |  | Catalyst |  | Catalyst ||
||   3130   |  |   3130   |  |   3130   |  |   3130   ||
|+-++++++++-+  +-++++++++-+  +-++++++++-+  +-++++++++-+|
|  ||||||||      ||||||||      ||||||||      ||||||||  |
|1-+||||||+-8  1-+||||||+-8  1-+||||||+-8  1-+||||||+-8|
| 2-+||||+-7    2-+||||+-7    2-+||||+-7    2-+||||+-7 |
|  3-+||+-6      3-+||+-6      3-+||+-6      3-+||+-6  | .. ..
|   4-++-5        4-++-5        4-++-5        4-++-5   | .. ..
+------------------------------------------------------+ .. ..
                        +------+_________________________|| ||
                        | En.2 |__________________________| ||
                        +------+                            ||
                        +------+____________________________||
                        | En.3 |_____________________________|
                        +------+






Karir                  Expires January 10, 2012                [Page 7]


Internet-Draft             ARMD Statistics                July 10, 2011



4.2. Impact of Number of Hosts

   One of the most simple experiments is to determine the overall
   baseline load that is generated on a given network segment when a
   varying number of hosts are active.  While the absolute numbers
   might vary on a large number of factors, what we are interested in
   here is how the traffic scales as different numbers of hosts are
   brought online given all other factors being held constant.  Our
   experiment therefore simply changes the number of active hosts in
   our experiment setup from one run to the next and we measure address
   resolution traffic on the network. The number of hosts is increased
   from 100 to 500 in steps of 100.  The results indicate that address
   resolution traffic scales in a linear fashion with the number of
   hosts in the network. This linear scaling applies both to ARP as
   well as ND traffic though raw ARP traffic rate was considerably
   higher than ND traffic rate.  For our parameters the rate varied
   from 100 to 250pps of ARP traffic and from 25pps to 200pps for ND
   traffic.  There is a clear spike in CPU load on the access switch in
   the beginning of each experiment, which can reach almost 40 percent.
   We were not able to discern any increase in this spike across
   experiments.

4.3. Impact of Traffic Patterns

   Traffic patterns can have a significant impact on the amount of
   address resolution traffic in the network.  In order to study this
   in detail we constructed two distinct experiments, the first of
   which simply increased the rate at which nodes were attempting to
   communicate with each other, while the second experiment controlled
   the number of active versus inactive nodes in the traffic exchange
   matrix.

   The first experiment uses all 500 nodes in our experiment and
   increases the traffic load for each run by reducing the wait time
   between communication events.  The wait time is reduced from 50
   seconds to 1 second over a series of 6 runs by roughly halving the
   duration for each run.  All other parameters remain the same across
   experiment runs.  Therefore the only factor we are varying is the
   total number of nodes a single node will attempt to communicate
   within a given interval of time.  Once again we observe a linear
   scaling in ARP traffic volumes ranging from 200pps for the slowest
   experiment to almost 1800pps for the most aggressive experiment.
   The linear trend also holds for ND traffic, which increases from
   50pps to 1400pps across different runs.




Karir                  Expires January 10, 2012                [Page 8]


Internet-Draft             ARMD Statistics                July 10, 2011


   The goal of the second experiment is to determine the impact of
   active versus inactive hosts in the network.  An inactive host in
   this context means one for which an IP address has been assigned,
   but there is nothing at that address so that ARP requests and all
   other packets are ignored.  All 500 hosts are involved in traffic
   initiation.  The pool of targets for this traffic starts out being
   the same 500 hosts that are initiating.  In subsequent runs we vary
   the ratio of active to inactive target hosts, from 500/0 to 400/100
   in steps of 100.  This experiment showed roughly a 60% increase
   (220-360 pps) in traffic for the IPv4 (ARP) case and about an 80%
   increase (160-290 pps) for the IPv6 case.

   In a slight variation on the second experiment all 500 nodes attempt
   to contact all other hosts plus an additional varying number of
   inactive hosts in steps of 100 up to a maximum of 400.  In this
   experiment we see a slight linear increase as the total number of
   nodes in the traffic matrix increases for both ARP and ND.

   We ran these experiments for IPv4 only, IPv6 only, and simultaneous
   IPv4 and IPv6.  ARP and ND traffic seemed to be independent of each
   other.  That is, the ARP and ND traffic rates and switch CPU load
   depend on the presented traffic load, not on the presence of other
   traffic on the network.

   One final experiment attempted to determine what the maximum
   additional load of ARP/ND traffic might be in our setup.  For this
   purpose we configured our experiment to use all 500 nodes to
   communicate with all 500 other nodes one at a time as fast as
   possible.  We were able to observe ARP traffic peak of up to 4000pps
   and a maximum CPU load of 65% on the access switch.

4.4. Impact of Network Events

   Network scanning is commonly understood to cause significant address
   resolution activity on the edge of the network.  Using our
   experimental setup we attempted to repeatedly scan our network both
   from the outside as well as within.  In each case we were able to
   generate ARP traffic spikes of up to 1400pps and ND traffic spikes
   of 1000pps.  These are also accompanied by a corresponding spike in
   CPU load at the access switch.

   Node failures in a network also have the ability to significantly
   impact address resolution traffic.  This effect depends on the
   particular traffic patter and the number of other hosts that are
   attempting to communicate with the failed node.  All nodes will
   repeatedly attempt to perform address resolution for the failed node
   and this can lead to significant increase in ARP/ND traffic.  We are


Karir                  Expires January 10, 2012                [Page 9]


Internet-Draft             ARMD Statistics                July 10, 2011


   able to show this via a simple experiment that creates 400 active
   nodes which all attempt to communicate with nodes in a separate
   group of 80 nodes.  For each experiment run we then shutdown hosts
   in the target group of 80 nodes in batches of 10 each.  We are able
   to demonstrate that ARP traffic actually increases in this scenario
   from an overall rate of 200pps to 300pps.

   Another network event that might result in significant changes in
   address resolution traffic is the migration of VMs in a data center.
   We attempted to replicate this scenario in our somewhat limited
   environment by placing one of our 8 blades in maintenance mode,
   which forced all 36 VMs on that blade to migrate to other blades.
   However, as our entire experimental infrastructure is located within
   a single rack we do not notice any changes in ARP traffic during
   this event.

   Many hypervisors remove the problem of virtual machine migration by
   assigning a MAC address to a VM, and then a kernel switching module
   handles all address resolution, accepting and sending packets for
   all the MAC addresses of its virtual machines through a determined
   host interface. In other words, the hypervisor responds to the
   appropriate traffic for the VMs it contains. It behaves as a router
   for the Layer 2 traffic it is exposed to.

4.5. Implementation Issues

   Protocol implementations and default parameter values can also have
   a significant impact on the behavior of address resolution traffic
   in the network.  Parameters such as cache timeout values in
   particular determine when cached entries are removed or need to be
   accessed to ensure they are not stale.  Though these parameters are
   unlikely to be modified the variation in these for different systems
   can impact ARP/ND traffic when different systems are present on a
   given network in varying numbers.  Our experimental setup did not
   explore this issue of mixed environments or sensitivity of ARP/ND
   traffic to the various protocols parameters.

4.6. Experiment Limitations

   Our experimental environment though fairly typical in the hardware
   and software aspects probably only represents a very limited small
   data center configuration.  It is difficult to thoroughly instrument
   very large environments and even smaller experimental environments
   in a lab might not be very representative.  We believe our
   architecture is fairly representative and provides us with useful
   insights regarding the scale and trends of address resolution
   traffic in a data center.


Karir                  Expires January 10, 2012               [Page 10]


Internet-Draft             ARMD Statistics                July 10, 2011


   One very significant limitation that we came across in our
   experiments was the problems of using all 500 nodes in a high load
   scenario.  When all 500 nodes were active simultaneously our
   architecture would run into a bottleneck while accessing disk
   storage.  This limitation also prevents us from attempting to scale
   our experiments for more than 500 nodes.  This also limited us in
   what experiments we could run at the maximum possible load.

   Our experimental testbed shared infrastructure, including network
   access switches, with production equipment.  This limited our
   ability to stress the network to failure, and our ability to try
   changes in switch configuration.

5. Scaling Up: Emulating Address Resolution Behavior on Larger Scales

   Based on the data collected from our experiments we have built an
   ARP/ND traffic emulator that has the ability to generate varying
   amounts of address resolution traffic on a network with varying
   address ranges.  This gives us the ability to scale beyond 500 VM
   nodes in our experiments.  Our software emulator can be used to
   directly test the impact of such traffic on nodes and switches in
   the network at much larger scales.

   Preliminary results show a good match between the testbed and the
   emulator for both traffic rates and switch load over a wide range of
   presented traffic load.  We have calibrated the emulator from the
   testbed data and will use the emulator to run experiments at scales
   that would otherwise be impractical in the real network available to
   us.

6. Conclusion and Recommendation

   In this document we have described some of our experiments in
   determining the actual amount of address resolution traffic on the
   network under a variety of conditions for a simple small data center
   topology.  We are able to show that ARP/ND traffic scales linearly
   with the number of hosts in the network as well as the traffic
   interconnection matrix.  In addition we also study the impact of
   network events such as scanning, machine failure and VM migrations
   on address resolution traffic.  We were able to show that even in a
   small data center with only 8 blades and 500 virtual hosts, ARP/ND
   traffic can reach rates of thousands of packets per second, and
   switch CPU loads can reach 65% or more.

   We are able to utilize the data from our experiments to build a
   software based ARP/ND traffic emulation engine that has the ability
   to generate address resolution traffic at even larger scales.  The


Karir                  Expires January 10, 2012               [Page 11]


Internet-Draft             ARMD Statistics                July 10, 2011


   goal of this emulation engine is to allow us to study the impact of
   this traffic on the network for large data centers.

7. Manageability Considerations

   This document does not add additional manageability considerations.

8. Security Considerations

   This document has no additional requirement for security.

9. IANA Considerations

   None.

10. Acknowledgments

   We want to acknowledge the following people for their valuable
   discussions related to this draft: Igor Gashinsky, Kyle Creyts,
   Warren Kumari.

   This document was prepared using 2-Word-v2.0.template.dot.

11. References

   [ARP]    D.C. Plummer, "An Ethernet address resolution protocol."
             RFC826, Nov 1982.

   [ND]     T. Narten, E. Nordmark, W. Simpson, H. Soliman, "Neighbor
             Discovery for IP version 6 (IPv6)." RFC4861, Sept 2007.

   [STUDY]  Rees, J., Karir, M., "ARP Traffic Study." MANOG52, June
             2011. URL
             http://www.nanog.org/meetings/nanog52/presentations/Tuesda
             y/Karir-4-ARP-Study-Merit Network.pdf


Authors' Addresses

   Manish Karir
   Merit Network Inc.
   1000 Oakbrook Dr, Suite 200
   Ann Arbor, MI 48104, USA
   Phone: 734-527-5750
   Email: mkarir@merit.edu




Karir                  Expires January 10, 2012               [Page 12]


Internet-Draft             ARMD Statistics                July 10, 2011


   Jim Rees
   Merit Network Inc.
   100 Oakbrook Dr, Suite 200
   Ann Arbor, MI 48104, USA
   Phone: 734-527-5751
   Email: rees@merit.edu



Intellectual Property Statement

   The IETF Trust takes no position regarding the validity or scope of
   any Intellectual Property Rights or other rights that might be
   claimed to pertain to the implementation or use of the technology
   described in any IETF Document or the extent to which any license
   under such rights might or might not be available; nor does it
   represent that it has made any independent effort to identify any
   such rights.

   Copies of Intellectual Property disclosures made to the IETF
   Secretariat and any assurances of licenses to be made available, or
   the result of an attempt made to obtain a general license or
   permission for the use of such proprietary rights by implementers or
   users of this specification can be obtained from the IETF on-line
   IPR repository at http://www.ietf.org/ipr

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   any standard or specification contained in an IETF Document. Please
   address the information to the IETF at ietf-ipr@ietf.org.

Disclaimer of Validity

   All IETF Documents and the information contained therein are
   provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION
   HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY,
   THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL
   WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY
   WARRANTY THAT THE USE OF THE INFORMATION THEREIN WILL NOT INFRINGE
   ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS
   FOR A PARTICULAR PURPOSE.

Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Karir                  Expires January 10, 2012               [Page 13]