Network working group                                             X. Xu
Internet Draft                                                   Huawei
Category: BCP                                              M. Boucadair
Expires: March 2010                                      France Telecom
                                                     September 25, 2009


    Redundancy and Load Balancing Framework for Stateful Network Address
                             Translators (NAT)

                  draft-xu-behave-stateful-nat-standby-01


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with
   the provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on March 25, 2010.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your
   rights and restrictions with respect to this document.






Xu & Boucadair         Expires March 25, 2010                 [Page 1]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT



Abstract

   This document defines a framework for ensuring redundancy and/or
   load balancing for stateful Network Address Translators (NAT),
   including NAT44, NAT46 and NAT64.

Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [RFC2119].

Table of Contents


   1. Introduction.................................................3
   2. Terminology..................................................3
   3. Reference Architecture.......................................4
   4. Redundancy Mechanisms........................................5
      4.1. Cold Standby Mechanism..................................6
      4.2. Hot Standby Mechanism...................................8
   5. Load Balancing Mechanisms....................................9
   6. Election Protocol Considerations.............................9
   7. State Synchronization Protocol Considerations...............10
   8. Security Considerations.....................................10
   9. IANA Considerations.........................................10
   10. Acknowledgments............................................11
   11. References.................................................11
      11.1. Normative References..................................11
      11.2. Informative References................................11
   Authors' Addresses.............................................12















Xu & Boucadair         Expires March 25, 2010                 [Page 2]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT


1. Introduction

   Network Address Translation (NAT) has been used as an efficient way
   to share the same IPv4 address among several hosts. Recently, due to
   IPv4 address shortage, several proposals have been elaborated to
   rely on Large Scale NAT (LSN) (also denoted as Carrier Grade NAT
   (CGN), e.g., [NAT444], [DS-Lite] and [NAT64]) as means to optimize
   the address multiplicative factor [Shortage]. In such models, CGN
   function (which may be embedded in a router or be deployed in
   standalone devices) is activated within large-scale networks, such
   as ISP networks or enterprise ones, where a huge amount of customers
   are located. These customers within large-scale networks may
   experience service degradation due to the presence of the single
   point of failure. Therefore, redundancy and/or load-balancing
   capabilities are strongly desired for these LSN/CGN devices in order
   to provide highly available services to customers. Failure detection
   and repair time must be therefore shortened.

   This document describes a framework of redundancy and/or load
   balancing for stateful NAT including: NAT46, NAT64 and NAT44.
   Stateless NAT is out of the scope of this memo. Unless mentioned,
   NAT and LSN/CGN terms throughout this document, pertain to stateful
   NAT and stateful LSN/CGN. Except dealing with the exceptional
   failures (e.g., power outage, OS crash-down or link failure etc.),
   the redundancy mechanism described in this document can also be used
   for planned maintenance operations (i.e., graceful shutdown of the
   primary NAT due to maintenance needs).

2. Terminology

   This memo makes use of the terms defined in [RFC2663]. Below are
   provided terms specific to this document:

   - LSN (Large Scale NAT)/CGN (Carrier Grade NAT): a NAT device placed
   within a large-scale network (e.g., ISP network, enterprise network,
   or campus network). These devices may be placed at the boundary
   between the large-scale private network and the public Internet,
   between a private network and a large-scale public network or
   between two heterogeneous (i.e., IPv4 and IPv6) IP realms.

   - LSN/CGN internal address realm (internal realm for short): a realm
   where the communication initiators (e.g., a client in the context of
   client/server application) are located. For NAT44, the internal
   realm refers to the private networks, as opposed to the IPv4
   Internet. For NAT64, the internal realm means IPv6 network or IPv6



Xu & Boucadair         Expires March 25, 2010                 [Page 3]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

   Internet. For NAT46, the internal realm refers to IPv4 network or
   IPv4 Internet. Accordingly, the hosts located in the internal realm
   are called internal hosts, and the addresses used in the internal
   realm are called internal addresses. In the context of DS-lite
   architecture, the internal address realm is assumed to be private
   IPv4 addresses even if the transport mode used to convey exchanged
   traffic is IPv6. A DS-lite CGN device is a NAT44 device which
   requires IPv6 capabilities and IPv6-specific information to perform
   its NAT operation.

   - LSN/CGN external address realm (external realm for short): a realm
   where the communication responders (e.g., a server in the
   client/server application) are located. For NAT44, the external
   realm refers to the IPv4 Internet. For NAT64, the external realm
   means the IPv4 Internet or IPv4 network. For NAT46, the external
   realm refers to the IPv6 Internet or IPv6 network. Accordingly, the
   hosts located in the external realm are called external hosts, and
   the addresses used in the external realm are called external
   addresses.

   - Internal address pool: an address pool used for assigning internal
   addresses to represent the external hosts in the internal realm.
   Note that this address pool is specific to NAT46 and NAT64. For
   NAT46, the IPv4 address pool used for assigning internal IPv4
   addresses to represent external IPv6 hosts is the internal address
   pool. For NAT64, the prefix64 used for synthesizing internal IPv6
   addresses to represent external IPv4 hosts in the internal realm
   could be looked as a special internal address pool.

   - External address pool: an address pool used for assigning external
   addresses for the internal hosts. For NAT44 and NAT64, the IPv4
   address pool is the external address pool. For NAT46, the prefix64
   could be looked as a special external IPv6 address pool from which
   synthesized IPv6 addresses are assigned to internal IPv4 hosts.

   - CPE (Customer Premises Equipment): A device which is used to
   interconnect the customer premise with the service provider's
   network.

   - Prefix64: an IPv6 prefix used for synthesizing IPv6 addresses for
   the IPv4 hosts. See [Format] for more details.

3. Reference Architecture

   In a typical operational scenario, as illustrated in Figure 1, two
   NAT devices are deployed for redundancy and/or load balancing



Xu & Boucadair         Expires March 25, 2010                 [Page 4]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

   purposes. Hence, we describe the corresponding mechanisms based on
   this scenario. Note that these mechanisms are also suitable in the
   scenarios in which more than two NAT devices are used.

        +-------------------------+     +-----------------------+
        |                         |     |                       |
        |                       +-+-----+-+                     |
        |                       |  NAT-A  |                     |
   +----+-------------+         +-+-----+-+    +-------------+  |
   |   Internal Host  |           |     |      |External Host|  |
   +----+-------------+           |     |      +-------------+  |
        |                       +-+-----+-+                     |
        |                       |  NAT-B  |                     |
        |    Internal realm     +-+-----+-+    External realm   |
        |                         |     |                       |
        +-------------------------+     +-----------------------+

              Figure 1. General Scenario of Dual NAT Routers



   Due to the fact that the redundancy and load-balancing mechanisms
   for NAT44, NAT46 and NAT64 are almost the same except for the routes
   towards the external realm advertised into the internal realm by the
   NAT devices or outsourced to a router, e.g., a route to the prefix64
   in the case of NAT64, a route to the IPv4 Internet (in the context
   of [NAT444]) or the tunnel concentrator (in the context of [DS-Lite])
   in the case of NAT44, and a route to the IPv4 address pool in NAT46,
   we describe these mechanisms in general.

4. Redundancy Mechanisms

   The fundamental principle of NAT redundancy is to make two or more
   NAT devices function as a redundancy group, and select one as the
   Primary NAT and the other(s) as the Backup NAT through a dedicated
   election procedure (see Section 6) or manual configuration. In the
   nominal regime, datagrams exchanged between hosts in the internal
   realm and the external realm are handled by the Primary NAT. Once
   the Primary NAT is out of service (means to detect and to notify
   this failure to the redundancy group should be activated), the
   Backup NAT with the highest priority (if several backup NATs are
   deployed) takes over and is consequently elected (or selected) to be
   responsible for handling received traffic. This Backup NAT is then
   identified as new Primary NAT. Once the former Primary NAT became
   operational, it could either preempt the role of Primary NAT or not.




Xu & Boucadair         Expires March 25, 2010                 [Page 5]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

   This should be part of the policies to be configured by the
   administrative entity managing a NAT redundancy group.

   To ensure a coherent behavior in case of NAT failure, this document
   assumes that both Primary and Backup NAT devices are managed by the
   same administrative entity. Thus, consistent configuration policies
   should be enforced in all involved nodes. Note that the election
   process must be deterministic and does not lead to fuzzy behavior as
   far as the election of new Primary NAT is concerned. Moreover, to
   enhance the service availability the time to detect a failure and
   the handover between the Primary and Backup NAT must be shortened.

   Two redundancy mechanisms are described hereafter: the cold or the
   hot standby mechanism:

       The goal of the cold standby mechanism is just to keep the NAT
        failover transparent to the communicating internal hosts;

       In contrast, the purpose of the hot standby mechanism is to
        maintain established sessions continuously during the NAT
        failover.

   The following sub-sections provide more information about these two
   modes.

4.1. Cold Standby Mechanism

   To implement cold standby mode, the internal addresses used to
   represent the external hosts in the local realm should be retained
   despite the NAT failover. The following assesses how this
   requirement is met in each NAT flavor:

       In the context of NAT44, the external hosts' internal addresses
        (i.e., the addresses used to represent the external hosts in
        the internal realm) are the same as their external addresses.
        Therefore, the above requirement is met naturally.

       In a NAT64 context, NAT devices belonging to a redundancy group
        should be configured with an identical IPv6 prefix prefix64.

       As for NAT46, NAT devices in a redundancy group should be
        configured with an identical IPv4 address pool and a subset of
        translation state information should be synchronized among
        these NAT devices through a dedicated state synchronization
        protocol [NAT-Sync]. This is to ensure the Backup NAT, once
        selected as the current Primary NAT, to assign the



Xu & Boucadair         Expires March 25, 2010                 [Page 6]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

        communicating IPv6 hosts the same IPv4 addresses as those
        assigned by the previous Primary NAT device.

   Each NAT device in a NAT redundancy group is configured with a
   different external address pool. A route to that external pool is
   then announced into the external realm by the NAT device or
   outsourced to another router.

       In the cases of NAT44 and NAT64: NAT devices are configured
        with different external IPv4 address pools (i.e., addresses to
        represent the external hosts in the internal realm) without any
        overlap. Otherwise, the same address or address/port pair,
        which was assigned to some internal host by the previous
        Primary NAT, may be occasionally assigned to a different
        internal host by the current Primary NAT, and this will cause
        some confusion. In addition, by using different external
        address pools on each NAT device, the outgoing and returning
        datagrams of a given session are ensured to always traverse the
        same NAT device (i.e., the primary NAT device) in normal cases
        except the NAT failover happens.

       In the case of NAT46, the issue occurred in NAT44 and NAT64
        cases will not happen when using the same external IPv6 address
        pool (i.e., the IPv6 prefix prefix64) due to the stateless
        address translation for the internal hosts. Hence each NAT
        device can be configured with either the same prefix64 or not.
        The case where different prefix64 is configured on distinct NAT
        devices is called as the cold standby, as opposed to the hot
        standby in which the same IPv6 prefix is used.

   In order to make IP datagrams, destined to the external realm,
   always traverse via the Primary NAT, the Primary NAT must announce
   into the internal realm a route towards the external realm. In case
   the Primary NAT and the Backup one are specified manually, the
   Backup NAT (or associated router) should announce into the internal
   realm a route towards the external realm to prepare for the failover.
   However, in order to ensure the route advertised by the Primary NAT
   (or by associated router), rather than that advertised by the Backup
   NAT, is selected as the best by the routers in the internal realm
   despite topology changes, the route advertised by the Backup NAT
   should be set at a higher enough cost or larger granularity (for
   example, the Backup NAT announces a route to 10.0.0.0/8, while the
   Primary NAT announces two more specific routes to 10.0.0.0/9 and
   10.128.0.0/9 respectively).





Xu & Boucadair         Expires March 25, 2010                 [Page 7]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

   Once the connections to the external realm are lost, the route
   towards the external realm previously announced should be withdrawn.
   When the Primary NAT fails, datagrams destined to the external realm
   will pass through the Backup NAT. If the Primary NAT and the Backup
   NAT are automatically elected through a dedicated election process,
   the Backup NAT would be elected as a new Primary NAT when the old
   Primary one fails, so it is not necessary for the Backup NAT to make
   the above route announcements.

4.2. Hot Standby Mechanism

   To preserve the established sessions during the failover, in
   addition to keeping the internal addresses for the external hosts
   unchanged, the external addresses for the internal hosts should also
   be kept unchanged.

   How to meet the first requirement will not be re-iterated since it
   is similar to the cold standby mechanism (See previous sub-section).

   To meet the second requirement, NAT devices in a redundancy group
   should be configured with an identical external address pool and
   they should assign the same external address and port for the same
   internal host. In the case of NAT46, NAT devices should be
   configured with an identical prefix64. For NAT44 and NAT64, in
   addition to having the NAT devices configured with identical IPv4
   address pools, the translation state on the Primary NAT device
   should be synchronized to the Backup NAT device(s) in a timely
   fashion.

   The Primary NAT (or its associated router) announces into the
   internal realm a route towards the external realm and announces into
   the external realm a route towards the external address pool. If the
   Primary NAT and the Backup NAT are specified manually, the Backup
   NAT device (or its associated router) should also announce those
   routes, but with higher enough cost or larger granularity. Once the
   connection to either the external realm or the internal realm is
   lost, the above routes should be withdrawn (either by the primary
   NAT device itself or by a third party). When the Primary NAT fails,
   the datagrams towards the external realm will pass through the
   Backup NAT device. If the Primary NAT and the Backup are
   automatically elected through a dedicated election procedure, the
   Backup NAT would be elected as a new Primary NAT when the old
   Primary NAT device fails. Consequently, it is not necessary for the
   Backup NAT to make the above route announcement.





Xu & Boucadair         Expires March 25, 2010                 [Page 8]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

5. Load Balancing Mechanisms

   Based on the above redundancy mechanisms, one can further realize
   load balancing among a group of NAT devices. The basic idea is to
   create two redundancy groups (e.g., group A and group B) on these
   NAT devices, make one device as the Primary NAT for group A and the
   Backup NAT for group B, while make the other as the Primary NAT for
   group B and the Backup NAT for group A. Taking NAT64 as an example,
   NAT devices are configured with two IPv6 prefixes prefix64s (e.g.,
   prefix64-A and prefix64-B) corresponding to two different redundancy
   groups (e.g., group A and group B) separately, and one device is
   designated as the Primary NAT for group A and the Backup NAT for
   group B, while the other as the Backup NAT for group A and the
   Primary NAT for group B. Therefore, the IPv6 datagrams towards the
   IPv4 external realm are balanced among these NAT devices according
   to their destination addresses with different prefix64 prefixes.

   For load balancing together with cold standby, each NAT device could
   either use the same external address pool or different external
   address pools corresponding to these redundancy groups. However, in
   the case of NAT64, in order to easily determine which prefix64
   should be used for synthesizing IPv6 address of a given IPv4 host in
   the return direction, it would be better to assign different address
   pools for different redundancy groups. In this way, the prefix64 can
   be easily determined according to the destination IPv4 address in
   the return packets sent from the IPv4 host. Besides, the external
   address pools on one NAT device shouldn't have any overlap with
   those of the other NAT device. Otherwise, the same address or
   address/port pair could be assigned occasionally to different
   internal hosts. In contrast, for load balancing together with hot
   standby, different external address pools should be configured for
   these redundancy groups. Otherwise, the return packets towards the
   internal realm may be forwarded to a wrong NAT device.

6. Election Protocol Considerations

   An election process and associated protocol(s) is used to
   automatically elect one NAT device among a NAT redundancy group as
   the Primary NAT device and the others as Backup NAT devices. Once
   the Primary NAT fails, the Backup device with the highest priority
   should take over the Primary NAT role after a short delay. The
   election protocol is also used to track the connectivity to the
   external realm and the internal realm. Once connections to the
   external realm or the internal realm lost, the NAT device is not
   qualified to be the Primary NAT and it will withdraw the route
   towards the external realm announced previously. In the case of hot



Xu & Boucadair         Expires March 25, 2010                 [Page 9]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

   standby, it should also withdraw the route towards the external
   address pool.

   As an implementation example, VRRP [RFC2338] can be used as the
   automatic election protocol. In addition, an interface tracking
   mechanism can also be used to adjust the priority to influence the
   election results.

   If two NAT devices are directly connected via an Ethernet network,
   VRRP can run directly on the Ethernet interfaces. Otherwise, some
   extra configuration or protocol changes need to be implemented. One
   option is to create conditions for VRRP to run among these devices.
   For example, to create a VPLS [RFC4761][RFC4762] instance and enable
   IP functions and run VRRP on those VLAN interfaces which are bound
   to that VPLS instance. If enabling IP on those interfaces is not
   supported, the following trick to realize the same goal, but at a
   cost of consuming two physical interfaces on each NAT router: create
   a VPLS instance among a set of NAT devices, and on each of them one
   Ethernet interface is bound to that VPLS instance, and another IP-
   enabled Ethernet interface is locally connected with that interface.
   Then VRRP can run on those IP enabled Ethernet interfaces which are
   all connected to that VPLS instance. Another option is to enhance
   VRRP so that VRRP neighbors can be configured manually and VRRP
   messages can be exchanged directly between two neighbors in a
   unicast fashion.

   VRRP is only an implementation example of the election process.
   Other protocols may be used to manage the roles of Primary and
   Backup.

7. State Synchronization Protocol Considerations

   [NAT-Sync] defines a candidate solution to NAT state synchronization
   by using Server Cache Synchronization Protocol (SCSP) [RFC2334]. For
   more information about the proposed solution, the reader is invited
   to refer to [NAT-Sync]

8. Security Considerations

   TBD.

9. IANA Considerations

   There are no IANA considerations for this document.





Xu & Boucadair         Expires March 25, 2010                [Page 10]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

10. Acknowledgments

   The author would like to thank Dan Wing and Dave Thaler for their
   insightful comments and reviews, and thank Dacheng Zhang and Xuewei
   Wang for their valuable editorial reviews.

11. References

11.1. Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

11.2. Informative References

   [RFC3022]  Srisuresh, P. and K. Egevang, "Traditional IP Network
             Address NAT (Traditional NAT)", RFC 3022, January 2001.

   [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address
             NAT (NAT) Terminology and Considerations", RFC
             2663, August 1999.

   [RFC2766] Tsirtsis, G. and P. Srisuresh, "Network Address
             Translation - Protocol Translation (NAT-PT)",
             RFC 2766, February 2000.

   [RFC4966] Aoun, C. and E. Davies, "Reasons to Move the Network
             Address NAT - Protocol NAT (NAT-PT) to Historic Status",
             RFC 4966, July 2007.

   [RFC2338] Knight, S., et. al., "Virtual Router Redundancy Protocol",
             RFC2338, April 1998.

   [RFC2334] Luciani, J., Armitage, G., Halpern, J., and N. Doraswamy,
             "Server Cache Synchronization Protocol (SCSP)", RFC 2334,
             April 1998.

   [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
             (VPLS) Using BGP for Auto-Discovery and Signaling",RFC
             4761, January 2007.

   [RFC4762] Lasserre, M. and Kompella, V. (Editors), "Virtual Private
             LAN Service (VPLS) Using Label Distribution Protocol (LDP)
             Signaling", RFC 4762, January 2007.





Xu & Boucadair         Expires March 25, 2010                [Page 11]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

   [NAT64] Bagnulo, M., Matthews, P., and I. Beijnum, "NAT64: Network
             Address and Protocol Translation from IPv6 Clients to
             IPv4 Servers", draft-ietf-behave-v6v4-xlate-stateful-01
             (work in progress), July 2009.

   [NAT444] Shirasaki, Y., Miyakawa, S., Nakagawa, A., Yamaguchi, J.,
             and H. Ashida, "NAT444 with ISP Shared Address",
             draft-shirasaki-nat444-isp-shared-addr-00 (work in
             progress), October 2008.

   [DS-Lite] Durand, A., "Dual-stack lite broadband deployments post
             IPv4 exhaustion", draft-ietf-softwire-dual-stack-lite-01
             (work in progress), July 2009.

   [Format] Huitema, C., Bao, C., Bagnulo, M., Boucadair, M., Li, X.,
             "Framework for IPv4/IPv6 Translation", draft-ietf-behave-
             address-format-00.txt (work in progress), August, 2009.

   [Framework] Baker, F., Li,X., Bao,C., Yin,K., "Framework for
             IPv4/IPv6 Translation", draft-ietf-behave-v6v4-framework-
             01 (work in progress), September 2009.

   [LSN] Nishitani,T., Miyakawa, S., Nakagawa, A., Ashida,H., "Common
             Functions of Large Scale NAT (NAT)", draft-nishitani-cgn-
             01 (work in progress), November 2008.

   [Shortage] Levis, P., Bouacadair, M., Grimault, J-L., Villefranque,
             A., "IPv4 Address Shortage: Needs and Open Issues", draft-
             levis-behave-ipv4-shortage-framework-02 (work in progress),
             June 2009.

   [NAT-Sync] Chen, D., Xu, X., Halpern, J., Boucadair, M., "NAT State
             Synchronization Using SCSP", draft-xu-behave-nat-state-
             sync-00 (work in progress), September, 2009

Authors' Addresses

   Xiaohu Xu
   Huawei Technologies,
   No.3 Xinxi Rd., Shang-Di Information Industry Base,
   Hai-Dian District, Beijing 100085, P.R. China
   Phone: +86 10 82836073
   Email: xuxh@huawei.com


   Mohamed Boucadair



Xu & Boucadair         Expires March 25, 2010                [Page 12]


Internet-Draft        Redundancy and Load Balancing     September 2009
                        Framework for Stateful NAT

   France Telecom
   3, av Francois Chateau
   Rennes 35000
   France
   Email: mohamed.boucadair@orange-ftgroup.com











































Xu & Boucadair         Expires March 25, 2010                [Page 13]