[Search] [txt|pdf|bibtex] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 07 08 09 10 11 12                        
Network Working Group                                          M. Tuexen
INTERNET DRAFT                                                Siemens AG
                                                                  Q. Xie
                                                              R. Stewart
                                                                M. Shore
                                                                  L. Ong
                                                    Point Reyes Networks
                                                             J. Loughney
                                                             M. Stillman
Expires September 1, 2002                                  March 1, 2002

                Architecture for Reliable Server Pooling

Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of [RFC2026].

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other groups
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at

The list of Internet-Draft Shadow Directories can be accessed at


The goal is to develop an architecture and protocols for the management
and operation of server pools supporting highly reliable applications,
and for client access mechanisms to a server pool.

A proposed architecture is presented and illustrated by examples.

Tuexen et al.                                                   [Page 1]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

1.  Introduction

1.1.  Overview

The Internet is always on. Many users expect services to be always
available; many business depend upon connectivity 24 hours a day, 7 days
a week, 365 days a year. In order to fulfill this, many proprietary
solutions and operating system dependent solutions have been developed
to provide highly reliable and highly available servers.

This document defines a proposed architecture, which can be used to
provide highly available services. The way this is achieved is by using
servers grouped into pools.  Therefore, if a client wants to access a
server pool, it will be able to use any of the servers in the server
pool taking into account the server pool policy.

Highly available services also put the same high reliability
requirements upon the transport layer protocol beneath RSerPool - it
must provide strong survivability in the face of network component

Supporting real time applications is another main focus of RSerPool
which leads to requirements on the processing time needed.

Scalability is another important requirement.

1.2.  Terminology

This document uses the following terms:

     Operation scope:
          The part of the network visible to pool users by a specific
          instance of the reliable server pooling protocols.

     Pool (or server pool):
          A collection of servers providing the same application

     Pool handle (or pool name):
          A logical pointer to a pool. Each server pool will be
          identifiable in the operation scope of the system by a unique
          pool handle or "name".

     Pool element:
          A server entity having registered to a pool.

     Pool user:
          A server pool user.

Tuexen et al.                                                   [Page 2]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

     Pool element handle (or endpoint handle):
          A logical pointer to a particular pool element in a pool,
          consisting of the name of the pool and a destination transport
          address of the pool element.

     Name space:
          A cohesive structure of pool names and relations that may be
          queried by an internal or external agent.

     Name server:
          Entity which the responsible for managing and maintaining the
          name space within the RSerPool operation scope.

1.3.  Abbreviations

     ASAP: Aggregate Server Access Protocol

     ENRP: Endpoint Name Resolution Protocol

     PE:   Pool element

     PU:   Pool user

     SCTP: Stream Control Transmission Protocol

     TCP:  Transmission Control Protocol

2.  Reliable Server Pooling Architecture

In this section, we discuss what a typical reliable server pool
architecture may look like.

2.1.  RSerPool Functional Components

There are three classes of entities in the RSerPool architecture:

     -    Pool Elements (PEs).

     -    Name Servers, also called ENRP Servers.

     -    Pool Users (PUs).

A server pool is defined as a set of one or more servers providing the
same application functionality. These servers are called Pool Elements
(PEs). Using multiple PEs in a server pool can be for fault tolerance or
load sharing, for example.

Tuexen et al.                                                   [Page 3]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

Each server pool will be identifiable by a unique name which is simply
an ASCII string, called the pool handle. To fulfill the performance
requirements given in [RFC3237] these names are not valid in the whole
Internet but only in smaller parts, called the operational scope.  Also,
the namespace is flat.

The second class of entities in the RSerPool architecture is the class
of the so called name servers. These name servers can resolve a pool
handle to a list of transport layer end-point addresses of PEs of the
server pool identified by the handle.

     Editors note: Should we talk about UDP, TCP, SCTP specifically?

In each operational scope there must be at least one name server. Most
likely there will be more than one. All these servers have the complete
knowledge about all server pools in the operational scope. The name
servers use a protocol called Endpoint Name Resolution Protocol (ENRP)
to communication which each other to make sure that all have the same
information about the server pools.

A client being served by a PE of a server pool is called a Pool User
(PU). This is the third class of entities in the RSerPool architecture.

The PU wants to be served by a PE of a particular server pool it must
know the pool handle of the server pool. The PU then uses the Aggregate
Server Access Protocol (ASAP) to query for transport layer addresses of
PEs belonging to the server pool identified by the pool handle.

[RFC3237] also requires that the name servers should not resolve a pool
handle to a transport layer address of a PE which is not in operation.
Therefore each PE is supervised by one specific name server, called the
home ENRP server of that PE. If it detects that the PE is out of service
all other name servers are informed by using ENRP.

ASAP is also used by a server to join or leave a server pool. It
registers or deregisters itself by communicating with a name server,
which will normally the the home ENRP server.

2.2.  RSerPool Protocol Overview

The RSerPool requested features can be obtained with the help of two
protocols: ENRP (Endpoint Name Resolution Protocol) and ASAP (Aggregate
Server Access Protocol).

ENRP is designed to provide a fully distributed fault-tolerant real-time
translation service that maps a name to a set of transport addresses
pointing to a specific group of networked communication endpoints
registered under that name. ENRP employs a client-server model with

Tuexen et al.                                                   [Page 4]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

which an ENRP server will respond to the name translation service
requests from endpoint clients running on the same host or running on
different hosts.

ASAP in conjunction with ENRP provides a fault tolerant data transfer
mechanism over IP networks. ASAP uses a name-based addressing model
which isolates a logical communication endpoint from its IP address(es),
thus effectively eliminating the binding between the communication
endpoint and its physical IP address(es) which normally constitutes a
single point of failure.

In addition, ASAP defines each logical communication destination as a
server pool, providing full transparent support for server-pooling and
load sharing. It also allows dynamic system scalability - members of a
server pool can be added or removed at any time without interrupting the

The fault tolerant server pooling is gained by combining two parts,
namely ASAP and ENRP. ASAP provides the user interface for name to
address translation, load sharing management, and fault management. ENRP
defines the fault tolerant name translation service. The protocol stack
used is described by the following figure 1.

               *********        ***********
               * PE/PU *        *ENRP Srvr*
               *********        ***********

               +-------+        +----+----+
  To other <-->| ASAP  |<------>|ASAP|ENRP| <---To Peer ENRP
  PE/PU        +-------+        +----+----+       Name Servers
               | SCTP  |        |  SCTP   |
               +-------+        +---------+
               |  IP   |        |   IP    |
               +-------+        +---------+
                    Figure 1: Typical protocol stack

2.3.  Typical Interactions between RSerPool Components

The following drawing shows the typical RSerPool components and their
possible interactions with each other:

Tuexen et al.                                                   [Page 5]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

  ~                                                  operation scope ~
  ~  .........................          .........................    ~
  ~  .        Server Pool 1  .          .        Server Pool 2  .    ~
  ~  .  +-------+ +-------+  .    (d)   .  +-------+ +-------+  .    ~
  ~  .  |PE(1,A)| |PE(1,C)|<-------------->|PE(2,B)| |PE(2,A)|<---+  ~
  ~  .  +-------+ +-------+  .          .  +-------+ +-------+  . |  ~
  ~  .      ^            ^   .          .      ^         ^      . |  ~
  ~  .      |      (a)   |   .          .      |         |      . |  ~
  ~  .      +----------+ |   .          .      |         |      . |  ~
  ~  .  +-------+      | |   .          .      |         |      . |  ~
  ~  .  |PE(1,B)|<---+ | |   .          .      |         |      . |  ~
  ~  .  +-------+    | | |   .          .      |         |      . |  ~
  ~  .      ^        | | |   .          .      |         |      . |  ~
  ~  .......|........|.|.|....          .......|.........|....... |  ~
  ~         |        | | |                     |         |        |  ~
  ~      (c)|     (a)| | |(a)               (a)|      (a)|     (c)|  ~
  ~         |        | | |                     |         |        |  ~
  ~         |        v v v                     v         v        |  ~
  ~         |     +++++++++++++++    (e)     +++++++++++++++      |  ~
  ~         |     + ENRP-Server +<---------->+ ENRP-Server +      |  ~
  ~         |     +++++++++++++++            +++++++++++++++      |  ~
  ~         v            ^                          ^             |  ~
  ~     *********        |                          |             |  ~
  ~     * PU(A) *<-------+                       (b)|             |  ~
  ~     *********   (b)                             |             |  ~
  ~                                                 v             |  ~
  ~         :::::::::::::::::      (f)      *****************     |  ~
  ~         : Other Clients :<------------->* Proxy/Gateway * <---+  ~
  ~         :::::::::::::::::               *****************        ~
     Figure 2: RSerPool components and their possible interactions.

In figure 2 we can identify the following possible interactions:

     (a)  Server Pool Elements <-> ENRP Server: (ASAP)

          Each PE in a pool uses ASAP to register or de-register itself
          as well as to exchange other auxiliary information with the
          ENRP Server. The ENRP Server also uses ASAP to monitor the
          operational status of each PE in a pool.

     (b)  PU <-> ENRP Server: (ASAP)

          A PU normally uses ASAP to request the ENRP Server for a name-
          to-address translation service before the PU can send user
          messages addressed to a server pool by the pool's name.

Tuexen et al.                                                   [Page 6]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

     (c)  PU <-> PE: (ASAP)

          ASAP can be used to exchange some auxiliary information of the
          two parties before they engage in user data transfer.

     (d)  Server Pool <-> Server Pool: (ASAP)

          A PE in a server pool can become a PU to another pool when the
          PE tries to initiate communication with the other pool. In
          such a case, the interactions described in B) and C) above
          will apply.

     (e)  ENRP Server <-> ENRP Server: (ENRP)

          ENRP can be used to fulfill various Name Space operation,
          administration, and maintenance (OAM) functions.

     (f)  Other Clients <-> Proxy/Gateway: standard protocols

          The proxy/gateway enables clients ("other clients"), which are
          not RSerPool aware, to access services provided by an RSerPool
          based server pool. It should be noted that these
          proxies/gateways may become a single point of failure.

3.  Examples

In this section the basic concepts of ENRP and ASAP will be described.
First an RSerPool aware FTP server is considered. The interaction with
an RSerPool aware and an non-aware client is given. Finally, a telephony
example is considered.

3.1.  Two File Transfer Examples

In this section we present two separate file transfer examples using
ENRP and ASAP. We present two separate examples demonstrating an
ENRP/ASAP aware client and a client that is using a Proxy or Gateway to
perform the file transfer. In this example we will use a FTP [RFC959]
model with some modifications. The first example (the RSerPool aware
one) will modify FTP concepts so that the file transfer takes place over
SCTP. In the second example we will use TCP between the unaware client
and the Proxy.  The Proxy itself will use the modified FTP with RSerPool
as illustrated in the first example.

Please note that in the example we do NOT follow FTP [RFC959] precisely
but use FTP-like concepts and attempt to adhere to the basic FTP model.
These examples use FTP for illustrative purposes, FTP was chosen since
many of the basic concept are well known and should be familiar to

Tuexen et al.                                                   [Page 7]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

3.1.1.  The RSerPool Aware Client

~                                                  operation scope ~
~  .........................                                       ~
~  . "File Transfer Pool"  .                                       ~
~  .  +-------+ +-------+  .                                       ~
~ +-> |PE(1,A)| |PE(1,C)|  .                                       ~
~ |.  +-------+ +-------+  .                                       ~
~ |.      ^            ^   .                                       ~
~ |.      +----------+ |   .                                       ~
~ |.  +-------+      | |   .                                       ~
~ |.  |PE(1,B)|<---+ | |   .                                       ~
~ |.  +-------+    | | |   .                                       ~
~ |.      ^        | | |   .                                       ~
~ |.......|........|.|.|....                                       ~
~ |  ASAP |    ASAP| | |ASAP                                       ~
~ |(d)    |(c)     | | |                                           ~
~ |       v        v v v                                           ~
~ |   *********   +++++++++++++++                                  ~
~ + ->* PU(X) *   + ENRP-Server +                                  ~
~     *********   +++++++++++++++                                  ~
~         ^     ASAP     ^                                         ~
~         |     <-(b)    |                                         ~
~         +--------------+                                         ~
~               (a)->                                              ~
           Figure 3: Architecture for RSerPool aware client.

To effect a file transfer the following steps would take place.

     (1)  The application in PU(X) would send a login request. The
          PU(X)'s ASAP layer would send an ASAP request to its ENRP
          server to request the list of pool elements (using (a)). The
          pool handle to identify the pool would be "File Transfer
          Pool". The ASAP layer queues the login request.

     (2)  The ENRP server would return a list of the three PEs PE(1,A),
          PE(1,B) and PE(1,C) to the ASAP layer in PU(X) (using (b)).

     (3)  The ASAP layer selects one of the PEs, for example PE(1,B). It
          transmitts the login request, the other FTP control data
          finally starts the transmission of the requested files (using
          (c)). For this the multiple stream feature of SCTP could be

     (4)  If during the file transfer conversation, PE(1,B) fails,
          assuming the PE's were sharing state of file transfer, a fail-

Tuexen et al.                                                   [Page 8]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

          over to PE(1,A) could be initiated. PE(1,A) would continue the
          transfer until complete (see (d)). In parallel a request from
          PE(1,A) would be made to ENRP to request a cache update for
          the server pool "File Transfer Pool" and a report would also
          be made that PE(1,B) is non-responsive (this would cause
          appropriate audits that may remove PE(1,B) from the pool if
          the ENRP servers had not already detected the failure) (using

3.1.2.  The RSerPool Unaware Client

In this example we investigate the use of a Proxy server assuming the
same set of scenario as illustrated above.

~                                                  operation scope ~
~  .........................                                       ~
~  . "File Transfer Pool"  .                                       ~
~  .  +-------+ +-------+  .                                       ~
~  .  |PE(1,A)| |PE(1,C)|  .                                       ~
~  .  +-------+ +-------+  .                                       ~
~  .      ^            ^   .                                       ~
~  .      +----------+ |   .                                       ~
~  .  +-------+      | |   .                                       ~
~  .  |PE(1,B)|<---+ | |   .                                       ~
~  .  +-------+    | | |   .                                       ~
~  .......^........|.|.|....                                       ~
~         |        | | |                                           ~
~         |    ASAP| | |ASAP                                       ~
~         |        | | |                                           ~
~         |        v v v                                           ~
~         |       +++++++++++++++          +++++++++++++++         ~
~         |       + ENRP-Server +<--ENRP-->+ ENRP-Server +         ~
~         |       +++++++++++++++          +++++++++++++++         ~
~         |                                ASAP   ^                ~
~         |     ASAP       (c)                (b) |  ^             ~
~         +---------------------------------+  |  |  |             ~
~                                           |  v  | (a)            ~
~                                           v     v                ~
~         :::::::::::::::::     (e)->     *****************        ~
~         :   FTP Client  :<------------->* Proxy/Gateway *        ~
~         :::::::::::::::::     (f)       *****************        ~
          Figure 4: Architecture for RserPool unaware client.

In this example the steps will occur:

Tuexen et al.                                                   [Page 9]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

     (1)  The FTP client and the Proxy/Gateway are using the TCP-based
          ftp protocol.  The client sends the login request to the proxy
          (using (e))

     (2)  The proxy behaves like a client and performs the actions
          described under (1), (2) and (3) of the above description
          (using (a), (b) and (c)).

     (3)  The ftp communication continues and will be translated by the
          proxy into the RSerPool aware dialect. This interworking uses
          (f)  and (c).

Note that in this example high availability is maintained between the
Proxy and the server pool but a single point of failure exists between
the FTP client and the Proxy, i.e. the command TCP connection and its
one IP address it is using for commands.

3.2.  Telephony Signaling Example

This example shows the use of ASAP/RSerPool to support server pooling
for high availability of a telephony application such as a Voice over IP
Gateway Controller (GWC) and Gatekeeper services (GK).

In this example, we show two different scenarios of deploying these
services using RSerPool in order to illustrate the flexibility of the
RSerPool architecture.

3.2.1.  Decomposed GWC and GK Scenario

In this scenario, both GWC and GK services are deployed as separate
pools with some number of PEs, as shown in the following diagram. Each
of the pools will register their unique pool handle (i.e. name) with the
ENRP Server. We also assume that there are a Signaling Gateway (SG) and
a Media Gateway (MG) present and both are RSerPool aware.

Tuexen et al.                                                  [Page 10]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

                           .    Gateway      .
                           . Controller Pool .
    .................      .   +-------+     .
    .   Gatekeeper  .      .   |PE(2,A)|     .
    .     Pool      .      .   +-------+     .
    .   +-------+   .      .   +-------+     .
    .   |PE(1,A)|   .      .   |PE(2,B)|     .
    .   +-------+   .      .   +-------+     .
    .   +-------+   . (d)  .   +-------+     .
    .   |PE(1,B)|<------------>|PE(2,C)|<-------------+
    .   +-------+   .      .   +-------+     .        |
    .................      ........^..........        |
                                   |                  |
                                (c)|               (e)|
                                   |                  v
        +++++++++++++++        *********       *****************
        + ENRP-Server +        * SG(X) *       * Media Gateway *
        +++++++++++++++        *********       *****************
               ^                   ^
               |                   |
               |     <-(a)         |

             Figure 5: Deployment of Decomposed GWC and GK.

As shown in the figure 5, the following sequence takes place:

     (1)  the Signaling Gateway (SG) receives an incoming signaling
          message to be forwarded to the GWC. SG(X)'s ASAP layer would
          send an ASAP request to its "local" ENRP server to request the
          list of pool elements (PE's) of GWC (using (a)). The key used
          for this query is the pool handle of the GWC. The ASAP layer
          queues the data to be sent in local buffers until the ENRP
          server responds.

     (2)  the ENRP server would return a list of the three PE's A, B and
          C to the ASAP layer in SG(X) together with information to be
          used for load-sharing traffic across the gateway controller
          pool (using (b)).

     (3)  the ASAP layer in SG(X) will select one PE (e.g., PE(2,C)) and
          send the signaling message to it (using (c)). The selection is
          based on the load sharing information of the gateway
          controller pool.

Tuexen et al.                                                  [Page 11]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

     (4)  to progress the call, PE(2,C) finds that it needs to talk to
          the Gatekeeper. Assuming it has already had gatekeeper pool's
          information in its local cache (e.g., obtained and stored from
          recent query to ENRP Server), PE(2,C) selects PE(1,B) and
          sends the call control message to it (using (d)).

          We assume PE(1,B) responds back to PE(2,C) and authorizes the
          call to proceed.

     (5)  PE(2,C) issues media control commands to the Media Gateway
          (using (e)).

RSerPool will provide service robustness to the system if some failure
would occur in the system.

For instance, if PE(1, B) in the Gatekeeper Pool crashed after receiving
the call control message from PE(2, C) in step (d) above, what most
likely will happen is that, due to the absence of a reply from the
Gatekeeper, a timer expiration event will trigger the call state machine
within PE(2, C) to resend the control message. The ASAP layer at PE(2,
C) will then notice the failure of PE(1, B) through (likely) the
endpoint unreachability detection by the transport protocol beneath ASAP
and automatically deliver the re-sent call control message to the
alternate GK pool member PE(1, A). With appropriate intra-pool call
state sharing support, PE(1, A) will be able to correctly handle the
call and reply to PE(2, C) and hence progress the call.

3.2.2.  Collocated GWC and GK Scenario

In this scenario, the GWC and GK services are collocated (e.g., they are
implemented as a single process). In such a case, one can form a pool
that provides both GWC and GK services as shown in figure 6 below.

Tuexen et al.                                                  [Page 12]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

     .  Gateway Controller/Gatekeeper Pool  .
     .                  +-------+           .
     .                  |PE(3,A)|           .
     .                  +-------+           .
     .           +-------+                  .
     .           |PE(3,C)|<---------------------------+
     .           +-------+                  .         |
     .    +-------+  ^                      .         |
     .    |PE(3,B)|  |                      .         |
     .    +-------+  |                      .         |
     ................|.......................         |
                     |                                |
                     +-------------+                  |
                                   |                  |
                                (c)|               (e)|
                                   v                  v
        +++++++++++++++        *********       *****************
        + ENRP-Server +        * SG(X) *       * Media Gateway *
        +++++++++++++++        *********       *****************
               ^                   ^
               |                   |
               |     <-(a)         |

             Figure 6: Deployment of Collocated GWC and GK.

The same sequence as described in 5.2.1 takes place, except that step
(4) now becomes internal to the PE(3,C) (again, we assume Server C is
selected by SG).

4.  Acknowledgements

The authors would like to thank Bernard Aboba, Matt Holdrege,
Christopher Ross, Werner Vogels and many others for their invaluable
comments and suggestions.

5.  References

[RFC793]    J. B. Postel, "Transmission Control Protocol", RFC 793,
            September 1981.

[RFC959]    J. B. Postel, J. Reynolds, "File Transfer Protocol (FTP)",
            RFC 959, October 1985.

[RFC2026]   S. Bradner, "The Internet Standards Process -- Revision 3",
            RFC 2026, October 1996.

Tuexen et al.                                                  [Page 13]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

[RFC2608]   E. Guttman et al., "Service Location Protocol, Version 2",
            RFC 2608, June 1999.

[RFC2719]   L. Ong et al., "Framework Architecture for Signaling
            Transport", RFC 2719, October 1999.

[RFC2960]   R. R. Stewart et al., "Stream Control Transmission
            Protocol", RFC 2960, November 2000.

[RFC3237]   M. Tuexen et al., "Requirements for Reliable Server
            Pooling", RFC 3237, January 2002.

6.  Authors' Addresses

Michael Tuexen                Tel.:   +49 89 722 47210
Siemens AG                    e-mail: Michael.Tuexen@icn.siemens.de
D-81359 Munich

Qiaobing Xie                  Tel.:   +1 847 632 3028
Motorola, Inc.                e-mail: qxie1@email.mot.com
1501 W. Shure Drive, #2309
Arlington Heights, Il 60004

Randall Stewart               Tel.:   +1 815 477 2127
Cisco Systems, Inc.           e-mail: rrs@cisco.com
24 Burning Bush Trail
Crystal Lake, Il 60012

Melinda Shore                 Tel.:   +1 607 272 7512
Cisco Systems, Inc.           e-mail: mshore@cisco.com
809 Hayts Rd
Ithaca, NY 14850

Lyndon Ong                    Tel.:   +1 408 321 8237
Point Reyes Networks          e-mail: long@pointreyesnet.com
1991 Concourse Drive
San Jose, CA

Tuexen et al.                                                  [Page 14]

Internet Draft  Architecture for Reliable Server Pooling      March 2002

John Loughney                 Tel.:
Nokia Research Center         e-mail: john.loughney@nokia.com
PO Box 407
FIN-00045 Nokia Group

Maureen Stillman              Tel.:   +1 607 273 0724 62
Nokia                         e-mail: maureen.stillman@nokia.com
127 W. State Street
Ithaca, NY 14850

             This Internet Draft expires September 1, 2002.

Tuexen et al.                                                  [Page 15]