INTERNET-DRAFT                                    J. Cohen, N. Phadnis
<draft-vinod-carp-v1-01.txt>                   Netscape Communications
                                                    Vinod Valloppillil
                                                 Microsoft Corporation
                                                         Keith W. Ross
                                            University of Pennsylvania
                                                     29 September 1997
                                                    Expires March 1998


                Cache Array Routing Protocol v1.1

Status of this Memo

 This document is an Internet-Draft.  Internet-Drafts are working
 documents of the Internet Engineering Task Force (IETF), its areas,
 and its working groups.  Note that other groups may also distribute
 working documents as Internet-Drafts.

 Internet-Drafts are draft documents valid for a maximum of six months
 and may be updated, replaced, or obsoleted by other documents at any
 time.  It is inappropriate to use Internet-Drafts as reference
 material or to cite them other than as ``work in progress.''

 To learn the current status of any Internet-Draft, please check the
 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
 ftp.isi.edu (US West Coast).

Abstract

  This draft documents proposed changes to the original version of
  the draft with this same name, to be called 'the Cache Array Routing
  Protocol (CARP) v1.1'  for dividing URL-space among an array of loosely
  coupled proxy  servers.

  The modified sections are 3.1 and 3.2.  Aside from those two
  sections, the drafts remains the same.

  An overview of the changes are presented in section 0.

  An HTTP client agent (either a proxy server or a client browser)
  which implements CARP v1.1 can allocate and intelligently route
  requests for the correct URLs to any member of the Proxy
  Array.  Due to the resulting sorting of requests through these
  proxies, duplication of cache contents is eliminated and global
  cache hit rates are improved.

Valloppillil                                           [Page 1]


INTERNET-DRAFT   Cache Array Routing Protocol v1.1    30 June 1997

Table of Contents

  0.  Proposed Change Overview........................ +
  1.  Overview........................................ 2
  2.  Proxy Array Membership Table.................... 3
      2.1  Global Information......................... 3
      2.2  Member Information......................... 4
  3.  Routing Function................................ 5
      3.1  Hash Function.............................. 5
      3.2  Hash Combination........................... 6
      3.3  Load Factor................................ 6
      3.4  Route Selection............................ 7
      3.5  Member Failure Routing..................... 7
  4.  Client-Side Implementation...................... 7
  5.  Versioning...................................... 7
  6.  Security Considerations......................... 7
  7.  Open Issues..................................... 7
  8.  Acknowledgements................................ 7
  9.  References...................................... 8
  10. Author's Information............................ 8


INTERNET-DRAFT   Cache Array Routing Protocol v1.1        30 June 1997

0. Proposed Change Overview

  Change in hash functions
  ------------------------
  The following changes are to make hash functions faster, better
  (resulting in more uniform distribution), and easier to evaluate
  (especially in JavaScript on client side).

  1.  Convert a URL to lower case to evaluate hash value to avoid parsing
  the URL. This makes hashing faster at client as well as server side.

  2.  Use shift instead of rotate operator. It is four times faster and
  gives more uniform distribution based on our experiments.

  3.  Additive multiplication is replaced with a simple multiplication.
  Additive effect can be achieved simply by adding 1 to the multiplier
  constant if required. Since the multiplier constant is anyway a set of
  random bits, adding 1 to it has no effect toward improving distribution.

  4.  The second rotate operation in evaluation of member proxy hash
  values as well as resultant hash value is eliminated. It is a relatively
  expensive operation and does not help improve distribution after the
  multiplication.


1. Overview

  The Cache Array Routing Protocol describes a distributed caching
  protocol based on

        1) a known membership list of loosely coupled proxies and
        2) a hash function for dividing URL space among those proxies

  The Proxy Array Membership Table is defined as a plain ASCII text
  file retrieved from an Array Configuration URL.  This document does
  NOT describe how this table is constructed, merely the format of
  the fields used by agents implementing.

  The hash function plus routing algorithm defined in this document
  take member proxies described in the Proxy Array Membership Table
  and make an on-the-fly determination as to which Proxy Array member
  should be the proper receptacle for a cached version of a resource
  keyed by URL.

  Downstream agents may then access the cached resource by forwarding
  the proxied HTTP request [5] for that resource to the appropriate
  member of the Proxy Array.

Valloppillil                                           [Page 3]


INTERNET-DRAFT   Cache Array Routing Protocol v1.1        30 June 1997

2. Proxy Array Membership Table

  The Proxy Array Membership Table is a plain-text ASCII file which
  can be published from a URL.

  The format of the table is:

  Proxy Array Information/<Version number>
  ArrayEnabled:  <0 | 1>
  ConfigID:  <opaque string>
  ArrayName:  <opaque string>
  ListTTL:  <minutes until next check>

  <name> <IP addr> <listening port> <table URL> <agent str>
  <statetime> <status UP | DOWN> <load factor> <cache size>

2.1 Global Information

  These are fields that describe the array itself and are not specific
  to any one member of an array

  Global information is terminated in the Proxy Array Membership Table
  by a CR/LF/CR/LF.

2.1.1 Version number

  The version number for implementations of this specification is
  1.0

2.1.2 ArrayEnabled

  This field allows proxies to advertise their implementation of CARP i
  v1 even if they are not members of a Proxy Array.

2.1.3 ConfigID

  ConfigID is an opaque number no larger than 32bits similar to an
  ETag in HTTP 1.1.  It is used to track the current state of an
  Array table and may be used to match multiple yet independently
  published copies of the Proxy Array Membership Table.

2.1.4 ArrayName

  ArrayName is an opaque string which is used to provide a convenient
  administrative name for a given array.

2.1.5 ListTTL

  ListTTL is the number of seconds for which an HTTP client entity
  should consider the current table image valid.  After ListTTL
  has expired, that client should retrieve a new copy of
  the Proxy Array Membership Table.

Valloppillil                                           [Page 4]


INTERNET-DRAFT   Cache Array Routing Protocol v1.1        30 June 1997

2.2 Member Information

  The following fields are published per member in an array and are
  separated by single spaces.  The end of an array member's record is
  terminated by a CR/LF.

2.2.1 Name

  The name of the proxy server.  Typically this is the fully qualified
  DNS name.  Downstream HTTP agents should use resolution of this name
  to determine how to connect to this proxy.

2.2.2 IP Addr

  The IP address that other proxy servers within this array should use
  to connect to this proxy server.  This is necessary for proxy
  servers which may be hosted on multi-hommed servers where requests
  are only accepted by one of the interfaces.

2.2.3 Listening Port

  The TCP port number this proxy is expecting requests on.

2.2.4 Table URL

  A URL which may be maintained by this proxy server on which a copy
  of the array membership table can be found.

2.2.5 Agent String

  An opaque string identifying the vendor / version of the
  proxy Server in the Array Membership Table.

2.2.7 Statetime

  How long a Proxy Server has been in its current state and has been
  a member of this table.  This is useful for dynamic generation of
  the Array Membership Table where the host generating the table has
  knowledge of the proxy's operational status.

2.2.8 Status

  Status provides a simple text string indicating whether a member
  proxy is currently able to handle requests (UP) or refused a
  connection when last contacted (DOWN).

Valloppillil                                           [Page 5]


INTERNET-DRAFT   Cache Array Routing Protocol v1.1        30 June 1997

2.2.9 Load Factor

  Load Factor is a relative amount of the total load for an array that
  should be handled by any given member of the array.

2.2.10 Cache Size

  Cache size is an informational field that indicates the size of the
  cache held by a particular member of an array.

3. Routing Function

  Once an agent has a Proxy Array Membership Table.  It uses a
  mathematical hash function to determine which of the members of
  the array should be the receptacle of a particular URL request.

  This routing function involves constructing n "scores" using a hash
  of the request URL plus a hash of each of the n proxies in the Proxy
  Array Membership Table.

  Both the URL and the proxy names are hashed in order to minimize the
  disruption of target routes if a member of the target array can't
  be contacted.

  Hashes of the URL and proxy name are constructed using the algorithm
  described in 3.1 and combined using the algorithm described in 3.2.

3.1. Hash Function

  The hash function outputs a 32 bit unsigned integers based on a
  zero-terminated ASCII input string.  Machine names as well as URLs
  should be converted to lower case for hash evaluation. This makes sense
  because machine names are case insensitive.

  Because irreversibility and strong cryptographic features are
  unnecessary for this application, a very simple and fast hash
  function based on the bitwise left shift operator is used.

  For (each char in URL):
        URL_Hash += (URL_Hash << 9) + char ;

  Member proxy hashes are computed in a similar manner:

  For (each char in MemberProxyName):
        MemberProxy_Hash +=   (MemberProxy_Hash << 9) + char ;

  Becaues member names are often similar to each other, their hash
  values are further spread across hash space via a multiplication:

  MemberProxy_Hash = MemberProxy_Hash * 0x62531965 ;


Valloppillil                                           [Page 6]


INTERNET-DRAFT   Cache Array Routing Protocol v1.1        30 June 1997

3.2. Hash Combination

  Hashes are combined by first exclusive or-ing (XOR) the URL hash by
  the machine name and then multiplying by a constant.

  All final and intermediate values are 32 bit unsigned integers.

  Combined_Hash = (URL_hash ^ MemberProxy_Hash) ;
  Combined_Hash = Combined_Hash * 0x62531965 ;

3.3. Load Factor

  Support for array members with differing HTTP processing & caching
  capacity is achieved by multiplying each of the combined hash values
  by a Load Factor Multiplier.

  The Load Factor Multiplier for an individual member is calculated by
  taking each member's relative Load Factor and applying the
  following formula:

  For each proxy server 1,...,K, the Load Factor Multiplier, X_k, is
  calculated iteratively as follows:

  X_1 = arbitrary positive constant

  X_k = ([K-k+1] * [P_k - P_{k-1}])/(X_1 * X_2 * ... * X_{k-1})
  X_k += (X_{k-1}^{K-k+1})
  X_k = X_k^{1/(K-k+1)}

  where:

  X_k = Load Factor Multiplier for proxy k
  K = number of proxies in an array
  P_k = relative percent of the load that proxy k should handle

  This is then combined with the previously computed hashes as

  Resultant_value = Combined_Hash * X_k


Valloppillil                                           [Page 7]


INTERNET-DRAFT   Cache Array Routing Protocol v1.1        30 June 1997

3.4. Route Selection

  The "score" for a particular combination of URL plus proxy is its
  resultant value. Once the agent determines of the scores of the
  K proxies, it routes the URL query to the proxy with the hightest
  score.

3.5. Member Failure Routing

  If a proxy can not contact the designated member of a proxy array
  in order to forward an HTTP request, that proxy should route
  the request to the second highest scoring proxy in the target array.

4. Client-side implementation

  CARP can be implemented on client-side HTTP browsers via the
  use of the Proxy AutoConfig file described in [1] and [2].

5. Versioning

  If a downstream proxy receives an Array Membership Table with a
  greater version # than that proxy is able to parse, it should
  fall back to simple proxy request routing to any administrator
  defined upstream proxy server.

6. Security Considerations

  This draft does not discuss relevant security considerations.

7. Open Issues

8. Acknowledgements

  The author would like to thank Brian Smith, Kip Compton, and
  Kerry Schwartz for their assistance in preparing this document.

  Most of the architecture & design of CARP stem from work conducted
  by Brian Smith at Microsoft Corp.

Valloppillil                                           [Page 8]

INTERNET-DRAFT   Cache Array Routing Protocol v1.1       30 June 1997

9. References

  [1] Luotonen, Ari., "Navigator Proxy Auto-Config File Format",
 Netscape Corporation, http://home.netscape.com/eng/mozilla/2.0/
 relnotes/demo/proxy-live.html, March 1996.

  [2] Microsoft Corporation., "Automatic Proxy Configuration",
 http://www.microsoft.com/ie/ieak/autosys.htm, March 21, 1997.

  [3] Wessels, Duane., "Internet Cache Protocol Version 2", http://ds.
 internic.net/internet-drafts/draft-wessels-icp-v2-00.txt, March 21,
 1997.

  [4] Sharp Corporation., "Super Proxy Script",
 http://naragw.sharp.co.jp/sps/, August 9, 1996.

  [5] Fielding, R., et. al, "Hypertext Transfer Protocol -- HTTP/1.1",
 RFC 2068, UC Irvine, January 1997.

  [6] Valloppillil & Cohen, "Hierarchical HTTP Routing Protocol",
 http://ircache.nlanr.net/Cache/ICP/draft-vinod-icp-traffic-dist-00.txt,
 April 21, 1997.

10.  Author Information

    Josh Cohen <josh@netscape.com>
    Nilkanth Phadnis <phadnis@netscape.com>
    Netscape Communications
    501 E. Middlefield Rd
    Mountain View, Ca 94043

    Vinod Valloppillil
    Microsoft Corporation
    One Microsoft Way
    Redmond, WA 98052

    Phone:  1.206.703.3460
    Email:  VinodV@Microsoft.Com

    Keith W. Ross
    University of Pennsylvania
    Department of Systems Engineering
    Philadelphia, PA  19104

    Phone:  1.215.898.6069
    Email:  Ross@UPenn.Edu

Expires December 1997