Internet Draft                                               Danny Cohen
                                                                 Myricom
Expires in six months                                         Craig Lund
                                                       Mercury Computers
                              Tony Skjellum, Thom McMahon, Robert George
                                            Mississippi State University
                                                               June 1998

           The Router-to-Router (RRP) PacketWay Protocol for
         High-Performance Interconnection of Computer Clusters

Status of this Memo

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   To view the entire list of current Internet-Drafts, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern
   Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific
   Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).




Table of Contents

   Introduction .....................................................  2
   Notations ........................................................  2
   PacketWay and IP .................................................  3
   Node Attributes ..................................................  3
   RRP Messages .....................................................  4
   Structure of RRP Messages ........................................  5
   RRP Records ......................................................  8
   Example .......................................................... 12
   Glossary ......................................................... 16
   Acronyms and Abbreviations ....................................... 18
   Editor's Address ................................................. 19











Cohen et al.                  Experimental                      [Page 1]


Internet-Draft               PacketWay RRP                      May 1998



Introduction

   The PacketWay family of protocols is introduced in the "The End-to-
   End (EEP) PacketWay Protocol for High-Performance Interconnection of
   Computer Clusters".  This document defines the Router-to-Router
   Protocol (RRP), the basic messages used by routers to exchange
   routing information with endpoints and each other.

   In the PacketWay model a router is a set of cooperating hosts on two
   (or more) networks.  These hosts, each a full-fledged host on its
   SAN, are called "half-routers" (HRs).  RRP defines, via message
   structure and behavior, the interactions between HRs as well as the
   interactions between HRs and nodes.

   RRP does not define the lower level protocols that deliver its
   messages.  RRP also does not define the connection between the HRs
   within the router-- these are left for mutual agreements among the
   implementors of each HR.

   However, the intra-router communication among these hosts is a
   "public" issue, handled according to the RRP which defines only the
   Network-level [Level-3], and not the lower levels of this
   communication.  All RRP messages are carried via EEP packets with the
   "Packet-Type" field of the EEP header set to "RRP".

   This document does not define how source routes are initially
   constructed.  It is expected that static tables may be manually
   maintained for simple or very stable systems.  Dynamic table-
   maintenance protocols will likely be outlined in a future document.

Notations

   8B    means "8-byte" (64 bits).

   0x    indicates hexadecimal values,  e.g., 0x0100 is
         2^8=256(decimal).

   0b    indicates binary values, e.g., 0b0100 is 4(decimal).

   xxxx  indicate a field that is discarded without any checking (e.g.,
         padding).

   [exp] in equations, is the integral part, rounded down, of `exp`
         (e.g., [23/8]=2).

   All length fields do not include themselves, and therefore may be 0.
   Lengths are specified either (a) by byte count, implying that some
   padding bytes may follow to fill 8B-words, or (b) by 8B-word count
   and PL, the number of trailing padding bytes (with PL between 0 and
   7).


Cohen et al.                  Experimental                      [Page 2]


Internet-Draft               PacketWay RRP                      May 1998


PacketWay and IP

   The architecture of PacketWay is very similar to the IP family (in
   fact it heavily borrows from IP), with emphasis on performance not
   generality and scaleability as was selected for IP.

   Like IP, PacketWay is based on an End-to-End protocol (EEP) that
   assumes that if an address (or equivalent specification of the desti-
   nation) is placed in the appropriate field in the packet header, then
   the packet will arrive to that destination.  Neither IP nor EEP
   specify how this happens.

   Routers are responsible for transferring packets from their source
   networks to their destination networks (possibly via other networks).
   The communication among the routers (such the entire family of the
   GGPs [Gateway/Gateway Protocols] as they were originally called) is
   NOT a part of IP (as defined originally in RFC-791 and MIL-STD-1777).
   Similarly, it is not a part of EEP.

   Like the IP family, PacketWay defines separately its Router-to-Router
   Protocol (RRP), in a device- and network-independent manner.

   However, the model of routers in PacketWay is slightly different from
   the original model in the IP family.  IP routers (or gateways as they
   were called then) are monolithic devices, provided by their vendors.
   Each IP-router is a bona-fide host on two (or more) networks.  The
   communication among these intra-router hosts is an internal "private"
   issue, handled by each vendor as it sees fit, not subject to pub-
   lished standards.

Node Attributes

   Each node must have a Physical Address.  Optionally it may also have
   Name, Capabilities, and Logical-Addresses:

   Physical Address    23 bits, flat, unique in this PacketWay.

   Name                flat, globally unique (e.g., IP address), arbi-
                       trary length

   Capabilities        regular GP node, router, PacketWay-server, NFS,
                       paging server, M/C server, SRVLOC-server, DSP,
                       printer, etc.

                       Some capabilities may need additional parameters
                       (e.g., SAN-ID for routers, and resolution+colors
                       for printers).  Their parameters are capability-
                       specific.  The capabilities are defined in the
                       PacketWay Enumeration document.




Cohen et al.                  Experimental                      [Page 3]


Internet-Draft               PacketWay RRP                      May 1998


   Logical-Addresses   a set of (logical) addresses to which this node
                       requests to listen.  Logical addresses designate
                       multicast and broadcast groups.

                       The control of the Logical-Addresses (a la IGMP)
                       is not defined in this document.  This will be
                       designed by the applications that use it (e.g.,
                       PacketWay-Multicast).

                       The management of logical addresses (e.g., JOIN
                       and LEAVE) is not defined here.


RRP Messages

   RRP messages are PacketWay messages with PT="RRP" and TE="RRP-Type"
   in their EEP-header, followed by zero or more RRP-records according
   to their RRP-type and completed by the TAIL which is the EI field of
   the EEP packet.  The RRP-records are defined in the next sections.

   The RRP-records constitute the Data Block (DB) of the PacketWay-
   message.  They must be in Big-Endian order, with e=0 in the EEP-
   header.

   We use "[XXX]" to indicate the RRP-message XXX, and <YYY> to indicate
   the RRP-record YYY.  XXX is the RRP-Type, carried in the Type
   Extension (TE) field of the EEP header (with Packet-Type of "RRP"),
   and YYY is the RTyp field, carried in the first byte of that
   RRP-record.

   Following are the 7 RRP messages, with their RRP-type, and the
   related error messages.  The column S->D (Source to Destination)
   shows who sends such messages to whom, where N is for Node, H is for
   HR, and A is for Any.

    RRP-
    Type       S->D   Description
   --------   ------  -----------------------------------------------
   [GVL2]      N->H   Please give me L2-routes to node (address)
                      Replies to [GVL2]: [L2SR], [RDRC], or [ERR/UNK].

   [L2SR]      H->N   Here are L2-routes to node (address)

   [HRTO]      N->H   Which HR should I use for node (address)?
                      Replies to [HRTO]: [RDRC] or [ERR/UNK].

   [RDRC]      H->N   Re-direct to node (address) via an HR on same SAN

   [TELL]      N->H   Please tell me about node (address, name, capa's)
                      The reply to [TELL] is [INFO], or [ERR/UNK].



Cohen et al.                  Experimental                      [Page 4]


Internet-Draft               PacketWay RRP                      May 1998



   [INFO]      A->A   Info about node (address, name, capabilities, LAs)

   [WRU?]      A->A   Who/what-Are-You?  (Tell me all about yourself)
                      The reply to [WRU?] is [INFO] about the replier.

   RRP also uses the following error messages:

   [ERR/UNK]          Destination Unknown (address)
   [ERR/HRDOWN]       HR Down
   [ERR/LKDOWN]       Link Down
   [ERR/GENERAL]      General error message


   [GVL2]          Please give me L2-routes from you to node (address)

                   PH     (with [PT/TE]=[RRP/GVL2])
                   <ADDR> (address of the node for which [L2SR] is
                           requested)


Structure of RRP Messages

   [L2SR]          Here are L2-routes from me to node (address)

                   PH     (with [PT/TE]=[RRP/L2SR])
                   <ADDR> (address of the node for which the following
                           <SRQR> is provided)
                   <SRQR> (Source Route/Quality record)
                   <MTUR> (optional) MTU records for the above <SRQR>

                   This message may have several (<SRQR>, <MTUR>) pairs,
                   one such pair for each source route.

   [HRTO]          Which HR should I use for node (address)

                   PH     (with [PT/TE]=[RRP/HRTO])
                   <ADDR> (address of the node for which initial HR
                           is requested)


   [RDRC]          Re-direct to destination node (address) via a HR
                   (address), on the same SAN.

                   PH     (with [PT/TE]=[RRP/RDRC])
                   <ADDR> (address of the destination node)
                   <ADDR> (address of the HR to be used for that
                           destination)

                   The above addresses are expected to be physical
                   (but they be otherwise).


Cohen et al.                  Experimental                      [Page 5]


Internet-Draft               PacketWay RRP                      May 1998


   [TELL]          Please tell me about node
                                 (address | name | capabilities)

                   PH     (with [PT/TE]=[RRP/TELL])
                   <ADDR> (address of that node)

                   or

                   PH     (with [PT/TE]=[RRP/TELL])
                   <NAME> (name of that node)

                   or

                   PH     (with [PT/TE]=[RRP/TELL])
                   <CAPA> (capabilities for which nodes are requested)

                   This message may have several <CAPA>'s, one for each
                   capability.

                   [TELL] identifies a node by an address and/or a name
                   and/or capabilities.  If more than one attribute is
                   specified (e.g., an address and name(s)) any nodes
                   that meets any of them should be considered (like an
                   implied OR).


   [INFO]          Info about node(s) (address, name, capabilities)

                   PH     (with [PT/TE]=[RRP/INFO])
                   <ADDR> (address of that node)
                   <NAME> (name of that node)
                   <CAPA> (capabilities for which nodes are requested)
                   <LADR> (Logical-Addresses for the requested node)

                   This message may have several <CAPA>'s, one for each
                   capability.  For nodes without <NAME>, <LADR>, or any
                   <CAPA>, these records are omitted.

                   [INFO] provides all the known information about all
                   the nodes that match the [TELL].  The <ADDR>-records
                   are the separators between the nodes.


   [WRU?]          Who/what-Are-You?

                   PH     (with [PT/TE]=[RRP/WRU?] and [DD]=0x7FFFFE)







Cohen et al.                  Experimental                      [Page 6]


Internet-Draft               PacketWay RRP                      May 1998


   [ERR/UNK]       Destination Unknown (address)

                   PH     (with [PT/TE]=ERROR/UNK)

                   <XXXX> (XXXX of the Destination node for which the
                           requested information is not available),
                           where >XXXX> is the <ADDR> and/or <NAME>
                           <and/or CAPA> of the node(s) about which
                           this message is sent


   [ERR/HRDOWN]    HR Down (or Router-Down)

                   PH     (with [PT/TE]=[ERROR/HRDOWN])
                   <ADDR> (address of the HR that is down)
                   <ADDR> (the other address of the router that is down)


   [ERR/LINKDOWN]  Link Down

                   PH     (with [PT/TE]=[ERROR/LINKDOWN])
                   <ADDR> (address of one end of the link that is down)
                   <ADDR> (address of the other end of the link that is
                           down)


   [ERR/GENERAL]   General Error (i.e., none of the above)

                   PH     (with [PT/TE]=[ERROR/GENERAL])
                   XX     (The entire message that caused the error :
                           PH+OH+DB+TAIL)






















Cohen et al.                  Experimental                      [Page 7]


Internet-Draft               PacketWay RRP                      May 1998


RRP Records

   Each RRP-record starts with an 8B-word header as shown below.  Its
   first byte identifies the record type (RTyp).  The second byte is the
   Pad-Count byte (PL) indicating the number of padding bytes.  The
   third and the fourth bytes (RL) are the length (in 8B-words) of the
   record, excluding the record header, hence it may be zero.  The rest
   of the header bytes depend on the record type (RTyp).

+--------+--------+--------+--------+--------+--------+--------+--------+
|  RTyp  |   PL   |       RL        |........|........|........|........|
+--------+--------+--------+--------+--------+--------+--------+--------+

   Some records that have an arbitrary length are "right justified" by
   having PL padding bytes before the data (Padding Before Data [PBD]).
   Some records that have an arbitrary length are "left justified" by
   having PL bytes after the data (Padding After Data [PAD]).  In either
   case the total number of data bytes is: (8*RL+4-PL).

   Following are the RRP-records.  These records are the building blocks
   used to construct RRP-messages.  In the following, "xxxx" indicate
   bytes that are discarded, such as for padding.  It is recommended to
   set them to all-0.

===> <ADDR> Node-Address Record [PAD]

   This record specifies either a single address (with AT=1) or a range
   of addresses (with AT=2 followed by AT=3, or by AT=4 followed by
   AT=5).  AT is the "Address-Type".

    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <ADDR> |  PL=0  |      RL=0       |  AT=1  |     PacketWay-Address    |
+--------+--------+--------+--------+--------+--------+--------+--------+

   or:
    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <ADDR> |  PL=4  |      RL=1       |  AT=2  |   Min-PacketWay-Address  |
+--------+--------+--------+--------+--------+--------+--------+--------+
|  AT=3  |   Max-PacketWay-Address  |  xxxx  |  xxxx  |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+

   or:
    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <ADDR> |  PL=4  |      RL=1       |  AT=4  |  PacketWay-Address-Value |
+--------+--------+--------+--------+--------+--------+--------+--------+
|  AT=5  |  PacketWay-Address-Mask  |  xxxx  |  xxxx  |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+



Cohen et al.                  Experimental                      [Page 8]


Internet-Draft               PacketWay RRP                      May 1998


   The address-mask follows the address-value.

   The above addresses may be physical or logical.

   The address X is specified by an <ADDR>-record if:

   if AT=1:                               X  == PacketWay-Address

   if AT=2,3:   Min-PacketWay-Address <=  X  <= Max-PacketWay-Address

   if AT=4,5:   (PacketWay-Address-Mask & X) == PacketWay-Address-Value

   An <ADDR>-record defines only one PacketWay-address (or one range),
   unlike an <LADR>-record (see below) that may specify multiple addresses
   and multiple address-ranges.

   If the <ADDR>-record is followed by other records that describe the
   same node (such as <NAME>, <CAPA>, <LADR>, <SRQR>, and <MTUR>) then
   the RL of the <ADDR>-records also covers all these records.  All
   these records apply to all the addresses specified in this
   <ADDR>-record.  Needless to say that <NAME> is not expected to appear
   within a record that specifies more than one address.

   Hence, if an <ADDR>-record with AT=1 has RL>1, or if an <ADDR>-record
   with AT>1 has RL>2, then this <ADDR>-record includes additional records
   (such as <CAPA>, <LADR>, <SRQR>, and/or <MTUR>) about the specified
   address(es).

   The enumeration is guaranteed not to have overlap between the AT and
   the RTyp codes.

===> <NAME> Node-Name Record [PAD]

   (e.g., a name with 7 bytes B1..B7)

    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <NAME> |  PL=3  |      RL=1       |   B1   |   B2   |   B3   |   B4   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|   B5   |   B6   |   B7   |  xxxx  |  xxxx  |  xxxx  |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+

   The number of bytes in the name is 8*RL+4-PL.










Cohen et al.                  Experimental                      [Page 9]


Internet-Draft               PacketWay RRP                      May 1998


===> <CAPA> Node-Capability Record [PAD]

   (e.g., 9 parameter bytes)

    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <CAPA> |  PL=2  |      RL=1       | CC=Cx  |   P1   |   P2   |   P3   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|   P4   |   P5   |   P6   |   P7   |   P8   |   P9   |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+

   Byte#4 is the Capability Code, CC, followed by as many parameter
   bytes as needed (9 in the above example).

   The capability codes are listed in the PacketWay Enumeration docu-
   ment.

   The number of bytes used by the parameters is 8*RL+3-PL.

===> <LADR> Logical-Addresses Record [PAD]

   (e.g., 2 logical addresses and a range of logical addresses)

    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <LADR> |  PL=4  |      RL=2       |  AT=1  |1110  Logical-Address-#1  |
+--------+--------+--------+--------+--------+--------+--------+--------+
|  AT=2  |1110  Min-Logical-Address |  AT=3  |1110  Max-Logical-Address |
+--------+--------+--------+--------+--------+--------+--------+--------+
|  AT=1  |1110  Logical-Address-#2  |  xxxx  |  xxxx  |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+

   Whereas an <ADDR>-record defines only one PacketWay-address (or one
   range), an <LADR>-record may specify multiple addresses (each with
   AT=1) and multiple ranges (each with a pair of AT=2,3 or AT=4,5).

===> <SRQR> Source-Route Record [PBD], with Q for that route.

   (e.g., an SR combined of 2 L2RHs, one with 13 bytes and one with 4
   bytes)

   This record carries one, or more, L2RHs (2 in the following example,
   one with SR of 13B, followed by an SR of 5B).










Cohen et al.                  Experimental                     [Page 10]


Internet-Draft               PacketWay RRP                      May 1998


    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <SRQR> |  PL=2  |      RL=3       |  xxxx  |  xxxx  |        Q        |
+--------+--------+--------+--------+--------+--------+--------+--------+
|vv000000|10 L=13B|  SR01  |  SR02  |  SR03  |  SR04  |  SR05  |  SR06  |
+--------+--------+--------+--------+--------+--------+--------+--------+
|  SR07  |  SR08  |  SR09  |  SR10  |  SR11  |  SR12  |  SR13  |   xxxx |
+--------+--------+--------+--------+--------+--------+--------+--------+
|vv000000|10 L=4B |  SR01  |  SR02  |  SR03  |  SR04  |   xxxx |   xxxx |
+--------+--------+--------+--------+--------+--------+--------+--------+

   Q (the Route Quality) is an unsigned 16-bit integer.  The units are
   not defined here.  It is assumed that it is monotonic with all-0
   being the best and all-1 the worst.  If there is an <MTUR>
   (MTU-record) for that SR it should follow this <SRQR>-record.
   However, the RL of the <SRQR> does not include the RL of the <MTUR>.

===> <MTUR> MTU record [PBD]

    0        1        2        3        4        5        6        7
+--------+--------+--------+--------+--------+--------+--------+--------+
| <MTUR> |  PL=0  |      RL=0       |         MTU (in 8B-words)         |
+--------+--------+--------+--------+--------+--------+--------+--------+

   The MTU record provides the MTU for the SR defined before (by an
   <SRQR>).

   The value of 0 means indefinite MTU (i.e., any length is OK).

























Cohen et al.                  Experimental                     [Page 11]


Internet-Draft               PacketWay RRP                      May 1998


Example

   In the following PacketWay network used for this example, 3 SANs are
   interconnected via 2 routers, Router-A (RTRA) between SAN1 and SAN3,
   and RTRB between SAN1 and SAN2.

+-------+          +--0--+   SAN1   +--0--+          +--0--+
| Node1 +----------3 SW0 1----------3 SW1 1----------3 SW2 1   MTU=16KB
+-------+          +--2--+          +--2--+          +--2--+
                      |                                 |
           RTRA1 ***********       +---+---+       *********** RTRB1
                 * RouterA *       | Node2 |       * RouterB *
           RTRA3 ***********       +---+---+       *********** RTRB2
                      |                |                |
+-------+   SAN3   +--0--+          +--0--+   SAN2   +--0--+
| Node3 +----------3 SW3 1          3 SW4 1----------3 SW5 1   MTU=8KB
+-------+          +--2--+          +--2--+          +--2--+

   In this example Node1 on SAN1 (with MTU=16KB) is looking for Node2
   which is on SAN2 (with MTU=8KB).  It first asks its default router
   (RTRA1) for an L2RH to Node2.  RTRA1 redirects Node1 to RTRB1
   regarding Node2.

   Node1 asks RTRA1 (by [HRTO], in message M1) which router to use for
   Node2.  RTRA1 suggests (using [RDRC], M2) to use RouterB.  Node1 uses
   L3-forwarding ([WRU?], M3), via Router-B, to verify that RTRB can
   indeed get to Node2, by asking Node2 for information about itself.
   Node2 provides this information ([TELL], M4) which Node1 likes.
   Node1 asks RouterB ([GVL2], M5) for L2RH(s) to Node2.  RouterB pro-
   vides ([L2SR], M6) the requested L2RH with its MTU of 1,024 8B-words
   (8KB).

   Finally, Node1 sends data (by M7) to Node2 using L2-forwarding.
   Similarly, Node2 may ask its default router which HR to use for Node1
   and for L2RH(s) to Node1.

   The sequence of messages (M1 thru M7) is shown below.

   (M1) Node1 sends [HRTO] to its default router RTRA1 asking which HR
   to use for node2.













Cohen et al.                  Experimental                     [Page 12]


Internet-Draft               PacketWay RRP                      May 1998

    0        1        2        3        4        5        6        7
+-----------------------------------------------------------------------+
| <----     The L2-header needed to get from Node1 to RouterA1    ----> |
| It may be any number of bytes.  In this example it's 9 bytes:230000000|
+--------+--------+--------+--------+--------+--------+--------+--------+
|00   P  |0          RTRA1          |      "HRTO"     |     "R R P"     |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|E=0|PL=0| Data-Length=1 (8B-words) |0|  RZ  |0          Node1          |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
| <ADDR> |  PL=0  |      RL=0       |  AT=1  |0          Node2          |
+--------+--------+--------+--------+--------+--------+--------+--------+
|      64 zero bits, unless any error was indicated along the path      |
+--------+--------+--------+--------+--------+--------+--------+--------+

   (M2) RTRA1 uses [RDRC] to re-direct to Node2 via RouterB.

    0        1        2        3        4        5        6        7
+-----------------------------------------------------------------------+
| <----     The L2-header needed to get from RouterA1 to Node1    ----> |
| It may be any number of bytes.  In this example it's 9 bytes:330000000|
+--------+--------+--------+--------+--------+--------+--------+--------+
|00   P  |0          Node1          |      "RDRC"     |     "R R P"     |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|E=0|PL=0| Data-Length=2 (8B-words) |0|  RZ  |0          RTRA1          |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
| <ADDR> |  PL=0  |      RL=0       |  AT=1  |0          Node2          |
+--------+--------+--------+--------+--------+--------+--------+--------+
| <ADDR> |  PL=0  |      RL=0       |  AT=1  |0          RTRB1          |
+--------+--------+--------+--------+--------+--------+--------+--------+
|      64 zero bits, unless any error was indicated along the path      |
+--------+--------+--------+--------+--------+--------+--------+--------+

   Node1 knows how to get to RouterB over its SAN.

   (M3) Node1 uses [WRU?] (still using L3-forwarding via RouterB) to
   verify the capabilities of Node-2, and that RTRB can indeed get to
   it. This is done by asking Node2 for information about itself.

    0        1        2        3        4        5        6        7
+-----------------------------------------------------------------------+
| <----     The L2-header needed to get from Node1 to RouterB1    ----> |
|    It may be any number of bytes.  Here it is 11 bytes: 11230000000   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|00   P  |0          Node2          |      "WRU?"     |     "R R P"     |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|E=0|PL=0| Data-Length=0 (8B-words) |0|  RZ  |0          Node1          |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|      64 zero bits, unless any error was indicated along the path      |
+--------+--------+--------+--------+--------+--------+--------+--------+





Cohen et al.                  Experimental                     [Page 13]


Internet-Draft               PacketWay RRP                      May 1998

   (M4) Node2 uses [INFO] (via RouterB2, also using L3-forwarding) to
   provide information about itself to Node1.  This info includes its
   PacketWay-address and its name ("Super").  If Node2 had implemented
   also Level-C of the RRP it would also provide a record about its
   capabilities (as shown in this example with 2 capabilities (with
   codes of 5 and 7).

    0        1        2        3        4        5        6        7
+-----------------------------------------------------------------------+
| <----     The L2-header needed to get from Node2 to RouterB2    ----> |
|     It may be any number of bytes.  Here it is 10 bytes: 1030000000   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|00   P  |0          Node1          |      "INFO"     |     "R R P"     |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|E=0|PL=0| Data-Length=5 (8B-words) |0|  RZ  |0          Node2          |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
| <ADDR> |  PL=0  |      RL=4       |  AT=1  |0          Node2          |
+--------+--------+--------+--------+--------+--------+--------+--------+
| <NAME> |  PL=7  |      RL=1       |  "S"   |  "u"   |  "p"   |  "e"   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|  "r"   |  xxxx  |  xxxx  |  xxxx  |  xxxx  |  xxxx  |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+
| <CAPA> |  PL=1  |      RL=0       |  CC=7  |   4    |   8    |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+
| <CAPA> |  PL=3  |      RL=0       |  CC=5  |  xxxx  |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+
|      64 zero bits, unless any error was indicated along the path      |
+--------+--------+--------+--------+--------+--------+--------+--------+

   By receiving this message Node1 knows that RTRB could indeed be used
   for communication with Node2.

   (M5) Node1 uses [GVL2] to ask RouterB for L2RH(s) from RouterB to
   Node2.

    0        1        2        3        4        5        6        7
+-----------------------------------------------------------------------+
| <----     The L2-header needed to get from Node1 to RouterB1    ----> |
|    It may be any number of bytes.  Here it is 11 bytes: 11230000000   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|00   P  |0          RTRB1          |      "GVL2"     |     "R R P"     |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|E=0|PL=0| Data-Length=1 (8B-words) |0|  RZ  |0          Node1          |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
| <ADDR> |  PL=0  |      RL=0       |  AT=1  |0          Node2          |
+--------+--------+--------+--------+--------+--------+--------+--------+
|      64 zero bits, unless any error was indicated along the path      |
+--------+--------+--------+--------+--------+--------+--------+--------+






Cohen et al.                  Experimental                     [Page 14]


Internet-Draft               PacketWay RRP                      May 1998

   (M6) RouterB uses [L2SR] to provide Node1 with an L2RH from RTRB2 to
   Node2, with its Q and MTU.  This L2RH is {3,0,3,0,0,0,0,0,0,0} from
   RouterB to Node2, and the MTU is 1,024 (meaning 8KB).

    0        1        2        3        4        5        6        7
+-----------------------------------------------------------------------+
| <----     The L2-header needed to get from RouterB1 to Node1    ----> |
|    It may be any number of bytes.  Here it is 11 bytes: 33330000000   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|00  P   |0          Node1          |      "L2SR"     |     "R R P"     |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|E=0|PL=0| Data-Length=4 (8B-words) |0|  RZ  |0          RTRA1          |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
| <ADDR> |  PL=0  |      RL=3       |  AT=1  |0          Node2          |
+--------+--------+--------+--------+--------+--------+--------+--------+
| <SRQR> |  PL=2  |      RL=1       |  xxxx  |  xxxx  |        Q        |
+--------+--------+--------+--------+--------+--------+--------+--------+
|vv000000|10 L=4B |   3    |   0    |   3    |   0    |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+
| <MTUR> |  PL=1  |      RL=0       |      MTU=1,024 (in 8B-words)      |
+--------+--------+--------+--------+--------+--------+--------+--------+
|      64 zero bits, unless any error was indicated along the path      |
+--------+--------+--------+--------+--------+--------+--------+--------+

   The MTU in the <MTUR> above is the lessor of the MTUs of both
   networks.

   The RL (record-length) of the last <MTUR>-record is NOT included in
   the RL of the preceding <SRQR>-record, but is included in the RL of
   the preceding <ADDR>-record (since the RL of the <SRQR> is included
   in the RL of the <ADDR>).  The RL=3 of the <ADDR> includes 2 words of
   <SRQR> and 1 word of <MTUR>.

   (M7) Finally, Node1 sends data to Node2 using L2-forwarding.

    0        1        2        3        4        5        6        7
+-----------------------------------------------------------------------+
| <----     The L2-header needed to get from Node1 to RouterB1    ----> |
|    It may be any number of bytes.  Here it is 11 bytes: 11230000000   |
+--------+--------+--------+--------+--------+--------+--------+--------+
|vv000000|10 L=4B |   3    |   0    |   3    |   0    |  xxxx  |  xxxx  |
+--------+--------+--------+--------+--------+--------+--------+--------+
|00  P   |0          Node2          |Sensor.SubType=? |     "Sensor"    |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|E=3|PL=0| Data-Length=? (8B-words) |0|  RZ  |0          Node1          |
+---+----+--------+--------+--------+-+------+--------+--------+--------+
|                                                                       |
| <------------------- The sensor data goes here ---------------------> |
|                                                                       |
+--------+--------+--------+--------+--------+--------+--------+--------+
|      64 zero bits, unless any error was indicated along the path      |
+--------+--------+--------+--------+--------+--------+--------+--------+


Cohen et al.                  Experimental                     [Page 15]


Internet-Draft               PacketWay RRP                      May 1998


   E=3 (0b0011) indicates that all the data is 64-bit, Big Endian order.

   All the messages shown in this appendix start with local L2 routing
   bytes needed to get across either SAN1 or SAN2 (indicated with "The
   L2-header needed to get from ... to ...") which are not L2RHs.  The
   difference is that these bytes are in front of the packet, exposed to
   the local switches, whereas the L2RHs are only exposed to PacketWay-
   entities.

   These local L2 routing bytes are the actual bytes required by the
   SANs and likely to be consumed as the messages traverses the SAN,
   unlike the L2RHs that are intact until converted to actual routing
   bytes.

   The L2RHs start with 0bvv00000010 followed by the number of routing
   bytes in that L2RH, and possibly also by several bytes of padding.


Glossary

   Address             A unique designation of a node (actually an
                       interface to that node) or a SAN.

   Buddy-HR            HRs are "buddies" if they are on the same SAN.

   Cut-Through         See Wormhole.

   Destination         The node to which a packet is intended.

   Dynamic-Routing     Routing according to dynamic information (i.e.,
                       acquired at run time, rather than pre-set).

   Endianness          The property of being Big-Endian or Little-Endian
                       (transmission order, etc.)

   Ethertype           A 16-bit value designating the type of Level-3
                       packets carried by a Level-2 communication sys-
                       tem.

   HR                  Half-Router, the part of a router that handles
                       one network only.

   L2-Forwarding       Forwarding based on Level-2 (i.e., data-link
                       layer of the ISORM) information, e.g., the native
                       technique of each SAN or LAN.  Also called
                       "source routing."

   L3-Forwarding       Forwarding based on end-to-end (Level-3 i.e.,
                       network layer of the ISORM) addresses.  Also
                       called "destination routing."



Cohen et al.                  Experimental                     [Page 16]


Internet-Draft               PacketWay RRP                      May 1998

   Map                 The topology of a network.

   Mapper              A node on a SAN/LAN that has the map and an RT
                       for that network.  It is expected that the mapper
                       dynamically updates the map and the RT.

   Multi-homed         A node with more than one network interface,
                       where each interface has another address.

   Node                Whatever can send and receive packets (e.g., a
                       computer, an MPP, a software process, etc.)

   Node                A C-struct (or equivalent) containing values for
                       some attributes of a node.

   Planned             Transfer of information, occurs after an initial
                       phase in which the sender decides which Level-2
                       route to use for that transfer.

   RCVF                The "Received From" set includes all the physical
                       addresses through which an RT was disseminated,
                       starting with the address of the mapper that
                       created that RT.

   Redirect            A message that tells nodes which HR should be
                       used in order to get to a certain remote address.

   Router              The inter-SAN communication device.

   Security            A relationship between 2 (or more) nodes that
                       defines how the nodes utilize security services
                       to communicate securely.

   Source              The node that created a packet.

   Source-Route        A Level-2 route that is chosen for a packet by
                       its source.

   Symbol              Data preceding the EEP header of a PacketWay mes-
                       sage, interleaving with the L2RHs.

   Twin-HR             Two HRs are twins if they both are parts of the
                       same inter-SAN router.

   Wormhole-routing    (aka cut-thru routing) forwarding packets out of
                       switches as soon as possible, without storing
                       that entire packet in the switch (unlike Stop-
                       and-forward)

   Zero-copy           A TCP system that copies data directly between
                       the user area and the network device, bypassing
                       OS copies


Cohen et al.                  Experimental                     [Page 17]


Internet-Draft               PacketWay RRP                      May 1998


Acronyms and Abbreviations


   0bNNNN  The binary number NNNN (e.g., 0b0100 is 4-decimal)

   0xNNNN  The hexadecimal number NNNN (e.g., 0x0100 is 256-decimal)

   8B      8 byte (64 bits) entity

   ADDR    The Address-record of RRP

   AT      Address Type

   BER     Bit Error Rate

   CAPA    The CAPAbility-record of RRP

   CSR     Common Source-Route

   DA      Destination Address

   DB      Data Block

   DL      Data Length (in 8B words)

   DT      Destination-Type

   EEP     End-to-End Protocol

   EI      Error Indication

   GVL2    An RRP message, requesting L2 route to a given destination

   GVRT    An RRP message asking an HR to give its routing tables

   HR      Half Router

   HRTO    An RRP message asking which HR to use for a given destination

   INFO    An RRP message providing information about nodes

   L2      Level-2 of the ISO Reference Model (Link)

   L2RH    Level-2 Routing Header

   L2SR    Source Route

   L3      Level-3 of the ISO Reference Model (Network)

   LADR    The Logical-addresses-record of RRP



Cohen et al.                  Experimental                     [Page 18]


Internet-Draft               PacketWay RRP                      May 1998


   LSbit   Least Significant bit

   LSbyte  Least Significant byte

   MSbit   Most Significant bit

   MSbyte  Most Significant byte

   MTU     Maximum Transmission Unit

   MTUR    The MTU-record of RRP

   NAME    The name-record of RRP

   OH      Optional Header field

   OH-TYPE The Type of an Optional Header field

   Q       Quality (of a path)

   RCVF    Received-From list, or the Received-From record of RRP

   RDRC    A re-direct message of RRP

   RRP     Router-to-Router Protocol

   RTBL    An RRP message proving a Routing Table

   SRQR    The Source-Route-and-Q-record of RRP

   TELL    RRP message requesting INFO about a partially specified node

   UNK     Unknown

   WRU?    An RRP message asking its recipient to identify itself




Editor's Address

   Anthony Skjellum
   Computer Science Department
   Box 9637
   Mississippi State University
   Mississippi State, MS 39762-9637

   Phone: 601-325-8435
   Fax:   601-325-8997
   Email: tony@cs.msstate.edu



Cohen et al.                  Experimental                     [Page 19]

--------