Network Working Group                                     R. R. Stewart
INTERNET-DRAFT                                                   Q. Xie
                                                               Motorola
                                                             S. Hussain
                                                               C. Sharp
                                                                  Cisco
                                                     H. J. Schwarzbauer
                                                                Siemens
                                                              T. Taylor
                                                        Nortel Networks
                                                              I. Rytina
                                                               Ericsson

expires in six months                                      June 2, 1999



           MULTI_NETWORK DATAGRAM TRANSMISSION PROTOCOL
                <draft-ietf-sigtran-mdtp-05.txt>


Status of This Memo

This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups.  Note that other groups may also distribute
working documents as Internet-Drafts.

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.



Abstract

This Internet Draft discusses a new protocol, namely the Multi-network
Datagram Transmission Protocol (MDTP), that is intended to provide
fault-tolerant reliable data transfer between communicating entities
over IP networks [1].

MDTP is proposed as an application-level protocol that is designed to
support redundant networks and transparent fault management. MDTP also
provides timing control and configuration flexibilities to meet the
stringent timing requirements often found in telephony signaling
protocols. The motivation of developing MDTP is to support
Internet-based high reliability applications such as signaling and
call control for Internet telephony.




Stewart, et al                                                  [Page  1]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


                        TABLE OF CONTENTS

1.  Introduction.......................................................3
  1.1 Terminology......................................................3
  1.2 Design Requirements of MDTP......................................4
  1.3 Interface to MDTP................................................5
2.  MDTP Datagram Format...............................................5
  2.1 MDTP Common Header Field Descriptions............................6
  2.2 MDTP Control Parameter Part Definitions..........................7
  2.3 MDTP Data Part Definitions......................................11
3.  Endpoint Association Initialization...............................12
  3.1 Initiation Message and Tag Lock.................................12
  3.2 Tag Unlock and TSN Initialization...............................13
  3.3 Datagram Processing during Tag Lock ............................14
  3.4 An Example of Association Initialization .......................14
  3.5 Other Initiation Issues.........................................15
    3.5.1 Selection of Tag Value......................................15
    3.5.2 Initiation from behind a NAT................................15
    3.5.3 Initialization Collision....................................16
    3.5.4 Association Re-initialization...............................16
4.  Transfer User Datagram............................................16
  4.1 Timer Management Rules..........................................17
    4.1.1 T3-send Timer Adjustment with RTT...........................18
  4.2 Multihoming Rotation............................................18
    4.2.1 Remote Multihoming Rotation.................................18
    4.2.2 Local Multihoming Rotation..................................19
  4.3 Stream Sequence Number..........................................19
  4.4 Ordered and Un-ordered Delivery.................................19
  4.5 Report Missing Datagrams........................................20
  4.6 Range Check on TSN .............................................21
  4.7 Advisory Ack Request............................................21
5   Congestion Controls...............................................22
  5.1 Send with Window Control........................................22
    5.1.1 Window Length Adjustment....................................23
  5.2 Send Timer Back-off at Re-transmission..........................24
6.  Network Management................................................25
  6.1 Failure Detection in Redundant Networks.........................25
  6.2 RTT Measurement.................................................26
  6.3 Network Heart Beat .............................................26
7.  Termination of Association........................................27
  7.1 Graceful Shutdown of an Association.............................28
8.  Stream Operations.................................................29
  8.1 Stream Initiation...............................................29
  8.2 Stream Termination..............................................29
  8.3 Other Issues with Stream Operations.............................30
9.  Interface with Upper Layer........................................30
10. Suggested MDTP Timer and Protocol Parameter Values................34
11. Acknowledgments...................................................34
12. Authors' Addresses................................................34
13. References........................................................35



Stewart, et al                                                  [Page  2]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999



1.  Introduction

This Internet Draft discusses a new protocol, namely the Multi-network
Datagram Transmission Protocol (MDTP). The intention of developing
MDTP is to provide a fault-tolerant, real-time reliable data transfer
mechanism between communicating endpoints over IP networks [1].

MDTP is proposed as an application-level protocol that is designed to
support redundant networks and transparent fault management. MDTP also
provides timing control and configuration flexibilities to meet the
stringent timing requirements often found in telephony signaling
protocols. The motivation of developing MDTP is to support
Internet-based high reliability applications such as signaling and
call control for Internet telephony.

MDTP is also designed to be scalable in order to support different
signaling transport requirements for different interfaces to a
telephony network.

For example, the transportation of signaling protocols such as ISDN
PRI may not require redundant networks, and hence only a subset of
MDTP will need to be implemented.  On the other hand, redundant
networks may be mandated when transporting SS7 signaling messages
amongst different components in a carrier-grade telephony core
network.  In such cases, the transparent support for redundant
networks, load sharing, and fault management defined in MDTP become
essential.

Many of the fundamental concepts that have made TCP such a  useful
protocol are reused in MDTP, and some of the advantages of UDP are
also merged into the design.


1.1 Terminology

The following terms are defined and used in this document:

- Redundant networks:

  An endpoint may be able to transmit or receive on more than one IP
  address/UDP port. RFC 1122 refers to this as multihoming. This
  constitutes a redundant local network (for MDTP) relative to the
  endpoint. MDTP makes no attempt to assure routing diversity within
  the internet connecting two endpoints. Each endpoint attempts to
  send to its peer endpoint using all the IP addresses and UDP ports
  its peer has open (within the constraints of any application
  specified restrictions). The choice of which local socket to send
  upon is an implementation detail (it is possible only one socket is
  available and bound to all of the local networks to which the machine is
  connected). The O/S also will play a role in the multihoming/redundancy.
  MDTP attempts a best effort at spreading the traffic across a

Stewart, et al                                                  [Page  3]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

  destination's available interfaces. It is assumed by MDTP that the
  network (if fault tolerance is desired) is engineered for diversity
  and MDTP's best effort will play only a small role in that diversity.

- Endpoint:

  Representation of the logical point where MDTP datagrams can be sent
  to or received from. Moreover, an MDTP endpoint shall be defined as
  a set of IP address/port combinations in order to support redundant
  networks. For example, an endpoint on a multi-homed host connected
  with N IP networks can be represented as:

     [IP addr1/port1,
       ...
      IP addrN/portN]

  where the port numbers or IP addresses may not be unique, but their
  combinations shall be guaranteed to be unique by the underneath IP
  networks.

- Association:

  Representation of an ongoing communication channel between two MDTP
  endpoints.

- Stream:

  Defines a sub-channel within an association. Datagrams sent through
  a stream shall be reliably transmitted and delivered independent to
  datagrams from other streams.

  Each stream shall be identified by a stream number that is unique
  within the association. Stream 0xffff is reserved and shall not be
  used.


1.2 Design Requirements of MDTP

The following are some of the design requirements of MDTP to
make MDTP capable of supporting real-time call control environments
that may employ redundant networks:

A) High communication fan-out: an endpoint may need to be in
   simultaneous communication with hundreds or thousands of endpoints
   performing various call processing functions. These endpoints may
   be codec converters, SS7 to IP translation applications, or, in the
   case of mobile networks, data selector and combiner applications.

B) Stringent timer control: an endpoint needs to have a very fine
   control over the timing for delivering a datagram. The timing
   should be easily adjusted depending on the message type and the
   destination. For example, after a few seconds of non-delivery the
   call which the message is about may not exist anymore.

Stewart, et al                                                  [Page  4]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


C) Support multiple network paths: an endpoint communicating with a peer
   should be able to take advantage of the multiple network paths and
   multihoming in a transparent way. Therefore, the protocol must
   be able to take advantage of local multi-homed hosts and remote
   multi-homed hosts to provide resilient data delivery. This means
   that the application or upper layer protocols need not to be involved
   in the network fault management. Instead, when network failure occurs
   MDTP should be able to automatically transmit out-bound datagrams to an
   alternate destination network interface (if one exists) without
   intervention from the application.

D) Reliable transport: datagrams might be lost or discarded while
   traveling in the IP network towards the destination. The protocol
   must handle the re-transmission of lost messages in an autonomous
   way without any intervention from the upper layer. Also, sometimes
   datagrams may arrive in duplicate copies, in such cases MDTP must
   be able to detect and remove the duplicates automatically.

E) Support both ordered and un-ordered delivery: MDTP must support
   both ordered and un-ordered delivery. In the case of ordered
   delivery, the receiver shall detect out-of-order datagrams and
   re-order them before dispatching them to the upper layer. In the
   un-ordered case, received datagrams shall be dispatched without any
   effort of re-ordering.

F) Support stream sequencing: on the demand of the upper layer
   protocols or applications, MDTP should be able to support sequenced
   delivery with regard to each individual stream, i.e., the delay caused
   by the loss and retransmission of a datagram should be isolated to
   only the stream to which the datagram belongs. This is particularly
   important in some call control applications, where a loss of a
   message should only affect the call whom the message belongs to.


1.3 Interface to MDTP

The application programs or upper layer protocols interface with MDTP
through a set of primitives (see section 9).

Towards the IP networks, it is assumed that UDP is used for the
transport layer. No special interfaces or changes are assumed within
UDP or at the UDP/MDTP interface.  MDTP maintains its own queuing and
endpoint association.  When MDTP runs on a router or on a
gateway-enabled host, it will place no special constraints on the
lower layer protocol implementations other than those described in the
Router Requirements and Host Requirements RFCs.


2.  MDTP Datagram Format

A MDTP datagram consists of a common header and possibly a control
parameter part, a data part, or both.


Stewart, et al                                                  [Page  5]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

                      MDTP Datagram Format

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  MDTP Protocol Identifier             | Vers  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |C/D| Msg Type  |   Reserved    |         Data Size             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   \                                                               \
   /                    Control Parameter Part                     /
   \                                                               \
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   \                                                               \
   /                          Data Part                            /
   \                                                               \
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Note: integers in the header of MDTP datagrams MUST be transmitted in
network byte-order.

Note: when both the control part and data part are present in an MDTP
datagram, the control part MUST be processed first.


2.1 MDTP Common Header Field Descriptions

    MDTP Protocol Identifier: 28 bits

      This shall be a fixed value of 0xf787307. The receiver shall
      always verify this Protocol Identifier before it proceeds any
      further in interpreting the header fields.

    Version: 4 bits

      This field represents the version number of the MDTP protocol,
      and shall be set to 0x3.

    C/D Bits: 2 bits

      This field indicates whether the Message Type and Data Size fields
      are filled in the present datagram:

         00 - reserved, shall not be used. The receiver shall silently
              discard any datagram with C/D bits set to 00.
         01 - Data Size only
         10 - Message Type only
         11 - Message Type and Data Size

    Message Type: 6 bits

      This shall indicate the type of control message. Its value is valid
      only when the C/D bits are set to either "10" or "11". Otherwise it

Stewart, et al                                                  [Page  6]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

      shall be set to 0x0 and ignored by the receiver.

      Message Type determines whether the control part is present in
      the current datagram.

      The value of Message Type is defined as the follows:

      0x0  - reserved and shall not be used

      0x1  - Initiation
      0x2  - Initiation Ack
      0x3  - Extended Data Ack
      0x4  - Advisory Ack Request
      0x5  - Window-up
      0x6  - Window-up Ack
      0x7  - RTT-request
      0x8  - RTT-ack
      0x9  - Abort
      0xa  - Graceful Shutdown
      0xb  - Graceful Shutdown Ack
      0xc  - Stream Initiation
      0xd  - Stream Initiation Ack
      0xe  - Stream Termination
      0xf  - Stream Termination Ack

      0x10 to 0x3f - reserved and shall not be used

    Reserved: 8 bits

      These bits are reserved for future use. The sender shall always
      set these bits to '0', and the receiver shall ignore there
      values.

    Data Size: 16 bits

      This value represents, in number of octets, the size of the user
      data present in the Data Part of the current datagram. Its value
      is only valid when C/D bits are set to either "01" or "11".
      Otherwise it shall be set to 0x0 and ignored by the receiver.


2.2 MDTP Control Parameter Part Definitions

This section defines whether a control parameter part is present for
each message type, and its format if a control parameter part is
present.

2.2.1 Initiation (0x1) and Initiation Ack (0x2):

    The parameter field of the Initiation and Initiation Ack messages
    shall carry two initiation Tags, the maximum window length and the
    sender's local network information. Note that the endpoint MAY
    be multi-homed.


Stewart, et al                                                  [Page  7]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

    The following defines the parameter format for carrying N IPv4
    Network addresses (other network address formats can be carried by
    setting the size and type fields accordingly):

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       Tag Value 1 (Seen)                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       Tag Value 2 (Send)                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       Max Window Length                       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                    Number of Networks = N                     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       Size of address=8       |    Type of Address=2          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                   IP Address of Network 1                     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         Port # 1              |      Padding = 0              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    /                                                               /
    \                              ...                              \
    /                                                               /
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       Size of address=8       |    Type of Address=2          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                   IP Address of Network N                     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         Port # N              |      Padding = 0              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    If there is any implementation-specific data needed to be
    exchanged at the setup of the association, it should be appended
    to the end of the above data structure. The format of the
    implementation-specific data should follow "Size/Type/Data Field"
    format as defined above. In case an endpoint does not support the
    implementation-specific data received, it shall ignore the
    additional fields.


2.2.2 Extended Data Ack (0x3):

    The parameter field contains 0 or more gap reports and the
    highest transmission sequence number (TSN) received.










Stewart, et al                                                  [Page  8]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Number of Gaps = N                     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         Gap #1 Start TSN                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          Gap #1 End TSN                       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    /                                                               /
    \                              ...                              \
    /                                                               /
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         Gap #N Start TSN                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          Gap #N End TSN                       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Last TSN Seen                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


2.2.3 Advisory Ack Request (0x4):

    No parameter field.

2.2.4 Window-up (0x5):

    No parameter field.

2.2.5 Window-up Ack (0x6):

    Same as that of Extended Data Ack.

2.2.6 RTT-request (0x7) and RTT-ack (0x8):

    The parameter field shall contain the time value that is used for
    RTT calculation (see section 6.2), and optionally an
    acknowledgment Seen value.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         Time Value 1                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         Time Value 2                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       0x0 or TSN Seen                         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2.2.7 Abort (0x9):

    The Abort message shall carry the initiation Tag of the
    destination endpoint as a measure of security.

Stewart, et al                                                  [Page  9]

Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Init-Tag                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2.2.8 Graceful Shutdown (0xa):

    The destination endpoint initiation Tag shall be carried as a
    measure of security.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            Init-Tag                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            TSN Seen                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2.2.9 Graceful Shutdown Ack (0xb):

    Same as that of Abort.

2.2.10 Stream Initiation (0xc):

    The parameter field shall contain the initiation Tag of the
    destination endpoint (see section 3.1), the Stream Identifier,
    and the Initial Sequence Number of this stream. Also, there shall
    be a "Size of Stream Info" and "Stream Information" fields that
    may contain an opaque user data structure specific to the stream
    being opened. The "Stream Information" field should be padded with
    '0's to 32 bit word boundary before transmission.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                           Init-Tag                            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         Stream Identifier     |       Reserved (set to 0)     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                 Size of Stream Info = N                       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    /                                                               /
    \                Stream Information (N octets)                  \
    /                                                               /
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2.2.11 Stream Initiation Ack (0xd):

    The parameter field shall contain the Stream Identifier.




Stewart, et al                                                  [Page 10]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         Stream Identifier   |       Reserved (set to 0)       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2.2.12 Stream Termination (0xe):

    The parameter field shall contain the initiation Tag value (see
    section 3.1) and the Stream Identification

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Init-Tag                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         Stream Identifier   |       Reserved (set to 0)       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2.2.13 Stream Termination Ack (0xf):

    Same has that of Stream Initiation Ack.


2.3 MDTP Data Part Definitions

The following format shall be used for MDTP datagram Data Part:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           TSN Seen                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           TSN Send                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Stream Identifier N      |    Sequence Number n          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   \                                                               \
   /                 User Data (seq n of Stream N)                 /
   \                                                               \
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


    TSN Seen: 32 bits

      This is a piggy-backed acknowledgment, indicating the reception
      of datagrams up to this TSN.

    TSN Send: 32 bits

      This value represents the TSN of the user data carried in this
      datagram.


Stewart, et al                                                  [Page 11]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

    Stream Number: 16 bits

      Identify the stream to which the following user date belongs.

    Sequence Number: 16 bits

      This value presents the sequence number of the following user
      data within the stream.

      Sequence number 0x0 indicates that the following user data shall
      be treated as un-ordered, and shall be dispatched to the upper
      layer by the receiver without any attempt of re-ordering.

    User Data: variable length

      This is the payload user data. The size of the user data shall
      be specified in the Data Size field. The implementation may
      optionally have some '0' padded at the end of User Data field.


3. Endpoint Association Initialization

Before the first data transmission can take place from one endpoint
("A") to another endpoint ("Z"), the two endpoints must complete an
initialization process in order to set up an association between them.

The upper layer may explicitly request MDTP to initialize an
association to an endpoint, or implicitly open the association by
sending the first datagram to that endpoint on stream 0.

Once the association is established, the global stream, i.e., stream
0, is automatically open and ready for datagram transmission. Other
streams must be explicitly opened before data transmission can occur.

A tag-and-lock mechanism must be employed during the initialization
in order to guard against security attacks as well as erroneous
datagrams.


3.1 Initiation Message and Tag Lock

The initialization process consists of the following steps (assuming
that MDTP endpoint "A" tries to set up an association with MDTP
endpoint "Z"):

A) "A" shall first send an Initiation message to "Z", with Tag Seen
   field set to 0x0 and Tag Send field set to Tag_A, where Tag_A shall
   be a random number in the range of 0x80000000 to 0xffffffff (see
   3.1.4 for Tag value selection), and enter the Tag-lock mode.

B) "Z" shall respond immediately with an Initiation Ack message, with
   Seen set to Tag_A and Send set to Tag_Z (same range as Tag_A), and
   enter the Tag-lock-new mode.


Stewart, et al                                                  [Page 12]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

   At this point, "Z" is ready to send user datagrams to "A" in stream
   0. And upon the reception of the above Initiation Ack from "Z", "A"
   also becomes ready to send user datagrams to "Z" in stream 0.

   Note: user data in other streams can not be sent until the
   respective streams are opened.

C) "Z" shall leave Tag-lock-new mode and enter Tag-lock mode only if a
   user datagram has been sent out from "Z" to "A".

   Note: to guard against "man in the middle" attacks, a limit should
   be imposed on the number of associations in the Tag-lock-new mode
   at any given endpoint; whenever that limit is reached, any further
   association Initiations received by the endpoint shall be silently
   discarded. Also, a timer shall be used on each association that is
   in the Tag-lock-new mode; at the expiration of that timer, that
   association shall be shutdown by the endpoint.

Note: no user data shall be carried in both the Initiation and
Initiation Ack messages, i.e., the C/D bits must be set to 10.

Note: both side must exchange their local network information and
their maximal window length in the Initiation and Initiation Ack
messages.


3.2 Tag Unlock and TSN Initialization

The first user datagram transmitted by "A" to "Z" shall have the TSN
Seen value set to Tag_Z in the Data Part (see 2.3).

Similarly, the first user datagram transmitted by "Z" to "A" shall
have the TSN Seen value set to Tag_A.

The reception of this first datagram with user data and with the
correct Tag value in the TSN Seen field from its peer shall unlock the
Tag and cause the endpoint to leave the Tag-lock or Tag-lock-new mode.

The receiver shall immediately send back an Extended Data Ack to
acknowledge the reception of this first user datagram.

The TSN Send value carried in this first datagram with user data shall
be used to establish the initial TSN of this peer, i.e., the sender of
this datagram.

To strengthen the security, this initial TSN shall be randomly
selected from the range between 0x1 and 0x7fffffff by the sender, by
means such as those suggested in RFC 1750 [9].

Note: if there exists any un-acked datagram(s) when an endpoint is to
send its first user datagram to its peer, the endpoint MUST send a
stand-alone Extended Data Ack to acknowledge the un-acked datagram(s)
it has received from that peer before it sends out its first user
datagram. This is because the TSN Seen field in the first out-bound

Stewart, et al                                                  [Page 13]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

user datagram can not be used as a TSN ack, instead it is used to
carried the peer's Tag.


3.3 Datagram Processing during Tag Lock

In Tag-lock or Tag-lock-new mode, an endpoint shall silently discard
any user datagrams from the peer endpoint that does not carried the
correct Tag value.

However, if there is a control part present in a discarded user
datagram (i.e., C/D = 0x11), the endpoint shall always process the
control part even when the data part is being discarded.

If another Initiation from "A" is received by "Z" after "Z" sent out
its Initiation Ack, "Z" shall respond to this second Initiation by
re-sending the Initiation Ack if the Tag Send field of this second
Initiation has the same value as that of the original Initiation.
Otherwise, "Z" shall respond by sending an Initiation of its own, with
Tag Send field set to Tag_Z, so as to elicit an Initiation Ack from
"A".


3.4 An Example of Association Initialization

In the following example, "A" initiates the association first and then
sends a user datagram to "Z", then "Z" sends two user datagrams
sometimes later:

Endpoint A                                          Endpoint Z

{app sets association with Z}
Initiation(C/D = 10)
   [Tag Seen=0,Tag Send=Tag_A
    & net addr info] --------\
(Start T1-init timer)         \
(Enter Tag_A-lock mode)        \---->Initiation Ack(C/D = 10)
                                       [Tag Seen=Tag_A,Tag Send=Tag_Z
                                /----   & net addr info]
                               /     (Enter Tag_Z-lock-new mode)
(Cancel T1-init timer)<-------/

{app sends 1st user data; strm 0}
U-Data(C/D = 01)
   [Seen=Tag_Z,Send=init TSN-A
    Strm=0,Seq=1,
    & user data]      -------\
(Start T3-send timer)         \
                               \---->(Leave Tag_Z-lock-new mode)
                               ------Ext Data Ack(C/D = 10)
                              /        [Gap=0,TSN Seen=init TSN-A]
(Cancel T3-send timer) <-----/
..


Stewart, et al                                                  [Page 14]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

                                     ..
                                     {app sends 2 datagrams;strm 0}
                               /---- U-data(C/D = 01)
                              /        [Seen=Tag_A,Send=init TSN-Z
(Leave Tag_A-lock mode) <----/          Strm=0,Seq=1,
(Start T2-receive timer)                & user data 1]
                               /---- U-data(C/D = 01)
                              /        [Seen=init TSN-A,
                             /          Send=init TSN-Z +1,
                        <---/           Strm=0,Seq=1,
                                        & user data 2]



If T1-init timer expires at "A" after the Initiation is sent, the same
Initiation message with the same Tag_A value shall be retransmitted and
the timer restarted. This shall be repeated Max.Init.Retransmit times
before "A" considers "Z" unreachable and optionally reports the
failure.


3.5 Other Initiation Issues

3.5.1 Selection of Tag Value

Tag values should be selected from the range of 0x80000000 to
0xffffffff. It is very important that the Tag value be randomized to
guard against "man in the middle" and "sequence number" attacks. It is
suggested that RFC 1750 [9] be used for the Tag randomization.


3.5.2 Initiation from behind a NAT

When a NAT is present between two endpoints, the endpoint that is
behind the NAT, i.e., one that does not have a publicly available
network address, shall take one of the following options:

A) Indicate that it has only one network by setting the 'Number of
   networks' field in the Initiation message to 0. This will make the
   endpoint that receives this Initiation message to consider the sender
   as only having that one address. This method can be used for a dynamic
   NAT, but any multihoming configuration at the endpoint that is behind
   the NAT will not be visible to its peer, and thus not be taken
   advantage of.

B) Indicate all of its networks in the Initiation by specifying all
   the actual IP addresses and ports that the NAT will substitute for the
   endpoint. This method requires that the endpoint behind the NAT must
   have pre-knowledge of all the IP addresses and ports that the NAT will
   assign.





Stewart, et al                                                  [Page 15]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

3.5.3 Initialization Collision

If two endpoints attempt to initialize an association with each other
at about the same instance, a collision will occur. As a result, each
side will receive an Initiation datagram from the other side after it
transmitted its own. In such a case, both sides shall send an
Initiation Ack datagram to the other side using the procedure
described above.


3.5.4 Association Re-initialization

An endpoint shall be allowed to re-initialize an established
association with the other endpoint.

Once an endpoint has left the Tag-lock or Tag-lock-new mode of the
previous association initialization process, it shall treat any new
Initiation message from its peer as a re-initialization event.

During a re-initialization, both endpoint shall follow the same
procedure as defined in section 3.1. And a new Init-Tag must be used
by the endpoint that receives the Initiation message if it has already
left the previous Tag-lock or Tag-lock-new mode.


4.  Transfer User Datagram

The receiver of a user datagram shall always acknowledge the reception
to the sender of the datagram. Normally, delayed acknowledgment shall
be used. The delay shall be controlled by a T2-receive timer.

At the expiration of T2-receive timer, if there is out-bound user data,
the ack should be piggy-backed on the data part of the out-bound user
datagram, occupying the TSN Seen field (see section 2.3). Otherwise, a
stand-alone Extended Data Ack shall be used to carry the
acknowledgment.

When Extended Data Ack is used, the sender shall fill the Last TSN
Seen field to indicate the highest TSN Send number it has received
from the peer. Any detected gaps must also be reported
(see section 4.5).

The following example illustrates both stand-alone and piggy-backed
acknowledgments:

Endpoint A                                      Endpoint Z
{App sends 3 messages in strm 0}
U-Data(C/D = 01)
   [Seen=5,Send=7,Strm=0,Seq=3]--------> (Start T2-receive timer)
(Start T3-send timer)

U-Data(C/D = 01)
   [Seen=5,Send=8,Strm=0,Seq=4]-------->


Stewart, et al                                                  [Page 16]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


U-Data(C/D = 01)
   [Seen=5,Send=9,Strm=0,Seq=5]-------->
                                         ...
                                         {Timer T2 expires}
                              /--------- Extended Data Ack(C/D=10)
                             /             [Gap=0,Seen=9]
(cancel T3-send timer) <----/
...
...
{App sends 1 message; strm 0}
U-Data(C/D = 01)
   [Seen=5,Send=10,Strm=0,Seq=6]-------> (Start T2-receive timer)
(Start T3-send timer)
                                         ...
                                         {App sends 1 message; strm 1}
                                         (cancel T2-receive timer)
                                 /------ U-Data (C/D=01)
                                /          [Seen=10,Send=6,Strm=1,Seq=2]
                               /         (Start T3-send timer)
(cancel T3-send timer) <------/
(Start T2-receive timer)
..
{Timer T2 Expires}
Extended Data Ack(C/D=10)
   [Gap=0,Seen=6]----------------------> (cancel T3-send timer)


4.1 Timer Management Rules

The the following rules shall be used to manage the timers during
normal datagram transfer, unless otherwise stated for some special
cases:

A) When a user datagram is received, the endpoint shall start a
   T2-receive timer if no T2-receive timer is currently running. Upon
   the expiration of the T2-receive timer, the endpoint shall
   acknowledge to the sender all the un-acked user datagrams it has
   received.

B) When a user datagram is sent out, the sending endpoint shall start
   a T3-send timer if no T3-send timer is currently running.

   If the T2-receive timer is running, the endpoint shall first stop
   the T2 timer, piggy-back an ack (or Extended Data Ack) unto the
   out-bound datagram, and then start a T3-send timer.

   If the T3-send timer expires, the endpoint shall follow the rules
   described in 4.6 for possible re-transmission of the un-acked
   datagrams.

   Moreover, whenever the T3-send timer is started the RTT estimate
   last calculated for that remote network address should be added to
   the base T3-send timer value (see sections 6.2 and 6.3 for RTT).

Stewart, et al                                                  [Page 17]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


C) When all outstanding datagrams are acknowledged, the T3-send timer
   shall be stopped if one is still running.

D) If an endpoint has a T3-send timer running and receives a partial
   acknowledgment (one that acknowledges some of the outstanding
   datagrams) then the endpoint shall restart the T3-send timer.

The following example shows the use of various timers.

Endpoint A                                         Endpoint Z
{App sends 2 messages; strm 0}
U-Data (C/D=01)
  [Seen=5,Send=7,Strm=0,Seq=3] ---------> (Start T2-receive timer)
(Start T3-send timer)

U-Data (C/D=01)                           {App sends 1 message; strm 1}
  [Seen=5,Send=8,Strm=0,Seq=4] -\     /-- (cancel T2-receive timer)
                                 \   /    U-Data (C/D=01)
                                  \ /       [Seen=7,Send=6,Strm=1,Seq=2]
                                   \      (Start T3-send timer)
                                  / \
(Re-start T3-send timer) <-------/   \
(Start T2-receive timer)              \
...                                    -> (Start T2-receive timer)
...
{T2-receive timer expires}
Extended Data Ack(C/D=10)
  [Gap=0,Seen=6] -----------------------> (Cancel T3-send timer)
                                          ..
                                          {T2-receive timer expires}
(Cancel T3-send timer)  <---------------- Extended Data Ack(C/D=10)
                                            [Gap=0,Seen=8]

4.1.1 T3-send Timer Adjustment with RTT

If the RTT measurement is available to a remote IP address, the sender
shall adjust the T3-send timer each time when sending datagrams to
that IP address. The calculation and adjustment of the timer should
follow the method described in [4]. RTT measurement shall be tracked
for each destination IP address if the remote host is multi-homed.

MDTP defines three methods to obtain RTT measurements, see sections
4.7, 6.2, and 6.3.


4.2 Multihoming Rotation

4.2.1 Remote Multihoming Rotation

When an endpoint is transmitting to a remote multi-homed endpoint, the
transmitting endpoint shall rotate between destination IP addresses.
Every time the application transmits a datagram, MDTP MUST keep track
of the remote IP address to which it sent the datagram in the MDTP

Stewart, et al                                                  [Page 18]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

protocol variable 'last.sent.intf'. MDTP should rotate each send in a
round robin fashion amongst all available destination IP addresses on
the remote multi-homed host and should update the protocol variable
'last.sent.intf' to indicate which destination IP address it last
used.

If possible, acks should be transmitted to the same IP address from
which the acked messages were received.  When acknowledging multiple
messages, this may not be possible.  In the latter case, MDTP SHOULD
rotate the transmission of acknowledgments to all remote IP addresses.

The MDTP implementation MUST allow an application to override this
rotation by specifying the destination IP address to which to send a
datagram.  The implementation must also provide an interface to add
and remove a remote IP address from rotation eligibility.


4.2.2 Local Multihoming Rotation

As discussed in section 3.3.4 of RFC 1122, an endpoint MAY rotate
transmitted messages amongst all local network interfaces by
specifying the local IP address and UDP port or it may allow the
networking protocol to decide which local IP address (and network
interface) to use to transmit a datagram..

If possible, acks should be transmitted from the same IP address over
which the acked messages were received. When acknowledging multiple
messages, this may not be possible. In the latter case, MDTP SHOULD
rotate the transmission of acknowledgments from all configured IP
address/port pairs.


4.3 Stream Sequence Number

The datagram stream sequence number shall always be set to 1 when the
stream is opened.

Also, when the stream sequence number reaches the value 0xffff the
next sequence number shall be set to 1. Sequence number '0' has
special meaning (see section 4.4) and shall not be used in normal
sequence number rotation..


4.4 Ordered and Un-ordered Delivery

Normally, the receiver shall ensure the user datagrams within any
given stream be delivered to the upper layer according to the order of
their stream sequence number. If there are datagram arrived out of
order of their stream sequence number, the receiver must hold the
received datagrams from delivery until they are re-ordered.

However, a sender can set the stream sequence number of a user
datagram to 0, to indicate that no ordering shall be performed on that
datagram within that stream. Upon the reception of the datagram, the

Stewart, et al                                                  [Page 19]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

receiver must by-pass the ordering mechanism and immediately delivery
the datagram to the upper layer.

This provides an effective way to transmit "out-of-band" data in any
given stream. Also, a stream can be used as an "un-ordered" stream by
simply setting the stream sequence number of each out-bound user
datagram to 0.


4.5 Report Missing Datagrams

MDTP uses a receiver-based retransmission policy, where the sender
attempts to elicit from the receiver information on the missing
datagrams before the retransmission.

If a receiver detects holes in the received user datagram sequence (by
examining TSN Send numbers), an Extended Data Ack with gap reports
shall be sent back to inform the sender so that the missing datagrams
can be re-transmitted.

Multiple gaps can be indicated in one single Extended Data Ack.

If there is out-bound user data, the endpoint shall piggy-back the
Extended Data Ack with the user data in the same MDTP datagram, by
setting the C/D bits to '11'. And the TSN Seen field in the data part
shall not be used, i.e., the sender shall set the field to 0x0 and the
receiver shall ignore it.

The following example shows the use of gap report in an Extended Data
Ack.

Endpoint A                                    Endpoint Z
{App sends 3 messages; strm 0}
U-Data (C/D=01)
   [Seen=3,Send=6,Strm=0,Seq=2]-------> (Start T2-receive timer)
(Start T3-send timer)

U-Data (C/D=01)
   [Seen=3,Send=7,Strm=0,Seq=3]-----X (lost)

U-Data (C/D=01)
   [Seen=3,Send=8,Strm=0,Seq=4]-------> (A gap detected in data)
                                        ..
                                        {T2-receive timer expires}
                                /------ Extended Data Ack (C/D = 10)
                               /          [Gap=1,Strt=7,End=7,Seen=8]
(Prepare retransmission) <----/


In this example, when "Z" receives the third datagram from "A" it
realizes that a gap exists in the received data. At the expiration of
T2-receive timer, "Z" sends an Extended Data Ack with a gap report to
"A" to indicate the missing datagram. Note that the Start and End
fields in the gap report specify the edges of the gap, i.e., the TSN

Stewart, et al                                                  [Page 20]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

numbers between Start and End are missing.

When the peer endpoint is multi-homed, the Extended Data Ack should be
sent out to the destination IP address specified in the MDTP protocol
variable 'last.good.intf'. The value of 'last.good.intf' is always
updated to point to the source IP address from which the last datagram
from the peer endpoint arrived.


4.6 Range Check on TSN

For security reasons, the receiver must check the range of the TSN
Send value in each received user datagrams.

Assume that the highest TSN received from a peer is T and the maximal
window length of the same peer is W (exchanged during association
initiation, see section 3.1). When the next user datagram arrives from
this peer, the receiver shall silently discard the datagram if the TSN
Send value carried in the datagram is greater than T+W (calculation
rounds up at 0x7fffffff to 0x1).


4.7 Advisory Ack Request

An endpoint may use Advisory Ack Requests to improve bandwidth
utilization.

The endpoint should send an Advisory Ack Request to its peer when it
reaches half of its current window length, and also when it detects
that the next send will reach the full window length (see section 5.1
for window control).

Upon the reception of an Advisory Ack Request, when it is not under
flow control condition the peer endpoint should immediately
acknowledge all the datagrams it has received but not yet
acknowledged, and then cancel the T2-receive timer if one is still
running. Otherwise, the peer endpoint shall take no action and ignore
the Advisory Ack Request.

The following shows an example of using Advisory Ack Request:

Endpoint A                                      Endpoint Z
{App sends 3 messages; strm 0}
U-Data(C/D = 01)
   [Seen=5,Send=7,Strm=0,Seq=3]-------------> (Start T2-recv timer)
(Start T3-send timer)

U-Data(C/D = 01)
   [Seen=5,Send=8,Strm=0,Seq=4]----------->






Stewart, et al                                                  [Page 21]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


{detects window half full, use Advisory Ack Req}
Adv Ack Request(C/D=11)
   [Seen=5,Send=9,Strm=0,Seq=5]------\
                                      \
                                       \----> (cancel T2-receive timer)
                            <---------------- Extended Data Ack(C/D=10)
                                                 [Gap=0,Seen=9]

An endpoint sending an Advisory Ack Request may also use this request
for its RTT calculation. The sending endpoint may note the time mark
when sending the datagram with the Advisory Ack Request.  When the
peer endpoint responds with an Extended Data Ack, the sender of the
Advisory Ack Request may use the time mark of the arriving Extend Data
Ack and the stored time mark to calculate the RTT as defined in
[4]. However, the sender of the Advisory Ack Request shall abandon the
RTT calculation if more datagrams are sent to its peer and no Extended
Data Ack is received.


5   Congestion Controls

Several different mechanisms shall be used jointly to achieve
congestion control in MDTP. These mechanisms are always used in regard
to the association, not a individual stream.


5.1 Send with Window Control

The sending endpoint shall use a transmission window to control the
number of outstanding datagrams, i.e., datagrams that have been sent,
but yet to be acknowledged. The length of the window is defined as the
maximal number of outstanding datagrams a sending endpoint can
allow. This length is adjusted dynamically, depending on the current
number of successful transmissions as well as the number of lost
datagrams or retransmissions.

When the number of outstanding datagrams reaches the current window
length, the endpoint shall still accept send requests from its upper
layer, but shall transmit no more datagrams until some or all of the
outstanding datagrams are acknowledged. The endpoint may also elect
to queue only a specified number of datagram when the window is full.
When this maximal number of queued datagrams is reached the endpoint
shall return an error to its upper layer.

Moreover, when the window length is reached, the next send request
from the upper layer will trigger a Window-up message to be
transmitted. Upon receiving this Window-up the receiver must respond
with a Window-up Ack, as illustrated by the following example
(assuming current window length is 3):





Stewart, et al                                                  [Page 22]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


Endpoint A                                      Endpoint Z
{App sends 3 messages, strm 0}
U-Data(C/D = 01)
   [Seen=5,Send=7,Strm=0,Seq=3]--------> (Start T2-receive timer)
(Start T3-send timer)

U-Data(C/D = 01)
   [Seen=5,Send=8,Strm=0,Seq=4]-------->

U-Data(C/D = 01)
   [Seen=5,Send=9,Strm=0,Seq=5]-------->

{App sends a new message, strm 1}
(queue new message and send Win-up)
Window-up(C/D = 10)     ---------------> (cancel T2-recv timer)
                                   /---- Window-up Ack(C/D = 10)
                                  /         [Gap=0,Seen=9]
(Cancel T3-send timer)  <--------/
U-Data(C/D = 01)
   [Seen=5,Send=10,Strm=1,Seq=2]-------> (Start T2-receive timer)
(Start T3-send timer)

In the above example, after the transmission of the first three
datagrams, "A" reached its window length. The next message from the
user triggered a Window-up that was sent to "Z". The Window-up shall
contain no user data. In response, "Z" cancelled timer T2 and
immediately sent a Window-up Ack. The arrival of this Window-up Ack
effectively resolved all the outstanding datagrams at "A", thus
allowing "A" to send out the next datagram.


5.1.1 Window Length Adjustment

The window length shall be initially set to 2, and shall then be
dynamically adjusted based on datagram loss and acknowledgment.

If the current window length is less than or equal to 4, every time
the number of consecutive outstanding datagrams acknowledged in a
single ack is equal to or greater than half the current window length,
the sender's window length shall be raised by 1, until it reaches
'Max.Outstanding.dg'(which should be a user configurable parameter).

If the current window length is greater than 4, every time the number
of consecutive outstanding datagrams acknowledged in a single ack is
equal to or greater than 4, the sender's window length shall be raised
by 1, until it reaches 'Max.Outstanding.dg'.

In the following circumstances, the sender's window length shall be
decreased. However, when the window length reaches 2 it shall not be
decreased any further.

The peer endpoint may report reception gaps which may correspond to
multiple datagram losses (indicated by an Extended Data Ack or

Stewart, et al                                                  [Page 23]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

Window-up Ack). If between 1 to 3 datagrams are lost, the window
length shall be decreased by 1. If between 4 to 7 datagrams are lost,
the window length shall be decreased by 2. If 8 or more datagrams are
lost, the window length shall be decreased by 4.

Any time a Window Up is sent to the receiving endpoint the sender's
window length shall be decreased by 1. Also, if a timeout forces a
retransmission the sender's window length shall be reduced to half of
its currently value.

The following table summarizes these rules:

-----------------------------------------------------------------
  duplicate ack received by sender  | Adjust down by 4
-----------------------------------------------------------------
  8 or more datagrams lost          | Adjust down by 4
-----------------------------------------------------------------
  4 to 7 datagrams lost             | Adjust down by 2
-----------------------------------------------------------------
  1 to 3 datagrams lost             | Adjust down by 1
-----------------------------------------------------------------
  Timeout forced retransmission     | Adjust down by 1/2 of the
                                    | current window.
-----------------------------------------------------------------
  Window up sent                    | Adjust down by 1
-----------------------------------------------------------------
  4 or more consecutive datagrams   | Adjust up by 1
  acknowledged (window length > 4)  |
-----------------------------------------------------------------
  1/2 Window length or more acked   | Adjust up by 1
  (window length <=4)               |
-----------------------------------------------------------------


5.2 Send Timer Back-off at Re-transmission

Whenever a T3-send timer expires, the endpoint shall re-transmit the
un-acked datagram that has the highest TSN Send value in that and
re-start the T3-send timer, unless:

A) If the current window length is reached, a Window-up message shall
   be sent out (see section 5.1), or

B) If the current window length is not reached and there is still user
   data pending for transmission, a new datagram with user data shall
   be sent out and T3-send timer shall be restarted.

When the T3-send timer is re-started at a retransmission, the length
of the timer shall be doubled from its previous value. Also, the
latest estimated RTT value for that network should be added to the new
timer value. The following shows the calculation of T3-send timer
value, where 'TL3-default' is a configurable protocol parameter.



Stewart, et al                                                  [Page 24]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

<at normal transmission>
  1. TL3-value = 'TL3-default'
  2. T3-send = TL3-value + RTT

<at re-transmissions>
  1. TL3-value = TL3-value * 2
  2. T3-send = TL3-value + RTT

The T3-send timer at the sender shall be restored to its default value
when a datagram is received from the peer endpoint.

The total number of consecutive re-transmissions to all destination IP
addresses in an association shall be recorded. If this value exceeds
the limit defined in 'Max.Retransmit', the sending endpoint shall
consider the peer endpoint unreachable and shall stop transmitting any
more data to it. The sending endpoint MAY report the failure to the
upper layer, including all datagrams in its out-bound buffer which
have not been acknowledged. Whenever a datagram is received from a
peer endpoint the total number of retransmissions shall be cleared.


6.  Network Management


6.1 Failure Detection in Redundant Networks

When the peer endpoint is multi-homed, the re-transmission of a
datagram should be attempted to the destination IP address specified
in the MDTP protocol variable 'last.good.intf'. The value of
'last.good.intf' is always updated to point to the source IP address
from which the last datagram from the peer endpoint arrived.

The number of consecutive T3-send timeout events is also recorded in
a variable 'retran.count' for each destination IP address. This count
should be incremented when a T3-send time-out event occurs for that
destination IP address. Every time a datagram is received from a peer
endpoint, the receiving endpoint shall reset to 0 the 'retran.count'
corresponding to the source IP address .

If the value in 'retran.count' exceeds half of the value of the
protocol parameter 'Max.Retransmit', the destination IP address shall
be reported to the upper layer as out-of-service and shall be removed
from eligibility for rotation.  When re-transmitting a datagram, the
re-transmission should use 'last.good.intf' as the preferred
destination IP address to which to re-transmit, unless 'last.good.intf'
points to the destination IP address on which the original T3-send
time-out event occurred.

In the event that a datagram is received from an IP address that has
been reported as out-of-service, the 'retran.count' shall be cleared
as specified above, the destination IP address shall be reported as
in-service to the upper layer, and the destination IP address shall be
considered valid for rotation.


Stewart, et al                                                  [Page 25]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


6.2 RTT Measurement

On occasions an endpoint of an association may need to perform an RTT
measurement of the network (or one of the redundant networks) between
itself and its peer.

RTT-request and RTT-ack messages shall be used to perform the RTT
measurement. In the messages, two 32 bit long opaque integers are used
in the control parameter field to carry the time value.

At the request of its upper layer, an endpoint shall initiate an RTT
measurement by sending an RTT-request (to a specific network if
redundant networks exist). The sender shall also place in Time value 1
and Time value 2 the value of the current time mark.

Upon the reception of this RTT-request message, the recipient shall
immediately respond with a RTT-ack to the sender (over the same
network on which the RTT-request arrives if the recipient is
multi-homed), with the time mark carried in the original RTT-request
copied into its own Time value fields.

Upon the reception of this reply, the sender shall use the time mark
in the reply RTT-ack to calculate the RTT (to the specific destination
IP address if redundant networks exist) as defined in [4].

Endpoint A                                      Endpoint Z
{RTT - Request Now=x.y}
RTT-request (C/D=10)
   [Time-value1=x,
    Time-value2=y,
    Seen=81]       ----------------------->
                                    /------- RTT-ack (C/D=10)
                                   /            [Time-value1=x,
                                  /              Time-value2=y,
                                 /               Seen=3]
(Endpoint A uses     <----------/
 x.y to calculate RTT)


6.3 Network Heart Beat


At the request of its upper layer, an endpoint shall enable heart beat
to a specific peer with which it has an established association.

The RTT-request message defined in section 2.2 shall be used as
the heart beat while the RTT-ack shall be used as the heart beat
response.

After having heart beat enabled, the endpoint shall transmit a heart
beat to that specific peer and start a T5-heartBeat timer. The peer
shall immediately respond to the heart beat in the same manner as the
RTT measurement procedure described in section 6.2. This response, as
well as the new RTT measurement, shall be stored by the endpoint.

Stewart, et al                                                  [Page 26]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


When the T5-heartBeat timer expires, the endpoint shall first check if
the previous heart beat has been responded to (on the same network it
was sent in the case of multi-homed hosts). If not, the destination IP
address to which the last heart beat was sent shall have the
'retran.count' incremented and checked following the rules described
in section 6.1. Then, the endpoint shall send another heart beat and
re-start the T5-heartBeat timer.

In the case where one or both endpoints are multi-homed, the sending of
Heart beats shall follow the network rotation rules outlined in
section 4.2.

If, before the expiration of T5-heartBeat timer, a datagram is
received by the endpoint, the T5-heartBeat timer shall be stopped and
the appropriate T2-receive timer shall be started. In other words, the
T5-heartBeat timer has the lower precedence than the T2-receive timer.

When there are no datagrams to send and no other timers are running,
the T5-heartBeat timer shall be started and the above procedure shall
continue.

The suggested interval for T5-heartBeat timer is 4000 ms, and may be
dynamically adjusted by adding the current RTT measurement if it is
available.


7.  Termination of Association

Before an endpoint terminates itself, it shall send an Abort message
to each of its peer endpoints in all existing associations. The Abort
shall be sent without requiring an acknowledgment from the peer
endpoint. However, the sender of the Abort message MUST fill in the
peer's Init-Tag.

When the peer endpoint receives the Abort, after verifying the Tag,
the peer shall remove the sender from its record, and optionally
report the termination of the sender to its upper layer. However if
the Tag sent with the Abort message is incorrect, the peer must
silently discard the Abort message.

The following shows an example of the termination of Endpoint A:

Endpoint A
{App indicates termination}
Abort (C/D = 10)
    [Tag-X]   --------------------------------> to Endpoint X

Abort (C/D = 10)
    [Tag-Y]   --------------------------------> to Endpoint Y

Abort (C/D = 10)
    [Tag-Z]   --------------------------------> to Endpoint Z


Stewart, et al                                                  [Page 27]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


7.1 Graceful Shutdown of an Association

An endpoint in an association may decide to "graceful shutdown" the
association without completely closing it down. With graceful
shutdown, both endpoints shall remove any record and pending datagrams
associated with the association. Further communications between the
two endpoints can be resumed by going through a re-initialization
procedure (see section 3.5.4).

A Graceful Shutdown message shall be sent to the peer endpoint of the
association, and the peer shall send back an acknowledgment.  Note
that it shall be the responsibility of the endpoint that sends the
Graceful Shutdown message to assure that all the outstanding datagrams
from its side have been resolved before it initiates the graceful
shutdown procedure.

In the Graceful Shutdown message, the sender shall indicate the
highest TSN Seen it has received from the peer, as well as the
Init-Tag of the peer.

Upon the reception of the Graceful Shutdown, the peer shall first
verify that Tag value contained in the Graceful Shutdown message is
valid. If the Tag is invalid, the message must be silently discarded.

The peer then shall verify, by checking the Seen numbers from the
Graceful Shutdown message, that all the out-bound datagrams have
reached the destination.  Otherwise, the peer shall re-transmit all
lost datagrams.

After sending the Graceful Shutdown, if the endpoint receives any new
user datagram it shall immediately respond with an Extended Data Ack
and re-start its T3-send timer.

The peer shall send a Graceful Shutdown Ack when all the outstanding
datagrams are acknowledged, then start a T4-shutdown timer. The
endpoint, after receiving the Graceful Shutdown Ack, must also
validate the Tag value contained in the message. If it does not match
the Tag value that unlocked the association, the message should be
silently discarded.

The following sequence shows an example of Graceful Shutdown:

    Endpoint A                                  Endpoint X
{App indicates graceful shutdown}
Graceful Shutdown (C/D=10)
   [Tag-X, Seen=10] ---------------------> (all datagrams resolved)
(start T3-send timer)            /-------- Graceful Shutdown Ack (C/D=10)
                                /             [Tag-A]
                               /           (start T4-shutdown timer)
(cancel T3-send timer) <------/            ...
(clean-up the association)                 (T4-shutdown expires)
                                           (clean-up the association)


Stewart, et al                                                  [Page 28]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

Both endpoints shall reject any new data request from their upper layers
while the graceful shutdown procedure is in progress.


8.  Stream Operations


8.1 Stream Initiation

An MDTP association between the two endpoints must be established
before any stream operation.

Except for the global stream (i.e, stream id 0), a stream shall be
initiated (opened) by the sender before any datagrams can be sent in
that stream. When a stream is no longer used, it shall be terminated
(closed) by the user. Moreover, both sides of the association shall be
able to initiate or terminate streams independently.

The sender initiates a stream by sending a Stream Initiation. In
addition to specifying the Stream Identifier, the sender must set the
Init-Tag field of the Stream Initiation to the Tag value of the peer
endpoint.

The sender shall also attach the stream-specific data, if any (usually
provided by the upper layer), with the Stream Initiation. Otherwise,
the Size of Stream Info shall be set to 0x0.

Then, the sender shall start a T3-send timer. If the T3-send timer
expires, the sender shall re-transmit the Stream Initiation.

Upon the reception of the Stream Initiation, the peer must first
verify that the correct Tag value is carried in the Init-Tag field of
the Stream Initiation. If so, the peer shall respond immediately with
a Stream Initiation Ack. Otherwise, the peer must silently discard the
Stream Initiation.

The following example shows the opening of stream 5 by "A":

    Endpoint A                                   Endpoint Z
{App Initiates stream 5}
Stream Initiation (C/D=10)
   [Tag=Tag-Z,Strm=5]   ----------------->
(Start T3-send timer)
(Cancel T3-send timer)  <----------------- Stream Initiation Ack
                                             (C/D=10) [Strm=5]


8.2 Stream Termination

An endpoint shall be allowed to terminate one of its streams by
sending a Stream Termination to the other side.

The same Tag verification process used for stream initiation shall
be applied to stream termination.

Stewart, et al                                                  [Page 29]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


The peer shall send a Stream Termination Ack in response to the Stream
Termination.

The following example shows the termination of stream 5 by "A":

    Endpoint A                                  Endpoint Z
{App closes stream 5}
Stream Termination (C/D=10)
   [Tag=Tag-Z,Strm=5] ------------------->
(Start T3-send timer)
(Cancel T3-send timer) <------------------ Stream Termination Ack
                                             (C/D=10) [Strm=5]

Received datagrams associated with a terminated stream shall be
silently discarded. It is up to the endpoint to assure that all
outstanding user datagrams in the stream are acknowledged before the
stream termination.


8.3 Other Issues with Stream Operations

When an association is re-initialized (see section 3.5.4), all existing
streams within that association will be automatically terminated.

The receiver shall silently discard any datagrams associated with a
stream which has not yet been opened or has already been terminated.


9.  Interface with Upper Layer

The upper layer protocols (ULP) shall request for services by passing
primitives to MDTP and shall receive notifications from MDTP for
various events.

The primitives and notifications described in this section should be
used as a guideline for implementing MDTP.

A) Init.MDTP primitive

This primitive allows MDTP to initialize its internal data structures
and allocate necessary resources for setting up its operation
environment. Note that once MDTP is initialized, ULP can communicate
directly with any other endpoints without re-invoking this primitive.

Mandatory attributes:

None.

Optional attributes:

The following types of attributes may be passed along with
the primitive:


Stewart, et al                                                  [Page 30]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

 o Timer selection and its operation syntax -- to indicate to MDTP
   an alternative timer the MDTP should use for its operation.
 o Initial MDTP operation mode;
 o IP port number, if ULP wants it to be specified;

B) Init.Association

This primitive allows the upper layer to initiate an association to a
specific peer endpoint. The peer endpoint shall be specified by one of
the IP address/port pairs which define the endpoint (see section 1.1).

Mandatory attributes:

 o associationID - specified as one of the IP address/port pairs of
   the peer endpoint with which the association is to be established.

Optional attributes:

 o eligibleNetList - a list of destination IP address/port pairs that
   the peer endpoint is allowed to use in its network rotation. By
   default, all destination IP address/port pairs on the peer are
   available.

C) Term.Association

Terminating an association.

Mandatory attributes:

 o associationID - specified as one of the IP address/port pairs of
   the peer endpoint with which the association is to be terminated.

Optional attributes:

None.

D) Send.Data primitive

This is the main method to send datagrams via MDTP.

Mandatory attributes:

 o data - This is the payload ULP wants to transmit;
 o size - The size of the payload in number of octets;
 o associationID - One of the IP address/port pair of the peer endpoint.
   Note that the actual destination address sent to will be determined
   by MDTP due to the network rotation, unless the current mode
   prohibits MDTP network rotation; in such a case the datagram will
   be sent to the IP address/port specified by associationID.

Optional attributes:

 o mode-flags - This indicates a new MDTP operation mode, taking effect
   immediately including the current datagram send;

Stewart, et al                                                  [Page 31]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


 o context - optional information that will be carried in the
   Send.Failure notification to the ULP if the transportation of
   this datagram fails.

 o streamID - to indicate which stream to send the data on. By
   default, the global stream will be used.

E) Receive.Data primitive

This primitive shall return the first datagram in the MDTP in-queue to
ULP, if there is one available. It may, depending on the specific
implementation, also return other informations such as the sender's
address, whether there are more datagrams available for retrieval,
etc. The behavior is undefined if no datagram is available when this
primitive is invoked.

Mandatory attributes:

 o buffer - the memory location indicated by the ULP to store the
   received datagram and other information.

Optional attributes:

   None.

F) Data.Arrive notification

MDTP shall invoke this notification on the ULP when a datagram is
successfully received and ready for retrieval.

G) Send.Failure notification

If a datagram can not be delivered MDTP shall invoke this notification
on the ULP.

The following may be optionally be passed with the notification:

 o data - the location ULP can find the un-delivered datagram.
 o context - optional information associated with this datagram (see
   D).

H) Network.Status.Change notification

When a endpoint-id is marked down (e.g., when MDTP detects a failure),
or marked up (e.g., when MDTP detects a recovery), MDTP shall
invoke this notification on the ULP.

The following shall be passed with the notification:

 o endpoint-id - This indicates the IP address/port of the
   peer endpoint affected by the change;
 o new-status - This indicates the new status.


Stewart, et al                                                  [Page 32]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

I) Communication.Up notification

This notification is used when MDTP becomes ready to send or receive
datagrams, or when a lost communication to an endpoint is restored.

The following shall be passed with the notification:

 o status - This indicates what type of event that has occurred;
 o associationID - An IP address/port to identify the peer endpoint;

J) Communication.Lost notification

When MDTP loses communication to an endpoint completely or detects
that the endpoint has performed a abort or graceful shutdown
operation, it shall invoke this notification on the ULP.

The following shall be passed with the notification:

 o status - This indicates what type of event that has occurred;
 o associationID - An IP address/port number to identify the peer
   endpoint;
 o packets-enqueue - The number and location of un-sent datagrams
   still holding by MDTP;
 o last-acked - the sequence number last acked by that peer endpoint;
 o last-sent - the sequence number last sent to that peer endpoint;

K) Change.Network.Rotation primitive

When the upper layer wants to inform MDTP to make a specific network
eligible or ineligible for in network rotation, the upper layer will send
this primitive to MDTP.

Mandatory attributes:

 o  action - This indicates if the network is to be made eligible or
             ineligible for network rotation.
 o  network-id - This is the IP address/port of the peer endpoint to
    be added or removed from network rotation consideration.

L) Open.Stream primitive

This should be used by the upper layer to open a new stream.

Mandatory attributes:

 o associationID - One of the IP address/port to identify the peer
   endpoint of the association to which the stream is to be opened. An
   association must have existed at the time of stream open.

Optional attributes:

   streamInfo - The upper layer should use this field to pass any
   stream-specific data to the other endpoint of the association.


Stewart, et al                                                  [Page 33]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999

Returned attributes:

 o The stream number that is opened.

M) Close.Stream primitive

This shall be used by the upper layer to request to close a stream.

Mandatory attributes:

 o associationID - One of the IP address/port to identify the peer
   endpoint of the association to which the stream is to be closed.

 o stream number - The stream number to identify the stream to be
   closed (this should be the number returned by the Stream.Open
   primitive on this stream).


10. Suggested MDTP Timer and Protocol Parameter Values

The following are suggested timer values for MDTP:

T1-init Timer       -   160 ms
T2-receive Timer    -    20 ms
T3-send Timer       -   160 ms (TL3-default)
T4-shutdown Timer   -   300 ms
T5-heartBeat timer  -  4000 ms (TL5-default)

The following protocol parameters are recommended:

Max.Outstanding.dg      - 20 messages
Max.Retransmit          - 10 attempts
Max.Init.Retransmit     -  8 attempts
Min.Mcast.Time.To.Reset -  5 seconds
Num.Of.Mcast.Reset.Msg  -  5 messages


11. Acknowledgments

The authors wish to thank Brian Wyld, A. Sankar, Henry Houh, Gary
Lehecka, Ken Morneault, Lyndon Ong, Greg Sidebottom and others for
their very valuable comments.


12.  Authors' Addresses

Randall R. Stewart                          Tel: +1-847-632-7438
Cellular Infrastructure Group               EMail: stewrtrs@cig.mot.com
Motorola, Inc.
1475 W. Shure Drive, #2C-6
Arlington Heights, IL 60004
USA



Stewart, et al                                                  [Page 34]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


Qiaobing Xie                                Tel: +1-847-632-3028
Cellular Infrastructure Group               EMail: xieqb@cig.mot.com
Motorola, Inc.
1501 W. Shure Drive, #2309
Arlington Heights, IL 60004
USA

Suheel Hussain                              Tel: +1-919-472-2312
Cisco Systems Inc.                          EMail:ssh@cisco.com
7025 Kit Creek Road
Research Triangle Park, NC  27709

Chip Sharp                                  Tel: +1-919-851-2085
Cisco Systems Inc.                          EMail:chsharp@cisco.com
7025 Kit Creek Road
Research Triangle Park, NC  27709

Hanns Juergen Schwarzbauer                  Tel: +49-89-722-24236
SIEMENS AG
Hofmannstr. 51
81359 Munich,  Germany
EMail: HannsJuergen.Schwarzbauer@icn.siemens.de

Tom Taylor                                  Tel: +1-613-736-0961
Nortel Networks                             EMail:taylor@nortelnetworks.com
1852 Lorraine Ave.
Ottawa Ontario Canada
K1H6Z8

Ian Rytina                                  Tel:
Ericsson Australia                          EMail:ian.rytina@ericsson.com
37/360 Elizabeth Street
Melbourne, Victoria 3000, Australia


13. References

[1] Postel, J. (ed.), "Internet Protocol - DARPA Internet Program
Protocol Specification", RFC 791, USC/Information Sciences Institute,
September 1981.

[2] Postel, J., "User Datagram Protocol", RFC 768, USC/Information Sciences
Institute, August 1980.

[3] Postel, J. (ed.), "Transmission Control Protocol", RFC 793, USC/
Information Sciences Institute, September 1981.

[4] Jacobson V., "Congestion Avoidance and Control", Proceedings of
SIGCOMM '88, pp 314-329, August 1988.

[5] Seth, T., etc. "Performance Requirements for Signaling in Internet
Telephony", Internet-Draft <draft-seth-sigtran-req-00.txt>, May 1999.


Stewart, et al                                                  [Page 35]


Internet Draft   Multi-network Datagram Transmission Protocol   June 1999


[6] Rytina, I., "Framework for Generic Common Signaling Transport
Protocol", draft-rytina-sigtran-generic-framework-00.txt>, Feb. 1999.

[7] Ashworth, J., "The Naming of Hosts", RFC 2100, April 1997.

[8] Braden, R., "Requirements for Internet hosts - Application and
Support", RFC 1122, October 1989.

[9] Eastlake 3rd, D., Crocker, S., and Schiller, J., "Randomness
Recommendations for Security", December 1994.











      This Internet Draft expires in 6 months from June 1999.































Stewart, et al                                                  [Page 36]