Internet Draft: RMTP-II Specification B. Whetten, M. Basavaiah
Document: draft-whetten-rmtp-ii-00.txt GlobalCast Communications, Inc.
Expires: Six months S. Paul
Lucent Technologies, Bell Labs
T. Montgomery
West Virginia University
N. Rastogi, J. Conlan, T. Yeh
GlobalCast Communications, Inc.
April 8, 1998
THE RMTP-II PROTOCOL
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is not appropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress".
To view the entire list of current Internet-Drafts, please check
the "1id-abstracts.txt" listing contained in the Internet-Drafts
Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
(Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
(Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
(US West Coast).
Abstract
The Reliable Multicast Transport Protocol II, RMTP-II, is a reliable
multicast protocol, designed to reliably and efficiently send data
from a few senders to large groups of simultaneous recipients. It is
designed primarily for use over controlled network topologies. It
works over both symmetric networks, as well as over asymmetrical
network topologies such as those provided by satellite, cable modem
or Asymmetrical Digital Subscriber Line (ADSL) carriers. Before
sending, each sender must connect with a trusted Top Node, to receive
permission and control parameters for its data stream. The top node
provides network managers with a single point of control for the
senders, allowing them to monitor and control the traffic being sent.
RMTP-II [Page 1]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Table of Contents
1 INTRODUCTION 4
2 ENTITIES 6
2.1 Node 6
2.2 RMTP-II Tree 6
2.3 Multicast Group Address 6
2.4 Data Channels 6
2.5 Control Channels 6
2.6 Data Stream 7
2.7 StreamID 7
2.8 Packet Sequence Numbers 7
2.9 Data Queue 7
2.10 RMTP-II Packets 7
3 NODE FUNCTIONS 10
3.1 Sender Node 10
3.2 Receiver Node 11
3.3 Top Node 12
3.4 Aggregator Node 13
3.5 Designated Receiver Node 13
4 OPERATION OF THE RMTP-II PROTOCOL 14
4.1 Basic Operation of the Protocol 14
4.2 Global Tree Parameters 15
4.3 Data and Control Paths 16
4.4 Data Transmission 16
4.5 Retransmission 16
4.6 Options 17
4.7 Tree Connections 18
4.8 Flow and Congestion Control 18
4.9 Membership 20
4.10 Fault Detection and Recovery 20
4.11 Quality of Service (QoS) 21
4.12 SNMP Support 22
4.13 Session Manager Support 22
4.14 Forward Error Correction 22
5 DETAILS: 23
5.1 RMTP-II Global Parameters 23
5.2 Tree Configuration 24
5.3 Tree Membership Algorithms 26
5.4 Data Transmission 28
5.5 Data Reception 32
5.6 HACK Generation 34
5.7 Retransmissions in response to a HACK 39
5.8 NACK Generation 40
5.9 NACK Response 41
5.10 Congestion Control 41
5.11 Failure Detection and Recovery 41
5.12 Control tree discovery 44
RMTP-II [Page 2]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
5.13 Control Tree Round Trip Times 46
6 RMTP-II PACKET FORMATS 48
6.1 RMTP-II Fixed Header 48
6.2 Data 50
6.3 Retransmission Packet 52
6.4 HACK 54
6.5 JoinConfirm 56
6.6 JoinStream 58
6.7 JoinAck 59
6.8 LeaveStream 60
6.9 Heartbeat 61
6.10 HeartbeatResponse 62
6.11 NullData 63
6.12 Eject 64
6.13 EOS 65
6.14 LeaveConfirm 66
6.15 RMTP-II Options Header 67
7 ACKNOWLEDGEMENTS 81
8 REFERENCES 83
RMTP-II [Page 3]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
1 Introduction
IP Multicast (RFC 1112) provides highly efficient delivery of
information to many receivers at once, with each packet going over
any given link no more than a single time. To date, there are no
standardized, reliable multicast protocols that provide the
equivalent of TCP for IP Multicast. It is widely recognized that,
unlike TCP, no single reliable multicast protocol can meet the needs
of all application types over all network types.
The Reliable Multicast Transport Protocol II, RMTP-II, is a reliable
multicast protocol which reliably and efficiently sends data from a
few senders to large groups of simultaneous recipients. RMTP-II
works over all types of networks, including symmetric networks as
well as asymmetrical networks such as those provided by satellite,
cable modem or Asymmetrical Digital Subscriber Line (ADSL) carriers.
RMTP-II requires some configuration of topology.
The objectives of RMTP-II are guaranteed reliability, high
throughput, and low end-to-end delay on any network topology, while
providing the network manager with control over transmission traffic.
RMTP-II is based on a hierarchical tree structure in which receivers
are grouped into local regions. In each region a special control
node is responsible for maintaining receiver membership and for
aggregating the acknowledgments from the receivers in its region and
forwarding them to the sender. The control node may also take
responsibility for retransmitting dropped packets to the local
receivers.
RMTP-II is a sender-reliable multicast protocol which uses a fully
distributed membership protocol to keep track of the current
membership of the tree. This allows senders to determine when
packets become stable and may be deleted. RMTP-II provides strong
guarantees on packet delivery and gives the senders a count of the
number of receivers that successfully received each packet. It also
supports optional negative acknowledgements of missed packets for
faster recovery of data and lower control traffic on low loss
networks.
The Internet is highly dependent on the TCP congestion control
mechanisms which allow all streams to share bandwidth fairly.
Widespread deployment of a transport protocol that does not interact
gracefully with the TCP protocol has the potential to do significant
damage to the Internet. This has been an obstacle to the
standardization of a reliable multicast protocol. RMTP-II is
designed to interact closely with congestion control algorithms and
provide these algorithms with constant information about the loss
RMTP-II [Page 4]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
rates and round trip times in the tree. The RMTP-II congestion
control algorithms are under development and will be presented in a
companion document.
RMTP-II complements its congestion control algorithms with additional
centralized management control over all RMTP-II streams running on
the network. RMTP requires senders to interact with a trusted Top
Node and accept configuration information from this node. The Top
Node provides the network manager with an SNMP interface for
monitoring and controlling the streams with which it is associated.
The network manager can place bandwidth limits on each stream and can
specify the congestion control parameters that each sender must use.
This provides graceful, controlled deployment of reliable multicast
that protects the network both through explicit management and
automatic congestion control policies.
RMTP-II requires some topology configuration which is left to an
external Session Manager. A Session Manager can be implemented as
either a centralized tool controlled by the network manager, as
static configuration files, or as a set of fully distributed
algorithms. The Session Manager is also responsible for optional
session advertisements and optional total group membership tracking.
RMTP-II provides the following additional features:
- Optional, integrated Forward Error Correction (FEC)is provided.
- Receivers may join or leave a stream already in progress.
- SNMP support is furnished all nodes.
- Senders and receivers may be ejected.
- Many-to-many multicast or special upgrades to routers is not
required.
- Time Bounded Reliability is provided for bounded recovery of data
for synchronous real-time streaming applications.
- A fault tolerant top node is supported.
- Dynamic fault detection and recovery are supported.
- The rate of generation of acknowledgement traffic can be
controlled.
Appendix A contains a design rationale for RMTP-II.
RMTP-II [Page 5]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
2 Entities
2.1 Node
A node is an entity with a network address which provides
transmission, control, or reception services for the network.
There are five types of nodes: sender nodes which transmit data,
receiver nodes which receive data, and three types of control nodes
which provide acknowledgment, aggregation and communication services
for control functions. The control nodes are: aggregator nodes which
aggregate and forward control information, designated receiver nodes
which function as aggregators and also receive and retransmit data,
and top nodes which perform aggregation and other control functions
at the top of the RMTP communication tree. Node functionality is
described later.
2.2 RMTP-II Tree
An RMTP-II tree is a collection of nodes connected into a tree-like
communication network with the top node at the root. The primary
function of the tree is to efficiently distribute acknowledgement
packets.
2.3 Multicast Group Address
A multicast group address is a pair consisting of an IP multicast
address and a UDP port number. Multicast group addresses are used for
sending data to receivers, and for sending control information to
local groups.
2.4 Data Channels
A data channel is a communication path used to send data streams.
Each data channel is associated with a multicast group address used
for sending and receiving the data streams. A sender sends a data
stream on a data channel. Receivers must join the data channel to
receive the data stream.
One RMTP-II Tree supports multiple data channels. A single data
channel can be used by one or more senders to send data. A receiver
must join all data channels corresponding to the streams it wishes to
receive.
2.5 Control Channels
A control channel is a communication path used to send control
information. A control channel may be associated with a unicast IP
RMTP-II [Page 6]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
address and port or may be associated with a multicast group address.
2.6 Data Stream
A data stream is a sequence of Data packets having a unique StreamID
and a data channel.
2.7 StreamID
A StreamID is a 16-bit number, chosen either by the application that
creates the stream or by RMTP-II. Senders and receivers use the
StreamID to distinguish data streams. A sender may specify a
StreamID in the range from 32768 (2^15) to 65535 (2^16)-1. Numbers
in the range from 0 to 32768 are reserved. If a sender specifies 0
as the StreamID, then RMTP-II assigns a unique StreamID in the range
from 1 to 32768. If a sender application specifies a StreamID that
is in use, RMTP-II will not allow the sender to join the RMTP tree.
2.8 Packet Sequence Numbers
A packet sequence number is a 32 bit number in the range from 1
through 2^32 - 1 which is used to specify the sequential order of a
Data packet in a stream of Data packets. A sender node assigns
consecutive sequence numbers to the Data packets provided by the
sender application for the data stream. Zero is reserved.
2.9 Data Queue
A data queue is a buffer, maintained by a sender or a designated
receiver node, for transmission and retransmission of the Data
packets provided by the sender application. New Data packets are
added to the data queue as they arrive from the application, up to a
specified buffer limit. The admission rate of packets to the network
is controlled by congestion control algorithms. Once a packet has
been received by its target recipients, it may be deleted from the
buffer under appropriate conditions.
2.10 RMTP-II Packets
All RMTP-II packets consist of a fixed header, optional option
headers, followed by data or control information.
Data is carried by Data packets and Retransmission packets. Control
information is carried in the following types of control packets:
HACK, NACK, JoinStream, JoinAck, JoinConfirm, LeaveStream, Heartbeat,
HeartbeatResponse, NullData, Eject, EOS, and LeaveConfirm.
2.10.1 Data Packet
RMTP-II [Page 7]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
The sender application provides Data packets to the sender node. The
sender node assigns consecutive sequence numbers to the Data packets
and multicasts the packets on the data channel.
Each Data packet has a StreamID and a sequence number which identify
the packet and allow a receiver application to reconstruct the data
stream from the Data packets. A receiver learns about Data packets
that have not yet arrived by receiving a packet later in the sequence
or from a NullData packet which mentions later packets.
A Data packet is said to be "stable", if it has been acknowledged as
received by all its target recipients.
2.10.2 Retransmission Packet
Retransmitted data is sent in Retransmission packets. A separate
packet type is used for retransmission to more easily enable use of
subtree multicast when this option is eventually supported by
routers.
2.10.3 HACK Packets
A HACK packet is unicast from a child node to its parent node to
indicate the status of the Data packets which have arrived and to
furnish statistics about the state of data reception at the node.
A HACK requests retransmission of Data packets that have not been
received. A HACK also acknowledges packets that have become stable.
A packet becomes stable when it has been received by sufficient,
critical nodes so that the sender is no longer required to hold the
packet for retransmission.
2.10.4 NACK
A NACK is a HACK that is used to request immediate recovery of lost
Data packets.
2.10.5 JoinStream Packet
All nodes except the top node send a JoinStream request to join a
data stream. The first JoinStream packet implicitly joins the RMTP-
II tree.
2.10.6 JoinConfirm Packet
A parent sends a JoinConfirm after it processes a JoinStream packet.
The JoinConfirm indicates the success or failure of the JoinStream
request.
RMTP-II [Page 8]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
2.10.7 JoinAck Packet
If a control node is unable to respond immediately to a JoinStream
request by a child with a JoinConfirm packet, it unicasts a JoinAck
packet to that child.
2.10.8 LeaveStream Packet
All nodes, except the top node, send a LeaveStream request when the
node leaves the data stream.
2.10.9 Heartbeat Packet
A control node periodically sends Heartbeat packets to notify its
child nodes that it is alive.
If a child node does not receive a Heartbeat packet within a
specified time limit, it detects that the parent has failed and joins
another parent node.
2.10.10 HeartbeatResponse Packet
All nodes, except the top node, unicast a HeartbeatResponse packet to
its parent node in response to a Heartbeat packet. The
HeartbeatResponse packet indicates that the child node is still
alive.
If a parent does not receive a HeartbeatResponse packet or a HACK
packet for a specified interval of time, then it detects that the
child node has failed and removes that child from its list of child
nodes.
2.10.11 NullData Packet
If a sender node has no data to send for a stream, it periodically
multicasts a NullData packet on the data channel. NullData packets
inform receivers about the state of the data stream and the sender.
A NullData packet contains the sequence number of the last packet
sent, which allows the receivers to determine missing packets. If a
receiver does not receive any Data or NullData packets from the
sender for a specified time, then the receiver will declare the
sender failed.
2.10.12 Eject Packet
If a child node is not operating normally, or a parent node restarts
after a failure and receives a packet from a child not in its child
RMTP-II [Page 9]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
list, then the parent node unicasts an Eject packet to the child
node. This child node can be another control node, a sender, or a
receiver.
2.10.13 LeaveConfirm Packet
A parent node unicasts a LeaveConfirm packet in response to a
LeaveStream packet from a child node.
3 Node Functions
3.1 Sender Node
Stream Variables
A sender node maintains the values of variables relating to the data
stream: the size of the data queue, the packet admission rate, the
sequence number of the lowest numbered unstable packet, and the
sequence number of the highest numbered packet in the data queue.
Join Stream
A sender node sends a JoinStream request to the top node for each
stream it intends to send. The sender provides a data channel with
the JoinStream request and either provides a unique StreamID or else
requests that the top node generate a unique StreamID.
Data Transmission
A sender multicasts Data packets to the receivers on the data channel
at a sending rate determined by the flow and congestion control
algorithms. If there is no data to send for a stream, the sender
periodically sends NullData packets to indicate that the data stream
is active.
Data Queue
When a sender receives a HACK from the top node, it uses the
information to delete packets from the data stream's data queue. A
packet is deleted from the queue if that packet is stable and all
lower numbered packets are also stable.
Data Retransmission
RMTP-II [Page 10]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
A HACK or NACK indicates which packets have not been received. A
sender retransmits the missed packets on the data channel. The time
interval between successive retransmissions for a data packet is
doubled for each retransmission, with an upper limit of 64
seconds(exponential backoff).
Data retransmission has higher priority than new data transmission.
The sending rate is controlled by the flow and congestion control
algorithms.
Tree Integrity
A sender gets Heartbeat packets from the top node's multicast control
channel and sends HeartbeatResponse packets to the top node on the
unicast control channel. This allows the sender and the top node to
determine whether the other is functioning. The NullData packets
also inform the receivers that the sender is functioning.
Receivers send HACKs in response to NullData packets, which informs
the sender of the status of the receivers.
3.2 Receiver Node
Join Stream
A receiver node sends a JoinStream request to its parent node for
each of the data streams that it wishes to receive.
Data Reception
A receiver node receives the data packets on the data channels and
delivers the data to its application.
Data Reliability
A receiver delivers the Data packets to the receiver application in
accordance with quality of service specified for the data stream.
A receiver sends HACKs to its parent to indicate the status of the
Data packets received and missed.
A receiver may also send NACKs to expedite the recovery of missing
Data packets.
Tree Integrity
A receiver receives Heartbeats from its parent on the parent's
multicast control channel to insure that the parent is operating.
RMTP-II [Page 11]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
A receiver sends HACKs and NACKs to its parent for each of its active
streams to indicate its availability. If none of the streams that a
receiver has joined are active, it periodically sends a
HeartbeatResponse packet to its parent.
3.3 Top Node
Tree Creation
The top node is assigned administratively and is the core of the
tree.
Sender Control
The top node controls transmission parameters and congestion control
parameters to each sender, and can change these dynamically while the
sender is transmitting.
Data Reliability
The top node aggregates HACKs received from its children and forwards
the aggregated HACK to the sender node.
Group Membership
The top node aggregates the membership count of the receivers, and
passes this to each sender.
Data Transmission
The top node may optionally accept unicast data from senders and
provide multicast transmission to the group. This is needed if the
sender node cannot multicast data.
Stream Identification
The top node allocates new StreamID values and guarantees the
uniqueness of the StreamIDs. If any sender requires a system
generated StreamID, the top node provides a unique value. If a
sender provides a StreamID value, the top node verifies that the
value is unique. If the value is not unique, the sender is not
allowed to join the RMTP-II tree.
Tree Integrity
The top node sends Heartbeats on its multicast control channel to
notify its child nodes and senders that it is available.
RMTP-II [Page 12]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
The child nodes of the top node respond to its Heartbeats with HACKs
or Heartbeat response packets.
The sender nodes respond to the top node Heartbeats with
HeartbeatResponse packets.
3.4 Aggregator Node
Data Reliability
An aggregator node aggregates HACKs from its child nodes and sends
them to its parent node on the unicast control channel.
NACK Forwarding
An aggregator node forwards NACKs from its child nodes to its parent
node.
Retransmission Forwarding
An aggregator node forwards retransmissions from its parent to its
children.
Group Membership
The aggregator node accumulates the membership count of its
receivers, and passes this count to its parent. An aggregator node
also maintains a list of its receivers.
Tree Integrity
An aggregator sends Heartbeats on its multicast control channel to
notify its child nodes of its availability.
An aggregator receives Heartbeats from its parent to insure that its
parent is available.
An aggregator receives HACKs or HeartbeatResponse packets from its
child nodes to insure that the child modes are available.
3.5 Designated Receiver Node
Aggregation
A designated receiver node has all the functionality of the
aggregator node.
Data Retransmission
RMTP-II [Page 13]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
A designated receiver receives data from all the data streams that
are joined by its child receiver nodes. A designated receiver
buffers the data for potential retransmission to its child receiver
nodes.
4 Operation of the RMTP-II protocol
4.1 Basic Operation of the Protocol
RMTP-II provides sequenced, reliable delivery of data from a few
senders to a large group of receivers. RMTP-II consists of a network
that has one or more sender nodes, many receiver nodes and one or
more control nodes.
The simplest configuration consists of a single sender and a top node
and receivers on multiple hosts connected to the network. An
implementation could allow multiple nodes of different types to run
in a single process, allowing a host to act as a sender and a
receiver, a sender and a top node, a receiver and a designated
receiver, or other combinations.
The figure below illustrates an RMTP tree with multiple control
nodes.
HACKs
-----------> (Top Node)----------------->(Sender node)
^ ^^^ |
| / | \ |
HACKs | / | \ |
| / | \ |
| / | \ (Designated |
/ | \ Receiver |
(Aggregator / | \ node) v
nodes) A N A N D N <------------|
^^ ^^^ ^^ | Data
/ | / | \ | \ | Channel
/ | / | \ | \ |
/ | / | \ | \ v
RN RN RN RN RN RN RN <------
(Receiver Nodes)
A sender joins the RMTP tree and multicasts Data packets on the data
channel.
In the case of a symmetrical network whose control tree topology
RMTP-II [Page 14]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
congruent to the multicast routing tree topology, the data channel
may have the same path as the control channel. In this case, a data
packet would be multicast from the sender through the control nodes
to the receivers.
In the case of an asymmetrical network such as a one way satellite
network with a terrestrial return path, the receivers and designated
receivers would receive the multicast data directly, but the
aggregators and top nodes will only receive control traffic.
A receiver joins a data channel to receive data. A receiver
periodically informs its parent about the packets that it has or has
not received by unicasting a HACK packet to the parent. Each parent
node aggregates the HACKs from its child nodes and unicasts a single
aggregated HACK to its parent. The top node aggregates the HACKs
from its child nodes and unicasts a single HACK to the sender.
Each control node multicasts Heartbeat packets that inform their
child nodes that the parent node is still functioning.
A tree forms a loop from the sender to the receivers, and back to the
sender. Data and NullData regularly exercise the downward data
direction. Heartbeat packets exercise the downward control direction.
HACKs, NACKs, and HeartbeatResponse packets regularly exercise it in
the upward direction. This combination constantly checks that all of
the nodes in the tree are still functioning correctly, and initiates
fault recovery when required.
4.2 Global Tree Parameters
A collection of tree-wide parameters are set and controlled at the
top node. All the nodes of the RMTP tree acquire these parameters
when they join the RMTP tree for the first time and receive a
JoinConfirm packet.
These global parameters can be changed at the top node. The changes
are propagated to the nodes of the tree in optional fields of the
Heartbeat packets.
The top node sends the modified global parameters in Heartbeat
packets and waits for all of its child nodes to confirm the change.
A child node receives and applies the changed parameters and sends a
confirmation to its parent in a special HeartbeatResponse packet.
When a control node receives a Heartbeat packet with modified global
parameter options, it propagates the modified global parameters to
its child nodes in its Heartbeat packets. A control node sends the
RMTP-II [Page 15]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
confirmation HeartbeatResponse packet to its parent only after it
receives HeartbeatResponse confirmations from all its child nodes.
When the top node gets confirmations from all its child nodes, it is
guaranteed that all the nodes in the tree have updated their global
parameters.
4.3 Data and Control Paths
Data is multicast by a sender on the data channel. Data may be
retransmitted by the sender on the data channel or be multicast or
unicast on a local control channel by a designated receiver.
The control channels are organized into a tree with the top node at
the top of the tree and the receivers at the bottom of the tree.
Aggregators and designated receivers are always in the middle of the
tree.
4.4 Data Transmission
In order to receive Data packets from a sender, a receiver must join
the multicast group for the stream to be received. Each multicast
group is called a data channel. A receiver only joins the data
channels for the streams it will receive.
A designated receiver joins all of the multicast groups that its
descendants have joined.
A sender transmits Data packets to a group using IP Multicast. The
intermediate nodes in the RMTP-II tree structure are not responsible
for forwarding these Data packets. The standard IP Multicast routers
forward the packets to all receivers and designated receivers.
4.5 Retransmission
There are three types of retransmissions: local retransmission,
global retransmission, and subtree retransmission.
4.5.1 Local Retransmission
If a designated receiver determines from its child node's HACKs or
NACKs that a Data packet was missed, the designated receiver
retransmits the Data packet. The designated receiver multicasts the
retransmission on its multicast control channel. Each control node
child forwards a retransmission, as required, to its children.
4.5.2 Global Retransmission
RMTP-II [Page 16]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
If a sender node receives a HACK or NACK that indicates missing
packets, the sender performs global retransmission. The
unacknowledged packets are multicast as Retransmission packets on the
data channel.
4.5.3 Subtree Retransmission
On a network whose routers support subtree multicast transmission, a
designated receiver at the top of a subtree may multicast
retransmitted data to its subtree. In this case, it is not necessary
for the control nodes below the designated receiver in the subtree to
forward the retransmissions. Currently, no commercial routers
support this feature.
4.6 Options
4.6.1 Expedited Recovery
Expedited Recovery is an optional feature, which can be enabled on a
per-stream basis. If the NACK option is enabled for a Data stream,
an explicit request, called a negative acknowledgment, or NACK, may
be sent by a child node to its parent requesting retransmission of
missed data packets. Expedited recovery is used for data streams
that require very low latency for data recovery or for networks that
have low loss rates and wish to reduce the amount of control traffic
sent.
4.6.2 NACK Suppression
A receiver waits a random interval, within a specified range, before
sending a NACK. If the receiver receives the retransmitted data
before the NACK timer expires, the receiver cancels the NACK. This
reduces the chance that multiple receivers generate a NACK for the
same packet.
A designated receiver node multicasts a Data packet to its children
as soon as it gets a NACK request for that packet.
An aggregator node or a top node forwards only one NACK for a missing
Data packet within a specified period of time. Other NACKs for a
missing packet are ignored because the up-stream designated receiver
or Sender will multicast the retransmission.
4.6.3 Control Tree discovery
RMTP supports an option which allows the nodes in the RMTP tree to
acquire the addresses and location of its ancestors in the tree and
the addresses of its parent's siblings
RMTP-II [Page 17]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
If an RMTP-II node's parent fails, then the node can use the acquired
information to join an alternate control node.
4.6.4 Forward Error Correction
Option fields in RMTP-II can be used to implement Forward Error
Correction. RMTP-II supports both proactive FEC, in which a fixed
amount of parity is sent with the data, and reactive FEC, in which
parity is sent in retransmission packets to repair multiple
independent losses. See the Appendix - Forward Error Correction for
details.
4.6.5 Fault Tolerant Top Node
RMTP-II provides support for the implementation of a fault tolerant
top node. The top node uses the options field of its Heartbeat
packets to send the address of a backup top node to all of its
children. The top node sets this option when a backup top node joins
or leaves the RMTP tree.
4.6.6 Congestion Control
Congestion Control option fields are used to propagate the loss rate
and other congestion control parameters. RMTP-II proposes one
congestion control mechanism. See the Appendix - Congestion Control
for details.
4.6.7 Time Bounded Reliability
A bound for delivery time may be optionally specified for a stream.
Packets are delivered exactly once and in the same order in which
they were sent from the source, but if the time bound on a packet
expires it is dropped. See the Appendix - Time Bounded Reliability
for details.
4.7 Tree Connections
In order to join a tree, each node, except top node, sends a
JoinStream packet to its parent node and sets a Join timer for the
confirmation.
If the parent node can immediately confirm the JoinStream request, it
responds with a JoinConfirm packet. If the parent node needs more
time to process the JoinStream request, it responds with a JoinAck
packet. After it fully processes the JoinStream, it sends a
JoinConfirm packet.
RMTP-II [Page 18]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
The top node does additional processing for a JoinStream request from
a sender node. The top node validates the uniqueness of the StreamID
of the JoinStream request. If a sender requests a system generated
StreamID, the top node generates a unique StreamID.
4.8 Flow and Congestion Control
Flow and congestion control algorithms act to prevent the senders
from overloading the receivers.
RMTP-II uses a send-ahead mechanism to allow continuous transmission
of data without waiting for packet acknowledgments. This is used in
conjunction with flow and rate control algorithms.
Research in congestion control for large scale reliable multicast is
not yet mature. RMTP-II is currently designed to support a range of
window based and rate based flow and congestion control mechanisms.
Each of these mechanisms can be specified as an option and is
controlled by the top node of the tree. For general Internet use,
specific congestion control policies will eventually be mandated.
4.8.1 Fixed Rate Based Control
Fixed, rate-based flow control limits the transmission speed to a
predefined value. The flow control rate includes the global
multicast retransmissions from the sender.
4.8.2 Congestion Control
RMTP-II uses a Loss Tolerant Rate Controller algorithm (LTRC) for
congestion control. See Appendix B. This controller uses loss report
information from the receivers and perceived state at the sender to
perform congestion control. The goal of the controller is to be
responsive to congestion, but not overly reactive to spurious
independent loss.
LTRC allows the sender application to dynamically adjust the admit
rate between specified limits. A sender gets loss rate information
for the receivers from the HACK packets. The bottom level control
nodes calculate loss rate for each of its receiver child nodes from
the HACK packet information. The control nodes send the maximum loss
rate for the child nodes to its parent in a HACK packet. The next
level control nodes do not aggregate the loss rate, but send the
maximum loss rate for all the child nodes.
4.8.3 Ejection of Slow Receivers
Each sender node is responsible for ejecting slow receivers.
RMTP-II [Page 19]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
The sender application provides the minimum and maximum admit rate
limits. If the admit rate falls below the minimum rate, the sender
maintains the admit rate at the minimum rate and unicasts an Eject
packet to the top node. The Eject packet contains the loss rate
threshold for the stream and a flag that indicates whether all
receivers that exceed the threshold should be ejected or only the
receiver(s) with the maximum loss rate exceeding the threshold. The
top node sends one or more Eject packets to its children based on the
loss rate threshold and bit flag in the Eject packet. If the flag is
set then it will send Eject packet to all of its children having loss
more than the threshold else only to the child having maximum lass
over threshold. If the child node is a control node it uses the same
mechanism to send Eject packet to its children. The receiver(s)
having loss rates at or above the threshold get Eject packets and are
required to leave the stream. The receivers may rejoin the stream at
a later time.
4.9 Membership
RMTP provides a simple counted membership for each stream, available
at the top node and sender node. This counted membership is a simple
count of the receiver nodes that have joined a stream.
The detailed RMTP tree and stream membership information is
distributed across all the control nodes in the RMTP tree. An
external membership server can be used to acquire the full tree and
per stream membership from all the control nodes.
4.10 Fault Detection and Recovery
4.10.1 Sender node failure
A sender node that has no data to send will periodically send
NullData packets on the data channel. If a receiver fails to receive
Data packets or NullData packets for a stream sent by the sender, the
receiver detects a sender failure.
A sender node sends HeartbeatResponse packets to the top node. If
the top node fails to receive a HeartbeatResponse from a sender, it
detects the sender's failure.
4.10.2 Receiver node failure
A receiver node sends HACKs for each of the active streams that it
has joined. If none of the streams are active, then the receiver
sends HeartbeatResponse packets to its parent.
If a receiver's parent node does not receive a HACK or a
RMTP-II [Page 20]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
HeartbeatResponse within a specified time interval, the parent
detects the failure of the receiver and removes the child from its
child list.
4.10.3 Top node failure
The top node sends Heartbeat packets to its child nodes on its
multicast control channel. If a child node does not receive
Heartbeats from the top node, it detects a failure of the top node.
4.10.3.1 Fault Tolerant Top Node
RMTP-II provides for the implementation of a fault tolerant top node.
The top node sends the address of a backup top node to all of its
children in the Heartbeat packets. A backup top node monitors the
Heartbeat packets of the top node. If the top node fails, the backup
top node takes over.
If a child node of the top node detects a failure of the top node, it
reconnects to the backup top node.
4.10.4 Aggregator/Designated Receiver failure
Each aggregator and designated receiver node sends Heartbeat packets
to its child nodes on its multicast control channel. If the child
nodes do not receive any Heartbeats from the parent node, they detect
failure of the parent.
4.10.4.1 Recovery
When a child node detects failure of its parent node, it can try to
reconnect to a randomly chosen alternate aggregator, designated
receiver, or the top node of the RMTP-II tree.
4.11 Quality of Service (QoS)
RMTP-II supports three levels of QoS:
Reliable Unordered: packets are delivered exactly once, but packet
order is not guaranteed.
Reliable Source Ordered: Packets are delivered exactly once and in
the same order in which they were sent from the source.
Time Bounded Reliability: This is an optional QoS in which packets
are delivered exactly once and in the same order in which they
were sent from the source, but packets that cannot be delivered
within a specified time bound are dropped. See Appendix D - Time
RMTP-II [Page 21]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Bounded Reliability.
4.12 SNMP Support
The control nodes, and the top node in particular, are designed to
interact with SNMP management tools. In cooperation with the top
node's centralized control of the senders, this allows network
managers to easily monitor and control the sessions being
transmitted.
All nodes of RMTP-II provide SNMP support. SNMP support is optional
for sender and receiver nodes, but is required for all control nodes.
See the Appendix - SNMP MIBs for details.
4.13 Session Manager Support
The Session Manager is an optional component associated with RMTP-II
which is responsible for providing each node with topology
configuration information. A session manager can be implemented
either as a set of static configuration files, as a fully distributed
algorithm, or as a centralized management tool that interfaces with
each of the nodes.
A session manager can optionally be responsible for providing total
group membership listings by aggregating the membership lists at each
control node.
A session manager can optionally provide an interface with directory
services for the announcement of sessions.
A session manager be responsible for partitioning the data at the
sender, selecting the appropriate sets of layers at each receiver,
and for reassembling the data at the receiver. This work is currently
in progress.
4.14 Forward Error Correction
Recent work [NB96, NBT97, Rizzo97] has shown the benefits of
incorporating reactive forward error correction (FEC) into reliable
multicast protocols. This feature encodes data packets with FEC
algorithms, but does not transmit the parity packets until a loss is
detected. The parity packets are then transmitted and are able to
repair different lost packets at different receivers. This is a
powerful tool for providing scalability in the face of independent
loss. When implemented, it is a simple matter to also provide
proactive FEC which automatically transmits a certain percentage of
parity packets along with the data. This is particularly useful when
RMTP-II [Page 22]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
a high minimum error rate is expected, or when low latency is
particularly important. This can be implemented as an option to
RMTP- II. See Appendix D for FEC details.
5 Details:
5.1 RMTP-II Global Parameters
5.1.1 The Tree-Wide Parameters
The following global, or tree-wide, parameters are defined for an
RMTP-II tree. These parameters are configured at the top node. The
other nodes acquire the parameters when they connect to the RMTP-II
tree.
The parameters can be modified at the top node. The changes will be
propagated to the rest of the tree.
B: The branching factor B denotes the maximum number of children for
any node in the tree.
C: The constant C is a scaling coefficient for Thack. The constant C
determines how quickly Thack increases when no data packets are
being sent.
R: The number R specifies the maximum number of HACKs which should
be received, on average, for each Data packet sent, at any node.
The stored value of R is divided by 100 to get an R value of the
form dd.dd.
Thack_max: The number Thack_max specifies the maximum time allowed
between HACK transmissions for each receiver.
Tjoin_response: The number Tjoin_response is the maximum time to
wait for a response to a JoinStream request from the parent node.
The response should be either a JoinAck or a JoinConfirm.
Rjoin: The number Rjoin is the number of retries of the JoinStream
request before declaring the parent unreachable.
Thb: the number Thb is the time interval at which control nodes
multicast Heartbeat packets.
F: The failure threshold constant, F, determines the threshold time
for failure detection.
RMTP-II [Page 23]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Tnulldata_max: The number Tnulldata_max is the maximum time interval
for sending NullData packets.
Optimistic: All the designated receivers should use optimistic HACK
mechanism.
RxMax: The number RxMax is the maximum number of retransmission that
can be done for a data packet.
Radmit_rate_min: This is the minimum admission rate that the sender
can use for Data packets.
Radmit_rate_max: This is the maximum admission rate that the sender
can use for Data packets.
5.1.2 Global Tree Parameter Distribution
When a node joins an RMTP tree for the first time, it receives the
global parameters from the options extension of the JoinConfirm
packet.
A node receives changes to global parameters in Heartbeat packets
whose option flag, Global Parameters Modified, is set to 1.
If a control node receives a Heartbeat packet containing modified
variables, it sends Heartbeat packets to its children with the option
flag, global parameters modified, set to 1. The control node
continues sending the Heartbeat packet until it receives confirming
HeartbeatResponse packets from all of its children. The control node
then applies the new values of global parameters and sends the
confirming HeartbeatResponse packet to its parent.
When a receiver node receives a Heartbeat with the option flag,
global parameter modified, set to 1, it applies the new parameter
values and sends confirmation in the next HeartbeatResponse packet to
its parent.
Each time modifications are made to global parameters at the top
node, it increments a 16-bit sequence number and inserts this
sequence number in the Heartbeat packets it sends to its child nodes.
The nodes of the RMTP tree identify global parameter modifications
based on the sequence number in the Heartbeat packet.
The top node does not allow additional modifications to the global
parameters, until it gets confirmations from all of its children that
the current modifications have been received.
5.2 Tree Configuration
RMTP-II [Page 24]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
A node requires information about the RMTP-II tree and the data
stream it wishes to join. The mechanism for acquiring the
configuration information is not part of the protocol, but is
implemented by the associated Session Manager. The RMTP-II protocol
specifies only the rules by which a node joins the tree.
A node requires the following information to join an RMTP-II tree.
The information depends on the type of the node.
5.2.1 Top Node Configuration
A top node requires:
tree ID: a unique identifier for the RMTP-II tree
UDP listen port: the number of the port on which the top node will
listen for its children's control messages
Local Multicast Control Channel: the address on which the top node
sends Heartbeat packets to its children.
5.2.2 Backup Top Node Configuration
A backup top node requires:
tree ID: a unique identifier for the RMTP-II tree
UDP listen port: the number of the port on which the top node will
listen for its children's control messages
Local Multicast Control Channel: the address on which the top node
sends Heartbeat packets to its children.
5.2.3 Aggregator node Configuration
An aggregator requires:
tree ID: the unique identifier for the RMTP-II tree to join
parent address: the address and port of the parent node to which the
node should connect
UDP listen port: the number of the port on which the node will listen
for its children's control messages
Local Multicast Control Channel: the address on which this node sends
Heartbeat packets to its children.
RMTP-II [Page 25]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
5.2.4 Designated Receiver node Configuration
A designated receiver requires:
tree ID: the unique identifier for the RMTP-II tree to join
parent address: the address and port of the parent node to which the
node should connect
UDP listen port: the number of the port on which the node will
listen for its children's control messages
Local Multicast Control Channel: the address on which this node
sends Heartbeat packets to its children.
5.2.5 Sender or Receiver node
A sender or receiver requires:
tree ID: the unique identifier for the RMTP-II tree to join
parent address: the address and port of the parent node to which the
node should connect.
StreamID: the unique identifier of the stream the node wants to send
or receive
data multicast group address: the multicast address on which the data
stream is sent.
5.3 Tree Membership Algorithms
5.3.1 Join Algorithm
A sender node makes a JoinStream request for every data stream that
it intends to send. The parent for a sender is the top node.
When a receiver node is created, it sends a separate JoinStream
packet for each of the streams that it intends to receive. When a
receiver node rejoins a stream after a parent failure, it may send a
single JoinStream packet to join multiple streams.
Aggregator and designated receiver nodes send a JoinStream packet
with StreamID value zero to join the RMTP tree, without joining any
stream. An aggregator or designated receiver later joins the streams
that their children join.
A node sends a JoinStream request to its parent node. It waits a
RMTP-II [Page 26]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
time interval, Tjoin_response, for a response from the parent node. A
parent node responds to a JoinStream request with a JoinAck packet or
a JoinConfirm packet. If the parent node cannot process the request
immediately, it responds with a JoinAck packet. Otherwise, it sends
a JoinConfirm packet.
If no response is received, the node retransmits the JoinStream
request. The node retransmits the JoinStream request a maximum number
of times, Rjoin. If a node does not receive a response for the
JoinStream requests after Rjoin tries, it reports a Parent-
Unreachable failure after Rjoin retries.
If a node receives a JoinAck packet, it continues transmitting
JoinStream requests at exponentially longer times, until it receives
a JoinConfirm rather than a JoinAck. Every time that a node does not
receive a response from its JoinStream request, it increments a local
variable NumberFailures. Every time that it does not receive a
JoinAck in response to its JoinStream request, it increments
NumberFailures. If NumberFailures exceeds Rjoin, it reports a
Parent- Unreachable failure.
A node receives the tree's constant parameters and the parent's
multicast control channel address from the JoinConfirm.
When joining, the sender can request that RMTP-II provide the
StreamID, in which case the top node generates a unique StreamID and
sends it in the JoinConfirm packet.
If a sender fails and than attempts to rejoin the tree, the top node
will not allow this new sender to join until a period 6*F*Thb, the
sender failure detect interval, has elapsed since the failure.
A control node can proactively join the RMTP-II tree without joining
a stream by issuing a JoinStream request with a StreamID value of
zero. A control node can reactively join the RMTP-II tree when it
receives a JoinStream request from any of its children. The control
node joins all the streams that its children join.
When a node issues a JoinStream request during a failure recovery, it
sets the rejoin flag in the JoinStream request. During a rejoin, a
node can specify all the streams that it wants to join in a single
JoinStream request.
5.3.2 Sender TimeStamp
When a sender is created, it records the current time, expressed in
elapsed seconds and microseconds since 00:00 Universal Coordinated
Time, January 1, 1970. The sender uses the seconds part to
RMTP-II [Page 27]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
initialize the value of TimeStamp. TimeStamp is a 32 bit number
which is unique within approximately 136 years.
The resolution of the TimeStamp is 1 second. Because this is UTC,
there is no counter discontinuity when daylight savings occurs.
The TimeStamp value is sent in every Data, NullData, Retransmission,
and HACK packet for the data stream, and does not change over time
for a stream.
The value of TimeStamp allows nodes to distinguish different
incarnations of the sender. A new TimeStamp on a data stream
indicates that the sender has restarted.
5.3.3 Leave Algorithm
A node sends a LeaveStream request when it wishes to leave the data
stream. The node waits for a period Tjoin_response to receive the
LeaveConfirm packet from its parent node. If the LeaveConfirm packet
is not received within the time Tjoin_response, then the node resends
the LeaveStream request. The node sends the LeaveStream request a
maximum of Rjoin times.
A control node can leave a stream only if it has no children.
To leave a stream, a sender node sends an EOS packet and waits for
the EOS in the HACK before leaving the stream.
5.4 Data Transmission
An application provides Data packets to a sender node. The sender
node assigns consecutive sequence numbers to the Data packets and
buffers the Data packets in the data queue.
A sender can use any sequence number other than 0 as the starting
sequence number for the stream. Sequence number 0 indicates an
inactive data stream and should not be used for other purposes.
The sender initializes LastStable as StartSequenceNumber-1.
LastStable is the sequence number of the last stable packet which the
sender has removed from its data queue. All lower numbered packets
are stable. The sender cannot retransmit any packet having sequence
number less than or equal to LastStable.
LastStable in conjunction with the sequence number is used by
receivers to detect the first Data packet for the stream as well as
missing packets.
RMTP-II [Page 28]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
For example, suppose that a sender's StartSequenceNumber = 10 and the
receiver's first Data packet has sequence number 15. LastStable = 9
(10-1). The missing packets are 10, 11, 12, 13 and 14.
5.4.1 Stable Packets
If the lowest numbered Data packet in the data queue becomes stable,
that is, all the target recipients have received the packet, then the
sender deletes the packet from the data queue. The Sender receives
stability information from the Stable Sequence Number of the HACKs it
receives.
5.4.2 Stream Parameters
The sender maintains the following values for each of its data
streams:
Ndata_size: This is the data queue size, the number of Data packets
the sender can buffer for transmission or retransmission. A
sender can admit a packet to the data queue only if the number of
packets in the data queue is less than Ndata_size.
Radmit_rate_max: A sender uses Radmit_rate_max to control the
maximum sending rate. The sending rate is determined by
congestion control algorithms, and can fluctuate between
Radmit_rate_min and Radmit_rate_max.
Nseq_low: This is the sequence number of the lowest numbered packet
in the data queue which is not yet stable. It is the sequence
number of the oldest Data packet that can be retransmitted.
Nseq_high: This is the sequence number of the highest numbered
packet in the data queue.
Tsend_alarm: This is the interval between multicasts of Data packets
by the sender.
Tnulldata_min: This is the minimum interval between the transmission
of NullData packets by the sender.
Tnulldata_alarm: This is the current interval between transmissions
of NullData packets by the sender.
5.4.3 Example: Packet Stability and Transmission
Suppose that: Ndata_size equals 150, Nseq_low equals 200, Nseq_high
equals 300, and all the Data packets numbered from 200 to 230 have
gone stable. In this case, the sender will delete the stable packets
RMTP-II [Page 29]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
from 200 to 230 from the data queue. The new value of Nseq_low will
be 231. The sender can transmit Data packets until Nseq_high reaches
380 (Nseq_low + Ndata_size).
5.4.4 Example: Packet Rate
Suppose the sender is using fixed size packets and Radmit_rate = 512
Kbps = 65536 Bps, Ndata_size = 200 packets, PACKET_SIZE = 1024 bytes,
Tsend_alarm = 50 ms = 20 ticks per second.
The maximum number of packets that can be sent per interval
Tsend_alarm is: (Radmit_rate*1024)/(8*(1000 /
Tsend_alarm)*PACKET_SIZE) For this example the sender can at most
send (512 * 1024)/(8 * (1000/50) * 1024) = 512/160 which is
approximately 3 packets per interval
5.4.5 NullData Packets
If a sender has no Data packets to transmit for a data stream, it
transmits NullData packets on the data channel. A NullData packet
contains a sender time stamp, TimeStamp, the sequence number of the
last packet, LastSequence, and the sequence number of the last stable
packet, LastStable.
NullData messages keep the stream alive, inform the receivers about
the state of the sender, and detect packets missed by receivers.
When there are no Data packet to send, a sender sets Tnulldata_alarm
to Tnulldata_min. If Tnulldata_alarm expires, then the sender
transmits a NullData packet and sets Tnulldata_alarm to twice its
current value. The maximum value of Tnulldata_alarm is
Tnulldata_max. This is done to utilize minimum bandwidth and insure
timely recovery of missing Data packets.
Example: Increase of Tnulldata_alarm
Suppose that:
Tnulldata_min = 2000 ms
Tnulldata_max = 16000 ms
Time of transmission of last Data packet = t1
Let s = Tnulldata_min
If there is no data to send, the sender transmits NullData packets at
times t2 = t1+s, t3 = t2+2s, t4 = t3+4s, t5 = t4+8s, t6 = t5+16s
|--|----|--------|----------------|----------------|---------------
t1 t2 t3 t4 t5 t6
5.4.6 End Of Stream
RMTP-II [Page 30]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
The end of a data stream is indicated by the flag EOS in the final
Data packet of the data stream. The sender transmits a Data packet
with the EOS bit set to 1 and waits for the packet to become stable.
If a receiver receives a Data packet with EOS set to 1 and it is not
missing any Data packets, it continuously generates HACK packets,
based on Thack, with the EOS bit set to 1, until it gets an EOS
packet from its parent confirming the end of stream. The receiver
can then leave the stream.
If a parent node receives a HACK with the EOS bit set to 1, it
responds with an EOS packet to that child node confirming the end of
stream.
If a parent node has received HACKs for a data stream with the EOS
bit set to 1 from all of its child nodes, then it sends a HACK with
EOS set to 1 to its parent node. Once a sender indicates the end of
the stream, it must wait until it gets a confirming EOS packet from
the top node before it can leave the group.
5.4.7 Computation of Feedback Round Trip Time
The round trip time, RTT, is the difference between the time a Data
packet is sent by the sender and the time that a HACK is received by
the sender indicating that the packet was received or requesting
retransmission of the packet.
The sender measures the RTT for the first packet of a stream and
every packet with sequence number equivalent to 1 mod H where
H = Ceiling(B/R)
B = MaxChildren
R is the overhead ratio.
For example, if H = 200, the sender will calculate RTT on packet
number 1, then on packet 201, 401, 601. The sender will receive, on
average, one HACK per 200 Data packets.
5.4.8 Retransmission Timeout
The Retransmission Timeout, RTO, is the time interval between
retransmissions, in milliseconds.
The RTO value is calculated by the sender node using the algorithm of
[Jacobson88], [Jacobson90] which depends on both the smoothed RTT and
the smoothed mean deviation. The algorithm can be implemented with
integer arithmetic.
RMTP-II [Page 31]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
M = measured RTT at the sender
A = smoothed RTT
D = smoothed mean deviation
Initially, let
g = 1/8, g is the gain for the RTT average.
h = 1/4, h is the gain for deviation
A = 0, current RTT estimator
D = 3 seconds
Err <= M - A
A <= A + g*Err
D <= D + h*(|Err| - D)
RTO = A + 4D
5.5 Data Reception
5.5.1 New Stream
When a receiver joins a data stream it receives a JoinConfirm packet
which contains the StreamID, the TimeStamp, and the value of
LastStable.
If the data stream is not active, then LastStable equals 0. If the
data stream is active, LastStable is the number of the stable packet
with the greatest sequence number. The receiver can only receive
packets with higher numbers than LastStable. The receiver calculates
the starting sequence number, StartSequenceNumber, for the stream by
adding 1 to the value LastStable. If the result equals 0 because of
wrap-around, then the value is incremented to 1, since sequence
number 0 is reserved.
5.5.2 Quality of Service
A receiver node delivers Data packets based on the quality of service
value, QoS, specified for the data stream. If the QoS value is
Source Ordered, then the Data packets must be delivered to the
application in the order defined by the sequence numbers of the
packets. If the QoS is Unordered Reliable, the packets are delivered
to the application in the order they are received, but are never
delivered more than once.
An optional QoS level, Time Bounded Reliability, is detailed in
Appendix - Time Bounded Reliability.
5.5.3 Detection of Missing Packets
RMTP-II [Page 32]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Data, NullData, and Retransmission packets contain information that a
receiver uses to detect missing packets.
A NullData packet for a stream contains the sequence number of the
last Data packet transmitted for the stream. A receiver determines
whether any intervening packets in the sequence are missing.
Each Data packet and Retransmission packet contains its own sequence
number and the number of the last stable packet, LastStable. A
receiver determines whether any packets have been missed between the
last stable packet and the highest numbered packet received.
5.5.4 Recovery of Missing Packets
Both HACKs and NACKs are used for recovery of missing packets.
If a receiver detects a missing Data packet, it sets to zero the bit
of the HACK bitmap which corresponds to the sequence number of the
missing Data packet.
If the bit flag N of Data packets, NullData Packets, or
Retransmission packets for a data stream is set to 1, then NACKs are
enabled for the data stream. If a receiver detects a missing Data
packet and NACKs are enabled for the stream, then the receiver sends
a NACK based on the NACK generation algorithm.
The HACK and NACK generation algorithms are described later.
5.5.5 Stream failure detection
A receiver detects the failure of a stream by the following
mechanisms:
Timeout Detection If a receiver does not receive any Data, NullData,
or Retransmission packets within an interval 2*F*Tnulldata_max, then
it detects the failure of the data stream
Timestamp Detection If a receiver receives a Data packet whose
TimeStamp value is greater than the current TimeStamp value for the
data stream, then the receiver detects the failure of the stream.
The receiver changes the current TimeStamp value to the TimeStamp
value of the received packet. The receiver discards any Data packet
with a TimeStamp value that is less than the current TimeStamp value.
For example, suppose the current StreamID equals 0x1000 and TimeStamp
equals 0x1234. The receiver receives a new Data packet with StreamID
0x1000 and TimeStamp 0x1300. The new TimeStamp value is higher than
RMTP-II [Page 33]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
the current value. The receiver detects that the stream has been
abnormally terminated and restarted with the TimeStamp 0x1300.
If a control node receives a HACK packet from a child node for a data
stream which has a higher TimeStamp value than the current TimeStamp
value for the data stream, then the node detects that the data stream
has failed. The control node updates its information for the data
stream with the new TimeStamp value and it ignores all HACK packets
with TimeStamp value less than the new TimeStamp value.
5.5.6 End of Stream
If a receiver receives a confirming EOS packet from its parent for a
data stream, it discards any further Data packets for that data
stream.
5.6 HACK Generation
HACK packets are used to acknowledge received Data packets and to
detect missed packets. The purpose of a HACK is to acknowledge the
packets that are considered stable, and to request retransmission of
packets that have not been received.
The following fields of a HACK packet are used to determine the
status of the received data:
High Sequence Number: If the node is a receiver, the High Sequence
Number is the sequence number of the highest numbered packet that
the node has received. If the node is a control node, then the
High Sequence Number is the sequence number of the highest
numbered packet received by any of its child nodes.
Low Sequence Number: If the node is a receiver, the Low Sequence
Number is the sequence number of the lowest numbered packet that
the node has not yet received. If the node is a control node,
then the Low Sequence Number is the sequence number of the lowest
numbered packet that has not yet been received by any of its child
nodes.
Stable Sequence Number: The stable sequence number is the sequence
number of the highest contiguous stable packet known to the node.
If the node is a receiver, this number is the Low Sequence Number
minus 1. If the node is an aggregator or top node then it is the
aggregated Low Sequence number minus 1. If the node is a
designated receiver then the Stable Sequence Number has different
meaning depending on the HACK mechanism used.
Bitmap: The bitmap consists of a sequence of single bit
RMTP-II [Page 34]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
"acknowledged" flags, corresponding to the sequence of packets
with sequence numbers between Low Sequence Number and the High
Sequence Number. A bit value 0 indicates that the corresponding
packet is missing and requires retransmission. For reasons of
efficiency, the bit corresponding to Low Sequence Number is at the
bit position Low Sequence Number Modulo 32. The remaining bits
are assigned to consecutive sequence numbers up to High Sequence
Number. Any remaining initial or terminal bits are ignored.
Example for details of HACK
LSN, the Lowest Sequence Number not yet received by this node = 40
HSN, the Highest Sequence Number received by this node = 72
Bitmap[0] = 0xFF7EDC7F
Bitmap[1] = FF800000
Here is the bitmap array written as 0s and 1s:
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 1 1 1|1 1 1 1|0 1 1 1|1 1 1 0|1 1 0 1|1 1 0 0|0 1 1 1|1 1 1 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 1 1 1|1 1 1 1|1 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The bitmap begins at position LSN modulo 32, which is position 8.
Bits 0-7 in Bitmap[0] are ignored. The bitmap extends for HSN - LSN
= 32 positions. The final position is at position 8 of Bitmap[1].
Bits 9 - 31 of bitmap[1] are ignored. The bitmap indicates that the
following packets are missing: 40, 47, 50, 54, 55, 56.
The receiver nodes and control nodes use different algorithms for
generating HACKs.
5.6.1 Receiver node HACK generation
Congestion caused by bursts of ACK or HACK traffic limits the
scalability of ACK or HACK based protocols. RMTP resolves this
problem by allowing the system to control the level of HACK overhead
by dynamically adjusting the rate of control packets to match the
rate of data packets, and by limiting control packet generation with
the rotating HACK algorithm.
Control rate of HACK
Congestion due to control packet traffic is most likely to occur when
data packets are sent at a high rate and the control channel has
significantly lower bandwidth than the data channel. RMTP-II
controls the rate by controlling the response ratio R = C/D. C is the
RMTP-II [Page 35]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
number of control packets a parent receives from its children in
response to the number D of data packets received. This ratio is
controlled by limiting the acknowledgments returned by the children.
The Rotating HACK Algorithm
Each receiver is informed of its index number, M, in its parent's
child list from the JoinConfirm packet received when the receiver
joins the data stream. The receiver calculates the value of H
defined by: H = Ceiling(B/R). B and R are tree wide constants.
A receiver only sends a HACK for a packet whose sequence number
Modulo H equals its M number modulo H. The value of H insures that
the receiver child nodes of a single parent send a total of no more
than R HACKs for each Data packet
If a packet is dropped in transit, and the packet would have
generated a HACK for the rotating HACK algorithm, then the first
packet received with a sequence number greater than the missing
trigger packet will cause a HACK to be generated.
5.6.1.1 Example
Assume a parent has B = 6 children, and the desired ratio of HACK
packets received to Data packets sent is R = 2. Then the children
will respond to every H = 6/2 = 3 Data packets. The children are
separated into three groups by their M numbers (Modulo H), 0, 1, 2.
The children with number 0 will respond to sequence numbers that
equal 0 (Modulo 3). The children with number 1 will respond to
sequence numbers that equal 1 (Modulo 3). The children with number 2
will respond to sequence numbers that equal 2 (Modulo 3). The 6
children are partitioned into three response classes of size 6/3 = 2.
There will be B/H = 2 children responding to each Data packet.
Sequence #: 101 102 103 104 105 106 107 108
Sequence #, Mod 3: 2 0 1 2 0 1 2 0
Receiver Index #: 1, 2, 3, 4, 5, 6
Receiver Index (Modulo 3): 1, 2, 0, 1, 2, 0
A small R value makes H large, and decreases the volume of control
traffic. However, since the children do not send HACKs immediately,
the time to reach a stable state is longer.
5.6.1.2 Guaranteed HACK Generation
When data traffic is low, a receiver may not send a packet for a long
time. This could cause long waits for packet stability and could also
RMTP-II [Page 36]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
make the receiver appear to have failed. The tree wide constant
Thack_max guarantees that receivers respond in a timely manner.
Thack_max specifies the maximum time that can elapse between HACKs
while the stream is active. If the stream is active, a receiver
sends at least one HACK within the interval Thack_max.
The value of the variable Thack dynamically controls the time between
HACKs. Thack varies as the average of the previous two inter-HACK
times, Thack1 and Thack2. The values of Thack1 and Thack2 are
initially set to Thack_max. Thack = Min((Thack1 + Thack2)*C,
Thack_max) where C is a tree-wide constant that determines the
responsiveness of Thack.
The Thack timer is initialized when a receiver receives the first
packet of a data stream. A new Thack value is calculated and the
Thack timer is reset each time the receiver sends a HACK packet.
If the data traffic load is high, H packets are received in time
Thack or less and the rotating HACK algorithm determines the
generation of HACKs.
If the load is low, the Thack timer expires before H packets are
received and Thack determines the generation of HACKs. Under low
load conditions, the value of Thack grows in an approximately
exponential manner to Thack_max.
5.6.2 Control node HACK generation
A control node has two functions in the HACK algorithm, HACK
aggregation and HACK generation. Designated receivers use a modified
HACK mechanism.
Aggregator and top node HACK mechanism
An aggregated HACK from a control node acknowledges that a packet has
gone stable only if all descendants of the control node have received
the packet. The Lowest Sequence Number refers to the lowest numbered
packet that has not been received by at least one of the control
node's child nodes. The Highest sequence number is the highest
numbered packet that has been received by all of the control node's
child nodes. The Stable Sequence Number is the aggregated Lowest
Sequence Number minus 1. The parent bitmap is a logical AND of
corresponding bits of all the child bitmaps. The bitmap indicates the
packets missed by all of the control node's descendents.
Example for HACK aggregation
HACK information from receiver-1
Lowest sequence number = 40
RMTP-II [Page 37]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Highest sequence number = 72
Stable sequence number = 39
Bitmap[0] = 0xFF7EDC7F Bitmap[1] = 0xFF800000
missing packets: 40, 47, 50, 54, 55, 56
HACK information from receiver-2
Lowest sequence number = 38
Highest sequence number = 74
Stable sequence number = 37
Bitmap[0] = 0xFDFEDD7F Bitmap[1] = 0xFF600000
missing packets: 38, 47, 50, 54, 56, 72
Aggregated HACK
Lowest sequence number = 38
Highest sequence number = 71
Stable sequence number = 37
Bitmap[0] = 0xFD7EDC7F Bitmap[1] = 0xFF600000
The control node generates a HACK when it receives HACKs from all its
child nodes. To guarantee HACK generation the control nodes sends a
HACK if its Thack timer expires. The algorithm for Thack timer is
identical to the receiver, refer to "Guaranteed HACK generation".
receiver-1: 40 47 50 54 55 56 (72)
receiver-2: 38 47 50 54 56 72 (74)
aggregated: 38 40 47 50 54 55 56 (71)
Packet 38 is missing from receiver-2 and is the lowest missing
packet. If a packet is missing in either bitmap, it is missing in the
aggregated HACK. Packet 71 is the highest packet received by both
receivers.
Designated receiver HACK mechanism
The designated receiver has two kinds of HACK mechanisms: Pessimistic
and Optimistic.
Pessimistic HACK
The pessimistic HACK mechanism is very similar to the aggregator's
HACK mechanism. The Stable Sequence Number is the aggregated lowest
Stable Sequence Number of all the child HACKs. The Lowest Sequence
Number is the lowest missing packet above the Stable Sequence Number
of the designated receiver. The Highest Sequence Number is the
Highest Sequence Number received by the designated receiver. The
bitmap only contains information about the data packets missed by the
designated receiver.
RMTP-II [Page 38]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Optimistic HACK
An optimistic HACK acknowledges that a packet is stable if the parent
node receives it. It is the responsibility of the parent node to
insure that every child receives the packet. A node generates an
optimistic HACK based on its own list of received packets and uses
the same rotating HACK algorithm used by the receiver nodes.
With optimistic HACKs, the designated receiver still needs to
aggregate HACKs from its child nodes because it needs to discard
buffered data packets. It discards a packet as soon as all of its
children have received it.
The advantage of the Optimistic HACK mechanism is that the sender
gets stability of packets before the packets are actually stable at
the receivers. This decreases the buffering requirements at the
receiver. The disadvantage of the Optimistic HACK mechanism is the
resulting decrease in fault recovery.
Consider a RMTP tree in which a receiver has a designated receiver as
parent and a designated receiver as the next ancestor node. The
parent node receives all the packets, but the receiver misses a
packet. If the receiver's parent dies before it can retransmit the
missing packet and the receiver rejoins the ancestor node, the
receiver cannot recover the mission packet. This is because the
packet has already gone stable at the ancestor and at the sender
because of the optimistic HACK mechanism.
5.7 Retransmissions in response to a HACK
HACK packets prompt retransmission of missing packets. This
retransmission can be done by the sender or by designated receiver
nodes.
A designated receiver node retransmits any missing packets for which
it receives a request in a HACK and for which it is currently holding
the requested packets. The retransmissions are multicast on the
local control channel. When a designated receiver fulfills a
retransmission request, it deletes that request from the request bit
field in the HACK to prevent any of its parents from duplicating the
retransmission.
When a designated receiver retransmits a packet, it calculates a
minimum period of time, Tmin_retransmit, before another
retransmission request will be honored for this packet. The value of
Tmin_retransmit is stored at the designated receiver along with the
packet's sequence number and a count of the number of times the
packet has been retransmitted. A request for retransmission which
RMTP-II [Page 39]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
arrives at a designated receiver during the period of time
Tmin_retransmit is ignored.
Tmin_retransmit is calculated as the local RTT of the control channel
multiplied by 2 to the power of the number of previous
retransmissions of that packet. Tmin_retransmit can never exceed
Tmax_retransmit, a fixed system constant.
If the number of retransmissions reaches RXmax, the sender or a
designated receiver will ignore further requests for retransmission.
The designated receiver will mark the packet as stable. The sender
will terminate the stream. The receiver will eventually get a Data
Fail Indication.
5.8 NACK Generation
A negative acknowledgment, or NACK, is an explicit request by a
receiver or a designated receiver to its parent node for a
retransmission of missed data. Networks that operate under low loss
conditions can use NACKs, combined with reduced HACK traffic, to get
faster recovery of lost Data packets with less control traffic
overhead.
A NACK is a HACK packet with the NACK bit, N, set to 1. NACKs are
allowed only for NACK enabled data streams.
After detecting a missing packet, a receiver waits for a random time
Z before generating a NACK. The time, Z, is random variable having a
truncated exponential distribution on the interval [0, T]. Z is
obtained from a uniformly distributed random variable, X, on the [0,
1] by the formula:
Z = T/b1*Ln(b2*X - 1)
where
N = 3, This is the expected number of NACKs, if all the Children miss
the packet.
T = (0.6)*RTT*b1/Ln(N)
b1 = Ln(B) + 1
b2 = Exp(b1) -1
RTT is the local round trip time to the parent
This backoff policy is discussed in [NB98], which also shows that
this policy has a better response time than a uniformly distributed
random variable and quantifies how the average latency varies as a
function of the average worst case number of responses, N.
If a receiver receives the missing Data packet before the NACK timer
RMTP-II [Page 40]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
expires, it cancels its NACK. This reduces the chance that multiple
receivers generate a NACK for the same packet and suppresses NACKs.
5.9 NACK Response
When a designated receiver receives a NACK, it immediately multicasts
the retransmission on the control channel, unless it suppresses
retransmission due to a recent HACK or NACK. Retransmissions due to
a NACK are suppressed the same way as retransmissions due to a HACK.
If a designated receiver gets a NACK requesting some packets that it
has and some packets that it does not have, it clears the
retransmission request bits for the packets it has, forwards the NACK
to its parent. It retransmits the packets that is has available. If
some packets are unavailable, the designated receiver sends a
Heartbeat with the NACK Suppression Option to its children. The NACK
Suppression option suppresses NACKs from the designated receiver's
children. A single NACK Suppression packet can cover multiple
packets.
If an aggregator node receives a Heartbeat packet from its parent
with the NACK Suppression option, it sends a Heartbeat packet to its
children with bits set for all requested Data packets.
When a designated receiver receives retransmissions from its parent,
it forwards these retransmissions to its children.
A top node or aggregator that receives a NACK acts just like a
designated receiver that does not have the packets for
retransmission. The sender for that stream is treated as the parent
of the top node.
5.10 Congestion Control
The aggregators and designated receivers at the lowest level of the
tree calculate the receiver loss averages based on the HACKs
received. Using the HACK bitmap, the control node calculates the loss
rate. Assuming the HACK bitmap covers 50 packets and 5 of the packets
are missing, the loss rate for that receiver will be 10%. The control
node reports the maximum of its children's loss rates to its parent.
Finally, the top node passes the loss rate information to the sender.
Congestion control is a required component of RMTP-II. See Appendix -
Congestion Control for more detail.
5.11 Failure Detection and Recovery
The Heartbeat, HeartbeatResponse and NullData packets provide the
RMTP-II [Page 41]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
primary failure detection mechanism in RMTP-II.
5.11.1 Fault tolerant top node
RMTP-II provides a fault-tolerant top node capability by allowing the
configuration of a backup top node. The primary function of a backup
top node is to monitor the top node's Heartbeat packets. If the top
node fails, the backup top node takes over the responsibilities of
the top node.
A backup top node is created administratively when configuring the
top node and the backup top node. The backup top node is configured
as a top node. It joins with StreamID set to zero. The backup top
node receives the top node's Heartbeat packets. The backup top node
sends HeartbeatResponse packets to the top node to enable the top
node to detect the failure of the backup top node.
The top node sends the backup top node's address in the Heartbeat
packets to it child nodes. An address of zero for the backup top
node indicates that there is no backup top node available.
5.11.2 Top Node Failure Detection
The top node sends Heartbeat packets on its multicast control channel
at time intervals Thb. If a child node does not receive a Heartbeat
packet from the top node for a period of 2*F*Thb, it declares the top
node failed.
If the backup top node does not receive a Heartbeat for an interval
F*Thb, then it declares the top node failed.
If a child of the top node does not receive any Heartbeat from the
top node for an interval 2*F*Thb, it detects the failure of the top
node and the failure of the backup top node. The child node declares
the top node failed and, because no backup exists, the child reports
the parent unreachable and fails.
5.11.3 Top Node Failure Recovery
The backup top node detects the top node's failure within an interval
that is half the interval required by the other children of top node.
The backup top node takes over the top node's responsibility and
sends Heartbeat packets on the top node's multicast control channel.
When the top node's children start to receive the backup top node's
Heartbeat packets, they validate the backup top node's identity and
send a JoinStream request with rejoin flag set to 1 to reconnect to
the backup top node. The backup top node acquires information about
RMTP-II [Page 42]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
the children from their JoinStream requests.
5.11.4 Parent Node Failure Detection by a Child
Each control node sends Heartbeat packets on its multicast control
channel every interval Thb. All the children of the control node
receive the Heartbeat packets. If the child does not receive any
Heartbeat from the parent in an interval F*Thb, it declare the parent
failed.
5.11.5 Node Failure Detection by a Parent
A parent node receives its child node's HACKs and HeartbeatResponse
packets. A child node sends either a HACK or a HeartbeatResponse
packet every interval F*Thb. If a parent fails to receive a packet
from a receiver child node within an interval 3*F*Thb, it detects the
receiver's failure.
If a parent fails to receive a packet from a control node child
within an interval 6*F*Thb, it detects the failure of the control
node child.
If a the top node fails to receive a HeartbeatResponse from the
backup top node for a period of 6*F*thb, it detects the failure of
the sender node.
The failure detection interval for a control node or sender node is
longer than that of a receiver node because these nodes reside in
controlled environments and the probability of their failure is much
less than that of a receiver node.
5.11.6 Child Node Failure Recovery
If a control node detects the failure of a child, it sends an Eject
packet with the Reason Code set to 1 to indicate that there was no
response from the child.
A control node may fail and restart before its children can detect
the failure. If a control node receives a HACK or HeartbeatResponse
packet from a child node that is not in its child list, it sends an
Eject packet with Reason Code 2 to indicate that the parent
restarted.
If a child node has information about other control nodes, it can try
to reconnect to an alternate parent node. The child node makes a
JoinStream request with the rejoin flag set to 1 to join an alternate
parent node. The child node picks a random parent node from the list
of parents.
RMTP-II [Page 43]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
If a child node receives an Eject packet from its parent, it must
send a JoinStream request to establish its state at the parent.
There is no guarantee that the JoinStream will succeed. If the
parent rejects the JoinStream request, the child node can try to
reconnect to an alternate parent node.
5.11.6.1 Example
Thb = 1 sec.
F = 3
Control node Heartbeat interval = 1 sec.
Control node's failure detection interval = 3 sec.
Receiver's HeartbeatResponse interval = 3 sec.
Receiver child node's failure detection interval = 9 sec.
Control child node's failure detection interval = 18 sec.
If a control node fails, its child receiver nodes detect the parent's
failure in 3 sec. The control node's parent detects the control
node's failure in 18 sec. The receivers have 15 sec. to connect to
an alternate parent node before losing their connection to the RMTP-
II tree.
5.12 Control tree discovery
A Heartbeat packet from a control node may optionally include an
ancestor list, a peer list, and a child list.
An ancestor list contains the address of the control node's parent
and the intervening parent nodes up to the top node. The list is
ordered from the top node to the issuing node's parent.
A peer list contains the addresses of the control node and its
siblings under the parent node.
A children list contains the addresses of the child nodes of the
current node.
A control-node only provides the addresses of the control child nodes
from its children list.
(Top Node) TN | Heartbeat from TN: Tree Info
/^\ | ancestors:
/ | \ | peers:
/ | \ \ / children: AN1, AN2, AN3
/ | \
AN1 AN2 AN3 | Heartbeat from AN1: Tree Info
/|\ | ancestors: TN
/ | \ | peers: AN1, AN2, AN3
/ | \ \ / children: AN4, AN5, AN6
/ | \
RMTP-II [Page 44]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
AN4 AN5 AN6 | Heartbeat from AN4: Tree Info
| | ancestors: TN, AN1
| | peers: AN4, AN5, AN6
| \ / children:
RN
The top node, TN, is at the root of the tree and has no ancestor list
and no peer list. The top node's children list contains the addresses
of AN1, AN2, and AN3.
The node AN1 receives the Heartbeat from the top node, TN, and
receives: - an ancestor list with no entries, indicating its parent
is
the top node
- a peer list with no entries, confirming that its parent is
the top node
- a Children list which contains the addresses of AN1 and its
peers.
The node AN1 sends the following lists in the Heartbeat packet to its
child nodes:
- an ancestor list containing the address of its parent, TN
- a peer list containing the addresses of it and its siblings, AN
- a children list containing the address of AN4, AN5, and AN6
The node AN4 receives the following lists in the Heartbeat packet
from AN1:
- an ancestor list containing the address of TN
- a peer list containing the addresses of AN1 and its peers. If AN1
fails then AN4 will try to join one of its parent's peers AN2 and
AN3.
If the join is not successful, AN4 will try to connect to TN.
- a children list containing the addresses AN4 and its siblings
The node AN4 sends the following lists in the Heartbeat packet to its
child nodes:
- an ancestor list containing the addresses of TN, AN1, and AN4
- a peer list containing the addresses of AN4 and its siblings, AN5
RMTP-II [Page 45]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
and AN6
- a children list with no entries; only control children are revealed
The receiver node, RN, receives the following lists in the Heartbeat
packet from AN4:
- an ancestor list containing the addresses of TN, AN1, and AN4
- a peer list containing the addresses of AN4 and its peers, AN5 and
AN6
- a children list with no entries
If AN4 fails, RN can try to join its peers, AN5 or AN6. If AN4's
peers don't respond, RN can attempt to the join AN4's parent.
5.13 Control Tree Round Trip Times
RMTP provides an optional mechanism to get the round trip time
(HopRTT), between a node and its children, the round trip time from a
node to its bottom-most descendent (TotalRTT-down), and the round
trip time from a node to the top node (TotalRTT-up). The RTT Option
is used in Heartbeat and HeartbeatResponse packets to calculate and
propagate these RTTs.
A child node gets information about HopRTT and TotalRTT-up from the
Heartbeats of its parent. A parent node gets TotalRTT-down from the
HeartbeatResponse of its children.
5.13.1 Algorithm for RTT
Each node uses the following values:
ResponseDelay: This is the time difference, in milliseconds, between
the time that a Heartbeat packet was received from a parent and
the time the HeartbeatResponse packet sent to the parent. The
child node sends this information in HeartbeatResponse to its
parent.
HopRTT: This is the maximum round trip time from a parent node to its
children. The parent node calculates this from the TimeStamp and
reception time of HeartbeatResponse. The children receive this
from the parent in a Heartbeat packet.
TotalRTT-down: This is the node's RTT to its most distance
descendent.
RMTP-II [Page 46]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
TotalRTT-up: This is the node's RTT to the top node.
A parent node sends a first Heartbeat packet with the RTT option to
its child nodes to calculate the RTT. It sets the value of TimeStamp
in the RTT option to the sending time. Other fields of the RTT option
are set to 0.
The parent node receives HeartbeatResponse packets from its children
with the following fields:
TimeStamp: This is the original Heartbeat TimeStamp value.
ResponseDelay: This is the time difference, in milliseconds, between
the time the child received the Heartbeat packet and the time it
sent the HeartbeatResponse packet.
HopRTT: This field is 0 in HeartbeatResponse packets.
TotalRTT-down: This is the total round trip time from the child to
its most distant descendent. TotalRTT equals 0 for a node with no
children.
The parent node calculates the round trip time for each of its
children:
round trip time = reception time - TimeStamp - ResponseDelay
The parent node sets the value of HopRTT:
HopRTT = the maximum of child round trip times. HopRTT = 0 for a node
with no children.
The parent node calculates the distance to its most distance
descendent:
TotalRTT-down = maximum of the RTT to each child plus the TotalRTT-
down values to that child. TotalRTT-down equals 0 for nodes with no
children.
The node is also the child of its parent and receives Heartbeat
packets from it parent. If the Heartbeat has the RTT option and the
TimeStamp value is non-zero, then the parent is requesting
information to calculate the RTT. The node returns a
HeartbeatResponse with the following values:
TimeStamp: This is the parent Heartbeat TimeStamp value.
ResponseDelay: This is the difference between the time the node
RMTP-II [Page 47]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
received the Heartbeat and the time it sends the
HeartbeatResponse.
HopRTT: This field is 0 in HeartbeatResponse packets.
TotalRTT-down: This is the total round trip time from the node to its
most distant descendent, described above.
If the Heartbeat has the RTT option and the TimeStamp value is zero,
then the parent is providing the RTT information. The HopRTT and the
TotalRTT-up fields have the relevant information. The node calculates
its own TotalRTT-up value:
TotalRTT-up = HopRTT plus parent's TotalRTT-up
The node sends the new information in the next Heartbeat packet it
sends. The following information is set in the RTT option of the
Heartbeat packet.
TimeStamp: This is set to 0.
ResponseDelay: This is set to 0.
HopRTT: This is the maximum of the round trip times from the parent
to its children.
TotalRTT-up: This is the parent's RTT to the top node, determined
from Heartbeats from its parent, described above. TotalRTT-up
equal 0 for the top node
6 RMTP-II Packet Formats
This section contains the packet formats for the RMTP-II packet
types. Each field of the packet is shown and briefly discussed. The
packet formats assume a 32-bit addressing scheme.
6.1 RMTP-II Fixed Header
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VER |O Num|Res| TYPE | RMTP-II Tree ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RMTP-II Tree ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
RMTP-II [Page 48]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
| Data or Optional Fields |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
VER: Specifies the version number of the protocol, the first field
to be checked by a receiver
O: a number that specifies the number of options present in this
packet. If no options are present, then this field is set to 0.
TYPE: Packet type.
Value Meaning
0 Reserved
1 Data
2 Retransmission
3 HACK
4 JoinStream
5 JoinAck
6 JoinConfirm
7 LeaveStream
8 Heartbeat
9 NullData
10 Eject
11 EOS
12 HeartbeatResponse
13 LeaveConfirm
14-255 Reserved for Future Use
Tree ID: Tree ID is a six-byte identifier of the RMTP-II tree.
Currently Tree ID consists of the four byte IP address and a two
byte UDP port of the top node.
RMTP-II [Page 49]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.2 Data
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Last Stable Packet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TimeStamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| StreamID |N|E| Reserved | QOS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Length | Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Sequence Number: A sender stamps each Data packet that it sends with
a sequence number. Sequence numbers are 32 bits in length and
have a valid range of 1 to 2^32-1. RMTP-II does not use sequence
number 0 and it is always ignored. Because this range is finite,
all arithmetic and comparison with these numbers is Modulo 2^32.
RMTP-II requires that no more than 2^31 packets can be created
over a period equal to the maximum lifetime for a datagram packet
in the network. This condition holds true when IP is used as the
datagram service.
Last Stable Packet: This field contains the sequence number of
highest numbered stable packet at the sender.
TimeStamp: This is a 32 bit time stamp with one second granularity
used to indicate the time when the stream was started.
StreamID: This is a 16 bit ID which uniquely identifies the stream.
N: NACK enabled. This is a flag that is set to 1 if NACKs are
enabled for this stream.
E: This is a flag that is set to 1 to indicate the last packet of
the stream.
QOS: This specifies the desired quality of service for the Data
packet. The valid values for the QoS field are:
2 Unordered
3 Source Ordered
4 Time bound Reliability
Data Length: This is the length of the data in bytes.
RMTP-II [Page 50]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Data: This holds the data to be delivered.
RMTP-II [Page 51]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.3 Retransmission Packet
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Last Stable Packet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TimeStamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| StreamID |N|E|D|S| Reserved | QOS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Length | Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Sequence Number: This is the sequence number of the original data
packet.
Last Stable Packet: This field contains the sequence number of the
last stable packet at the sender.
TimeStamp: This is a 32 bit time stamp with one second granularity
used to indicate the time when the stream was started.
StreamID: This is a 16 bit ID which uniquely identifies the stream.
N: This is a flag that is set to 1 if NACKs are enabled for this
stream
E: This is a flag that is set to 1 to indicate the last packet of a
stream.
D: This a flag that is set to 1 for a retransmission packet from a
designated receiver.
S: This flag is set to 1 to enable subtree multicast for
retransmission. If the flag is set to 1, an intermediate control
node will not forward the packet to its child nodes.
QOS: This is a number that specifies the desired QoS for the Data
packet. The valid values for the QoS field are:
2 Unordered
3 Source Ordered
4 Time bound Reliability
Data Length: This is the length of the data in bytes.
RMTP-II [Page 52]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Data: Data to be delivered.
RMTP-II [Page 53]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.4 HACK
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TimeStamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mcast Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mcast Port | StreamID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Child Index |E|N| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| HACK Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| High Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Low Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Stable Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitmap length | Receiver Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitmap | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
TimeStamp: This is a 32 bit time stamp with one second granularity
used to indicate the time when the stream was started.
McastGroup Address: This is the Multicast group address for this
stream.
Mcast Port: This is the UDP port for this stream.
StreamID: This is a 16 bit ID which uniquely identifies the stream.
Child Index: This is the index number of the child node in its
parent's child list.
E: This flag is set to 1 for the last HACK, to indicate all packets
have been received.
N: This flag is set to 1 to indicate that this is a NACK packet.
HACK Sequence Number: This is the sequence number of the HACK
packet.
High Sequence Number: This is the sequence number of the highest
RMTP-II [Page 54]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
numbered packet received.
Low Sequence Number: This is the sequence number of the lowest
numbered missing packet.
Stable Sequence Number: The stable sequence number is the sequence
number of the highest contiguous stable packet known to the node.
Bitmap Length: This is the length of the bitmap in 32 bit words.
Receiver Count: This is the number of receivers connected to this
node.
Bitmap: The bitmap of the missing packets between Low Sequence
Number and High Sequence Number; starts at bit position Low
Sequence Number modulo 32; Leading and trailing bits are ignored.
RMTP-II [Page 55]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.5 JoinConfirm
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Child Index | Role | |C|R| HB_TTL |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Heartbeat address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Heartbeat Port | Overhead |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence No | Number of Senders |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 1 Last Stable |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 1 TimeStamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 1 StreamID | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 2 Last Stable |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 2 TimeStamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 2 StreamID | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Child Index: This is the index of the child node in its parent's
child list.
Role: This identifies or acknowledges the role of the parent.
C: This flag is set to 1, if the parent accepts the child, or to 0
if the parent rejects the child.
R: This flag is set to 1 to indicate confirmation for a Rejoin
packet.
HB_TTL: This is the time-to-live for the Heartbeat packet.
Parent Heartbeat Address: This is the multicast address for the
parent's local control channel where Heartbeats are sent.
Parent Heartbeat Port: This is the UDP port for the parent's local
control channel where Heartbeats are sent
Overhead: This is the response constant that specifies the number of
HACKs that the answering parent expects for each Data packet.
RMTP-II [Page 56]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Sequence No.: This is the sequence number of the JoinStream packet.
Number of Senders: This is the number of known senders.
Sender Last Stable: This is the number of the last stable packet for
the data stream.
Sender TimeStamp: This is a 32 bit time stamp with one second
granularity used to indicate the time when the stream was started.
StreamID: This is the identifier of the stream.
RMTP-II [Page 57]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.6 JoinStream
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Orig. TTL |R| Res | Role | Res |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence No | Number of Senders |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 1 StreamID | Sender 1 Mcast Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 1 Mcast Addr |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 2 StreamID | Sender 2 Mcast Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender 2 Mcast Addr |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|...
Orig. TTL: This is the original TTL sent from the originator. This
field allows receiving node to assess roughly the distance from
the source.
R: This flag is set to 1 to indicate a rejoin request.
Role: This specifies the role of the joining node in the RMTP tree.
Sequence No: This is the number of times the same request has been
sent.
Number of Senders: This is the number of known senders.
Sender StreamID: This is the stream identifier for the sender's data
stream.
Sender Mcast Port: This is the multicast port for the sender's data
channel.
Sender Mcast Addr: This is the multicast address for the sender's
data channel.
RMTP-II [Page 58]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.7 JoinAck
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Orig. TTL | Res | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Heartbeat address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Heartbeat Port | HB_TTL | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Orig. TTL: This is the original TTL sent from the originator. This
field allows a receiving node to roughly assess the distance from
the source.
Sequence No.: This indicates how many times the same packet has been
sent.
Parent Heartbeat Address: This is the Heartbeat and the parent
control multicast address.
Parent Heartbeat Port: This is the Heartbeat and the parent control
UDP port.
HB_TTL: This is the time-to-live for the Heartbeat packet.
RMTP-II [Page 59]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.8 LeaveStream
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Orig. TTL | Sequence No | Role | Res |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| StreamID | Mcast Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mcast Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Orig. TTL: This is the original time-to-live sent from the
originator.
Sequence no.: This indicates how many times the same packet has been
sent.
Role: This is the role of the node sending the leave packet.
StreamID: This is the ID of the stream.
Mcast UDP Port: This is the multicast port of this stream.
McastGroup Address: This is the multicast group address of this
stream.
RMTP-II [Page 60]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.9 Heartbeat
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Port |N|T| Reserved | Role |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Parent Address: This is the IP address of the parent.
Parent Port: This is the UDP port number of the parent.
N: This flag is set to 1 to indicate that the node must respond
immediately.
T: This flag is set to 1 to indicate that the backup top node is
taking over as the top node for the tree.
Role: This is the role of the parent in the RMTP tree.
RMTP-II [Page 61]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.10 HeartbeatResponse
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Role | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Child Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Role: This is the role of the node in the RMTP tree.
Child Identifier: This is a unique identifier for the child, the
StreamID for a sender node, the Child Index for other nodes
RMTP-II [Page 62]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.11 NullData
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Last Stable |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TimeStamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|N| Reserved | StreamID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Sequence Number: This is the sequence number of the last Data packet
sent.
Last Stable: This is the sequence number of last stable Data packet
at the Sender.
TimeStamp: This is a 32-bit time stamp with one second granularity
used to indicate reincarnation of the sender.
N: This flag is set to 1 to enable NACKs for this stream.
StreamID : This is the 32-bit stream identifier.
RMTP-II [Page 63]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.12 Eject
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reason Code | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Reason Code: This code indicates the reason for sending the eject
packet
RMTP-II [Page 64]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.13 EOS
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TimeStamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mcast Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mcast Port | StreamID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
TimeStamp: This is a 32-bit time stamp with one second granularity
used to indicate reincarnation of the sender.
McastGroup Address: This is the multicast group address for this
stream.
Mcast Port: This is the UDP port number for this stream.
StreamID: This is the unique 32-bit stream identifier.
RMTP-II [Page 65]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.14 LeaveConfirm
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence No | Reserved | StreamID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Sequence No.: This is the sequence number of LeaveConfirm packet.
StreamID: This is the unique 32-bit stream identifier.
RMTP-II [Page 66]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15 RMTP-II Options Header
RMTP-II Options take a general form for their first 32 bits. That
form is:
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | Option Specific Format |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: a 2-bit flag that indicates how the packet is processed if the
option is not
supported by the implementation.
00: ignore the option, but process the packet
01: ignore the packet
10: leave the RMTP tree
11: reserved
The flag A is set to 00 to indicate that the option is to be
skipped and the packet processed. A is set to 01 to indicate that
the whole packet is to be discarded. A is set to 10 to indicate
that this is an important option which modifies the functional
characteristics of the RMTP protocol and so the node must exit the
RMTP tree.
OTYPE: A number that indicates the option type. The predefined values
of OTYPE are:
1 Sender Forward
2 FEC
3 Control tree discovery
4 Global parameter
5 fault tolerant backup top node information
6 congestion control information
7 Round-trip Time for control tree
8 Time bounded reliability
9 Sender status Update Option
10 Update Sender Parameters Option
11 NACK Suppression Option
12-63 reserved for future use
LENGTH: Specifies the length of this option block in 32 bit words.
RMTP-II [Page 67]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.1 Sender Forward Option
This option allows a sender which does not have multicast capability,
to forward data to the top node which will multicast the data to the
tree.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | RESERVED |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: set to 00 ignore option, process packet
OTYPE: set to 1
LENGTH: set to 1
RMTP-II [Page 68]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.2 FEC Option
See Appendix D for a description of the Forward Error Correction
option.
RMTP-II [Page 69]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.3 RMTP tree control node Info Option
A control node may optionally include lists of control node addresses
in its Heartbeat packet.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LEN | reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| parent nodes | peer nodes | child nodes | reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+- +-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: set to 00 ignore option, process packet
OTYPE: 3
LEN: This is the number of 32-bit words in the option packet.
Ancestor Nodes: This is the number of address-port pairs in
ancestor's address list. The addresses are ordered from the top node
down to the current node's parent. If the number is zero then this
node does not have any parents and must be the top node.
Peer Nodes: This is the number of addresses-port pairs in the peer
address list. The peers list contains the addresses of the control
node children of this node's parent. The top node has zero peer
nodes.
Child Nodes: This is the number of address-port pairs in the child
control node list of this node.
Node Address: These entries contain the IP address of the listed
nodes.
RMTP-II [Page 70]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Node Port: The port number of the listed nodes
RMTP-II [Page 71]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.4 Global Parameter Modifications Option
This option enables modifications to the global parameters of the
RMTP tree. This option is available for JoinConfirm, Heartbeat, and
HeartbeatResponse packets.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Branching factor | Thack constant |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RMax | R |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Thack_max | Tjoin_response |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Rjoin | Theartbeat |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Fthreshold | Tnulldata_max |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|O| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: set to 10 Exit the RMTP tree, this option modified the
behavior of the timings.
OTYPE: 4
LENGTH: 6 (6 32-bit words)
Sequence Number: This is the sequence number that identifies this
update. This number is set by the top node. All the nodes of the RMTP
tree use this sequence number to determine if this is a duplicate
packet. The Sequence Number is the only field when the option is use
in a HeartbeatResponse packet.
Branching factor: The branching factor B denotes the maximum number
of children for any node of the tree.
Thack constant: The Thack constant is a scaling coefficient for
Thack.
R: This is a real number which specifies the number of HACKs which
should be received, on average, for each Data packet sent, at any
node.
RxMax: This is the maximum number of retransmissions allowed for a
RMTP-II [Page 72]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
data packet.
Thack_max: This number specifies the maximum time allowed between
HACK transmissions for each receiver.
Tjoin_response: This number specifies the maximum time to wait for a
response to a JoinStream request from the parent node. The response
should be either a JoinAck or a JoinConfirm.
Rjoin: This number specifies the number of retries of the JoinStream
request before declaring the parent unreachable.
Theartbeat: This number specifies the time interval Thb at which
control nodes multicast Heartbeat packets.
Fthreshold: This number specifies the threshold time for failure
detection.
Tnulldata_max: This number specifies the maximum time interval for
sending NullData packets. If the receiver does not receive any data
packet from the sender within 2*Tnulldata_max, it detects a sender
failure.
O: This is a bit flag that is set to 1 to indicate that all
designated receivers should use the optimistic HACK mechanism. The
flag is set to 0 to indicate the pessimistic HACK mechanism.
RMTP-II [Page 73]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.5 Fault Tolerant Top Node Option
This option enables a backup top node.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | Backup top node UDP Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Backup top node IP Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: set to 00 skip option, process packet
set to 01 discard packet
OTYPE: set to 5 fault tolerant backup top node information
Lenth: set to 2 two 32-bit words in the packet
Backup top node UDP Port:
Set to 0 no backup top node
Set to UDP port of backup top node, if the backup top node exists
Backup top node IP Address:
Set to 0 no backup top node
Set to IP Address of backup top node, if the backup top node exists
RMTP-II [Page 74]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.6 LTRC Congestion Control Option
This option enables transmission of loss rate for congestion control.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | LossRate |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: set to 00 skip option, process packet
set to 01 discard packet
OTYPE: set to 6 LTRC Congestion Control
LENGTH: set to 1 one 32-bit field
LossRate: Set to loss rate value * 100. If the loss rate is , say,
10.34%, then LossRate will be 1034 (10.34*100).
RMTP-II [Page 75]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.7 Control Tree Round Trip Time (RTT)
This option is used in heartbeat and HeartbeatResponse packets to
calculate the roundtrip time and to propagate the RTT to all the
nodes in the RMTP tree.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TimeStamp[0] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TimeStamp[1] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ResponseDelay |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| HopRTT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TotalRTT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: set to 00 skip option, process packet
OTYPE: set to 7: Round Trip Time (RTT)
LENGTH: set to 6
TimeStamp: Set to the current time when a control node wants to
calculate the RTT. When a child node receives a Heartbeat with this
field set to a non-zero value, it sets the RTT option in the next
HeartbeatResponse it sends. If this field is set to zero, then the
control node is providing HopRTT-up and TotalRTT-up to its child
nodes.
ResponseDelay: This field is set only in HeartbeatResponse packets.
ResponseDelay is difference, in milliseconds, between the time the
heartbeat packet was received and the time the HeartbeatResponse is
sent.
HopRTT: The maximum round trip time from a parent node to child
nodes. This is set only in heartbeat packets.
TotalRTT: If the packet is a heartbeat, then this field is TotalRTT-
up, the round trip time from top node to the child node. If this
packet is HeartbeatResponse packet, then this field is TotalRTT-down,
the maximum round trip time from the node sending the
HeartbeatResponse to its most distance descendent.
RMTP-II [Page 76]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.8 Time Bounded Reliability Option
This option provides support for time bounded reliability and can be
used by a sender in the Data packets.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | Bound |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Tsent_0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Tsent_1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A: set to 10 leave RMTP tree
OTYPE: set to 8 Time bounded reliability
LENGTH: set to 3 three 32-bit words in the packet
Bound: This is the maximum time in seconds allowed for recovery of
the packet
Tsent: This is the time at the sender when the packet was generated.
The timestamp is expressed in elapsed seconds and microseconds since
00:00 Universal Coordinated Time, January 1, 1970. Tsent_0 contains
the seconds part of the timestamp and Tsent_1 contains the
microseconds part of the timestamp.
RMTP-II [Page 77]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.9 Sender Status Update Option
The sender uses this option in the HeartbeatResponse packet to update
top node about current status at the sender.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | Current Rate |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A:
set to 00 skip option, process packet
OTYPE: set to 9 This is the Sender Status Update Option
LENGTH: set to 1 There is one 32-bit word in the packet
Current Rate: This is the current admit rate at the sender.
RMTP-II [Page 78]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.10 Sender Parameters Update Option
The top node uses this option in heartbeat, HeartbeatResponse, Join,
and JoinConfirm packets to pass updated parameters to the sender.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | Minimum Rate |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Maximum Rate | Initial Rate |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A:
set to 10 leave the RMTP tree
OTYPE: set to 10 This is an Update Sender Parameters Option.
LENGTH: set to 2 There are two 32-bit words in the packet.
Minimum Rate: This is the minimum admit rate for the sender.
Maximum Rate: This is the maximum admit rate sender for the sender.
Initial Rate: This is the starting rate. If currently active then use
Initial Rate as the current rate.
RMTP-II [Page 79]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
6.15.11 NACK Suppression Option
This option is sent in Retransmission or Heartbeat packets by control
nodes to support NACK suppression.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | OTYPE | LENGTH | Not Used |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Start Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| End Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Suppression Bitmap |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A:
set to 00 skip option, process the packet
set to 01 discard packet
OTYPE: set to 11 NACK Suppression Option
LENGTH: set to 3 + Bitmap length
Start Sequence Number: This specifies the lowest sequence number of
Data packet, requested for retransmission but not available at the
control node.
End Sequence Number: This specifies the highest sequence number of
Data packet, requested for retransmission but not available at the
control node.
Suppression Bitmap: This is a bitmap of missing Data packets at the
control node, requested by at least one child node for
retransmission.
RMTP-II [Page 80]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
7 ACKNOWLEDGEMENTS
RMTP-II draws heavily on the rich work done by the reliable multicast
research community over the last decade. The area of tree based
reliable multicast started with RMTP [PSK94, PSLB97] and TMTP
[YGS95]. RMTP was the first protocol to propose organizing receivers
into a tree formation for local recovery and avoidance of ACK
implosion. TMTP introduced a number of new concepts, most notably the
efficiencies gained by combining local NACKs with global ACKs. The
MTP family of protocols [AFK92, BOGKS94] were the first to introduce
the notion of different types of nodes having different roles in the
communication group. The SRM protocol [FVLMZ96] showed how to use
multicast NACKs for efficient local recovery and NACK suppression.
PGM [SFLT98] showed how to do NACK suppression without relying on
many-many multicast.
The optional forward error correction is taken from [NB96, NBT97,
Rizzo97]. The NACK timer backoff algorithms are taken directly from
[NB98].
LORAX [LLG96] showed how to have multiple senders share a common
acknowledgement tree. Studies have been done comparing different
classes of reliable multicast protocols, showing that local recovery
is a key piece of a scaleable reliable multicast protocol [NLJBC98],
and that the combination of NACKs and ACKs proposed originally in
TMTP has the highest scalability of any class of reliable multicast
protocols [LG96]. MFTP [MRTW97] was the first protocol to emphasize
the importance of support for satellite and asymmetrical networks.
Recent work [PPV98, LG97, LC98, SFLT98] has shown the benefits
changing routers to better support reliable multicast. RMTP-II has
drawn on all of these ideas, and owes much of its design to these
works and many others.
CONTRIBUTORS AND REVIEWERS
Special thanks goes to the following individuals, who have greatly
contributed to the design and review of RMTP-II.
Ernst Biersack, Institut EURECOM
Carsten Bormann, University of Bremen, TZI
Rick Buskens, Lucent
Jon Crowcroft, UC London
Christophe Diot, INRIA
Jamal Golestani, Lucent
Brian Levine, UC Santa Cruz
John Lin, Lucent
Don Newell, Intel
Jorg Nonnenmacher, Institut EURECOM
RMTP-II [Page 81]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
Joerg Ott, University of Bremen, TZI
Christos Papadopoulos, University of Washington, St. Louis
Krishan Sabnani, Lucent
Nils Seifert, Telligue
Muhammad Siddiqui, Lucent
Tony Speakman, Cisco
Paul Stirpe, Reuters
Gursel Taskale, Reuters
Wei Wu, Reuters
Qingming Wang, Lucent
Rajendra Yavatkar, Intel
RMTP-II [Page 82]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
8 REFERENCES
[AFK92] S. Armstrong, A. Freier, and K. Marzullo, "Multicast
Transport Protocol", DARPA RFC 1301, February 1992.
[BOGKS94] C. Bormann, J. Ott, H.-C. Gehrcke, T. Kerschat and N.
Seifert, "MTP-2: Towards Achieving the S.E.R.O. Properties for
Multicast Transport'', International Conference on Computer
Communications and Networks (ICCCN-94), 1994.
[FVLMZ96] S. Floyd, V. Jacobson, C. Liu, S. McCanne, L. Zhang, "A
Reliable Multicast Framework for Light-weight Sessions and
Application Level Framing", ACM Transactions on Networking, November
1996.
[Jacobson88] V. Jacobson, "Congestion Avoidance and Control",
Computer Communications Review, vol. 18, no. 4, pp.314-s29.
[Jacobson90] V. Jacobson, "Berkeley TCP Evolution from 4.3-Tahoe to
4.3-Reno", Proceedings of the Eighteenth internet Engineering Task
Force, p 365, (Sept), University of British Columbia, Vancouver, B.C.
[LC98] D. Li and D. Cheriton. "OTERS: Exploiting the Routing
Topology for High-Performance Reliable Multicast", work in progress.
[LG96] B. Levine and J.J. Garcia-Luna-Aceves, "A Comparison of Known
Classes of Reliable Multicast Protocols", Proc. International
Conference on Network Protocols (ICNP-96), October 1996.
[LG97] B. Levine and J.J. Garcia-Luna-Aceves, "Improving Internet
Multicast with Routing Labels," IEEE International Conference on
Network Protocols (ICNP-97), October 28 - 31, 1997. p. 241-250.
[LLG96] B. Levine, D. Lavo, and J.J. Garcia-Luna-Aceves. "The Case
for Concurrent Reliable Multicasting Using Shared ACK Trees", Proc.
The ACM Multimedia '96 Conference, November 1996.
[MRTW97] K. Miller, K. Robertson, A. Tweedly, and M. White.
"StarBurst Multicast File Transfer Protocol (MFTP) Specification",
<draft-miller-mftp-spec-02.txt>, January 1997, work in progress.
[NB96] J. Nonnenmacher and E.W. Biersack, "Reliable Multicast: Where
to use Forward Error Correction", Proc. 5th. Workshop on Protocols
for High Speed Networks, Sophia Antipolis, France, Oct. 1996.
[NB98] J. Nonnenmacher and E. W. Biersack, "Optimal Multicast
Feedback", Proc. IEEE INFOCOM 1998, March 1998.
RMTP-II [Page 83]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
[NBT97] J. Nonnenmacher, E. W. Biersack, and Don Towsley. "Parity-
Based Loss Recovery for Reliable Multicast Transmission", In Proc. of
ACM SIGCOMM '97, Cannes, France, September 1997.
[NLJBC98] J. Nonnenmacher, M. Lacher, M. Jung, E. W. Biersack and
Georg Carle, "How bad is reliable multicast without local recovery?",
In Proc. of IEEE INFOCOM'98, San Francisco, CA, USA, March 1998.
[PPV98] C. Papadopoulos, G. Parulkar, and G. Varghese, "An Error
Control Scheme for Large Scale Multicast Applications", INFOCOM '98,
March 1998.
[PSK94] S. Paul, K. Sabnani, and D. Kristol, "Multicast Transport
Protocols for High Speed Networks", Proceedings of International
Conference on Network Protocols (ICNP-94), 1994.
[PSLB97] "Reliable Multicast Transport Protocol (RMTP)", S. Paul, K.
K. Sabnani, J. C. Lin, and S. Bhattacharyya, IEEE Journal on Selected
Areas in Communications, Vol. 15, No. 3, April 1997.
[Rizzo97] L. Rizzo, "Effective erasure codes for reliable computer
communications protocols", DEIT Technical Report LR-970115.
[SFLT98] T. Speakman, D. Farinacci, S. Lin, and A. Tweedly,
"Pragmatic Group Multicast (PGM) Transport Protocol Specification",
Internet Draft <draft-speakman-pgm-spec-01.txt>, January 1998, work
in progress.
[WMK94] B. Whetten, T. Montgomery, S. Kaplan, "A High Performance
Totally Ordered Multicast Protocol", in "Theory and Practice in
Distributed Systems", Springer Verlag LCNS938, 1994.
[YGS95] R. Yavatkar, J. Griffioen, and M. Sudan, "A Reliable
Dissemination Protocol for Interactive Collaborative Applications,"
Proceedings of the ACM Multimedia '95 Conference, November 1995.
RMTP-II [Page 84]
Internet Draft The RMTP-II PROTOCOL 8 April 1998
AUTHORS' ADDRESSES
Brian Whetten
whetten@gcast.com
Murali Basavaiah
murali@gcast.com
Sanjoy Paul
sanjoy@dnrc.bell-labs.com
Todd Montgomery
tmont@cs.wvu.edu
Naveen Rastogi
naveen@gcast.com
Jim Conlan
jim@gcast.com
Thomas Yeh
yeh@gcast.com
RMTP-II [Page 85]