Internet Draft                                              Man Yeob Lim
<draft-lim-ip-reliable-multicast-00.txt>                   Dae Young Kim
                                                     Chungnam Nat. Univ.
                                                           November 1997

                  IP Extension for Reliable Multicast

Status of This Memo

This document is an Internet Draft. Internet Drafts are working
documents of the Internet Engineering Task Force(IETF), its areas, and
its working groups.Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress".

To learn the current status of any Internet-Draft, please check the
"lid-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe),
ftp.isi.edu (USWest Coast),  or munnari.oz.au (Pacific Rim).

Abstract

This memo presents IP extension for recovering multicast packets from
congestion. Dropped packets can be recovered far faster by IP routers
with extension of this memo than by group member end-hosts. Because
necessary interactions are limited among adjacent routers, this scheme
substantially reduces overall signaling overhead among group members for
packet recovery.























Lim & Kim                    Expires May 1998                   [page 1]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

                           Table of Contents

Status of This memo
Abstract

1. Introduction ....................................................   3

2. Overview ........................................................   4
   2.1. Recovering single loss by cache routers ....................   4
   2.2. Recovering burst loss by buffering routers .................   4
   2.3. Three schemes of recovery ..................................   5
   2.4. Delay before retransmission ................................   5

3. Protocol description ............................................   6
   3.1. Extension to IP datagram adding cache/buffering router address 6
   3.2. Converting multicast datagram to unicast datagram ..........   6
   3.3. Drop ICMP Message ..........................................   7
   3.4. ICMP register buffering host message .......................   7

4. Implementation issues ...........................................   8
   4.1. Compatibility ..............................................   8
   4.2. Minimum implementation .....................................   8
   4.3. Cache/Buffer size consideration ............................   8






























Lim & Kim                    Expires May 1998                   [page 2]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

1. Introduction

Since the IP Multicast was proposed[1], there have been many research
works on reliable multicast protocols. However the fact that the
multicast itself is done in the IP layer but the solutions are sought in
the transport layer makes the search for solutions more difficult. The
transport protocol sits on group member end hosts which are spread over
a large geographical area, and so if packet losses occur in the layer,
it not only takes long to detect in the transport layer but also makes
group coordination very complicated. Even though many schemes were
proposed to overcome the multicast communication losses[2-6], it is hard
to devise an efficient solution without any attempt involving the IP
protocol in the task.

There are two types of packet loss in the real Internet environments.
One type is transparent to routers, while the other is not. The first
type includes packet loss due to link error or router failure. In order
to recover these packets an end-to-end Ack/Nak operation is required.
The second type includes packet drop due to congestion or TTL(time to
live) expiration. This type of packet loss is made by explicit router
decision. As the transmission quality improves, the first type of packet
loss is diminishing to a negligible order and the congestion becomes
major reason for packet loss. Because packet drops at congestion are
done with routers' knowledge, we can think of a recovery scheme by
explicit coordination among routers. If  recovery of lost packets are
done instantaneously and actively by the IP routers before later
intervention by the higher protocol, not only the end-to-end multicast
protocol can be significantly simplified but also the recovery can be
done in a much faster fashion. A minimal requisite for the routers'
capability at congestion in order to make the proposed scheme possible
is that the router should be able to see the packet to collect necessary
information before actually dropping one.

A study[7] shows that the loss on the links of the multicast network is
observed to be only 2% or less of the whole packet loss and also that
the rest congestion loss are again classified into two types, single and
burst.  Most of the congestion loss consist of isolated single losses,
but a few of very long loss bursts, lasting from a few seconds up to 3
minutes(around 2000 consecutive packets) contribute heavily to the total
packet loss. The single losses are believed to be coming from
instantaneous congestion of a relatively short period, while the burst
losses are coming from long lasting congestion.

We propose extensions to IP and ICMP protocols for efficient recovery
of both single and burst packet losses due to router congestion. We
propose to place multicast routers with recovery cache or buffer in
various places in  network so that lost packets can be recovered by
coordination among routers. It is not suitable that all lost packets be
recovered by routers. Recovery should be limited only to important
multicast packets which are to be specially tagged, so that the cache
size can be minimized and multicast routers are not required to do too


Lim & Kim                    Expires May 1998                   [page 3]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

much a processing overhead. In IP version 4, reliability bit in the type
of service field can be used for this purpose. In IP version 6, one
entry in the priority field is suitable to specify reliable multicasting
per packet or flow label can be used to specify reliable multicasting
per data stream.


2. Overview

2.1. Recovering single loss by cache routers

Single losses are recovered by the so called cache router, which is
located at just one previous hop from where congestion occurs. Cache
routers continuously copy all multicast packets with QoS of recovery
attribute in its ring type cache. When a cache router forwards a
multicast packet, it updates the cache router address in the option
field of the multicast packet with its own IP address. While this
multicast packet travels along the network toward destination, the cache
router address is updated every time the packet passes through cache
routers. When a router has to drop a packet due to congestion, it sends
a drop message to the cache router whose IP address is specified in the
option field of the packet. Upon reception of the drop message, the
cache router looks for the same packet in the cache. If it is still
there, the cache router decrements TTL field of the packet, retransmits
the packet and stores a copy in the cache again so that retransmission
can be repeated as long as the packet's TTL is not equal to zero. In
order to generate a drop message properly the router should be able to
do minimum processing before actually dropping a packet, that is to
accept a packet into buffer and to make a drop message duplicating the
header part.

2.2. Recovering burst loss by buffering routers

Burst loss occurs when a router encounters congestion for a long time.
A burst can be from a few packets to several thousands of packets
requiring a large cache to store copies. Because this size can be large,
it is not feasible to equip every router with a cache big enough to
recover burst loss. Instead, we place special routers for burst loss, so
called buffering router at several nodes where multicast tree branches.
A buffering router covers recovery from burst loss which occurs between
itself and the next buffering router in the routing tree. When a burst
loss occurs at a router the router sends a series of drop messages to
the previous hop router, i.e. the cache router, which relays the drop
messages to the buffering router. Once a burst loss occurs it will take
a while before the router is relieved from the congestion. The router
sends an estimated recovery time in the drop message. The buffering
router waits until the congested router recovers from congestion and
retransmits the packets. The packets are converted to unicast packets
and directed to the congested router. Then the congested router restores
multicast packets from the unicast packets and resumes multicast
forwarding.


Lim & Kim                    Expires May 1998                   [page 4]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

When a host is available for a buffering device instead of the
buffering router itself, the host can be used to store copies of
multicast packets. If a host registers itself as a buffering device to a
buffering router, the buffering router sends all duplicated multicast
packets of proper QoS  which pass through the buffering router to the
buffering host and updates the buffering router address in the option
field of the multicast packets with the IP address of the buffering host.

2.3. Three schemes of recovery

The recovery is done in three steps. First, if a router encounters a
loss of either single or burst, it sends a drop message to the cache
router. If the cache router still has the packet in the cache it
transmits the packet to the requester. Second, if the cache router finds
the packet is no longer existing in the cache it forwards the drop
message to the buffering router. Then the buffering router searches the
packet in the buffer and sends to the requester by unicasting. The
congested router converts the unicast packet into a multicast packet and
resumes multicast routing. Third, if the buffering router finds the
packet is no longer existing in the buffer it forwards the message to
the original source of the packet. Then the source host retransmits the
packet to the congested router by unicasting. Three schemes can be
implemented in any combination or separately. This operation ensures
full recovery upon packet loss due to congestion.

2.4.  Delay before retransmission

When a single or burst loss occurs there arise a question that how soon
the router will be recovered from congestion and becomes ready to
receive packets. If the congestion extends for a long period of time
fast retransmission is useless or makes problem even worse. In the
current version of IP, there is no provision on this but TCP flow
control takes care of this situation by reducing the window size. But
the control scheme of window size is not precisely corresponding to the
recovery timing of the router. But there is no other way because there
is no information about recovery time of router. But the congested
router can have information on why there occurs overflow, whether it is
single or burst and how soon it can be relieved. If this is true the
router can add this information, estimated recovery time, in drop
message. The cache router which is in charge of the primary
retransmission decides whether retransmission be made instantly or be
made after certain amount of delay time. If delay is required, the cache
router forwards the drop message to the buffering router together with
the delay information. Because the buffering router has enough amount of
buffer which can hold the packets for a long time the packets are
retransmitted after the delay time required for the congested router to
recover.






Lim & Kim                    Expires May 1998                   [page 5]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

3. Protocol description

3.1. Extension to IP datagram adding cache/buffering router address

An option is defined in IP datagrams to store cache and buffering
router IP addresses. The cache IP address is the IP address of the cache
router which ensures recovery from single loss and the buffering IP
address is the IP address of the buffering router or host which ensures
recovery from burst loss. Figure 1 shows packet format of the option in
the IP datagram.


 0                8              16              24             31
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |   code(10)    |    length     |           reserved            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                   Cache Router IP Address                     |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |               Buffering Router/Host IP Address                |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Figure 1.  The format of the router option in an IP datagram


The source host initializes the cache and the buffering IP address as 0
and IP address of the source host respectively. Cache routers update the
cache address field as their IP addresses and buffering routers update
the buffering  address field as its IP address. If a host is used as a
buffering device, this field is updated as the IP address of the
buffering host. When routing multicast packets with proper QoS, the
buffering router forwards copy of the packets to the buffering host. The
cache and the buffering address fields are continually updated while the
packets are passing through the routers. This makes it possible that
recovery is implemented by the nearest router from the congested router.

3.2. Converting multicast datagram to unicast datagram

When a buffering router receives a drop message then the router
searches the packet in the buffer. If it succeeds finding the packet it
converts the packet into a unicast packet saving the multicast address
in the option field and changing the destination address to the
congested router address. If it fails, it forwards the drop message to
the original source host. Receiving routers should convert a unicast
packet to a multicast packet and continue multicast routing. The figure
2 shows the format of the multicast address option. Retransmission
packet by cache routers is forwarded in multicast format because the
cache router is located adjacent to the congested router.






Lim & Kim                    Expires May 1998                   [page 6]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

 0                8              16              24             31
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |   code(11)    |    length     |           reserved            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                      Multicast Address                        |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 Figure 2. The format of the multicast address option in an IP datagram


3.3. Drop ICMP Message

This message is sent to the cache router when a drop from congestion
occurs. One message is generated per each drop. Receiving the drop
message the cache router searches the requested packet in the cache. If
found, the cache router sends the packet to the congested router. If not
found, drop message is forwarded to the buffering router without
changing the source IP field, thus preserving the IP address of the
congested router. Receiving the drop message the buffering router
searches the requested packet in the buffer. If found, the buffering
router converts the multicast packet into a unicast packet and sends to
the congested router. If not found, drop message is forwarded to the
original source host without changing the source IP field, thus
preserving the IP address of the congested router. The source host can
either retransmits the requested packet to the congested router or
notifies to upper layer protocol. The congested router specifies
estimated time to recover from congestion in the drop message, so that
routers can make delay before retransmission. Figure 3 shows the format
of the drop ICMP message.


 0                8              16              24             31
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |   type        |    code       |           checksum            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                   estimated time to recover                   |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |           internet header + 64 bits of datagram               |
 |                             prefix                            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        Figure 3.  ICMP drop message format


3.4. ICMP register buffering host message

This message is used for a host to register itself to a buffering
router as a buffering host. When a buffering router receives this
message the router forwards all multicast packets with proper QoS to the
buffering host. Upon receiving drop message from a congested router the
buffering host transmits to the congested router in unicast format.


Lim & Kim                    Expires May 1998                   [page 7]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

 0                8              16              24             31
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |   type        |    code       |           checksum            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                   buffering host IP address                   |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        Figure 4.  ICMP register cache host message format


4.  Implementation issues

4.1. Compatibility

The extensions proposed above are upward compatible with current
version of IP. Cache routers and buffering routers can be used combined
or separately. Routers can be replaced incrementally from congested
areas. New ICMP drop message and datagram message with new options will
be tunneling existing routers intact so that we can improve the network
smoothly without changing all the routers simultaneously.

4.2.  Minimum implementation

The extensions proposed above could require major technology
improvements like router design with large memory. For an intermediate
stage we can think a minimum implementation plan. Without using cache or
buffering routers, the sending host can recover packet losses in the IP
layer. The IP of the source host can send retransmission packets to the
congested router via unicast messages. The congested router converts the
unicast packet to a multicast packet and resumes multicasting from the
point where multicasting was suspended. This can enable recovery from
congestion completely in the IP layer thus not only making the recovery
much faster but also making the transport multicasting protocol simple
and straightforward. The IP layer module can either have its own buffer
to hold IP packets or manage the packets for retransmission using linked
list with TCP buffer.

4.3.  Cache/Buffer size consideration

Suppose cache routers are used in a gigabit network and routers are
separated by 100 kilometers apart. Packet travel time is 0.3 millisecond
for one way. If a congestion occurs, the cache router drops the received
multicast packet and sends a drop message. If we assume the time to
process a received packet and to generate a drop message is 0.4
millisecond, the cache router should store a duplicate copy of a
multicast packet by 1 millisecond. This results in 1 Mbit cache required
for each channel.

Considering buffer size of a buffering router, suppose round travel
time between source and destination host is 10 seconds. If we give
router's recovery time to recover from congestion 10 seconds the total


Lim & Kim                    Expires May 1998                   [page 8]


Internet Draft    IP Extension for Reliable Multicasting   November 1997

time to store packets in the buffer will be 20 seconds. Supposing 1
percent of the total 1 gigabit traffic is multicasting traffic requiring
retransmission, the buffer size will be 25 Mbyte. If we increase the
recovery time to 3 minutes the buffer size becomes around 250 Mbyte, and
we feel this figure is not difficult to implement.



Authors:

Man Yeob Lim, Mr                  Dae Young Kim, Prof.
InfoCom Eng. Dept.                InfoCom Eng. Dept.
Chungnam National University      Chungnam National University
Daejeon 305-764                   Daejeon 305-764
Korea                             Korea

Phone: +82 42 821 3544            Phone: +82 42 821 6862
Fax:   +82 42 821 2225            Fax:   +82 42 823 5586
Email: mylim@sunam.kreonet.re.kr  Email: dykim@ccl.chungnam.ac.kr
                                  http://ccl.chungnam.ac.kr/~dykim/


REFERENCES

[1]  S. Deering, Host Extensions for IP Multicasting, RFC 1112, Jan.
     1989.
[2]  S. Kasera, J. Kurose, and D. Towsley, Scalable reliable multicast
     using multiple multicast groups, Proc. ACM Sigmetrics Conference,
     1997.
[3]  J. M. Chang and N. F. Maxemchuk, Reliable broadcast protocol, ACM
     Trans. Computer Systems, 2(3):251-273, August 1984.
[4]  S. Armstrong, A. Freier, K. Marzullo, Multicast Transport Protocol,
     RFC 1301, Feb. 1992.
[5]  B. Whetten, T. Montgomery, S. Kaplan, A high performance totally
     ordered multicast protocol, Theory and Practice in Distributed
     Systems, Springer Verlag, LCNS 938.
[6]  C. Papadopoulos, G. Paruklar, G. Varghese, An error control scheme
     for large-scale multicast applications, Washington University, St.
     Louis.
[7]  M. Yajnik, J. Kurose, and D. Towsley, Packet loss correlation in
     the Mbone multicast network, University of Massachusetts at Amherst.















Lim & Kim                    Expires May 1998                  [page 9]