draft-ietf-rmt-bb-lcc-00

RMT Working Group                               M. Luby/Digital Fountain
INTERNET DRAFT                                         L. Vicisano/Cisco
Expires May          2001                      A. Haken/Digital Fountain
                                                           November 2000


              Reliable Multicast Transport Building Block:
                       Layered Congestion Control
                     <draft-ietf-rmt-bb-lcc-00.txt>


Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026. Internet-Drafts are working docu-
ments of the Internet Engineering Task Force (IETF), its areas, and its
working groups.  Note that other groups may also distribute working doc-
uments as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.

Copyright Notice

   Copyright (C) The Internet Society (2000).  All Rights Reserved.


Abstract

This document describes LCC, a scalable layered congestion control
building block for multicast.  LCC is a combination of approaches that
allow multiple receivers to concurrently receive packets from a single
sender at varying rates depending on individual bandwidth connections
and network conditions.  Two basic goals of the approach are to allow
each receiver to obtain the full benefit of the available bandwidth to
the sender and to be fair to other flows in the network.

For all of the approaches described in this memo, a sender sends data
for one or more objects to multiple multicast groups, potentially at
different rates for each group, where an object is any well-defined



Luby, Vicisano, Haken                                   FORMFEED[Page 1]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


content or file.  The set of multicast groups carrying data for an
object or sequence of objects out of a single sender is called a ses-
sion.  LCC assumes that each receiver is free to join and leave at any
time one or multiple groups carring the data for the session, as
required by the congestion control algorithm. This implies that, in gen-
eral, only a subset of the packets sent are actually received.  Applica-
tions that use LCC must either be tolerant to this, or they must be able
to compensate through some erasure-recovery mechanism (E.g. FEC tech-
niques [FECBBSTAN] are an afficient way to achieve erasure-recovery if
LCC is used in a reliable file transfer protocol).  The number of groups
and which groups in the session each receiver joins is dictated by the
local bandwidth availability and network conditions experienced by the
receiver.   In particular, receivers reduce their reception rate as soon
as they feel congestion as evidenced by measured packet loss.

LCC is receiver driven, i.e., each receiver adjusts its reception rate
in response to measured packet loss at the receiver in a manner reminis-
cent of TCP congestion control.  Thus, each receiver experiences a
reception rate appropriate to that receiver independent of other
receivers.

LCC has the following properties:


o To each receiver, it appears as if there is a dedicated unicast ses-
  sion from the sender to the receiver, where the reception rate adjusts
  to congestion along the path from sender to receiver similar to TCP.

o To the sender, there is no difference in load or outgoing rate if one
  receiver is joined to the session or a million (or any number of)
  receivers are joined to the session, independent of when the receivers
  join and leave.

o For each link in the network, the packet traffic from the session and
  its reaction to competing traffic is the same whether there is one
  receiver or a million receivers beyond the link, and this reaction is
  similar to how TCP reacts.

o Receivers adjust their reception rate by joining and leaving groups,
  i.e., with no feedback to the sender.

  Thus, LCC provides a massively scalable layered congestion control
  approach that is network friendly.

1. Introduction

This document describes a scalable layered congestion control (LCC)
building block that can be used by applications built on top of IP



Luby, Vicisano, Haken                                   FORMFEED[Page 2]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


multicast [DEE88].  For example, LCC can be used as the congestion con-
trol scheme for LCT [LCT00]. Many congestion control schemes have been
built on top of multicast. However, scalability is not a design goal for
many of these congestion control schemes, in the sense that they require
all receivers to be receiving at the same rate, and in general this rate
is dictated by the available bandwidth along the most constrained path
from the sender to a receiver.  Thus, the reception rate of receivers is
restricted to that of the worst case receiver.  In contrast, a general
design goal of LCC is that each receiver can be receiving at the rate
that is dictated by the available bandwidth along the path from the
sender to that receiver.  Thus, the reception rate of receivers is inde-
pendent of that of other receivers.

One of the key difficulties in scaling sender-based multicast congestion
control schemes is dealing with the amount of data that flows from
receivers back to the sender to adjust the sender rate. LCC avoids any
such feedback, and thus is massively scalable.

An attractive feature of a scalable congestion control scheme is the
ability for different receivers to join and leave the session asyn-
chronously without adversely affecting the reception experience of other
receivers and without affecting the scalability of the scheme.  This is
one of the features provided by LCC.

To transmit data about an object or sequence of objects using LCC, a
sender sends the data concerning the object(s) to one or more multicast
groups.  The rate at which data is sent to different multicast groups
may vary.  The set of groups pertaining to an object or set of objects
emanating from a single server over which congestion control is per-
formed is called a session.

The original ideas for LCC are from [MCC96], [VIC98A], [VIC98B],
[BYE00].  In all of these schemes, the sender places congestion control
information into the header of each packet sent to the session.  The set
of groups a receiver joins is determined by the receiver based on sig-
nals placed into packets by the sender and by loss measured along the
path from the sender to that receiver.  Receivers that can receive pack-
ets at a rate higher than their current rate are allowed to periodically
increase their reception rate, and receivers that are receiving packets
at a higher rate than they have the capacity for (as evidenced by packet
loss) MUST reduce their rate.

A primary goal for LCC session is to be fair to other LCC sessions as
well as to sessions using other congestion control schemes within other
protocols such as TCP.  In particular, if several sessions are flowing
through a bottleneck link, then it is desirable for the sessions to
share the bandwidth capacity of the link fairly, and it is also desir-
able that the link not be overly congested.



Luby, Vicisano, Haken                                   FORMFEED[Page 3]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


This document describes two congestion control schemes, a static layer
scheme called FLID-SL and a dynamic layer scheme called FLID-DL.  FLID-
SL is applicable to networks where joins to multicast groups and leaves
from multicast groups are processed quickly, e.g., on the same time
scale as the round trip time from the receiver to the sender.

2. Environmental Requirements

FLID-DL is applicable to networks where joins to multicast groups are
process quickly but leaves from multicast groups may take several sec-
onds.  One of the current problems with many implementations of IP mul-
ticast routing protocols is that the IGMP protocol used between receiver
and access routers to join and leave groups has the limitation that IGMP
leave messages can take a significant amount of time to take effect,
leading to a large amount of data flowing to a receiver on a given group
long after the receiver has issued an IGMP leave message for that group
in reaction to packet loss.  This is a problem because the network
remains congested for the period of time it takes to process the IGMP
leave request, thus making it difficult to rely on IGMP leave requests
to provide a network friendly congestion control scheme.  The dynamic
layer scheme uses groups where the transmission rates on the groups
change over time in a way that makes it possible for receivers to
quickly reduce their reception rate by taking no action.  Instead,
receivers that want to maintain their current reception rate must peri-
odically issue IGMP join requests, and receivers that want to increase
their reception rate must issue an additional IGMP join request.

One of the attractions of both congestion control schemes is that they
are multicast routing independent and that they do not require multicast
reverse connectivity, i.e. LCC receivers do not send multicast traffic
or any other traffic for purposes of congestion control. In particular,
LCC works with the original multicast model introduced in [DEE88], which
we call Internet Standard Multicast (ISM) in this document, and with the
Source Specific Multicast (SSM) model that is based on [HOL99].  The
definition of a group that is used throughout this document is slightly
different with ISM and with SSM.  When using ISM, packets of a group are
sent to a multicast group address G.  When using SSM, packets of a group
are sent to a channel address (S,G), where S is the IP address of the
sender and G is a multicast group address.

SSM is more attractive to LCC than ISM for a few reasons.  First, a ses-
sion is made up of multiple groups, and LCC can be used to deliver a
large number of objects over time using the same set of groups assigned
to the session.  With ISM, the multicast group address G that corre-
sponds to a group must be allocated so that it is unique across the
Internet.  With SSM, the multicast group address G can be allocated
locally by the sender with the only requirement that it is unique to the
sender, because it is the (S,G) channel that corresponds to the group



Luby, Vicisano, Haken                                   FORMFEED[Page 4]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


that a receiver joins.

Second, LCC supports an unlimited number of receivers that are dynami-
cally joining and leaving groups within a session.  Changes in the mul-
ticast tree topology with SSM are light weight operations (a new branch
from the receiver towards S grows when a receiver joins, and the branch
is deleted when the receiver leaves), and with ISM changes can be heav-
ier weight (involving transitions from a (*,G)-tree rooted at an RP to
the tree rooted at S).

Third, LCC is scalable to an unlimited number of receivers that may span
the global Internet. Thus, the light weight mechanisms that SSM uses to
cross ISP boundaries (standard BGP+ routing tables) is a distinct advan-
tage over the heavier weight mechanisms used by ISM (the MSDP and BGMP
protocols, both of which are not needed by SSM).

Finally, a receiver joins a group by joining a channel (S,G) with SSM,
and thus the receiver will only receive packets sent from the sender S.
With ISM, the receiver joins a group by joining a multicast group G, and
all packets sent to G, regardless of their origin sender, will be
received by the receiver.  With ISM, if another sender application is
inadvertently sending packets to G at a high rate, these packets will be
sent over portions of the network and delivered to the receiver even
though they were not requested by the receiver, potentially resulting in
a large amount of unintended network congestion.  Thus, SSM has com-
pelling advantages over ISM for prevention of unintended and wasteful
network congestion.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [R2119].

3. General Architecture

The Layered Congestion Control (LCC) schemes described in this document
applies to a session emanating from a single sender.  A session consists
of data sent to one or more groups for some period of time that is
determined by the application.  For example, within LCT [LCT00], a ses-
sion is defined as the data sent from a single sender about one object
or a sequence of objects over which congestion control is to be per-
formed.

A sender sends data to one or more groups within a session.  The trans-
mission rate of data to different groups may vary among groups, and the
transmission rate to a particular group may vary over time.  Congestion
control is performed globally over all data sent to the session.  Typi-
cally, the sender continues to send data to all groups in a session
until the session is terminated.  The session may be terminated when



Luby, Vicisano, Haken                                   FORMFEED[Page 5]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


some amount of time has expired, a certain amount of data has been sent,
or some out-of-band signal (from a higher level protocol, perhaps) has
indicated completion by a sufficient number of receivers.

It is possible that a receiver may concurrently join multiple sessions
to receive data from multiple senders.  For example, three different
sessions could be transmitted from three different senders, each session
consisting of four groups to which data is being sent at different
rates, and a receiver may join and receive packets from all 12 groups
concurrently.  However, since the senders may be located at different
points in the network that experience varying network conditions, a
receiver MUST perform congestion control independently on each session
it is receiving.

All packets sent to a session should be roughly of the same size in
order for the congestion control scheme to work effectively.  Larger
packet sizes are generally desirable so that the fraction of bandwidth
lost to packet headers is reduced.  On the other hand, if the packet
size is larger than the network's maximum transmission unit (MTU), pack-
ets would be fragmented, and the loss of any fragment would require the
entire packet to be lost.  Therefore a packet size close to, but not
exceeding, the MTU is best, as this reduces packet header overhead and
packet handling overhead in routers.

A receiver must first obtain a transmission session description before
joining a session.  This includes the information about the groups asso-
ciated with a session that is needed in order to join those groups.
Once a receiver has obtained this information the receiver may join one
of the groups in the session.  In order to be in compliance with LCC,
receivers MUST join and leave groups within a session as described in
detail in this document in order to vary their reception rate in the
face of varying bandwidth capacity between the receivers and the sender.

The transmission session description is determined and agreed upon by
the senders and communicated to the receivers out-of-band, or, in some
cases, included or partially included in the header of each packet. The
session description pertinent to LCC could include the packet format and
length, the sender address and the multicast group address(es) used in
the session.  The session description could be in a form such as SDP
[HAN98]. There must be an out-of-band mechanism for receivers to obtain
the session description. The session description might be carried in a
session announcement protocol such as SAP [HAN96], located on a Web page
with scheduling information, or conveyed via E-mail or other out-of-band
methods. Discussion of session description format, and distribution of
session descriptions is beyond the scope of this document.

4. Overview of congestion control schemes




Luby, Vicisano, Haken                                   FORMFEED[Page 6]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


LCC performs congestion control by dedicating multiple groups to a ses-
sion.  Receivers joined to the session are subject to heterogeneous
reception rates, obtained by having the receiver selectively join a sub-
set of all the groups available from the sender.

The original ideas for both of the LCC congestion control schemes are
from a combination of [MCC96, VIC98A, VIC98B, BYE00].

When a sender is instructed to start a session, the range of possible
reception rates is specified by rmin and rmax, where rmin is the minimum
available reception rate and rmax is the maximum available reception
rate.  It is recommended that rmin be small enough so that any receiver
can receive at rate rmin without incurring significant packet loss.

If rmin = rmax is specified then there is only one reception rate speci-
fied for the session.  If rmax > rmin is specified, then multiple recep-
tion rates should be made available in the session. Let X = 1.3.  It is
recommended that the number of available reception rates be set to A+1,
where A is the largest value such that rmin * X^A <= rmax, and that for
all i=0,...,A the reception rate R(i) be set to rmin * X^i.

Let A+1 be the number of different reception rates available for a ses-
sion, and let R(0) < ... < R(A) be the set of available reception rates.
As an example, set R(0) = 24 Kbps, set X = 1.3, set A=30, and for all
i=1,...,30 set R(i) = R(0) * X^i.  Then, R(30) = 62.9 Mbps, and thus the
ratio of the maximum achievable rate to the minimum rate is a factor of
2,620 in this example. When a receiver is receiving packets at reception
rate R(i), we say the receiver is at layer i.  Let r(0) = R(0), and for
all i = 1,..., A, let r(i) = R(i) - R(i-1) be the incremental rate
needed to increase the rate from layer i to layer i+1.  Suppose the cur-
rent reception rate of a receiver is R(i), i.e., the receiver is at
layer i.  If the receiver is to increase its reception rate to layer
i+1, it does so increasing its reception rate by r(i+1) to R(i+1).  If
the receiver is to decrease its reception rate to layer i-1, it does so
decreasing its reception rate by r(i) to R(i-1).  If the receiver is to
maintain it reception rate at layer i, it does not change its reception
rate from R(i).

5. Fair Layered Increase/Decrease + Static layer scheme (FLID-SL)

This section provides a detailed operational description of how to apply
the LCC design principles to congestion control when IGMP leave latency
is minimal.  The next section described extends these principles to
describe a second way to achieve congestion control that overcomes long
IGMP leave latencies.

5.1. Sender operation




Luby, Vicisano, Haken                                   FORMFEED[Page 7]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


The sender uses A+1 groups numbered 0,..., A.  For all i = 0,...,A, the
ith group carries packets at rate r(i) at all times.  Group 0 is
referred to as the base group.

The sender partitions time into equal duration intervals called time
slots.  The time slot duration TSD determines the reaction time of
receivers to changing network congestion conditions.  It is recommended
that the time slot duration TSD be set to one of either 0.5, 1.0, or 2.0
seconds.  Associated with each time slot is the time slot index.  The
range of values for the time slot index is [0..G-1] for some value G.
The time slot index increments by one modulo G between each consecutive
time slot.  For example, if G = 32 then the time slot index is 0, 1, 2,
3, ... , 30, 31, 0, 1, ... in consecutive time slots.  It is recommended
that G be set to 128.  G must be set to at least 3.

In order to be able to measure loss within each group, the sender places
consecutive sequence numbers in the packets sent to a group. The
sequence numbers ignore time slot boundaries, i.e., the sequence numbers
within the same group across the time slot boundary are still consecu-
tive. Sequence numbers wrap around at 2^16, i.e., the consecutive
sequence numbers are 0, 1, 2, ..., 2^16-2, 2^16-1, 0, 1, 2, ... .

The sender places into each packet the index of the group within the set
of groups used for that session.

The sender places into each packet the time slot index.  Thus, all pack-
ets within the same time slot must have the same time slot index.

The sender places an increase signal trigger into each packet.  The
increase signal trigger is set to either 0 or 1 for each packet.  The
increase signal trigger must be the same for all packets sent to a group
within the same time slot.  An increase signal trigger of 0 indicates no
increase allowed to receivers, and an increase signal trigger of 1 indi-
cates increase is allowed to receivers.

The increase signal triggers are calculated as follows by the sender.
For all i = 0,...,A-1, let p(i) = min {1.0, 20*packet size*TSD/R(i)},
and set p(A) = 0.  Note that 1 >= p(0) >= p(1) >= ... >= p(A-1) >= p(A)
= 0.  Let B be an integer associated with each time slot that increases
by one for each consecutive time slot, and thus t = (B mod G) is the
time slot index.  For each time slot, for each i = 0,...,A, for all
packets sent to the group i that carries rate r(i) during the Bth time
slot, the increase signal trigger is set to 1 if BB <= p(i), and the
trigger is set to 0 otherwise.  Here, BB is derived from B by writing B
in reverse binary notation and considering it as a fraction that is
between 0 and 1.  For example, if B = 253 when written base ten, then
when written in binary B = 11111101, and then BB = 0.10111111 when writ-
ten as a binary fraction.  As a decimal fraction, BB = 0.7421875.  This



Luby, Vicisano, Haken                                   FORMFEED[Page 8]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


method of computing BB from B, where B increases by one at each time
slot, guarantees that increase signal triggers equal to 1 for a given
layer i are very well-spaced out over the time slots, and that on aver-
age the fraction of time slots with increase signal 1 for layer i is
p(i).  Note that increase signal trigger for the group carrying rate
r(A) during a time slot is always set to 0, since p(A) = 0.  This
ensures that there is never an attempt by a receiver to increase the
reception rate above R(A).

This method of setting the increase signal triggers implies the follow-
ing monotonicity property: if the trigger is set to 1 for layer i during
some time slot, then the trigger is also set to 1 for lower layers i' <
i in the same time slot.  This allows receivers at lower layers behind a
bottleneck link to increase to a higher layer when receivers at higher
layers increase to a higher layer.  During other time slot periods,
receivers at lower layers will be allowed to increase to a higher layer
when receivers at higher layers aren't allowed to increase to a higher
layer, thus giving receivers at lower layers a chance to catch up.

5.2. Receiver operation

When a receiver first joins a session, it must only join the base group
and remain joined only to the base group for at least one complete time
slot. The rate r(0) = R(0) of the base group must be small enough that
when a receiver is joined to just this base group there is no signifi-
cant packet loss.  If there is significant packet loss over any signifi-
cant period of time when the receiver is only joined to the base group
then the receiver must leave the session by leaving the base group.  If
there is only one reception rate offered by the session, i.e., there is
only a base group offered by the session and no other groups, then only
this first paragraph is relevant to the receiver congestion control
scheme.  If there are two or more reception rates offered by the ses-
sion, i.e. there is a base group and (at least three) other groups, then
the rest of this subsection is relevant to the receiver congestion con-
trol scheme.

The receiver must keep a timer that tracks the maximum interarrival time
between packets.  Whenever there is an interarrival time that exceeds
TSD, the receiver must leave the session by leaving all groups in the
session immediately.  The receiver may thereafter try to rejoin the ses-
sion.

During a generic time slot t a receiver is joined to some number i, 0 <=
i <= A, of the groups 0, ..., i that have rates r(0), r(1),
r(2),...,r(i), respectively, within time slot t.  Thus, during time slot
t, the receiver is at layer i and has a reception rate R(i). The
receiver does not join or leave any groups in the middle of time slot t.
To simplify the following discussion, let t+1 indicate t+1 mod G. The



Luby, Vicisano, Haken                                   FORMFEED[Page 9]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


receiver only makes changes in group membership at the beginning (indi-
cated by the first packet received) of the next time slot t+1.

If there is at least one packet loss measured in time slot t in any
group then the receiver must leave group i at the beginning of the next
time slot t+1. This will drop the reception rate for the receiver from
layer i to layer i-1, i.e., the reception rate will drop from R(i) to
R(i-1).

If there is no measured packet loss in time slot t then the action of
the receiver depends on the increase signal trigger in group i in time
slot t.  If the increase signal trigger is 0 for group i in time slot t,
indicating the receiver must not increase above layer i, the receiver
does not join or leave any groups at the beginning of time slot t+1. If
the increase signal trigger is 1 for group i in time slot t (in which
case i < A, since the increase signal trigger is always 0 for group A),
indicating the receiver can increase to layer i+1, then the receiver
joins group i+1 at the beginning of time slot t+1.

5.3. General considerations

Generally, the multicast group addresses associated with the groups con-
stitute a consecutive range of multicast address space.  For example,
the 21 groups [0..20] may be bound to the SSM channel addresses
(192.35.134.26, 232.153.220.0) through (192.35.134.26, 232.153.220.20).
However, it is not a requirement that these multicast group addresses be
consecutive.  There can be at most 256 groups associated with an LCC
session using the static layer scheme, because there are 8 bits avail-
able for specifying groups in the abstract LCC packet header.

The number of groups associated with a session, and the addresses of the
multicast groups or channels bound to these groups, must be part of the
session description information communicated out-of-band.

5.4. Fairness

A crucial variable in determining the fair share of multiple TCP ses-
sions flowing through a bottleneck link is the round trip time (RTT).
In general, the smaller the RTT for TCP the more aggressive the session
will be against other sessions, including other TCP sessions with larger
RTTs.  With FLID-SL it is desirable that receivers behind a common bot-
tleneck link are joined to the same set of groups in the session at each
point in time independent of their distance from the sender.  Thus, the
role that RTT plays in TCP with respect to fairness does not make sense
for FLID-SL.  In this document, the FLID-SL parameters are set so that a
session shares approximately equally with other FLID-SL sessions, and a
FLID-SL session shares approximately equally with a TCP session with a
RTT of 200 ms.  This is assuming all sessions use the same packet



Luby, Vicisano, Haken                                  FORMFEED[Page 10]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


length.

FLID-SL will not be fair to other sessions if the IGMP leave latency is
high.  Thus, FLID-SL should not be used with multicast routing protocols
that do not support fast IGMP leaves.

6. Fair Layered Increase/Decrease + Dynamic layer scheme (FLID-DL)

Several ideas are used to circumvent the problems associated with long
leave latency in the design of FLID-DL.  As with FLID-SL, let A+1 be the
number of different reception rates available for a session, and let
R(0) < ... < R(A) be the set of available reception rates.  One idea is
to allocate G+1 groups to the session, numbered from 0 to G, where G >
A.  Group 0 is called the base group, and this group always carries
packets at rate r(0) = R(0).  Groups 1 thru G are called dynamic groups,
because over time they carry packets at different rates.  At each point
t in time, for all i = 1, ..., A, there is exactly one dynamic group
that carries the rate r(i), and these groups are called active groups at
time t.  The remaining Q = G-A dynamic groups carry zero rate at time t,
and these groups are called quiescent groups at time t.  Which dynamic
groups are active and what rate they carry and which groups are quies-
cent varies over time in such a way that the problem of long leave
latency is overcome.

6.1. Sender operation

As for FLID-SL, the sender partitions time into equal duration intervals
called time slots.  The time slot duration TSD determines the reaction
time of receivers to changing network congestion conditions.  It is rec-
ommended that the time slot duration TSD be set to one of either 0.5,
1.0, or 2.0 seconds.  Associated with each time slot is the time slot
index.  The range of values for the time slot index is [0..G-1], i.e.,
the number of possible time slot indices is equal to the number of
dynamic groups. The time slot index increments by one modulo G between
each consecutive time slot.  For example, if G = 32 then the time slot
index is 0, 1, 2, 3, ... , 30, 31, 0, 1, ... in consecutive time slots.

Given the definition of a time slot and time slot index, we can now
define how the rates on the dynamic groups vary over time.  For i =
A+1,...,G, define r(i) = 0.  The idea is that dynamic group j in time
slot t carries packets flowing at rate r(((j-t-1) mod G)+1).  Thus, in
the time slots with indices 0, 1, 2, ..., G-A-1, G-A, G-A+1, ..., G-1,
dynamic group G carries packets at rate r(G) = 0, r(G-1) = 0, r(G-2) =
0,...,r(A+1) = 0, r(A), r(A-1),...r(1), respectively.  Thus, dynamic
group G is quiescent for Q = G-A time slots, then carries rate r(A),
r(A-1), r(A-2), ..., r(1) over the subsequent A time slots.  This same
pattern is then cyclically repeated thereafter.  Each of the other G-1
dynamic groups goes through the same cycle, where the cycle for dynamic



Luby, Vicisano, Haken                                  FORMFEED[Page 11]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


group G-1 is shifted one time slot forward from dynamic group G, and in
general the cycle for dynamic group j is shifted one time slot forward
from dynamic group j+1.  Thus, during each time slot t, for each i =
1,...,G, dynamic group ((i+t-1) mod G) + 1 carries packets at rate r(i).
Thus, during each time slot A groups are active and Q are quiescent.

The reason for this organization of the dynamic groups is that it allows
receivers to quickly decrease their reception rate within one time slot
without requiring small leave latencies.  For 1 <= i <= A, suppose a
receiver is joined to dynamic group j = ((i+t-1) mod G)+1 at time t in
order to receive at rate r(i).  Then, over the i-1 subsequent time slots
t+1, t+2,..., t+i-1 the reception rate of the receiver from group j is
r(i-1), r(i-2),...,r(1).  Then, in the next time slot t+i, the rate of
group j drops to zero and remains there for a total of Q time slots.
The idea is that once a receiver is joined to a dynamic group, the
receiver remains joined to the group independent of all other factors
such as packet loss until the group becomes quiescent, and at this point
in time the receiver immediately leaves the group.  Let LL be the upper
bound on the leave latency, i.e., LL is an upper bound on the time
between when a receiver issues a leave to a group and the time when the
leave takes effect.  Then, in order for the leave to take effect before
the group becomes active again in Q time slots, it must be the case that
LL <= (Q-1) * TSD.  For example, if LL = 10 seconds and TSD is set to 1
second, then Q must be at least 11.   Once a receiver joins a dynamic
group it remains joined to the dynamic group until it becomes quiescent,
at which time the group is left and this takes effect before the group
becomes active again.  As we describe below in the receiver congestion
control scheme, with this property long leave latencies are no longer a
problem, as the reaction time to network congestion is at most one time
slot.

Suppose there is only one reception rate available, i.e., rmin = rmax.
Then the base group is used to transmit at rate rmin and no dynamic
groups are used in the session, and A = 0, Q = 0, and thus G = 0.  When
only the base group is used, the group number in each packet is set to
zero, the time slot index in each packet is set to zero and the increase
signal trigger in each packet is set to zero.  The sequence numbers are
used as normal within the base group, as described below.

The rest of this subsection concerns the case when rmax > rmin, i.e.,
when multiple reception rates are available in the session.  Since there
are at least two different reception rates, A >= 1.  Let LL be the maxi-
mum leave latency.  Based on TSD and LL, Q must be at least LL/TSD + 1.
For example, if LL is 9.3 seconds and TSD is 1.0 second then the minimum
possible value for Q is 11.  Based on this, Q >= 2, and thus the number
G = Q+A of dynamic groups must be at least 3.

In order to be able to measure loss within each group, the sender places



Luby, Vicisano, Haken                                  FORMFEED[Page 12]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


consecutive sequence numbers in the packets sent to a group. The
sequence numbers ignore time slot boundaries, e.g., even though the rate
of packets sent to a dynamic group changes when the time slot changes,
the sequence numbers within the group across the time slot boundary are
still consecutive. Sequence numbers wrap around at 2^16, i.e., the con-
secutive sequence numbers are 0, 1, 2, ..., 2^16-2, 2^16-1, 0, 1, 2, ...
.

The sender places into each packet the index of the group within the set
of groups used for that session.

The sender places into each packet the time slot index.  Thus, all pack-
ets within the same time slot must have the same time slot index.

The sender places an increase signal trigger into each packet.  The
increase signal trigger is set to either 0 or 1 for each packet.  The
increase signal trigger must be the same for all packets sent to a
dynamic group within the same time slot.  An increase signal trigger of
0 indicates no increase allowed to receivers, and an increase signal
trigger of 1 indicates increase is allowed to receivers, in a manner
described  in detail in the description of the receiver congestion con-
trol scheme.

The increase signal triggers that indicate to receivers when to increase
from a given layer to the next layer are calculated as follows by the
sender.  For all i = 0,...,A-1, let p(i) = min {1.0, 20*packet
size*TSD/R(i)}, and set p(A) = 0.  Note that 1 >= p(0) >= p(1) >= ... >=
p(A-1) >= p(A) = 0.  Let B be an integer associated with each time slot
that increases by one for each consecutive time slot, and thus t = (B
mod G) is the time slot index.  For the  time slot, for each i =
0,...,A, for all packets sent to group j = ((i+B-1) mod G)+1 that car-
ries rate r(i) during the Bth time slot, the increase signal trigger is
set to 1 if BB <= p(i), and the trigger is set to 0 otherwise.  Here, BB
is derived from B by writing B in reverse binary notation and consider-
ing it as a fraction that is between 0 and 1.  For example, if B = 253
when written base ten, then when written in binary B = 11111101, and
then BB = 0.10111111 when written as a binary fraction.  As a decimal
fraction, BB = 0.7421875.  This method of computing BB from B, where B
increases by one at each time slot, guarantees that increase signal
triggers equal to 1 for a given layer i are very well-spaced out over
the time slots, and that on average the fraction of time slots with
increase signal 1 for layer i is p(i).  Note that increase signal trig-
ger for the dynamic group carrying rate r(A) during a time slot is
always set to 0, since p(A) = 0.  This ensures that there is never an
attempt by a receiver to increase the reception rate above R(A).

This method of setting the increase signal triggers implies the follow-
ing monotonicity property: if the trigger is set to 1 for layer i during



Luby, Vicisano, Haken                                  FORMFEED[Page 13]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


some time slot, then the trigger is also set to 1 for lower layers i' <
i in the same time slot.  This allows receivers at lower layers behind a
bottleneck link to increase to a higher layer when receivers at higher
layers increase to a higher layer.  During other time slot periods,
receivers at lower layers will be allowed to increase to a higher layer
when receivers at higher layers aren't allowed to increase to a higher
layer,  thus giving receivers at lower layers a chance to catch up.

6.2. Receiver operation

When a receiver first joins a session, it must only join the base group
and remain joined only to the base group for at least one complete time
slot. The rate r(0) = R(0) of the base group must be small enough that
when a receiver is joined to just this base group there is no signifi-
cant packet loss.  If there is significant packet loss over any signifi-
cant period of time when the receiver is only joined to the base group
then the receiver must leave the session by leaving the base group.  If
there is only one reception rate offered by the session, i.e., there is
only a base group offered by the session and no dynamic groups, then
only this first paragraph is relevant to the receiver congestion control
scheme.  If there are two or more reception rates offered by the ses-
sion, i.e. there is a base layer and (at least three) dynamic layers,
then the rest of this subsection is relevant to the receiver congestion
control scheme.

The receiver must keep a timer that tracks the maximum interarrival time
between packets.  Whenever there is an interarrival time that exceeds
TSD, the receiver must leave the session by leaving all groups in the
session immediately.  The receiver may thereafter try to rejoin the ses-
sion.

During a generic time slot t a receiver is joined to the base group at
rate r(0) and  some number i, 0 <= i <= A, of dynamic groups that have
rates r(1), r(2),...,r(i), respectively, within time slot t.  Thus, dur-
ing time slot t, the receiver is at layer i and has a reception rate
R(i). The receiver does not join or leave any groups in the middle of
time slot t.  The receiver only makes changes in group membership at the
beginning (indicated by the first packet received) of the next time slot
after t.

If the receiver is joined to one or more dynamic groups in addition to
the base group (i >= 1) in time slot t then at the beginning of the next
time slot after t the receiver must leave the dynamic group that was
carrying packets at rate r(1) during time slot t, i.e. the receiver must
leave dynamic group j = (t mod G) + 1.  Dynamic group j will be quies-
cent for Q time slots after the end of time slot t, and thus by the time
group j becomes active again, the leave request will have been pro-
cessed.



Luby, Vicisano, Haken                                  FORMFEED[Page 14]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


If there is at least one packet loss measured in time slot t then the
receiver must not join any groups at the beginning of the next time slot
after t.  If the receiver is joined to at least one dynamic group in
addition to the base group (i >= 1) then the effect of this inaction
will be to drop from layer i to layer i-1, i.e., the reception rate will
drop from R(i) to R(i-1).  This is because, for all 2 <= i' <=i, the
dynamic group that carries rate r(i') in time slot t will carry rate
r(i'-1) in the time slot after t, and the dynamic layer that carries the
rate r(1) in time slot t drops to zero and remains at zero for Q time
slots at the end of time slot t.  The net effect is that the reception
rate drops by r(i) at the beginning of the time slot after t, and thus
the reception rate drops from R(i) to R(i-1).

If there is no measured packet loss in time slot t then how many groups
the receiver joins at the beginning of the time slot after t depends on
the increase signal trigger for layer i in time slot t.  The increase
signal trigger for layer i in time slot t is carried in packets in group
((i+t-1) mod G)+1, since this is the group that is carrying packets at
rate r(i) in time slot t.  If the increase signal trigger is 0 for layer
i in time slot t, indicating the receiver must not increase above layer
i, the receiver joins dynamic group j = ((i+t) mod G)+1 at the beginning
of the time slot after t.  This is because j is the group carrying pack-
ets at rate r(i) during the time slot after t and this will maintain the
receiver reception rate at R(i) = R(i-1) + r(i).  If the increase signal
trigger is 1 for layer i in time slot t (in which case i < A, since the
increase signal trigger is always 0 for the group carrying rate r(A) in
time slot t), indicating the receiver can increase to layer i+1, then
the receiver joins dynamic groups j = (i+t) mod G + 1 and j' = (i+1+t)
mod G + 1.  This is because j is the group carrying packets at rate r(i)
during the time slot after t, and j' is the group carrying packets at
rate r(i+1) during the time slot after t, and this will increase the
receiver reception rate to R(i+1) = R(i-1) + r(i) + r(i+1).

Thus, at the beginning of each time slot, the receiver does at most one
leave and at most two joins.

6.3. General considerations

Generally, the multicast group addresses associated with the groups con-
stitute a consecutive range of multicast address space.  For example,
the 21 groups [0..20] may be bound to the SSM channel addresses
(192.35.134.26, 232.153.220.0) through (192.35.134.26, 232.153.220.20).
However, it is not a requirement that these multicast group addresses be
consecutive.  Besides the LCC base group 0, there can be at most 128
dynamic groups associated with an LCC session using a dynamic layer
scheme, or 129 groups in total including the base group. This is because
there are 7 bits allocated to time slot indices in the abstract LCC
packet header, and thus there are at most 128 time slot indices



Luby, Vicisano, Haken                                  FORMFEED[Page 15]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


possible, and there is a one-to-one correspondence between time slot
indices and dynamic groups.

The number of groups associated with a session, and the addresses of the
multicast groups or channels bound to these groups, must be part of the
session description information communicated out-of-band.

6.4. Fairness

A primary goal for LCC session is to be fair to other LCC sessions as
well as to sessions using other congestion control schemes within other
protocols such as TCP.  In particular, if several sessions are flowing
through a bottleneck link, then it is desirable for the sessions to
share the bandwidth capacity of the link fairly, and it is also desir-
able that the link not be overly congested.  A crucial variable in
determining the fair share of multiple TCP sessions flowing through a
bottleneck link is the round trip time (RTT).  In general, the smaller
the RTT for TCP the more aggressive the session will be against other
sessions, including other TCP sessions with larger RTTs.  With FLID-DL
it is desirable that receivers behind a common bottleneck link are
joined to the same set of groups in the session at each point in time
independent of their distance from the sender.  Thus, the role that RTT
plays in TCP with respect to fairness does not make sense for FLID-DL.
In this document, the FLID-DL parameters are set so that a session
shares approximately equally with other FLID-DL sessions, and a FLID-DL
session shares approximately equally with a TCP session with a RTT of
200 ms.  This is assuming all sessions use the same packet length.

7. Abstract LCC packet header

LCC defines the congestion control information that must be carried in
each packet header.  This information is of the same format for both the
static layer scheme and for the dynamic layer scheme.  Other information
may be required in the packet header to support the scheme that is using
LCC to implement congestion control, but this is outside the scope of
this document.  Thus, although we refer to packets as LCC packets, other
protocols that embed LCC header information into packets may consider
the packets to be of the type of the overall protocol.  For example, the
LCT protocol instantiation [LCT00] may use LCC for congestion control,
and the packets that LCT uses are considered to be LCT packets, even
though these packets contain LCC header information.

The LCC packets that contain the required congestion control information
are sent by the sender(s) to a multicast IP destination address.  In the
LCC header, all integer fields are carried in "big-endian" or "network
order", that is, most significant byte (octet) first.  Unless otherwise
noted, numeric constants are in decimal (base 10).




Luby, Vicisano, Haken                                  FORMFEED[Page 16]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


A LCC packet header contains the following congestion control informa-
tion that is placed into each packet by a sender:

o Increase signal trigger (T): 1 bit

o Time slot index (TSI): 7 bits

o Group number (GN): 8 bits

o Sequence number (SEQNO): 16 bits


The required congestion control information required in a LCC packet is
depicted in Figure 1 below.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T|    TSI      |      GN       |             SEQNO             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 1 - LCC packet header layout

8. Applications

LCC is a good choice for a congestion control scheme to use with LCT
[LCT00].  With LCT, FEC codes [FECBBINFO00, FECBBSTAN00] are used to
provide reliability for content download applications.  When using LCC,
LCT sends coded data about one or more objects to each group that is
part of a session.  With appropriate use of FEC codes, the data sent to
the different groups in the session, or in some cases even multiple ses-
sions from different senders, can be effectively used by receivers to
recover the original object(s).  With the LCT approach to reliable mul-
ticast, the reliability scheme based on FEC codes can be made to be com-
pletely separate and independent of the congestion control scheme based
on LCC.  Furthermore, both the FEC codes described in [FECBBINFO00,
FECBBSTAN00] and LCC can be used so that in the overall LCT protocol
receivers do not send packets to the sender except perhaps to request
extension of an ongoing session or to confirm complete receipt of an
object.  Thus, using LCC for congestion control and FEC codes for relia-
bility within LCT yields a massively scalable network friendly content
distribution protocol.

LCC is potentially also applicable to other applications.  For example,
it is possible that LCC could be used as the congestion control scheme
for a layered media streaming application with LCT [LCT00].

9. LCC and Generic Router Assist



Luby, Vicisano, Haken                                  FORMFEED[Page 17]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


Router filtering of packets to assist in congestion control is described
in [LUB99], [CAI99].  The addresses of the multicast groups can be com-
municated to the routers and they can do filtering of groups based on
congestion.  This is one of the reasons why it is good to have the con-
gestion control information portion of the packet header in a fixed
place at the beginning of the packet, so that the routers can probe this
field if necessary (although it may not be).  A full exploration of this
approach is outside the scope of this document.

10. Intellectual Property Issues


Digital Fountain has patents pending for congestion control schemes that
may be needed to use some of the congestion control schemes described in
this document in a commercial product or service.  Digital Fountain is
willing to provide a blanket royalty free license to the rights it holds
that are needed to use the congestion control schemes described in this
document if and when these congestion control schemes become part of the
IETF standards.

11. References

[AFZ95]   Acharya, S., Franklin, M., and Zdonik, S.,
"Dissemination-Based Data Delivery Using Broadcast Disks", IEEE
Personal Communications, pp.50-60, Dec 1995.

[R2119]   Bradner, S., Key words for use in RFCs to Indicate Requirement
Levels (IETF RFC 2119) http://www.rfc-editor.org/rfc/rfc2119.txt

[BYE00] Byers, J.W., Frumin, M., Horn, G., Luby, M., Mitzenmacher, M.,
Roetter, A., Shaver, W., "FLID-DL Congestion Control for Layered Multicast",
Int'l Workshop on Networked Group Communication,
Vol. 2, pp. 71-81, Palo Alto, CA, Nov. 2000.

[CAI99] Cain, B., Speakman, T., and Towsley, D., "Generic Router
Assist (GRA) Building Block, Motivation and Architecture", Internet
Draft draft-ietf-rmt-gra-arch-00.txt, a work in progress.

[DEE88]   Deering, S., "Host Extensions for IP Multicasting", RFC
1058, Stanford University, Stanford, CA, 1988.

[LCT00] Luby, M., Gemmell, J., Vicisano, L., Rizzo, Handley, M., L., Crowcroft, J.,
"Layered Coding Transport: A massively scalalable multicast protocol",
Internet draft draft-ietf-rmt-lct-00, November 2000.

[FECBBINFO00] Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, M.,
Crowcroft, J., "The use of Forward Error Correction in Reliable Multicast",
Internet Draft draft-ietf-rmt-info-fec-00.txt, November 2000.



Luby, Vicisano, Haken                                  FORMFEED[Page 18]


Internet Draft     RMT BB, Layered Congestion Control           May 2001


[FECBBSTAN00] Luby, M., Gemmell, J., Vicisano, L., Rizzo, L., Handley, M.,
Crowcroft, J., "RMT BB: Forward Error Correction Codes", Internet Draft
draft-ietf-rmt-bb-fec-01.txt, November 2000.

[HAN96]   Handley, M., "SAP: Session Announcement Protocol", Internet
Draft, IETF MMUSIC Working Group, Nov 1996.

[HAN98]   Handley, M., and Jacobson, V., "SDP: Session Description
Protocol", RFC 2327, April 1998.

[HOL99] Holbrook, H., Cheriton, D., "IP Multicast Channels: Experss
Support for Large-scale Single-source Applications", ACM SIGCOMM'99

[LUB99]      Luby, M., Vicisano, L., Speakman, T. "Heterogeneous
multicast congestion control based on router packet filtering",
presented at RMT meeting in Pisa, March 1999.

[MCC96] McCanne, S., Jacobson, V., Vetterli, M.,,
"Receiver-driven Layered Multicast", Sigcomm'96, Stanford, CA, August 19996.

[VIC98A] L.Vicisano, L.Rizzo, J.Crowcroft, "TCP-like
Congestion Control for Layered Multicast Data Transfer", IEEE Infocom'98,
San Francisco, CA, Mar.28-Apr.1 1998.

[VIC98B] Vicisano, L., "Notes On a Cumulative Layered
Organization of Data Packets Across Multiple Streams with Different
Rates", University College London Computer Science Research Note
RN/98/25, Work in Progress (May 1998).

Authors' Addresses

   Michael Luby
   luby@digitalfountain.com
   Digital Fountain
   600 Alabama Street
   San Francisco, CA, USA, 94110

   Lorenzo Vicisano
   lorenzo@cisco.com
   cisco Systems, Inc.
   170 West Tasman Dr.,
   San Jose, CA, USA, 95134

   Armin Haken
   armin@digitalfountain.com
   Digital Fountain
   600 Alabama Street
   San Francisco, CA, USA, 94110



Luby, Vicisano, Haken                                  FORMFEED[Page 19]


Internet Draft     RMT BB, Layered Congestion Control           May 2001





















































Luby, Vicisano, Haken                                  FORMFEED[Page 20]

Document	Document type	Expired Internet-Draft (rmt WG) Expired & archived
	Select version	00
	Author	Michael Luby Email authors
	RFC stream
	Intended RFC status	(None)
	Other formats	txt pdf bibtex bibxml
	Additional resources	Mailing list discussion