INTERNET-DRAFT M. Handley/J. Crowcroft/C. Bormann/J. Ott
Expires: January 2001 ACIRI/UCL/TZI/TZI
July 2000
The Internet Multimedia Conferencing Architecture
draft-ietf-mmusic-confarch-03
Status of this memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This document is a product of the Multiparty Multimedia Session
Control (MMUSIC) working group of the Internet Engineering Task
Force. Comments are solicited and should be addressed to the working
group's mailing list at confctrl@isi.edu and/or the authors.
Abstract
This document provides an overview of multimedia conferencing on the
Internet. The protocols mentioned are specified elsewhere as RFCs,
Internet-Drafts, or ITU recommendations. Each of these
specifications gives details of the protocol itself, how it works and
what it does. This document attempts to provide the reader with an
overview of how the components fit together and of some of the
assumptions made, as well as some statement of direction for those
components still in a nascent stage.
(Remove before publication:) This document is a product of the
Multiparty Multimedia Session Control (MMUSIC) working group of the
Internet Engineering Task Force. Comments are solicited and should
be addressed to the working group's mailing list at confctrl@isi.edu
and/or the authors.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 1]
INTERNET-DRAFT Conferencing Architecture July 2000
1. Introduction
The Internet is not currently very good at carrying audio and video.
This is hardly surprising as it was not designed or engineered with
real-time traffic in mind, but there has recently been a great deal
of interest in using the Internet for telephony services. Part of
this has come from pricing anomalies that make internet telephony
somewhat artificially cheaper than traditional telephone services,
but this is not the whole story. The Internet itself is improving to
better handle traffic such as audio and video, and in the medium
term, the internet should be able to provide good quality realtime
multimedia services, although such quality improvements are likely to
incur additional charges.
However, the real interest in using the internet for audio and video
should come from the prospect for a single ubiquitous communications
network that not only allows traditional telephony services, but also
video, shared collaboration tools, and through IP Multicast, multi-
party conferences and multimedia sessions that scale from small group
meetings through to television sized audiences. In principle, this
may lead to a ``democratization'' of telecommunication services,
where licenses to broadcast are not required to control physical
access to the limited broadcast medium (although they may still be
required for political reasons).
It is far from clear what services will eventually emerge using such
communications capabilities. We can only say that the technical
capability to have large numbers of sessions ranging in audience from
hundreds to millions of participants, largely unlimited by geographic
boundaries, will lead to services and social structures that do not
exist today. However, we can describe the basic technologies that
are likely to bring about such changes, and in this document we
attempt to provide such an overview. We leave it to the reader to
imagine the uses to which this technology will be put.
2. The Technology
In conjunction with computers, the term ``conferencing'' is often
used in two different ways: firstly, to refer to bulletin boards and
mail list style asynchronous exchanges of messages between multiple
users; secondly, to refer to synchronous or so-called ``real-time''
conferencing, including audio/video communication and shared tools
such as whiteboards and other applications. This document is about
the architecture for this latter application, multimedia conferencing
in an Internet environment.
There are other infrastructures for teleconferencing in the world:
POTS (Plain Old Telephone System) networks often provide voice
conferencing and phone-bridges, while with ISDN, H.320 [14] can be
used for small, strictly organized video-telephony conferencing.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 2]
INTERNET-DRAFT Conferencing Architecture July 2000
The architecture that has evolved in the Internet is far more general
as well as being scalable to very large groups, and permits the open
introduction of new media and new applications as they are devised.
As the simplest case, it also allows two persons to communicate via
audio only, so it encompasses IP telephony.
The determining factors of a conferencing architecture are
communication within (possibly large) groups of humans and real-time
delivery of information. In the Internet, this is supported at a
number of levels. The remainder of this section provides an overview
of this support, and the rest of the document describes each aspect
in more detail.
In a conference, information must be distributed to all the
conference participants. Early conferencing systems used a fan-out
of data streams, e.g., one connection between each pair of
participants, which means that the same information must cross some
networks more than once. The Internet architecture uses the more
efficient approach of multicasting the information to all
participants (see section 3).
Multimedia conferences require real-time delivery of at least the
audio and video information streams used in the conference. In an
ISDN context, fixed rate circuits are allocated for this purpose --
whether their bandwidth is required at any particular instance or
not. On the other hand, the traditional Internet service model
(``best effort'') cannot make the necessary quality of service
available in congested networks. New service models are being
defined in the Internet together with protocols to reserve capacity
or prioritize traffic in a more flexible way than that available with
circuit switching (see section 4).
In a datagram network, multimedia information must be transmitted in
packets, some of which may be delayed more than others. In order
that audio and video streams be played out at the recipient in the
correct timing, information must be transmitted that allows the
recipient to reconstitute the timing. A transport protocol with the
specific functions needed for this has been defined (see section 5).
The nature of the Internet reflects that of the world in that it is
very heterogeneous. Techniques exist to exploit this, and to deliver
appropriate quality to different participants in the same conference
according to their capabilities.
Conference tools such as virtual whiteboards or shared editors are
not concerned with real-time delivery of audio or video but maintain
and update shared state between the participants. Work on support of
such applications in a multicast environment is in progress (section
6.2).
The humans participating in a conference generally need to have a
specific idea of the context in which the conference is happening,
which can be formalized as a conference policy. Some conferences are
essentially crowds gathered around an attraction, while others have
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 3]
INTERNET-DRAFT Conferencing Architecture July 2000
very formal guidelines on who may take part (listen in) and who may
speak at which point. In any case, initially the participants must
find each other, i.e. establish communication relationships
(conference setup, section 7). During the conference, some
conference control information is exchanged to implement a conference
policy or at least to inform the crowd of who is present (section 6).
In addition, security measures may be required to actually enforce
the conference policy, e.g. to control who is listening and to
authenticate contributions as actually originating from a specific
person. In the Internet, there is little tendency to rely on the
traditional ``security'' of distribution offered e.g. by the phone
system. Instead, cryptographic methods are used for encryption and
authentication, which need to be supported by additional conference
setup and control mechanisms (section 8).
Figure 1: Internet multimedia conferencing protocol stacks
|<--- Conference Management --->|<--- Media Agents --->|
| | |
| Conference | Conference | Audio/ | Shared |
| Setup & Discovery | Course Control | Video | Applications |
+-------------------------+------+--------+-+--------+------------+ +
| S D P | | Distr. | RTP / | Reliable | |
| SAP | SIP | HTTP | SMTP | RSVP | Ctrl(1)| RTCP |Multicast(2)| |
+-----+--+--+------+------+ +--+--------+----------+------------+--+
| UDP | T C P | | U D P |
+--------+----------------+---+--------------------------------------+
| IP + IP Multicast |
+--------------------------------------------------------------------+
| Integrated Services Forwarding |
+--------------------------------------------------------------------+
The protocol stacks for internet multimedia conferencing are
illustrated in Figure 1. Most of the protocols are not deeply
layered unlike many protocol stacks, but rather are used alongside
each other to produce a complete conference.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 4]
INTERNET-DRAFT Conferencing Architecture July 2000
3. Multicast Traffic Distribution
+--------------+------------------+-----------------------------------+
|Protocol | Documentation | Purpose |
+--------------+------------------+-----------------------------------+
|IP Multicast | RFC 1112, 2236 | Host extensions for IP Multicast |
| | | Multicast routing protocols: |
|DVMRP | RFC 1075 | Dense-mode Intra-domain |
|PIM-SM | RFC 2362 | Sparse-mode Intra-domain |
|PIM-DM | Internet Draft | Dense-mode Intra-domain |
|CBT | RFC 2189 | Sparse-mode Intra-domain |
+--------------+------------------+-----------------------------------+
IP multicast provides efficient many-to-many data distribution in an
internet environment. It is easy to view IP multicast as simply an
optimization for data distribution, and indeed this is the case, but
IP multicast can also result in a different way of thinking about
application design. To see why this might be the case, examine the
IP multicast service model, as described by Van Jacobson [8]:
- Senders just send to the group
- Receivers express an interest in receiving data sent to the
group
- Routers conspire to deliver data from senders to receivers
With IP multicast, the group is indirectly identified by a single IP
class-D multicast address.
Several things are important about this service model from an
architectural point of view. Receivers do not need to know who or
where the senders are to receive traffic from them. Senders never
need to know who the receivers are. Neither senders or receivers
need care about the network topology as the network optimizes
delivery.
An IP multicast group is scalable because information about group
membership and group changes at the IP level are kept local to
routers near the relevant members. How this is performed depends on
the particular multicast routing scheme in use local to the member,
and although it is not a trivial task, several solutions do exist and
therefore multicast routing will not be discussed in detail here.
For more detailed information on multicast routing, see [7, 6, 4, 1,
23]. Typically, as a group with s senders and r receivers increases
in size, state in routers scales O(s) or O(1) depending on the
routing scheme in use. This state may be in on-tree routers for
newer so called sparse-mode algorithms such as PIM, or in off-tree
routers for older so-called dense-mode algorithms such as DVMRP.
Thus the most scalable current multicast routing algorithms require
O(1) state in on-tree routers, and hence the total routing state
scales O(g) in a router that is on-tree for g groups. We can also
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 5]
INTERNET-DRAFT Conferencing Architecture July 2000
envisage multicast routing schemes which require less than O(g)
state*, but the requirement is not currently urgent, so none of these
have yet been implemented.
The level of indirection introduced by the IP class D address
denominating the group solves the distributed systems binding
problem, by pushing this task down into routing; given a multicast
address (and UDP port), a host can send a message to the members of a
group without needing to discover who they are. Similarly receivers
can ``tune in'' to multicast data sources without needing to bother
the data source itself with any form of request.
IP multicast is a natural solution for multi-party conferencing
because of the efficiency of the data distribution trees, with data
being replicated in the network at appropriate points rather than in
end-systems. It also avoids the need to configure special-purpose
servers to support the session, which require support, and which
cause traffic concentration and can be a bottleneck. For larger
broadcast-style sessions, it is essential that data-replication be
carried out in a way that only requires per-receiver network-state to
be local to each receiver, and that data-replication occurs within
the network. Attempting to configure a tree of application-specific
replication servers for such broadcasts rapidly becomes a ``multicast
routing'' problem, and thus native multicast support is a more
appropriate solution.
3.1. Address Allocation
+----------+-------------------+----------------------------------+
|Protocol | Documentation | Purpose |
+----------+-------------------+----------------------------------+
|MADCAP | Internet Draft | DHCP-like client protocol |
| | | for address allocation |
|AAP | Internet Draft | Intra-domain address allocation |
|MASC | Internet Draft | Inter-domain address allocation |
|BGMP | Internet Draft | Inter-domain multicast routing |
+----------+-------------------+----------------------------------+
How does an application choose a multicast address to use?
In the absence of any other information, we can bootstrap a multicast
application by using well-known multicast addresses. Routing
(unicast and multicast) and group membership protocols [5] can do
_________________________
* with IP encapsulation, not all on-tree routers need hold the
state for a group whose traffic they are forwarding -- traffic for
the group can be encapsulated (either unicast of multicast) be-
tween on-tree routers nearer the edge of the network, reducing
some of the state burden on backbone routers.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 6]
INTERNET-DRAFT Conferencing Architecture July 2000
just that. However, this is not the best way of managing
applications of which there is more than one instance at any one
time.
For these, we need a mechanism for allocating group addresses
dynamically, and a directory service which can hold these allocations
together with some key (session information for example --- see
later), so that users can look up the address associated with the
application. The address allocation and directory functions should
be distributed to scale well.
Multicast address allocation is currently an active area of research.
For many years multicast address allocation has been performed using
multicast session directories (see section 7.1), but as the users and
uses of IP multicast increase, it is becoming clear that a more
hierarchical approach is required.
An architecture [10] is currently being developed based around a
well-defined API that an application can use to request an address.
The host then requests an address from a local address allocation
server, which in turn chooses and reserves an unallocated address
from a range dynamically allocated to the domain. By allocating
addresses in a hierarchical and topologically sensitive fashion, the
address itself can be used in a hierarchical multicast routing
protocol currently being developed (BGMP, [29]) that will help
multicast routing scale more gracefully that current schemes.
4. Internet Service Models
+----------+----------------+--------------------+
|Protocol | Documentation | Purpose |
+----------+----------------+--------------------+
|IP | RFC 791 | Internet Protocol |
+----------+----------------+--------------------+
Traditionally the internet has provided so-called best-effort
delivery of datagram traffic from senders to receivers. No
guarantees are made regarding when or if a datagram will be delivered
to a receiver, however datagrams are normally only dropped when a
router exceeds a queue size limit due to congestion. The best-effort
internet service model does not assume FIFO queuing, although many
routers have implemented this.
With best-effort service, if a link is not congested, queues will not
build at routers, datagrams will not be discarded in routers, and
delays will consist of serialization delays at each hop plus
propagation delays. With sufficiently fast link speeds,
serialization delays are insignificant compared to propagation
delays*.
_________________________
* For slow links, a set of mechanisms has been defined that
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 7]
INTERNET-DRAFT Conferencing Architecture July 2000
If a link is congested, with best-effort service, queuing delays will
start to influence end-to-end delays, and packets will start to be
lost as queue size limits are exceeded. Real-time traffic does not
cope terribly well with packet loss levels of more than a few
percent, although it is possible to add redundancy [12] to increase
the levels at which loss becomes a problem. In the last few years a
significant amount of work has also gone into providing non-best-
effort services that would provide a better assurance that an
acceptable quality conference will be possible.
4.1. Non-best effort service
Real-time internet traffic is defined as datagrams that are delay
sensitive. It could be argued that all datagrams are delay sensitive
to some extent, but for these purposes we refer only to datagrams
where exceeding an end-to-end delay bound of a few hundred
milliseconds renders the datagrams useless for the purpose they were
intended. For the purposes of this definition, TCP traffic is
normally not considered to be real-time traffic, although there may
be exceptions to this rule.
On congested links, best-effort service queuing delays will adversely
affect real-time traffic. This does not mean that best-effort
service cannot support real-time traffic --- merely that congested
best-effort links seriously degrade the service provided. For such
congested links, a better-that-best-effort service is desirable.
To achieve this, the service model of the routers can be modified.
FIFO queuing can be replaced by packet forwarding strategies that
discriminate different ``flows'' of traffic. The idea of a flow is
very general. A flow might consist of ``all marketing site web
traffic'', or ``all fileserver traffic to and from teller machines''.
On the other hand, a flow might consist of a particular sequence of
packets from an application in a particular machine to a peer
application in another particular machine set up on request, or it
might consist of all packets marked with a particular Type-of-Service
bit.
There is really a spectrum of possibilities for non-best-effort
service something like that shown in Figure 2.
Figure 2: Spectrum of internet service types
best effort assured by guaranteed by
unsignalled type of service per-flow reservation
+-------------+-------------+-------------+-------------+
prioritized by assured by
type of service aggregate reservation
_________________________
helps minimize serialization and link access delays [2].
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 8]
INTERNET-DRAFT Conferencing Architecture July 2000
This spectrum is intended to illustrate that between best-effort, and
hard per-flow guarantees lie many possibilities for non-best-effort
service, including having hard guarantees based on an aggregate
reservation, assurances that traffic marked with a particular type-
of-service bit will not be dropped so long as it remains in profile,
and simpler prioritization-based services.
Towards the right hand side of the spectrum, flows are typically
identifiable in the Internet by the tuple: source machine,
destination machine, source port, destination port, protocol, any of
which could be ``ANY'' (wildcarded).
In the multicast case, the destination is the group, and can be used
to provide efficient aggregation.
Flow identification is called classification and a class (which can
contain one or more flows) has an associated service model applied.
This can default to best effort.
Through network management, we can imagine establishing classes of
long lived flows -- enterprise networks (``Intranets'') often enforce
traffic policies that distinguish priorities which can be used to
discriminate in favor of more important traffic in the event of
overload (though in an underloaded network, the effect of such
policies will be invisible, and may incur no load/work in routers).
The router service model to provide such classes with different
treatment can be as simple as a priority queuing system, or it can be
more elaborate.
Although best-effort services can support real-time traffic,
classifying real-time traffic separately from non-real-time traffic
and giving real-time traffic priority treatment ensures that real-
time traffic sees minimum delays. Non-real-time TCP traffic tends to
be elastic in its bandwidth requirements, and will then tend to fill
any remaining bandwidth.
We could imagine a future Internet with sufficient capacity to carry
all of the world's telephony traffic. Since this is a relatively
modest capacity requirement, it might be simpler to establish
``POTS'' as a static class which is given some fraction of the
capacity overall, and then within the backbone of the network no
individual call need be given an allocation (i.e. we would no longer
need the call setup/tear down that was needed in the legacy POTS
which was only present due to under-provisioning of trunks, and to
allow the trunk exchanges the option of call blocking). The vision
is of a network that is engineered with capacity for all of the non-
best-effort average load sources to send without needing individual
reservations.
4.2. Reservations
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 9]
INTERNET-DRAFT Conferencing Architecture July 2000
+-----------------+----------------+---------------------------------------+
|Protocol | Documentation | Purpose |
+-----------------+----------------+---------------------------------------+
|RSVP | RFC 2205 | Resource ReSerVation Protocol (RSVP) |
|Controlled Load | RFC 2211 | Network service model |
| Service | | selected by RSVP |
|Guaranteed | RFC 2212 | Network service model |
| Service | | selected by RSVP |
+-----------------+----------------+---------------------------------------+
For flows that may take a significant fraction of the network (i.e.
are ``special'' and can't just be lumped under a static class), we
need a more dynamic way of establishing these classifications. In
the short term, this applies to many multimedia calls since the
Internet is largely under-provisioned at the time of writing.
RSVP has been standardized for just this purpose. It provides flow
identification and classification. Hosts and applications are
modified to speak RSVP client language, and routers speak RSVP.
Since most traffic requiring reservations is delivered to groups
(e.g. TV), it is natural for the receiver to make the request for a
reservation for a flow. This has the added advantage that different
receivers can make heterogeneous requests for capacity from the same
source. Thus RSVP can accommodate monochrome, color and HDTV
receivers from a single source (also see section Figure 5).
Again the routers conspire to deliver the right flows to the right
locations.
RSVP accommodates the wildcarding noted above.
Admission Control
If a network is provisioned such that it has excess capacity for all
the real-time flows using it, a simple priority classification
ensures that real-time traffic is minimally delayed. However, if a
network is insufficiently provisioned for the traffic in a real-time
traffic class, then real-time traffic will be queued, and delays and
packet loss will result. Thus in an under-provisioned network,
either all real-time flows will suffer, or some of them must be given
priority.
RSVP provides a mechanism by which an admission control request can
be made, and if sufficient capacity remains in the requested traffic
class, then a reservation for that capacity can be put in place.
If insufficient capacity remains, the admission request will be
refused, but the traffic will still be forwarded with the default
service for that traffic's traffic class. In many cases even an
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 10]
INTERNET-DRAFT Conferencing Architecture July 2000
admission request that failed at one or more routers can still supply
acceptable quality as it may have succeeded in installing a
reservation in all the routers that were suffering congestion. This
is because other reservations may not be fully utilising their
reserved capacity in those routers where the reservation failed.
Billing
If a reservation involves setting aside resources for a flow, this
will tie up resources so that other reservations may not succeed, and
depending on whether the flow fills the reservation, other traffic is
prevented from using the network. Clearly some negative feedback is
required in order to prevent pointless reservations from denying
service to other users. This feedback is typically in the form of
billing.
Billing requires that the user making the reservation is properly
authenticated so that the correct user can be charged. Billing for
reservations introduces a level of complexity to the internet that
has not typically been experienced with non-reserved traffic, and
requires network providers to have reciprocal usage-based billing
arrangements for traffic carried between them. It also suggests the
use of mechanisms whereby some fraction of the bill for a link
reservation can be charged to each of the downstream multicast
receivers.
4.3. Differentiated Services
+-------------------------+----------------+------------------------+
|Protocol | Documentation | Purpose |
+-------------------------+----------------+------------------------+
|Differentiated Services | RFC 2474 | DS Field in IP Header |
|Differentiated Services | RFC 2475 | DS Architecture |
+-------------------------+----------------+------------------------+
Whereas RSVP asks routers to classify packets into classes to achieve
a requested quality of services, it is also possible to explicitly
mark packets to indicate the type of service required. Of course,
there has to be an incentive and mechanisms to ensure that ``high-
priority'' is not set by everyone in all packets, and this incentive
is provided by edge-based policing and by buying profiles of higher
priority service. In this context, a profile could have many forms,
but a typical profile might be a token-bucket filter specifying a
mean rate and a bucket size with certain time-of-day restrictions.
This is still an active research area, but the general idea is for a
customer to buy from their provider a profile for higher quality
service, and the provider polices marked traffic from the site to
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 11]
INTERNET-DRAFT Conferencing Architecture July 2000
ensure that the profile is not exceeded. Within a provider's
network, routers give preferential services to packets marked with
the relevant type-of-service bit. Where providers peer, they arrange
for an aggregate higher-quality profile to be provided, and police
each other's aggregate if it exceeds the profile. In this way,
policing only needs to be performed at the edges to a provider's
network on the assumption that within the network there is sufficient
capacity to cope with the amount of higher-quality traffic that has
been sold. The remainder of the capacity can be filled with regular
best-effort traffic.
One big advantage of differentiated services over reservations is
that routers do not need to keep per-flow state, or look at source
and destination addresses to classify the traffic, and this means
that routers can be considerably simpler. Another big advantage is
that the billing arrangements for differentiated services are
pairwise between providers at boundaries -- at no time does a
customer need to negotiate a billing arrangement with each provider
in the path*
5. Transport Protocols
So-called real-time delivery of traffic requires little in the way of
transport protocol. In particular, real-time traffic that is sent
over more than trivial distances is not retransmittable.
With packet multimedia data there is no need for the different media
comprising a conference to be carried in the same packets. In fact
it simplifies receivers if different media streams are carried in
separate flows (i.e., separate transport ports and/or separate
multicast groups). This also allows the different media to be given
different quality of service. For example, under congestion, a
router might preferentially drop video packets over audio packets.
In addition, some sites may not wish to receive all the media flows.
For example, a site with a slow access link may be able to
participate in a conference using only audio and a whiteboard whereas
other sites in the same conference with more capacity may also send
and receive video. This can be done because the video can be sent to
a different multicast group than the audio and whiteboard. This is
first step towards coping with heterogeneity by allowing the
receivers to decide how much traffic to receive, and hence allowing a
conference to scale more gracefully.
5.1. Receiver Adaptation and Synchronization
_________________________
* With reservations there may be ways to avoid this too, but
they're somewhat more difficult given the more specific nature of
a reservation.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 12]
INTERNET-DRAFT Conferencing Architecture July 2000
Figure 3: Network Jitter and Packet Audio
|x|
| | |
|x| |
| | |
Compression |x| v
+ Packetizer | |
+--------+ +-------+
Microphone | | 1.5 Mbit/s link | |
+-----+ A A A | |-----------------| Router|
/~\------+__| |>>> >>> >>>| |A A A A A| |
\_/------+ |A->D |20 ms Audio| |-----------------| |
+-----+Timeslices | | ---------> | |
| | +-------+
+--------+ |A|
| | |
Shared link: |x| |
Audio traffic |A| |
interspersed w |A| |
other traffic |x| |
|x| |
|x| |
Depacketizer |A| |
+ Timing recovery |A| v
+ Decompression | |
+--------+ +-------+
Speaker | | 1.5 Mbit/s link | |
|\ +-----+ A A A | |-----------------| Router|
| +---+ | |<<< <<< <<<| |A A AA A | |
| | |-----|D->A |20 ms Audio| |-----------------| |
| +---+ +-----+Timeslices | | <--------- | |
|/ | | +-------+
+--------+ | |
|X| |
|X| |
| | |
|X| v
| |
Best-effort traffic is delayed by queues in routers between the
sender and the receivers. Even reserved priority traffic may see
small transient queues in routers, and so packets comprising a flow
will be delayed for different times. Such delay variance is known as
jitter, and is illustrated in Figure 3.
Real-time applications such as audio and video need to be able to
buffer real-time data at the receiver for sufficient time to remove
the jitter added by the network and recover the original timing
relationships between the media data. In order to know how long to
buffer for, each packet must carry a timestamp which gives the time
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 13]
INTERNET-DRAFT Conferencing Architecture July 2000
at the sender when the data was captured. Note that for audio and
video data timing recovery, it is not necessary to know the absolute
time that the data was captured at the sender, only the time relative
to the other data packets.
Figure 4: Inter-media synchronization
Incoming packets
---------------- +----------------+
V A V AV A --> | Host |
---------------- | Demuxing |
+----------------+
/ \
/ \
A A / A V V \ V
v v
+---------------+ +---------------+
| Depacketizer | per source | Depacketizer |
+---------------+ delay adaptation: +---------------+
v \ 45 ms 95 ms / v
+------------+ \ / +------------+
| format | \ / | format |
| conversion | +------------------+ | conversion |
+------------+ | synchronization | +------------+
| | agent | |
| +------------------+ |
| mix / \ |
v / \ v
| | / \ | |
+-----+ / \ +-----+
| | / \ | |
+-----+ / \ +-----+
| A | / 95 ms 95 ms \ | V |
+-----+ / \ +-----+
| A | <--+ +-> | V |
+-----+ /| +--------+ +-----+
| A | +---+ | |/------\| | V |
+-----+>>>>>| | | || ||<<<<<+-----+
+---+ | |\------/|
\| +--------+
As audio and video flows will receive differing jitter and possibly
differing quality of service, audio and video that were grabbed at
the same time at the sender may not arrive at the receiver at the
same time. At the receiver, each flow will need a playout buffer to
remove network jitter. Inter-flow synchronization can be performed
by adapting these playout buffers so that samples/frames that
originated at the same time are played out at the same time (see
figure Figure 4). This requires that the time base of different
flows from the same sender can be related at the receivers, e.g. by
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 14]
INTERNET-DRAFT Conferencing Architecture July 2000
making available the absolute times at which each of them was
captured.
5.2. RTP
+-------------+----------------+--------------------------------------+
|Protocol | Documentation | Purpose |
+-------------+----------------+--------------------------------------+
|RTP,RTCP | RFC 1889 | packet format for realtime traffic |
|RTP Profile | RFC 1890 | specific RTP profile for AV traffic |
|RTP Payload | RFC 2032, | payload formats for specific codecs |
| Formats | 2035, etc | |
+-------------+----------------+--------------------------------------+
The transport protocol for real-time flows is RTP [28]. This
provides a standard format packet header which gives media specific
timestamp data, as well as payload format information and sequence
numbering amongst other things. RTP is normally carried using UDP.
It does not provide or require any connection setup, nor does it
provide any enhanced reliability over UDP. For RTP to provide a
useful media flow, there must be sufficient capacity in the relevant
traffic class to accommodate the traffic. How this capacity is
ensured is independent of RTP.
Every original RTP source is identified by a source identifier, and
this source id is carried in every packet. RTP allows flows from
several sources to be mixed in gateways to provide a single resulting
flow. When this happens, each mixed packet contains the source IDs
of all the contributing sources.
RTP media timestamp units are flow specific --- they are in units
that are appropriate to the media flow. For example, 8kHz sampled
PCM encoded audio has a timestamp clock rate of 8kHz. This means
that inter-flow synchronization is not possible from the RTP
timestamps alone.
Each RTP flow is supplemented by Real-Time Control Protocol (RTCP)
packets. There are a number of different RTCP packet types. RTCP
packets provide the relationship between the realtime clock at a
sender and the RTP media timestamps so that inter-flow
synchronization can be performed, and they provide textual
information to identify a sender in a conference from the source id.
5.3. Conference Membership and Reception Feedback
IP multicast allows sources to send to a multicast group without
being a receiver of that group. However, for many conferencing
purposes it is useful to know who is listening to the conference, and
whether the media flows are reaching receivers properly. Accurately
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 15]
INTERNET-DRAFT Conferencing Architecture July 2000
performing both these tasks restricts the scaling of the conference.
IP multicast means that no-one knows the precise membership of a
multicast group at a specific time, and this information cannot be
discovered, as to try to do so would cause an implosion of messages,
many of which would be lost*. Instead, RTCP provides approximate
membership information through periodic multicast of session messages
which, in addition to information about the recipient, also give
information about the reception quality at that receiver. RTCP
session messages are restricted in rate, so that as a conference
grows, the rate of session messages remains constant, and each
receiver reports less often. A member of the conference can never
know exactly who is present at a particular time from RTCP reports,
but does have a good approximation to the conference membership. The
is analogous to what happens in a real-world meeting hall; the
meeting organizers may have an attendance list, but if people are
coming and going all the time, they probably do not know exactly who
is in the room at any one moment.
Reception quality information is primarily intended for debugging
purposes, as debugging of IP multicast problems is a difficult task.
However, it is possible to use reception quality information for rate
adaptive senders, although it is not clear whether this information
is sufficiently timely to be able to adapt fast enough to transient
congestion.
5.4. Scaling Issues and Heterogeneity
The Internet is very heterogeneous, with link speeds ranging from
around 10 kbit/s up to around 10 Gbit/s, and very varied levels of
congestion. How then can a single multicast source satisfy a large
and heterogeneous set of receivers?
_________________________
* Note that a conference policy that restricts conference mem-
bership can be implemented using encryption and restricted distri-
bution of encryption keys, of which more later.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 16]
INTERNET-DRAFT Conferencing Architecture July 2000
Figure 5: Receiver adaptation: multiple layers and multicast groups
/~~~\ ##### 2 Mbit/s layer
| R | ===== 512 kbit/s layer
\___/ ----- 64 kbit/s layer
#
10 Mbit/s # 10 Mbit/s
link: # link
#
/~~~\ #######> +---+
| S | =======> | |
\___/ -------> +---+
\ = 1.5 Mbit/s
Source \ = link
\ = 1.5 Mbit/s
\ = link
\ +---+=========>/~~~\
| |--------->| R |
+---+ \___/
10 Mbit/s / = \
link / = \ 128 kbit/s
/ = \ link
/ = \ 10 Mbit/s
/~~~\ +---+ link /~~~\
| R | | |--------->| R |
\___/ +---+ \___/
\
\ 10 Mbit/s
\ link
\
/~~~\
| R |
\___/
In addition to each receiver performing its own adaptation to jitter,
if the sender layers [22] its video (or audio) stream, different
receivers can choose to receive different amounts of traffic and
hence different qualities. To do this the sender must code they
video as a base layer (the lowest quality that might be acceptable)
and a number of enhancement layers, each of which adds more quality
at the expense of more bandwidth. With video, these additional
layers might increase the framerate or increase the spatial
resolution of the images or both. Each layer is sent to a different
multicast group, and receivers can decide individually how many
layers to subscribe to. This is illustrated in Figure 5. Of course,
if they are going to respond to congestion in this way, then we also
need to arrange that the receivers in a conference behind a common
bottleneck tend to respond together to prevent de-synchronized
experiments by different receivers from having the net effect that
too many layers are always being drawn through a common bottleneck.
RLM [21] is one way that this might be achieved, although there is
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 17]
INTERNET-DRAFT Conferencing Architecture July 2000
continuing research in this area.
6. Conference Control
+---------+--------------------------+-------------------------------------+
|Protocol | Documentation | Purpose |
+---------+--------------------------+-------------------------------------+
|H.323 | ITU recommendation H.323 | Tightly coupled conference setup |
| | | and control |
|H.332 | ITU recommendation H.332 | Loosely coupled extensions to H.323 |
+---------+--------------------------+-------------------------------------+
Conferences come in many shapes and sizes, but there are only really
two models for conference control: light-weight sessions and tightly
coupled conferencing. For both models, rendezvous mechanisms are
needed. Note that the conference control model is orthogonal to
issues of quality of service and network resource reservation, and it
is also orthogonal to the mechanism for discovering the conference.
Light-weight sessions are multicast based multimedia conferences that
lack explicit conference membership control and explicit conference
control mechanisms. Typically a lightweight session consists of a
number of many-to-many media streams supported using RTP and RTCP
using IP multicast*. Typically, the only conference control
information needed during the course of a light-weight session is
that distributed in the RTCP session information, i.e. an approximate
membership list with some attributes per member.
Tightly coupled conferences may also be multicast based and use RTP
and RTCP, but in addition they have an explicit conference membership
mechanism and may have an explicit conference control mechanism that
provides facilities such as floor control.
The most widely used tightly coupled conference control protocols
suitable for Internet use are those belonging to the ITU's H.323
family [16]. However it should be noted that this is inappropriate
for larger conferences where scaling problems will be introduced by
the conference control mechanisms. The Simple Conference Control
Protocol (SCCP) [18] has been proposed as a more scalable distributed
conference control protocol.
In order to try and address large conferences, the ITU is currently
standardising H.332 [17], which is essentially a small tightly
coupled H.323 conference with a larger lightweight-sessions-style
_________________________
* There is some confusion on the term session, which is some-
times used for a conference and sometimes for a single media
stream transported by RTP. In this document, we prefer to use the
less ambiguous term conference except where existing protocols use
the term session.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 18]
INTERNET-DRAFT Conferencing Architecture July 2000
conference listening in as passive participants. It is not yet clear
whether H.332 will see large scale acceptance, as its benefits over a
simple lightweight session are not terribly obvious. It seems likely
that lightweight sessions combined with stream authentication (see
section 8.3) might be a more appropriate solution for many potential
customers.
6.1. Controlling Multimedia Servers
+---------+---------------+---------------------------------+
|Protocol | Documentation | Purpose |
+---------+---------------+---------------------------------+
|RTSP | RFC 2326 | Remote control and AV playback |
| | | and recording servers |
+---------+---------------+---------------------------------+
The Real-Time Stream-control Protocol (RTSP) provides a standard way
to remote control a multimedia server. While primarily aimed at web-
based media-on-demand services, RTSP is also well suited to provide
VCR-like controls for audio and video streams, and to provide
playback and record functionality of RTP data streams. A client can
specify that an RTSP server plays a recorded multimedia session into
an existing multicast-based conference, or can specify that the
server should join the conference and record it.
6.2. Protocols for Non-A/V Applications
Applications other than audio and video have evolved in Internet
conferencing, e.g. Imm, Wb [8], NTE [11]. Such applications can be
used to substitute for meeting aids in physical conferences
(whiteboards, projectors) or replace visual and auditory cues that
are lost in teleconferences (e.g., a speaker list application); they
also can enable new styles of joint work.
Most non-A/V applications have in common that the application
protocol is about establishing and updating a shared state. Loss of
information is often not acceptable, so some form of multicast
reliability is required. The applications' requirements differ: Some
applications make per-participant additions to the shared state that
are orthogonal to each other (e.g., whiteboards), some evolve a more
closely interrelated common state (e.g., additions to a speaker list
must be properly sequenced). Some applications can make use of added
bandwidth/react to congestion in an elastic way, others transport
data that, although not strictly real-time, is time-critical.
In the IRTF research group on Reliable Multicast [13], work is in
progress on common protocol elements that can be used in such
applications. At the time of writing, some aspects of reliable
multicast are not well-understood, such as the proper way to provide
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 19]
INTERNET-DRAFT Conferencing Architecture July 2000
congestion control in a multi-sender multicast environment. As
congestion control is considered an essential element, standards
track protocols are not expected before this can be solved.
7. Conference Setup
There are two basic forms of conference discovery mechanism. These
are session advertisement and session invitation. Session
advertisements are provided using a session directory, and inviting a
user to join a session is provided using a session invitation
protocol such as SIP or H.323.
7.1. Session Directories
+----------+------------------+----------------------------------+
|Protocol | Documentation | Purpose |
+----------+------------------+----------------------------------+
|SDP | RFC 2327 | Session description format |
|SAP | Internet draft | Multicast session announcements |
+----------+------------------+----------------------------------+
The rendezvous mechanism for many light-weight sessions is a
multicast based session directory. This ``broadcasts'' session
descriptions [9] to all the potential session participants. These
session descriptions provide an advertisement that the session will
exist, and also provide sufficient information including multicast
addresses, ports, media formats and session times so that a receiver
of the session description can join the session. The session
description protocol (SDP) describes the content and format of a
multimedia session, and the session announcement protocol (SAP) is
used to distribute it to all potential session recipients.
This mechanism can also be applied to advertised tightly coupled
sessions, and only requires that additional information about the
mechanism to use to join the session is given. However, as the
number of sessions in the session directory grows, we expect that
only larger-scale public sessions will be announced in this manner,
and smaller, more private sessions will tend to use direct invitation
rather than advertisement.
7.2. Session Invitation
+----------+----------------+----------------------------------------------+
|Protocol | Documentation | Purpose |
+----------+----------------+----------------------------------------------+
|SIP | RFC 2543 | initiating multimedia calls and conferences |
+----------+----------------+----------------------------------------------+
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 20]
INTERNET-DRAFT Conferencing Architecture July 2000
Not all sessions are advertised, and even those that are advertised
may require a mechanism to explicitly invite a user to join a
session. Such a mechanism is required regardless of whether the
session is a lightweight session or a more tightly coupled session,
although the invitation system must specify the mechanism to be used
to join the session.
As users are mobile, it is important that such an invitation
mechanism is capable of locating and inviting a user in a location
independent manner. Thus user addresses need to be used as a level
of indirection rather than routing a call to a specific terminal.
The invitation mechanism should also provide for alternative
responses, such as leaving a message or being referred to another
user, should the invited user be unavailable.
The Session Initiation Protocol (SIP) provides a mechanism whereby a
user can be invited to participate in a conference. SIP does not
care whether the session is already ongoing, or is just being
created, and it doesn't care whether the conference is a small
tightly coupled session or a huge broadcast -- it merely conveys an
invitation to a user in a timely manner, inviting them to
participate, and provides enough information for them to be able to
know what sort of session to expect. Thus although SIP can be used
to make telephone-style calls, it is by no means restricted to that
style of conference.
8. Security
There is a temptation to believe that multicast is inherently less
private than unicast communication since the traffic visits so many
more places in the network. In fact, this is not the case except
with broadcast and prune type multicast routing protocols [4].
However, IP multicast does make it simple for a host to anonymously
join a multicast group and receive traffic destined to that group
without the other senders' and receivers' knowledge. If the
application requirement (conference policy) is to communicate between
some defined set of users, then strict privacy can only be enforced
in any case through adequate end-to-end encryption.
RTP specifies a standard way to encrypt RTP and RTCP packets using
private key encryption schemes such as DES [24]. It also specifies a
standard mechanism to manipulate plain text keys using MD5 [25] so
that the resulting bit string can be used as a DES key. This allows
simple out-of-band mechanisms such as privacy-enhanced mail to be
used for encryption key exchange.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 21]
INTERNET-DRAFT Conferencing Architecture July 2000
8.1. Authentication and Key Distribution
+----------+----------------------------+---------------------------+
|Protocol | Documentation | Purpose |
+----------+----------------------------+---------------------------+
|PGP | RFC 1991 | public key cryptography |
|X.509 | ITU recommendation X.509 | directory authentication |
+----------+----------------------------+---------------------------+
Key distribution is closely tied to authentication. Conference or
session directory keys can be securely distributed using public-key
cryptography on a one-to-one basis (by email, a directory service, or
by an explicit conference setup mechanism), but this is only as good
as the certification mechanism used to certify that a key given by a
user is the correct public key for that user. Such certification
mechanisms [3] are however not specific to conferencing, and it looks
likely that certificates such as those provided by PGP will be most
widely used in the near term.
Session keys can be distributed using encrypted Session Descriptions
carried in SIP session invitations, or in encrypted session
announcements as described below. Neither of these mechanisms
provide for changing keys during a session as might be required in
some tightly coupled sessions, but they are probably sufficient for
many used in the context of lightweight sessions.
Even without privacy requirements in the conference policy, strong
authentication of a user is required if making a network reservation
results in usage based billing.
8.2. Encrypted Session Announcements
+----------+------------------+----------------------------------+
|Protocol | Documentation | Purpose |
+----------+------------------+----------------------------------+
|SAP | Internet draft | multicast session announcements |
+----------+------------------+----------------------------------+
Session Directories can make encrypted session announcements using
private key encryption, and carry the encryption keys to be used for
each of the conference media streams in the session. Whilst this
does not solve the key distribution problem, it does allow a single
conference to be announced more than once to more than one key-group,
where each group holds a different session directory key, so that the
two groups can be brought together into a single conference without
having to know each other's keys.
8.3. Secured ``Broadcasts''
While private-key encryption is sufficient to exclude non-members
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 22]
INTERNET-DRAFT Conferencing Architecture July 2000
Figure 6: Joining a light-weight multimedia session
User A | |
creates | SDP/SAP |
conference |-----------> |
| |User B
| SDP/SAP IGMP |starts
|-----------> IGMP /--<---------|session
| IGMP /-<------/ |directory
|-----------<---------/ |
| |
| SDP/SAP |
|-------------------------------------------->|
| |
User A | |
starts | RTP |
sending |===========> |
| RTCP |
|-----------> |
| |
| RTP |
|===========> |
| |
| RTP |User B
|===========> IGMP |joins
| IGMP /--<---------|conference
| IGMP /-<------/ |
|-----------<---------/ |User's App
| RTCP |Sends RTCP
| RTP /--<---------|Session
|===============================/============>|Message
|<-----------------------------/ |
| RSVP Path Message |
|-------------------------------------------->|
| |User's App
| RTP /-----------|makes
|================================/===========>|reservation
| RSVP RESV Message /-----/ |
|<-----------------------/ |
| |
| RTP |Quality
|============================================>|of Service
| |improves
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 23]
INTERNET-DRAFT Conferencing Architecture July 2000
from sending or receiving multicast conference traffic, it does mean
that all members of a session are equal. This is normally acceptable
for multi-way conferences, but will not be acceptable for many
broadcasters who require the ability to ensure that only they can
send, perhaps in addition to ensuring that only their paid customers
can receive. This is nicely illustrated by the multicast of the
Rolling Stones concert in 1994 which was billed as being the first
live concert on the Mbone. In fact, this honour goes to a little
known band called Severe Tire Damage who had multicast an impromptu
concert a year previously. To make their point, just before the
Stones were due to go on stage, Severe Tire Damage suddenly started
broadcasting one of their songs live to the same multicast group.
Clearly commercial broadcasters want to avoid occurrences like this
one.
Such secured broadcasts can be performed by encrypting a hash
(digitally signing) of each packet with the senders private key of a
public-private key pair. The public key is then given to the
receivers, and they discard (and prune if possible) any packets that
are unsigned. The problem with this is that even encrypting a 128
bit hash with a public key algorithm can be relatively expensive to
perform at high packet rates sometimes seen with video. The use of
public-key cryptography for this purpose has not yet been
standardized, but some such mechanism will clearly be needed before
the Mbone becomes an acceptable environment for commercial
broadcasters.
9. Summary
This document is an attempt to gather together in one place the set
of assumptions behind the design of the Internet Multimedia
Conferencing architecture, and the services that are provided to
support it.
Figure 6 shows an example time sequence involved in setting up a
light-weight session between two sites. In this case, site A creates
a session advertisement, and some time later starts sending a media
stream even though there may be no receiver at that time. Some time
later, site B joins the session (the multicast routing protocol here
is PIM), and starts to receive the traffic. At the earliest
opportunity site B also makes an RSVP reservation to ensure the flow
quality is satisfactory. This example should be taken as
illustrative only -- there are different ways to join sessions, and
different ways to get improved quality of service.
The lightweight sessions model for Internet multimedia conferencing
may not be appropriate for all conferences, but for those sessions
that do not require tightly-coupled conference control, it provides
an elegant style of conferencing that scales from two participants to
millions of participants. It achieves this scaling by virtue of the
way that multicast routing is receiver driven, keeping essential
information about receivers local to those receivers. Each new
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 24]
INTERNET-DRAFT Conferencing Architecture July 2000
participant only adds state close to them in the network. It also
scales by not requiring explicit conference join mechanisms; if
everyone were to need to know exactly who is in the session at any
time, the scaling would be severely adversely affected. RTCP
provides membership information that is accurate when the group is
small, and increasingly only a statistical representation of the
membership as the group grows. Security is handled through the use
of encryption rather than through the control of data distribution.
For those that require tightly coupled conferences, solutions such as
H.323 are emerging there too.
There are still many parts of this architecture that are incomplete,
and are still the subject of active research. In particular,
differentiated services for better-than-best-effort service show
great promise to provide a more scalable alternative to individual
reservations. Multicast routing scales well to large groups, but
scales less well to large numbers of groups; we expect this will
become the subject of significant research over the next few years.
Multicast congestion control mechanisms are still a research topic,
although in the last year several schemes have emerged that show
promise. Layered codecs show great promise to allow conferences to
scale in the face of heterogeneity, but the join and leave mechanisms
that allow them to perform receiver-based congestion control are
still being examined. We have several working examples of reliable-
multicast-based shared applications; the next few years should see
the start of standardization work in this area as appropriate
multicast congestion control mechanisms emerge. Finally a complete
security architecture for conferencing would be very desirable;
currently we have many parts of the solution, but are still waiting
for an appropriate key-distribution architecture to emerge from the
security research community.
The Internet Multimedia Conferencing architecture and the Mbone have
come a long way from their early beginnings on the DARTnet testbed in
1992. The picture is not yet finished, but it has now taken shape
sufficiently that we can see the form it will take. Whether or not
the Internet does evolve into the single communications network that
is used for most telephone, television, and other person-to-person
communication, only time will tell. However, we believe that it is
becoming clear that if the industry decides that this should be the
case, the Internet should be up to the task.
10. Acknowledgments
Acknowledgments are due to the End-to-End Research Group, the Int-
serv, RSVP, MMUSIC and AVT working groups of the IETF, and discussion
with colleagues at UCL. The earliest clear exposition of some of the
ideas here was presented at ACM SIGCOMM 1994 in London by Van
Jacobson.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 25]
INTERNET-DRAFT Conferencing Architecture July 2000
11. Authors' Addresses
Mark Handley
AT&T Center for Internet Research at ICSI
1947 Center St, Suite 600
Berkeley, CA 94704
EMail: mjh@aciri.org
Jon Crowcroft,
Department of Computer Science
University College London
Gower Street,
London WC1E 6BT, UK.
Email: j.crowcroft@cs.ucl.ac.uk
Carsten Bormann, Joerg Ott
Universitaet Bremen TZI
Postfach 330440
D-28334 Bremen, GERMANY.
Email: cabo@tzi.org, jo@tzi.org
References
[1] A. Ballardie, P. Francis, J. Crowcroft, ``An Architecture for
Scalable Inter-Domain Multicast Routing'', ACM SIGCOMM 1993, pp
85-95.
[2] C. Bormann, ``Providing integrated services over low-bitrate
links,'' RFC2689, September 1999.
[3] CCITT (Consultative Committee on International Telegraphy and
Telephony). ``Recommendation X.509: The Directory --
Authentication Framework.'' 1988.
[4] S. Deering, C. Partridge, D. Waitzman, ``Distance Vector
Multicast Routing Protocol'', RFC 1075, Nov 1988.
[5] Steve Deering, ``Multicast Routing in Internetworks and Extended
LANs'', ACM SIGCOMM 88, August 1988, pp 55-64 and ``Host
Extensions for IP Multicasting'', RFC 1112.
[6] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C-G. Liu, L.
Wei ``An Architecture for Wide Area Multicast Routing'' ACM
SIGCOMM 1994, pp 126-135.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 26]
INTERNET-DRAFT Conferencing Architecture July 2000
[7] Estrin, Farinacci, Helmy, Thaler, Deering, Handley, Jacobson,
Liu, Sharma, Wei, ``Protocol Independent Multicast-Sparse Mode
(PIM-SM): Protocol Specification'', RFC 2362.
[8] S. Floyd, V. Jacobson, S. McCanne, C-G. Liu, L. Zhang, ``A
Reliable Multicast Framework for Light-weight Sessions and
Application Level Framing'' ACM SIGCOMM 1995, pp 342-356.
[9] M. Handley, V. Jacobson, ``SDP: Session Description Protocol''
INTERNET-DRAFT, Dec 1997.
[10] M. Handley, D. Thaler, D. Estrin, ``The Internet Multicast
Address Allocation Architecture'', INTERNET-DRAFT, Dec 1997.
[11] M. Handley, J. Crowcroft, ``Network Text Editor (NTE): A
scalable shared text editor for the MBone'', ACM SIGCOMM 1997.
[12] V. Hardman, A. Sasse, M. Handley, A. Watson, ``Reliable Audio
for Use over the Internet'' Proc INET '95, Hawaii, Internet
Society, Reston, VA, 1995.
[13] IRTF Research Group on Reliable Multicast,
http://www.east.isi.edu/RMRG/
[14] ITU ``Recommendation H.320: Narrow-band visual telephone systems
and terminal equipment'', ITU, Geneva, 1997
[15] ITU ``Recommendation T.124 -- Generic Conference Control'', ITU,
Geneva.
[16] ITU ``Recommendation H.323: Visual telephone systems and
equipment for local area networks which provide a non guaranteed
quality of service'', ITU, Geneva, 1996
[17] ITU ``Recommendation H.332: H.323 Extended for Loosely-Coupled
conferences'', ITU, Geneva
[18] C. Bormann, J. Ott, C. Reichert, ``Simple Conference Control
Protocol'' INTERNET-DRAFT, June 1996.
[19] V. Jacobson, ``Congestion Avoidance and Control'', ACM SIGCOMM
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 27]
INTERNET-DRAFT Conferencing Architecture July 2000
1988.
[20] J. Linn, ``Privacy Enhancement for Internet Electronic Mail:
Part I: Message Encryption and Authentication Procedures'', RFC
1421, Feb 1993
[21] S. McCanne, V. Jacobson and M. Vetterli, ``Receiver-driven
Layered Multicast''. ACM SIGCOMM 1996, pp. 117-130.
[22] S. McCanne, M. Vetterli, ``Joint Source/Channel Coding for
Multicast Packet Video''. Proceedings of the IEEE International
Conference on Image Processing. October, 1995. Washington, DC.
[23] J. Moy, ``Multicast Extensions to OSPF'', RFC 1584, March 1994.
[24] National Institute of Standards and Technology (NIST), ``FIPS
Publication 46-1: Data Encryption Standard'', January 22, 1988
[25] Rivest, R., ``The MD5 Message-Digest Algorithm'', RFC 1321, MIT
Laboratory for Computer Science and RSA Data Security, Inc.,
April 1992
[26] Schooler, E., A Distributed Architecture for Multimedia
Conference Control, ISI Research Report ISI/RR-91-289, November
1991. ftp://ftp.isi.edu/pub/hpcc-papers/mmc/mmcc.ps
[27] Schulzrinne, H., ``Personal Mobility for Multimedia Services in
the Internet'' IMDS'96, March 4-6 1996.
ftp://ftp.fokus.gmd.de/pub/step/papers/Schu9603:Personal.ps.gz
[28] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson ``RTP: A
Transport Protocol for Real-Time Applications'' RFC 1889.
[29] D. Thaler, D. Estrin, D. Meyer, ``Border Gateway Multicast
Protocol'', INTERNET-DRAFT, Oct 1997.
M. Handley/J. Crowcroft/C. Bormann/J. Ott [Page 28]