Internet Engineering Task Force                    MBONED Working Group
INTERNET-DRAFT                                           Kevin Almeroth
draft-ietf-mboned-mrm-use-00.txt                                   UCSB
Expires August 1999                                          Liming Wei
                                                     cisco Systems, Inc
                                                      February 26, 1999


                   Justification for and use of the
               Multicast Routing Monitor (MRM) Protocol


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.  Internet-Drafts are
   working documents of the Internet Engineering Task Force (IETF),
   its areas, and its working groups.  Note that other groups may also
   distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   To view the entire list of current Internet-Drafts, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).


Abstract

This document motivates the need for the Multicast Routing Monitor
(MRM) [MRM] protocol by describing the niche that exists for a
router-based multicast management protocol.  Using the "sufficient
and necessary" argument, we suggest that existing protocols and
techniques lack important management functionality.  This document
briefly describes the methodology used by MRM, justifies the
existence of MRM, and describes some of the scenarios in which MRM
will be of value.


1. Introduction

The Multicast Routing Monitor (MRM) protocol has been designed to
assist in the detection and isolation of network faults related to
the delivery of multicast traffic[MRM99].  In particular, management
functions offered by MRM are specifically designed to monitor routing
operation, and assist in the investigation of routing anomalies and
connectivity problems.

MRM has been designed with consideration for the other types of
multicast management protocols and tools that are available.  As

Almeroth, Wei                                                 [Page 1]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999



we will show, even though there are a wide variety of tools
available today, there is a need for a router-based monitoring
protocol.  The justification for MRM as a new protocol has followed
the ``necessary and sufficient'' premise.  MRM is being developed
because it is necessary when comparing its functions to those
offered by alternatives like the Real Time Control Protocol (RTCP) [RTP]
and the Simple Network Management Protocol (SNMP) [SNMPV1,SNMPV2].

Furthermore, MRM is being developed because it is sufficient in
providing the functions needed by its target class of applications.
Using this reasoning, MRM will offered functions and provide multicast
traffic management that no other protocols currently offer.
2. Overview of MRM

MRM is a protocol intended to be implemented in both routers and
end stations.  The operation of MRM is based on communication and
coordination between three types of network entities.

   * MRM Manager:  The MRM manager provides an interface enabling
     a user to configure and execute tests, and then collect and
     present results.  The MRM manager communicates with MRM testers
     who are instructed to source and/or sink multicast traffic.
     The MRM manager, through beacon messages, also maintains and
     modifies the set of MRM testers.

   * Test Sender (TS):  A test sender is basically responsible
     for sourcing multicast traffic.  TSs will receive authenticated
     requests from the MRM manager and will send a specified number
     of multicast packets to a specified multicast address with a
     specified inter-transmission time between packets.

   * Test Receiver (TR): Based on instructions from an MRM manager,
     a test receiver is expected to either explicitly join a
     multicast group or simply monitor traffic on a specified group
     address.  Based on thresholds specified by the MRM manager, a
     TR will report faults.  Additionally, the MRM manager may
     request TR reports regardless of whether any thresholds were
     violated.  One of the keys to scalability is ensuring that a
     large number of TRs don't overwhelm the MRM manager with
     traffic.  Scalability is handled using a combination of
     techniques including report suppression and aggregation.

As the MRM protocol specification indicates about itself, ``it only
specifies the types of information a MRM manager can obtain, and the
protocol used to acquire such information.  How an MRM manager processes
or presents the diagnostic information is an implementation issue.''
These functions are expected to be provided using companion management
tools.  Furthermore, the MRM protocol specification does not fully
describe the scenarios in which MRM is expected to be useful.  Such
functions and scenarios are described in Section 4 of this document.

Almeroth, Wei                                                 [Page 2]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999



3. Justification

MRM provides a set of functions not provided by any of the commonly
used MBone debugging protocols and tools.  Most of the tools used
in the MBone today fall into one of three categories: (1) SNMP-based
tools like Mview [MVIEW] or Mstat [MSTAT];  (2) RTCP-based tools like
Mhealth [MHEALTH], RTPmon [RTPMON], or MultiMON [MULTIMON]; or
(3) multicast route tracing tools like Mtrace [MTRACE1,MTRACE2].
MRM, in addition to being an independent management tool, can be
used in conjunction with these other tools to provide a richer set
of management functions.  Some of the reasons why the above protocols
or tools fail are discussed in the following paragraphs.

   * SNMP:  SNMP provides a mechanism to poll devices for
     information or to have alarms generated when certain events
     occur.  The problem with SNMP is that a wide ranging failure
     could potentially overwhelm a management station.  For example,
     consider a scenario in which SNMP agents in a particular
     multicast tree are configured to generate an alarm if the packet
     loss exceeds a certain level.  Then consider the implosion that
     would occur if a link close to the root becomes congested, and
     a majority of group members generated alarms.  This scenario
     demonstrates the basic drawbacks of SNMP: a general lack of
     scalability especially when considering that large number of
     devices/hosts that may be involved in a multicast group.
     Scalability arguments do not preclude the use of SNMP, but a
     manager using SNMP to manage multicast would have to be extremely
     careful in deciding how to configure the network.  In fact,
     properly configuring network devices to provide sufficient
     management information while avoiding management-induced congestion
     or implosion may be prohibitive in most networks.

   * RTCP:  RTCP has a much more scalable feedback mechanism but it
     has its own deficiencies.  The scalability of RTCP is based on
     a random wait time chosen from an interval calculated by each
     group member and based on an estimate of the overall group size.
     The larger the group, the larger the wait interval, and the longer
     the average inter-packet time between RTCP feedback messages.
     The goal of the RTCP feedback mechanism to is consume bandwidth
     equal to 5% of the data traffic rate.  While this algorithm seems
     reasonable it can be problematic as a tool to management multicast
     traffic.  Some reasons include:

     o RTCP feedback is multicast to all group members, and given
       that receivers will have heterogeneous bandwidth capabilities,
       even scalable feedback has the potential to overwhelm some
       receivers.




Almeroth, Wei                                                 [Page 3]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999



     o In many management applications there is no need for feedback
       data to be transmitted to all group members.  And if privacy
       is an issue, the group-wide delivery of RTCP is even less
       desirable.

     o RTCP feedback provides only a single, end-to-end loss and
       jitter value.  More generally, RTCP contains only a very small
       amount of information useful for debugging purposes.  MRM is
       designed to include a broader range of information, including
       packet duplication statistics, and also to be extensible.

     While some of the problems with RTCP are being addressed by
     redefining the standard to allow more flexibility in the use of
     RTCP [RTPNEW], these efforts do not solve all of the problems.
     In particular, the most critical deficiency of RTCP is a lack of
     detailed routing information.  In particular, when trying to
     isolate routing faults, the end-to-end style feedback provided
     by RTCP is unlikely to have sufficient granularity.  To address
     this problem, some RTCP-based tools are used in combination with
     other tools.  For example, RTPmon and mtrace are commonly used
     together.  However, the major drawback of this solution is that
     it fails to provide the sort of traffic origination and flexible
     group membership services offered by MRM.

   * Mtrace:  Mtrace is a tool designed to provide hop-by-hop path
     information for a specific source and destination.  It is a
     useful tool for figuring out a multicast path and round trip
     information.  For a specific group, mtrace will also tell a user
     hop-by-hop packet loss.  Coupled with RTCP feedback, mtrace can
     be used to monitor many of the relevant factors for an active
     source and group including per-receiver loss, hop-by-hop loss,
     tree topology, jitter, and round trip time.  Several tools in
     development, including Mhealth, provide a graphical real-time
     display of group statistics.  However, mtrace (coupled with other
     tools) only provides information about active groups.  Attempting
     to do fault detection, or more specifically, fault pre-detection,
     is nearly impossible.  The common paradigm today is to gather a
     set of willing participants who then join a ``debugging'' session.
     Further complicating the problem is that sometimes, starting an
     MBone tool in a remote location to receive and transmit RTCP
     reports is not possible.  One solution to this problem is a
     ``dumb'', non-GUI tool that simply receives and responds to an
     RTP stream.  While tools like this have been discussed, but none
     are widely available, and even if they were, attempting to rapidly
     configure and change group membership would be laborious at best.
     MRM is designed with the specific purpose of facilitating
     on-the-fly, adhoc test multicast senders and receivers to test
     a variety of multicast group configurations.



Almeroth, Wei                                                 [Page 4]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999



4. Scenarios for Use of MRM

MRM is designed to provide automated fault detection and isolation
services for multicast traffic.  In order to support these services
with any kind of automation, MRM must be both flexible and scalable.
MRM scalability implies the ability detect faults without raising
so many alarms that additional problems are caused from the delivery
of alarm messages.  One problem, in particular, is response implosion
at the MRM manager.  MRM flexibility implies the ability to isolate
faults by sourcing traffic from anywhere in the network and collecting
statistics from any node or subset of nodes.  In addition to basic
fault detection and isolation, MRM is intended to provide more advanced
functions.  These extended functions include:

   * Fault logging and real-time (passive) monitoring functions.

   * Pro-active test (fault isolation) include service provisioning
     and impact analysis.

The remainder of this section is dedicated to the description of
scenarios in which MRM functions are expected to be used.

   * Pre-Event Testing:  One of the best examples of this type of
     scenario is the MBone delivery of two audio/video channels from
     the IETF meetings held three times a year all over the world.
     Preceding the week of meetings for each IETF, staff members
     install a terminal room and establish network connectivity
     including multicast capability.  In some cases, setup activities
     occur weeks, days, or hours before the first meeting Monday
     morning.  Verifying that multicast routing is working both into
     and out of the IETF meeting rooms can be a challenge.
     Verification is especially challenging because the IETF meetings
     have a world-wide audience.  Ensuring that multicast is working
     at even a small number of remote sites is difficult.  One
     problem that sometimes occurs is that the MBone equipment,
     including cameras and workstations, may not be available when
     the network is first turned on.  In these cases, there are no
     multicast-capable sources or receivers inside the IETF network.
     MRM would alleviate this problem by allowing testing of multicast
     in both directions.  Furthermore, MRM would also allow someone not
     yet on site to test multicast connectivity.  Relatively extensive
     testing can be performed by choosing a set of Test Receivers
     representative of the world-wide distribution of actual IETF
     participants.  MRM would allow the IETF staff and the ISP to
     observe where major network bottlenecks are occurring.  In some
     cases, early discovery of problems could lead to fixes in time
     for the event.




Almeroth, Wei                                                 [Page 5]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999



     These techniques, used for pre-event testing at ``nomadic
     events'', would also be appropriate for estimating the quality
     of transmissions events in ``non-nomadic'' networks.  Instead of
     the IETF or an academic conference, an MRM manager might want to
     estimate the loss, delay, and jitter for a frequently scheduled
     event like an MBone lecture or company event. Instead of waiting
     until the event starts and using a tool like RTPmon, an MRM
     manager can set up a test session any time before the session
     starts, and evaluate the quality to most, if not all of the
     critical company locations.  In the MBone today, if a transmitter
     wants to perform this kind of testing, the transmitter will,
     out-of-band, have to ask several friends to join a test session
     and then send a multicast stream and monitor RTCP reports.
     Obviously, this method is not very compelling.

   * Classic Fault Isolation:  A second scenario that MRM is designed
     to assist a network manager in is classic fault isolation.  Like
     unicast routing, multicast routing problems can be very difficult
     to debug.  And unlike unicast routing, the additional
     complexities of providing efficient, one-to-many delivery can
     introduce additional bugs that are difficult to find.  To date,
     a significant number of strategies, tools, and techniques have
     been developed, built, and proposed [MDH].  However, these
     attempts generally require a significant level of multicast
     routing expertise and experience, characteristics not always
     found among NOC personnel.  As a result, MRM is designed to offer
     a layer of abstraction between multicast route management and the
     intricacies of multicast routing.  MRM is also designed not to be
     completely independent of the strategies, tools, and techniques
     already in use today.  MRM and existing tools can work in concert
     to isolate multicast routing problems.

     MRM's design offers some important flexibility in isolating
     multicast routing faults.  In particular, the ability to specify
     a transmission rate allows a manager to closely inspect single,
     infrequently transmitted packets.  Also, the ability to easily
     add and remove members from the group of Test Receivers allows a
     manager to quickly and efficiently affect the topology of the
     multicast tree.

   * Session Monitoring:  The scenarios discussed so far followed
     logically, from verifying multicast connectivity to isolating
     any potential faults.  The next key scenario is monitoring of
     existing, active sessions.  Such groups will have a well-known
     multicast address, and might be exchanging group membership
     information via RTCP reports or some other out-of-band mechanism.
     If the group is small, and feedback from each receiver is




Almeroth, Wei                                                 [Page 6]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999


     important, the set of test receivers can be configured to send
     reports to the MRM manager via unicast.  If the group is large
     and complete feedback is not necessary, the set of test receivers
     can be frequently adjusted to represent some statistical sampling
     of the group.  The ability to send statistical reports via unicast
     helps to improve the scalability of session monitoring by not
     overwhelming all receivers with all reports.  Finally, if the
     group is using multicast tools that do not use RTCP and use no
     real-time signaling, generation of a real-time list of group
     members may be difficult to create.  Other techniques will have
     to be used.  One network-layer approach might be to use SNMP
     information to find the set of links in the multicast tree.  A
     more simple approach might depend on other available information
     like the fact that most users start the multicast tool via a WWW
     page.  In this case, HTTP server logs can be used to estimate
     group membership.

   * Fault Logging:  In the case when session monitoring identifies
     the existence of a fault, a range of fault logging functions may
     be required.  At one extreme, the MRM manager may simply need to
     be alerted when faults occur so that appropriate investigative
     measures can be taken.  At the other extreme, service contracts
     may depend on the provision of service with certain guarantees.
     Any outages will need to be closely tracked.  These two extremes
     again demonstrate the need for MRM to be flexible.  In particular,
     when faults need to be closely monitored and logged, a wide-scale
     outage may itself cause a heavy load on the network.  While
     identifying the exact load capable of being supported by a
     distressed network is beyond the scope of MRM, MRM does and will
     support scalability and aggregation functions.


8.  Security

Security issues are discussed in the MRM protocol description [MRM].

















Almeroth, Wei                                                 [Page 7]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999



9.  Authors' Addresses

Kevin Almeroth
Department of Computer Science
University of California
Santa Barbara, CA 93106-5110
USA
almeroth@cs.ucsb.edu

Liming Wei
cisco Systems, Inc.
170 West Tasman Drive
San Jose, CA 95134
USA
lwei@cisco.com


10. References

[MRM]      L. Wei, and D. Farinacci, "Multicast Routing Monitor (MRM)",
           IETF Internet-Draft, draft-ietf-mboned-mrm-*.txt,
           February 1999.

[RTP]      H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson,
           "RTP:  A Transport Protocol for Real-Time Applications",
           IETF RFC 1889, January 1996.

[SNMPV1]   J. Case, M. Fedor, M. Schoffstall, and J. Davin, "Simple
           Network Management Protocol", IETF RFC 1157, May 1990.

[SNMPV2]   J. Case, K. McCloghrie, M. Rose, and S. Waldbusser,
           "Protocol Operations for Version 2 of the Simple Network
           Management Protocol (SNMPv2)", IETF RFC 1905, January 1996.

[MVIEW ]   D. Thaler, "Mview Tool",
           http://www.merit.edu/~mbone/mviewdoc/Welcome.html.

[MSTAT]    B. Fenner, et al., "Mstat", Available as part of mrouted at
           ftp://ftp.parc.xerox.com/pub/net-research/ipmulti/.

[MHEALTH]  D. Makofske, and K. Almeroth, "Mhealth -- Real-Time Multicast
           Tree Health Monitoring Tool", http://imj.ucsb.edu/mhealth/,
           August 1998.

[RTPMON]   A. Swan, and D. Bacher, "RTPmon",
           ftp://mm-ftp.cs.berkeley.edu/pub/rtpmon/, January 1997.





Almeroth, Wei                                                 [Page 8]


INTERNET-DRAFT     draft-ietf-mboned-mrm-use-00.txt      February 1999



[MULTIMON] J. Robinson, and J. Stewart, "MultiMON 2.0 -- Multicast
           Network Monitor", http://www.merci.crc.ca/mbone/MultiMON/,
           August 1998.

[MTRACE1]  B. Fenner, et al., "Multicast Traceroute (mtrace) 5.2",
           ftp://ftp.parc.xerox.com/pub/net-research/ipmulti/
           September 1998.

[MTRACE2]  B. Fenner, and S. Casner, "A `traceroute' Facility for IP
           Multicast", IETF Internet-Draft,
           draft-ietf-idmr-traceroute-ipm-*.txt, November 1995.

[RTPNEW]   H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson,
           "RTP:  A Transport Protocol for Real-Time Applications",
           IETF Internet-Draft, draft-ietf-avt-rtp-new-*.txt",
           November 1998.

[MDH]      D. Thaler, and B. Aboba, "Multicast Debugging Handbook",
           IETF Internet-Draft, draft-ietf-mboned-mdh-*.txt,
           October 1998.































Almeroth, Wei                                                 [Page 9]