INTERNET DRAFT Ali Boudani, Bernard Cousin
Expires: July 2003 Jean-Marie Bonnin
IRISA-France
February 2003
An Effective Solution for Multicast Scalability:
The MPLS Multicast Tree (MMT)
<draft-boudani-mpls-multicast-tree-03.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 except that the right to
produce derivative works is not granted.
Internet Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsolete by other documents
at anytime. It is inappropriate to use Internet Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
A multicast router should keep forwarding state for every multicast
tree passing through it. The number of forwarding states grows with
the number of groups. In this paper, we present a new approach, the
MPLS multicast Tree (MMT), which utilizes MPLS LSPs between
multicast tree branching node routers in order to reduce forwarding
states and enhance scalability. In our approach only routers that
are acting as multicast tree branching node routers for a group
need to keep forwarding state for that group. All other non-
branching node routers simply forward data packets by traffic
engineered unicast routing using MPLS LSPs. We can deduce that our
approach can be largely deployed because it uses for multicast
traffic the same unicast MPLS forwarding scheme. We will briefly
discuss the multicast scalability problem, related works and
different techniques for forwarding state reduction. We discuss also
the advantages of our approach, and conclude that it is feasible and
promising. Finally, we analytically evaluate our approach.
1 Introduction
Multicast has become increasingly important with the emergence of
network-based applications such as IP telephony, video conferencing,
Boudani & Cousin [Page 1]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
distributed interactive simulation (DIS) and software upgrading.
Using the multicast services, a single transmission is needed for
sending a packet to n destinations, while n independent
transmissions would be required using the unicast services. A
multicast routing protocol should be simple to implement, scalable,
robust, use minimal network overhead, consume minimal memory
resources, and inter-operate with other multicast routing protocols.
Multicast suffers from scalability problem with the number of
concurrently active multicast groups because it requires a router to
keep forwarding state for every multicast tree passing through it
and the number of forwarding states grows with the number of groups.
Scalability can be evaluated not only in terms of the overhead
growth in the presence of a large number of groups but also by the
number of participants per group and by groups for which the set of
participants changes often over time. Overhead can be measured in
terms of memory resources (in routers) as they relate to routing
states maintained per group and can be measured also by bandwidth
resources in terms of control or signaling messages per group and
also by processing power.
Recently, significant research effort has focused on the multicast
scalability problem. Some schemes attempt to reduce forwarding state
by tunneling [1] or by forwarding state aggregation [2]. Both these
works attempt to aggregate routing state after this has been
allocated to groups. Other architectures aim to eliminate
forwarding states at routers either completely by explicitly
encoding the list of destinations in the data packets, instead of
using a multicast address [3] or partially by using branching node
routers in the multicast tree [4,5].
The Xcast proposal [3], used for small groups, eliminates the need
for forwarding states. The source encodes the list of destinations
in the Xcast header, and then sends the packet to a router. Each
router along the way parses the header, partitions the destinations
based on each destination's next hop, and forwards a packet with an
appropriate Xcast header to each of the next hops.
In SEM [4], we proposed that the source uses unicast encoding for
multicast packets and sends them to its next hop branching node
routers. Each branching node router acts as a source and packets
travel from a branching node router to another. A special mechanism
was introduced to inform each branching node router about its next
hop branching node routers for a group.
REUNITE [5], uses recursive unicast trees to implement multicast
service. REUNITE does not use class D IP addresses. Instead, both
group identification and data forwarding are based on unicast IP
addresses. Only branching node routers for a group need to keep
multicast forwarding state. All other non-branching node routers
simply forward data packets by unicast routing.
We think that using the branching node routers to forward multicast
data packets in unicast mode is very efficient in order to reduce
forwarding state and thus enhance scalability. The main problem is
how to ensure that a branching node router has a complete knowledge
about its next hop branching node routers. And another issue is how
can we reduce the effect of encapsulated multicast packets in
Boudani & Cousin [Page 2]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
unicast packets. We think that a network information manager
system (NIMS) in each domain can be used to resolve the first
problem while the multi protocol label switching (MPLS) [9]
presents an efficient solution for the second problem.
A framework for MPLS multicast traffic engineering that explain IP
multicast deployment in MPLS environment was proposed by Ooms et
al [6]. Another studies about MPLS and multicast proposed by
Farinacci et al. [7] explain how to use PIM to distribute MPLS
labels for multicast routes. Other expired Internet drafts studied
the same subject [10].
The latest interesting proposal is aggregated multicast [8]. The key
idea of aggregated mulicast is that, instead of constructing a tree
for each individual multicast session in the core network, one can
have multiple multicast sessions share a single aggregated tree to
reduce multicast state and, correspondingly, tree maintenance
overhead at network core. In this proposal there was two
requirements: (1) original group addresses of data packets must be
preserved somewhere and can be recovered by exit nodes to determine
how to further forward these packets;(2) some kind of identification
for the aggregated tree which the group is using must be carried and
transit nodes must forward packets based on that. Instead of using
IP encapsulation as in SEM, which of course, adds complexity and
processing overhead, a potentially much better possibility is to use
MPLS [9]. To handle aggregated tree management and matching between
multicast groups and aggregated trees, a centralized management
entity called tree manager is introduced. In group to aggregated
tree matching, complication arises when there is no perfect match or
no existing aggregated tree covers a group. There was a disadvantage
in leaky matching because certain bandwidth should be wasted to
deliver data to nodes that are not involved for the group. In our
proposal we intend to resolve also this problem of leaky matching.
Using MPLS with multicast has many benefits not only for reducing
multicast forwarding states but also for traffic engineering and
QoS issues. In this paper, we only focus on the scalability
problem. We propose a novel approach to reduce multicast state, the
MPLS multicast Tree. This approach uses MPLS LSPs between multicast
tree branching node routers in order to reduce forwarding states
and enhance scalability. The main difference with previous
approaches is that we use only multicast states for branching
routers in the multicast tree. Other routers between two branching
nodes don't have multicast states. Unicast is used between two
branching routers. This way the total number of multicast forwarding
states may be significantly reduced. By using MPLS LSPs between
branching routers, memory resource is reduced also and no need for
encapsulation techniques. The MPLS forwarding process conserve the
router resources also. In this paper, we examine several design and
implementation issues of our scheme and present an evaluation for
our protocol.
The remainder of this paper is organized as follow. In section 2 the
concept of MPLS multicast aggregation is described and some
implementation related issues are discussed. Section 3 contains the
approach analysis and an evaluation for its forwarding state and
Boudani & Cousin [Page 3]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
messaging overhead. Section 4 is a summary followed by a list of
references.
2 Observation and proposal: MPLS Multicast Tree
The key idea of MPLS multicast tree is that, instead of having
multicast forwarding states in all routers in a constructed tree for
each individual multicast session in the core network (backbone),
one can have only multicast forwarding states in branching node
routers of the tree. By using the branching node routers, multicast
forwarding states are reduced, correspondingly, the tree maintenance
overhead at network core.
The MPLS multicast tree is targeted basically for intra-domain
multicast, but it is not limited for that only since it is based on
MPLS, and can be used also for inter-domain multicast.
2.1 Observation
In conventional multicast, routers on the multicast tree located
between the source and a group member should maintain multicast
states for the group.
Let's consider the network illustrated in fig.1 (single multicast
session with only one source and one group). Suppose that there are
group members at router R5, then routers R1, R2 R3, R4 should all
have multicast state for the group G even if they don't have
directly connected members.
S
|
R1
|
R2
|
R3
|
R4
/ \
/ \
R6 R5
Fig.1 : Multicast tree (one source, one group)
Let's take the same network but with multiple multicast sessions
(one single source and two groups). We suppose that there are
members for group G1 at R5 and members for group G2 at R6. Routers
R1, R2, R3, R4 should maintain states for groups G1 and G2 even if
they don't have directly connected group members. When the number of
group increase in time multicast states became harder and harder.
Fig.2 represents a multicast topology (constructed tree) resulting
from a traceroute experiments [5]. In the entire multicast tree
there are 8 branching node routers out of 97 routers.
Fig.2: Traceroutre experiment from a source to 15 other sites
Boudani & Cousin [Page 4]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
One can conclude that maintaining multicast forwarding states in non
branching node routers is a waste of resources and that only
branching node routers for multicast groups should maintain
multicast forwarding states. In conventional multicast a router can
discover that it is a branching node router for a group but it
doesn't have any idea about its next hop branching node routers for
that group.
A second observation is that even when 2 multicast trees share
multiple links, multicast forwarding states on routers for the
shared links will not be aware of that and there is no possibility
for aggregation.
In REUNITE there was a completely unicast approach for delivering
multicast packets. However, many of the LAN and WAN technologies
have native support for multicast. Sending individual unicast
messages to each of the receivers in a multicast-capable subnet as
ethernet is very inefficient. Multicast packets should always
contain the IP multicast header.
In SEM we have introduced a tunneling process which can be
implemented to conventional multicast too but messages between
branching node routers in order to construct tunnels can be
considered as expensive overheads. Our proposal will take all these
observations into account.
Overheads are the header processing time that take a router to
determine the routing and also the size of multicast routing table.
These two overheads are related. Our goal is to minimize the
multicast overheads in each router. In our proposal, only branching
node routers will contain the multicast routing table. All other
routers do not need the routing states.
In aggregated multicast[8], the solution was to have a single shared
aggregated multicast tree but as mentioned in discussions
complication arises when there is no perfect match or no existing
tree covers a group. The disadvantage here is that certain bandwidth
is wasted to deliver data to nodes that are not involved for the
group. Bandwidth can crucial factors for provisioning QoS in
multicast networks and even for best effort Internet.
2.2 proposal
In order to inform a branching node router about its next hop
branching node routers for a group, each domain should contain a
network information manager system (NIMS) for each group, charged to
collect join messages from all group members in that domain. After
collecting all join and prune messages, the NIMS should compute the
multicast tree for that group in the domain. The computation for a
group means discovering all branching node routers and the next hop
branching node routers for each group.
We should note that scalable multicast protocols like PIM-SM
(shared tree) and CBT use a core router similar to the one used in
our approach.
In our approach we are suggesting that the NIMS should collect join
messages from all group members and have a complete overview about
the multicast network.
Boudani & Cousin [Page 5]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
The NIMS send then messages to all branching node routers to inform
them about their next hop branching node routers. On receiving this
message, a branching node router creates a multicast forwarding
state for the multicast session.
Once branching node routers and their next hop are identified,
packets will be send from a branching node router to another until
achieving their target.
Tunneling between branching node routers was studied in [1], [4],
and [5] but was judged very expensive and very complicated since
multicast packets should be encapsulated as unicast packets and
send after that over the tunnel.
Based on observations, we propose a new approach, the MPLS multicast
tree, which utilizes MPLS dynamically established LSPs between
multicast tree branching node routers in order to reduce forwarding
states and enhance scalability.
When a multicast packet arrives to the ingress router of a MPLS
domain, the packet is analyzed according to its multicast IP header.
The router should determine who are the next hop branching node
routers for that packet. Based on this information, multiple copies
of the packets are generated and a MPLS label is pushed to the
multicast packet corresponding to each next branching node router.
Then a labeled multicast packet will not be different from a labeled
unicast packet.
When arriving to a next hop branching node router, the label is
pulled and again the same process is repeated. This process should
be repeated until the packet arrives to its destination.
When arriving to a LAN, the packet unlabeled can be delivered by
conventional multicast protocols according to IGMP informations.
Note that labeled multicast packets will be examined at each
branching router and that is why a multicast packet on non-branching
router will be treated as a MPLS labeled unicast packet. So no extra
charges are needed for multicast packets since the MPLS tables
already exist in routers. The examination of a multicast packets in
intermediate routers on the tree is considered as a disadvantage of
this approach but we leave this issue for further studies.
When a branching node router receive the message from the NIMS it
should calculate the label corresponding to the next hop router and
it should add a label with an interface to the (S, G) entry in the
branching node router. In this way we can ensure that there is no
extra overhead.
2.3 Protocol evaluation
The efficiency of the MMT can be evaluated in terms of state
information requirement, tree cost, data processing efficiency, and
control overhead. In this analysis, we will focus on the state
information requirement, and control overhead of the MMT.
The state information requirement can be measured using the average
multicast forwarding table size. State information analysis includes
also the MPLS overhead. The control overhead can be measured using
the total number of control packets sent to all the links that are
needed to maintain the protocol states.
2.1 Average Multicast Forwarding Table Size
Boudani & Cousin [Page 6]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
In PIM-SSM, the multicast forwarding table entries on a source
specific tree of a multicast group G rooted at a source S is a
(S, G) entry.
Using the same reasoning in [1], we consider the parameter alpha of
a distribution tree T to be the average number of multicast
forwarding table entries per router for the tree:
Alpha(T) = Ne/NT
where Ne is sum of the total number of multicast forwarding table
entries, i.e., the total number of (S, G) entries, on all the
routers for distribution tree t, and NT is the number of routers on
the tree.
When no MPLS tunnels are established, each router on a source
specific distribution tree has one (S, G) forwarding table entry for
the distribution tree, in which case Ne = NT and the value of the
alpha parameter reaches its maximum 1.0 for source specific trees.
The minimum alpha value for any particular tree is defined by the
following equation
Alpha min (T) = (Nb + Nl + Nr)/ NT
where Nb is the number of branching points on tree T, Nl is the
number of leaf nodes on the tree, Nr is the number of root node of
the tree which always 1, and Nt is the total number of nodes on tree
t.
The alpha parameter of a tree reaches its minimum when all uni-
multicast routers on the tree are bypassed by dynamic tunnels. In
conclusion, for source specific trees, the following condition holds:
0< (Nb + Nl + Nr)/Nt < alpha < 1
In paxson's work, used in [1], routes between 37 sites located all
over the world are recorded using the traceroute utility. We can use
the same analysis and deduce also that the minimum alpha parameter
values are constantly smaller than 20% which implies that for global
scope sparse multicast groups, over 80% reductions in forwarding
table size can be achieved using the MMT.
2.2 MPLS overhead
In MMT, MPLS as unicast traffic engineering tool will be used also
as multicast traffic engineering tool. IN MMT, multicast will take
all the benefits of MPLS, that already been used for unicast, and
will introduce no extra overheads. A multicast packet will be
treated like a unicast packet and that is an advantage.
2.3 Control overhead
The control overhead of MMT can be measured in terms of average
number of control messages sent per link or the total percentage of
bandwidth spent on control traffic.
Boudani & Cousin [Page 7]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
In PIM-SSM, each distribution tree needs to be refreshed
periodically and that to rapidly detect a router that belong to a
tree, goes down. In our approach, we don't need the join and prune
messages periodically. When a router goes down, the unicast tables
will detect it and thus no extra information is needed.
3 related issues
3.1 The tree manager
One can argue that using NIMS and sending join and leaves messages
to it are weak points in our approach. But we have to mention that
the most scalable approaches in conventional multicast like PIM-SM
and CBT use similar techniques. In PIM-SM and CBT, a router should
act as a Rendez-Vous point that collect join messages for each group
and these join messages will be refreshed periodically. But in our
approach, messages will not be sent periodically. Also a few
messages will be sent from the NIMS to branching routers, thus there
is no remarkable overhead. We think that our approach has an
advantage over these conventional multicast protocols since we don't
force multicast packets to be sent all the way to the Rendez-Vous
point and next to receivers. By contrast packets follow always the
constraints shortest path. Besides there is no switching between
shared tree and source specific tree.
NIMS could be unique for each group like the case in CBT and PIM-SM.
We can also imagine that there is multiple NIMSs that can
communicate with each other to update informations about topology.
Finally, in OSPF networks the NIMS can easily collect informations.
3.2 MPLS issues
3.2.1 Multicast aggregation
In conventional multicast, it is not possible to aggregate multicast
IP addresses. Receivers can be located anywhere in the Internet,
there is no other alternative than having one entry by multicast IP
address in the multicast routing table. Some scheme tried to
aggregate by using leaky trees.
Since in MMT, we are using MPLS, aggregation of multicast IP
addresses can be transformed to label aggregations. There is no
wasted bandwidth to realize aggregation, the aggregation overhead is
zero. We have an advantage here that the traffic engineer will
control perfectly its network.
Multicast groups may share some links in their multicast trees and
aggregation introduce benefits.
3.2.2 Loop detection
In MMT, multicast loop detection for multicast traffic is the same
loop detection in unicast since packets will be send using MPLS
labels between branching node routers.
Boudani & Cousin [Page 8]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
3.2.3 LSP constructions
One can say that LSP construction is very expensive if there is no
multicast traffic, but these LSPs are already constructed and used
by unicast traffic also and thus no extra LSP construction
overheads. There are no MPLS extra states for multicast traffic
needed.
3.2.4 Multicast LDP extensions
There are no modifications to LDP. Labels used for multicast packets
are the same used for unicast packets and that is the major
difference with all other MPLS multicast approaches.
We have just to be sure that at branching node routers, when a
packet arrive, the label is pulled up and the new labels
(corresponding to each next hop branching node router) are pushed
in.
3.2.5 Multicast load balancing
In MMT, multicast packets will be sent from a branching node to
another, but we are not obliged to follow the same path constructed
by conventional multicast protocols. By using different LSPs, load
balancing is ensured.
3.3 Using MMT for inter-domain multicast
Join messages in domains are sent to the NIMS. Each domain had its
own NIMS for the group.
To receive packets from sources belonging to other domains, the
NIMS, after calculating which border router will transmit the
packets, will sent a message to create forwarding state for the
group in that border router. Due to the creation of this forwarding
state, the border router will contact border routers in other
domains with a normal (S, G) join message. These border routers will
contact NIMS routers in their domains.
Other domains don't need to be implementing MMT. Once a packet
arrives in a border router for a particular domain, a label is
pushed down and sent to all receivers.
3.4 Other ideas (difference between multicast and unicast traffic)
In our approach, multicast packets will follow the same path as
unicast packets. One can say that multicast packets should follow
different paths from unicast packets, but this solution is too heavy
because encoding technique described in [11] will be used. There is
no meaning then to have the same labels for unicast and multicast
and there are extra forwarding states in the routers.
4 Conclusion
In this paper, we present a new approach, the MPLS multicast tree,
Boudani & Cousin [Page 9]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
which utilizes MPLS LSPs between multicast tree branching node
routers in order to reduce forwarding states and enhance
scalability. In our approach only routers that are acting as
multicast tree branching node routers for a group need to keep
forwarding state for that group. All other non-branching node
routers simply forward data packets by traffic engineered unicast
routing using MPLS LSPs. In our approach, we do not need MPLS
multicast labels different from those used for unicast.
The MPLS labels used for unicast are used for multicast traffic
also. Of course the disadvantage is that multicast packets need
to be examined at the IP level in each branching node router on
the multicast tree. We leave this issue for further studies.
We briefly discussed the multicast scalability problem and related
works for forwarding state reduction. We discussed also the
advantages of our approach, and concluded that it is feasible and
promising. Finally, we analytically evaluate our approach and we
deduced that our approach could be largely deployed because it uses
for multicast the same unicast MPLS forwarding scheme.
References
[1] J. Tian et al., Forwarding state reduction for sparse mode multicast
Communication,Proceedings of IEEE INFOCOM, MArch 1998.
[2] P. Radoslavov et al., Exploiting the bandwidth-memory tradeoff in
multicast state aggregation, Technical report, USC Dept. of CS
Technical Report 99-697, july 1999.
[3] R. Boivie et al., Explicit Multicast (Xcast) Basic Specification,
Internet draft, 2000.
[4] A. Boudani et al., Simple Explicit Multicast(SEM),
Internet draft, 2001.
[5] I. Stoica, et al., "REUNITE: A recursive Unicast Approach to
Multicast", http://www.ieee-infocom.org/2000/papers/613.ps.
[6] D. Ooms et al., Framework for IP multicast in MPLS, Internet draft,
december 2000
[7] D. Farinacci et al., Using PIM to distribute MPLS labels for
multicast routes, Internet draft, november 2000
[8] A. Fei and al., Aggregated multicast: an approach to reduce
multicast state, UCLA CSD Technical Report #010012, June 2001
[9] Multiprotocol label switching architecture, RFC3031, January 2001.
[10] http://ardnoc41.canet2.net/mpls/drafts/index.html.
[11] E. Rosen et al., MPLS label stack encoding, RFC3032, January 2001.
Boudani & Cousin [Page 10]
INTERNET-DRAFT The MPLS Multicast Tree (MMT) July 2003
Authors Addresses
Ali Boudani
IRISA/INRIA Rennes
Campus Universitaire de Beaulieu
Avenue du General Leclerc 35042 Rennes France
Phone : (33) 02 99 84 25 37
Fax : (33) 02 99 84 25 29
E-mail : aboudani@irisa.fr
Bernard Cousin
IRISA/INRIA Rennes
Campus Universitaire de Beaulieu
Avenue du General Leclerc 35042 Rennes France
Phone : (33) 02 99 84 73 33
Fax : (33) 02 99 84 71 71
E-mail : bcousin@irisa.fr
Jean Marie Bonnin
IRISA/INRIA Rennes
Campus Universitaire de Beaulieu
Avenue du General Leclerc 35042 Rennes France
Phone : (33) 02 99 12 70 07
Fax : (33) 02 99 12 70 30
E-mail : jm.bonnin@enst-bretagne.fr
Expiration Date: July 2003