INTERNET DRAFT                                Ali Boudani, Bernard Cousin
Expires: April 2005                                Jean-Marie Bonnin
                                                       IRISA-France
                                                      October 2004

                An Effective Solution for Multicast Scalability:
                         The MPLS Multicast Tree (MMT)
                   <draft-boudani-mpls-multicast-tree-06.txt>


Status of this Memo

     By submitting this Internet-Draft, I certify that any applicable
     patent or other IPR claims of which I am aware have been disclosed,
     or will be disclosed, and any of which I become aware will be
     disclosed, in accordance with RFC 3668."

     Internet Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and working groups. Note that other
     groups may also distribute working documents as Internet-Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsolete by other documents
     at anytime. It is inappropriate to use Internet Drafts as reference
     material or to cite them other than as "work in progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt.

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.

Abstract


     A multicast router should keep forwarding state for every multicast
     tree passing through it. The number of forwarding states grows with
     the number of groups. In this paper, we present a new approach, the
     MPLS multicast Tree (MMT), which utilizes MPLS LSPs between
     multicast tree branching node routers in order to reduce forwarding
     states and enhance scalability. In our approach only routers that
     are acting as multicast tree branching node routers for a group
     need to keep forwarding state for that group. All other non-
     branching node routers simply forward data packets by traffic
     engineered unicast routing using MPLS LSPs. We can deduce that our
     approach can be largely deployed because it uses for multicast
     traffic the same unicast MPLS forwarding scheme. We will briefly
     discuss the multicast scalability problem, related works and
     different techniques for forwarding state reduction. We discuss also
     the advantages of our approach, and conclude that it is feasible and
     promising. Finally, we analytically evaluate our approach.

     1 Introduction

     Multicast has become increasingly important with the emergence of
     network-based applications such as IP telephony, video conferencing,




Boudani et al.                                                [Page 1]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)       October 2004


     distributed interactive simulation (DIS) and software upgrading.
     Using the multicast services, a single transmission is needed for
     sending a packet to n destinations, while n independent
     transmissions would be required using the unicast services. A
     multicast routing protocol should be simple to implement, scalable,
     robust, use minimal network overhead, consume minimal memory
     resources, and inter-operate with other multicast routing protocols.
     Multicast suffers from scalability problem with the number of
     concurrently active multicast groups because it requires a router to
     keep forwarding state for every multicast tree passing through it
     and the number of forwarding states grows with the number of groups.
     Scalability can be evaluated not only in terms of the overhead
     growth in the presence of a large number of groups but also by the
     number of participants per group and by groups for which the set of
     participants changes often over time. Overhead can be measured in
     terms of memory resources (in routers) as they relate to routing
     states maintained per group and can be measured also by bandwidth
     resources in terms of control or signaling messages per group and
     also by processing power.
     Recently, significant research effort has focused on the multicast
     scalability problem. Some schemes attempt to reduce forwarding state
     by tunneling [1] or by forwarding state aggregation [2]. Both these
     works attempt to aggregate routing state after this has been
     allocated to groups. Other architectures aim to eliminate
     forwarding states at routers either completely by explicitly
     encoding the list of destinations in the data packets, instead of
     using a multicast address [3] or partially by using branching node
     routers in the multicast tree [4,5].
     The Xcast proposal [3], used for small groups, eliminates the need
     for forwarding states. The source encodes the list of destinations
     in the Xcast header, and then sends the packet to a router. Each
     router along the way parses the header, partitions the destinations
     based on each destination's next hop, and forwards a packet with an
     appropriate Xcast header to each of the next hops.
     In SEM [4], we proposed that the source uses unicast encoding for
     multicast packets and sends them to its next hop branching node
     routers. Each branching node router acts as a source and packets
     travel from a branching node router to another. A special mechanism
     was introduced to inform each branching node router about its next
     hop branching node routers for a group.
     REUNITE [5], uses recursive unicast trees to implement multicast
     service. REUNITE does not use class D IP addresses. Instead, both
     group identification and data forwarding are based on unicast IP
     addresses. Only branching node routers for a group need to keep
     multicast forwarding state. All other non-branching node routers
     simply forward data packets by unicast routing.
     We think that using the branching node routers to forward multicast
     data packets in unicast mode is very efficient in order to reduce
     forwarding state and thus enhance scalability. The main problem is
     how to ensure that a branching node router has a complete knowledge
     about its next hop branching node routers. And another issue is how
     can we reduce the effect of encapsulated multicast packets in




Boudani et al.                                            [Page 2]


INTERNET-DRAFT            The MPLS Multicast Tree (MMT)      October 2004


     unicast packets. We think that a network information manager
     system (NIMS) in each domain can be used to resolve the first
     problem while the multi protocol label switching (MPLS) [9]
     presents an efficient solution for the second problem.
     A framework for MPLS multicast traffic engineering that explain IP
     multicast deployment in MPLS environment was proposed by Ooms et
     al [6]. Another studies about MPLS and multicast proposed by
     Farinacci et al. [7] explain how to use PIM to distribute MPLS
     labels for multicast routes.  Other expired Internet drafts studied
     the same subject [10].
     The latest interesting proposal is aggregated multicast [8]. The key
     idea of aggregated mulicast is that, instead of constructing a tree
     for each individual multicast session in the core network, one can
     have multiple multicast sessions share a single aggregated tree to
     reduce multicast state and, correspondingly, tree maintenance
     overhead at network core. In this proposal there was two
     requirements: (1) original group addresses of data packets must be
     preserved somewhere and can be recovered by exit nodes to determine
     how to further forward these packets;(2) some kind of identification
     for the aggregated tree which the group is using must be carried and
     transit nodes must forward packets based on that. Instead of using
     IP encapsulation as in SEM, which of course, adds complexity and
     processing overhead, a potentially much better possibility is to use
     MPLS [9]. To handle aggregated tree management and matching between
     multicast groups and aggregated trees, a centralized management
     entity called tree manager is introduced. In group to aggregated
     tree matching, complication arises when there is no perfect match or
     no existing aggregated tree covers a group. There was a disadvantage
     in leaky matching because certain bandwidth should be wasted to
     deliver data to nodes that are not involved for the group. In our
     proposal we intend to resolve also this problem of leaky matching.
     Using MPLS with multicast has many benefits not only for reducing
     multicast forwarding states but also for traffic engineering and
     QoS issues. In this paper, we only focus on the scalability
     problem. We propose a novel approach to reduce multicast state, the
     MPLS multicast Tree. This approach uses MPLS LSPs between multicast
     tree branching node routers in order to reduce forwarding states
     and enhance scalability. The main difference with previous
     approaches is that we use only multicast states for branching
     routers in the multicast tree. Other routers between two branching
     nodes don't have multicast states. Unicast is used between two
     branching routers. This way the total number of multicast forwarding
     states may be significantly reduced. By using MPLS LSPs between
     branching routers, memory resource is reduced also and no need for
     encapsulation techniques. The MPLS forwarding process conserve the
     router resources also. In this paper, we examine several design and
     implementation issues of our scheme and present an evaluation for
     our protocol.
     The remainder of this paper is organized as follow. In section 2 the
     concept of MPLS multicast aggregation is described and some
     implementation related issues are discussed. Section 3 contains the
     approach analysis and an evaluation for its forwarding state and




Boudani et al.                                                [Page 3]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


     messaging overhead. Section 4 is a summary followed by a list of
     references.

     2 Observation and proposal: MPLS Multicast Tree

     The key idea of MPLS multicast tree is that, instead of having
     multicast forwarding states in all routers in a constructed tree for
     each individual multicast session in the core network (backbone),
     one can have only multicast forwarding states in branching node
     routers of the tree. By using the branching node routers, multicast
     forwarding states are reduced, correspondingly, the tree maintenance
     overhead at network core.
     The MPLS multicast tree is targeted basically for intra-domain
     multicast, but it is not limited for that only since it is based on
     MPLS, and can be used also for inter-domain multicast.

     2.1 Observation

     In conventional multicast, routers on the multicast tree located
     between the source and a group member should maintain multicast
     states for the group.
     Let's consider the network illustrated in fig.1 (single multicast
     session with only one source and one group). Suppose that there are
     group members at router R5, then routers R1, R2 R3, R4 should all
     have multicast state for the group G even if they don't have
     directly connected members.
                                  S
                                  |
                                  R1
                                  |
                                  R2
                                  |
                                  R3
                                  |
                                  R4
                                 / \
                                /   \
                              R6     R5

              Fig.1 : Multicast tree (one source, one group)

     Let's take the same network but with multiple multicast sessions
     (one single source and two groups). We suppose that there are
     members for group G1 at R5 and members for group G2 at R6. Routers
     R1, R2, R3, R4 should maintain states for groups G1 and G2 even if
     they don't have directly connected group members. When the number of
     group increase in time multicast states became harder and harder.
     Fig.2 represents a multicast topology (constructed tree) resulting
     from a traceroute experiments [5]. In the entire multicast tree
     there are 8 branching node routers out of 97 routers.

     Fig.2: Traceroutre experiment from a source to 15 other sites




Boudani et al.                                                [Page 4]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


     One can conclude that maintaining multicast forwarding states in non
     branching node routers is a waste of resources and that only
     branching node routers for multicast groups should maintain
     multicast forwarding states. In conventional multicast a router can
     discover that it is a branching node router for a group but it
     doesn't have any idea about its next hop branching node routers for
     that group.
     A second observation is that even when 2 multicast trees share
     multiple links, multicast forwarding states on routers for the
     shared links will not be aware of that and there is no possibility
     for aggregation.
     In REUNITE there was a completely unicast approach for delivering
     multicast packets. However, many of the LAN and WAN technologies
     have native support for multicast. Sending individual unicast
     messages to each of the receivers in a multicast-capable subnet as
     ethernet is very inefficient. Multicast packets should always
     contain the IP multicast header.
     In SEM we have introduced a tunneling process which can be
     implemented to conventional multicast too but messages between
     branching node routers in order to construct tunnels can be
     considered as expensive overheads. Our proposal will take all these
     observations into account.
     Overheads are the header processing time that take a router to
     determine the routing and also the size of multicast routing table.
     These two overheads are related. Our goal is to minimize the
     multicast overheads in each router. In our proposal, only branching
     node routers will contain the multicast routing table. All other
     routers do not need the routing states.
     In aggregated multicast[8], the solution was to have a single shared
     aggregated multicast tree but as mentioned in discussions
     complication arises when there is no perfect match or no existing
     tree covers a group. The disadvantage here is that certain bandwidth
     is wasted to deliver data to nodes that are not involved for the
     group. Bandwidth can crucial factors for provisioning QoS in
     multicast networks and even for best effort Internet.

     2.2 proposal

     In order to inform a branching node router about its next hop
     branching node routers for a group, each domain should contain a
     network information manager system (NIMS) for each group, charged to
     collect join messages from all group members in that domain. After
     collecting all join and prune messages, the NIMS should compute the
     multicast tree for that group in the domain. The computation for a
     group means discovering all branching node routers and the next hop
     branching node routers for each group.
     We should note that scalable multicast protocols like PIM-SM
     (shared tree) and CBT use a core router similar to the one used in
     our approach.
     In our approach we are suggesting that the NIMS should collect join
     messages from all group members and have a complete overview about
     the multicast network.




Boudani et al.                                                [Page 5]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


    The NIMS send then messages to all branching node routers to inform
     them about their next hop branching node routers. On receiving this
     message, a branching node router creates a multicast forwarding
     state for the multicast session.
     Once branching node routers and their next hop are identified,
     packets will be send from a branching node router to another until
     achieving their target.
     Tunneling between branching node routers was studied in [1], [4],
     and [5] but was judged very expensive and very complicated since
     multicast packets should be encapsulated as unicast packets and
     send after that over the tunnel.
     Based on observations, we propose a new approach, the MPLS multicast
     tree, which utilizes MPLS dynamically established LSPs between
     multicast tree branching node routers in order to reduce forwarding
     states and enhance scalability.
     When a multicast packet arrives to the ingress router of a MPLS
     domain, the packet is analyzed according to its multicast IP header.
     The router should determine who are the next hop branching node
     routers for that packet. Based on this information, multiple copies
     of the packets are generated and a MPLS label is pushed to the
     multicast packet corresponding to each next branching node router.
     Then a labeled multicast packet will not be different from a labeled
     unicast packet.
     When arriving to a next hop branching node router, the label is
     pulled and again the same process is repeated. This process should
     be repeated until the packet arrives to its destination.
     When arriving to a LAN, the packet unlabeled can be delivered by
     conventional multicast protocols according to IGMP informations.
     Note that labeled multicast packets will be examined at each
     branching router and that is why a multicast packet on non-branching
     router will be treated as a MPLS labeled unicast packet. So no extra
     charges are needed for multicast packets since the MPLS tables
     already exist in routers. The examination of a multicast packets in
     intermediate routers on the tree is considered as a disadvantage of
     this approach but we leave this issue for further studies.
     When a branching node router receive the message from the NIMS it
     should calculate the label corresponding to the next hop router and
     it should add a label with an interface to the (S, G) entry in the
     branching node router. In this way we can ensure that there is no
     extra overhead.

     2.3 Protocol evaluation

     The efficiency of the MMT can be evaluated in terms of state
     information requirement, tree cost, data processing efficiency, and
     control overhead. In this analysis, we will focus on the state
     information requirement, and control overhead of the MMT.
     The state information requirement can be measured using the average
     multicast forwarding table size. State information analysis includes
     also the MPLS overhead. The control overhead can be measured using
     the total number of control packets sent to all the links that are
     needed to maintain the protocol states.




Boudani et al.                                                [Page 6]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004









































     2.4 MPLS overhead

     In MMT, MPLS as unicast traffic engineering tool will be used also
     as multicast traffic engineering tool. IN MMT, multicast will take
     all the benefits of MPLS, that already been used for unicast, and
     will introduce no extra overheads. A multicast packet will be
     treated like a unicast packet and that is an advantage.

     2.5 Control overhead

     The control overhead of MMT can be measured in terms of average
     number of control messages sent per link or the total percentage of
     bandwidth spent on control traffic.




Boudani et al.                                                [Page 7]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


     In PIM-SSM, each distribution tree needs to be refreshed
     periodically and that to rapidly detect a router that belong to a
     tree, goes down. In our approach, we don't need the join and prune
     messages periodically. When a router goes down, the unicast tables
     will detect it and thus no extra information is needed.

     3 related issues

     3.1 The tree manager

     One can argue that using NIMS and sending join and leaves messages
     to it are weak points in our approach.  But we have to mention that
     the most scalable approaches in conventional multicast like PIM-SM
     and CBT use similar techniques. In PIM-SM and CBT, a router should
     act as a Rendez-Vous point that collect join messages for each group
     and these join messages will be refreshed periodically. But in our
     approach, messages will not be sent periodically. Also a few
     messages will be sent from the NIMS to branching routers, thus there
     is no remarkable overhead. We think that our approach has an
     advantage over these conventional multicast protocols since we don't
     force multicast packets to be sent all the way to the Rendez-Vous
     point and next to receivers. By contrast packets follow always the
     constraints shortest path. Besides there is no switching between
     shared tree and source specific tree.
     NIMS could be unique for each group like the case in CBT and PIM-SM.
     We can also imagine that there is multiple NIMSs that can
     communicate with each other to update informations about topology.
     Finally, in OSPF networks the NIMS can easily collect informations.

     3.2 MPLS issues

     3.2.1 Multicast aggregation

     In conventional multicast, it is not possible to aggregate multicast
     IP addresses. Receivers can be located anywhere in the Internet,
     there is no other alternative than having one entry by multicast IP
     address in the multicast routing table. Some scheme tried to
     aggregate by using leaky trees.
     Since in MMT, we are using MPLS, aggregation of multicast IP
     addresses can be transformed to label aggregations. There is no
     wasted bandwidth to realize aggregation, the aggregation overhead is
     zero. We have an advantage here that the traffic engineer will
     control perfectly its network.
     Multicast groups may share some links in their multicast trees and
     aggregation introduce benefits.

     3.2.2 Loop detection

     In MMT, multicast loop detection for multicast traffic is the same
     loop detection in unicast since packets will be send using MPLS
     labels between branching node routers.




Boudani et al.                                                [Page 8]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


     3.2.3 LSP constructions

     One can say that LSP construction is very expensive if there is no
     multicast traffic, but these LSPs are already constructed and used
     by unicast traffic also and thus no extra LSP construction
     overheads. There are no MPLS extra states for multicast traffic
     needed.

     3.2.4 Multicast LDP extensions

     There are no modifications to LDP. Labels used for multicast packets
     are the same used for unicast packets and that is the major
     difference with all other MPLS multicast approaches.
     We have just to be sure that at branching node routers, when a
     packet arrive, the label is pulled up and the new labels
     (corresponding to each next hop branching node router) are pushed
     in.

     3.2.5 Multicast load balancing

     In MMT, multicast packets will be sent from a branching node to
     another, but we are not obliged to follow the same path constructed
     by conventional multicast protocols. By using different LSPs, load
     balancing is ensured.

     3.3 Using MMT for inter-domain multicast

     Join messages in domains are sent to the NIMS. Each domain had its
     own NIMS for the group.
     To receive packets from sources belonging to other domains, the
     NIMS, after calculating which border router will transmit the
     packets, will sent a message to create forwarding state for the
     group in that border router. Due to the creation of this forwarding
     state, the border router will contact border routers in other
     domains with a normal (S, G) join message. These border routers will
     contact NIMS routers in their domains.
     Other domains don't need to be implementing MMT. Once a packet
     arrives in a border router for a particular domain, a label is
     pushed down and sent to all receivers.

     3.4 Other ideas (difference between multicast and unicast traffic)

     In our approach, multicast packets will follow the same path as
     unicast packets. One can say that multicast packets should follow
     different paths from unicast packets, but this solution is too heavy
     because encoding technique described in [11] will be used. There is
     no meaning then to have the same labels for unicast and multicast
     and there are extra forwarding states in the routers.

     3.5 Switching between L2 and L3 in branching node router

     One may say that switching between L2 and L3 in branching node
     router eliminates the gain obtained by using the MPLS. This is true




Boudani et al.                                               [Page 9]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


     if there is a lot of branching node routers in the multicast tree in
     a domain. But, it is also true that it is better than having
     multicast states in all routers. But as a solution for this problem.
     We can imagine that every multicast session (S,G) is associated at
     the ingress router of a domain to a label (this is the NIMS mission
     to reserve labels for sessions). this label represents the group
     only in the domain, so there is no need for label allocation. The
     modified label distribution protocol can manage this lable
     distribution. One may argue that we can not attribute and associate
     labels to all multicast sessions since the multicast labels are
     limited (20 bits). This is true, but the traffic engineer should
     manage the demands and include the QoS and differtiated services to
     serve the sessions in the way that more important sessions passes
     first. This is very convenient with the Internet best-effort logic.

     4 Conclusion

     In this paper, we present a new approach, the MPLS multicast tree,
     which utilizes MPLS LSPs between multicast tree branching node
     routers in order to reduce forwarding states and enhance
     scalability. In our approach only routers that are acting as
     multicast tree branching node routers for a group need to keep
     forwarding state for that group. All other non-branching node
     routers simply forward data packets by traffic engineered unicast
     routing using MPLS LSPs. In our approach, we do not need MPLS
     multicast labels different from those used for unicast.
     The MPLS labels used for unicast are used for multicast traffic
     also. Of course the disadvantage is that multicast packets need
     to be examined at the IP level in each branching node router on
     the multicast tree. We leave this issue for further studies.
     We briefly discussed the multicast scalability problem and related
     works for forwarding state reduction. We discussed also the
     advantages of our approach, and concluded that it is feasible and
     promising. Finally, we analytically evaluate our approach and we
     deduced that our approach could be largely deployed because it uses
     for multicast the same unicast MPLS forwarding scheme.

References

[1] J. Tian et al., Forwarding state reduction for sparse mode multicast
    Communication,Proceedings of IEEE INFOCOM, MArch 1998.
[2] P. Radoslavov et al., Exploiting the bandwidth-memory tradeoff in




Boudani et al.                                               [Page 10]


INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


    multicast state aggregation, Technical report, USC Dept. of CS
    Technical Report 99-697, july 1999.

[3]  R. Boivie et al., Explicit Multicast (Xcast) Basic Specification,
     Internet draft, 2000.

[4]  A. Boudani et al., Simple Explicit Multicast(SEM),
      Internet draft, 2001.

[5]  I. Stoica, et al., "REUNITE: A recursive Unicast Approach to
     Multicast", http://www.ieee-infocom.org/2000/papers/613.ps.

[6] D. Ooms et al., Framework for IP multicast in MPLS, Internet draft,
    december 2000

[7] D. Farinacci et al., Using PIM to distribute MPLS labels for
    multicast routes, Internet draft, november 2000

[8] A. Fei and al., Aggregated multicast: an approach to reduce
    multicast state, UCLA CSD Technical Report #010012, June 2001

[9] Multiprotocol label switching architecture, RFC3031, January 2001.

[10] http://ardnoc41.canet2.net/mpls/drafts/index.html.

[11] E. Rosen et al., MPLS label stack encoding, RFC3032, January 2001.


Authors Addresses

  Ali Boudani
  IRISA/INRIA Rennes
  Campus Universitaire de Beaulieu
  Avenue du General Leclerc 35042 Rennes France
  Phone : (33) 02 99 84 25 37
  Fax : (33) 02 99 84 25 29
  E-mail : aboudani@irisa.fr

  Bernard Cousin
  IRISA/INRIA Rennes
  Campus Universitaire de Beaulieu
  Avenue du General Leclerc 35042 Rennes France
  Phone : (33) 02 99 84 73 33
  Fax : (33) 02 99 84 71 71
  E-mail : bcousin@irisa.fr

  Jean Marie Bonnin
  IRISA/INRIA Rennes
  Campus Universitaire de Beaulieu
  Avenue du General Leclerc 35042 Rennes France
  Phone : (33) 02 99 12 70 07
  Fax : (33) 02 99 12 70 30




Boudani et al.                                               [Page 11]

INTERNET-DRAFT           The MPLS Multicast Tree (MMT)      October 2004


  E-mail : jm.bonnin@enst-bretagne.fr


Copyright Statement

Copyright (C) The Internet Society (2004).  This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights."

"This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."

 Expiration Date: April 2005