Network Working Group Arup Acharya
Internet Draft Frederic Griffoul
<draft-acharya-ipsofacto-mpls-mcast-00.txt> Furquan Ansari
C&C Research Labs, NEC
February 23, 1999
Expires August 23, 1999
IP Multicast Support in MPLS Networks
<draft-acharya-ipsofacto-mpls-mcast-00.txt>
Status of This Memo
This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other group may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
Multicast support in a MPLS network has yet to be defined. This
document discusses both dense-mode and sparse-mode IP multicast within
the context of a MPLS network. Unlike unicast routing, dense-mode
multicast routing trees are established in a data-driven manner and it
is not possible to topologically aggregate such trees, which are
rooted at different sources. In sparse-mode multicast, source-specific
trees may coexist with a core/shared tree, and it is not possible to
assign a common label to traffic from different sources on a branch of
the shared tree. This leads us to suggest a per-source traffic-driven
label allocation scheme for supporting all three types of multicast
(dense mode, shared tree, source tree) routing trees in a MPLS
network.
Acharya, Griffoul & Ansari [Page 1]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
Table of Contents
1. Introduction 3
2. Dense-mode multicast: problem definition 4
3. Sparse-mode multicast: problem definition 6
3.1 Existing proposals for PIM-SM in MPLS 6
3.2 Shared tree/source tree co-existence problem 6
3.3 Per-source label assignment 7
4. Building block for proposed MPLS multicast 8
4.1 Assumptions 8
4.2 Upstream "implicit" label distribution 9
4.2.1 Label assignment 9
4.2.2 Label withdrawal 10
4.3 Downstream LDP-based label distribution 11
4.4 Comparison of the distribution procedures 13
5. Proposed Solution for PIM-DM in MPLS 13
5.1 Basic operations 13
5.2 Label Binding triggered by PIM-Graft 14
5.3 Label Reclamation triggered by PIM-Prune 14
5.4 Label Reclamation triggered by PIM inactivity timer 14
5.5 Example 15
6. Proposed solution for PIM-SM in MPLS 16
6.1 Source-specific/shortest-path tree 16
6.2 Shared tree 17
6.2.1 Label Reclamation 17
6.2.2 Example 18
7. Proposed solution for DVMRP and MOSPF in MPLS 19
8. Effects of L3 topology change on multicast LSP 20
8.1 Loops 20
8.2 Change of upstream router 20
9. Conclusions 20
10. Security Considerations 21
11. Acknowledgments 21
12. References 21
13. Authors Addresses 22
Appendix A: LDP Multicast FEC Definitions 23
Appendix B: LDP Initialization Session Multicast Parameter 24
Table of Abbreviations
DVMRP Distance Vector Multicast Routing Protocol
IGMP Internet Group Management Protocol
IP Internet Protocol
LSP Label Switched Path
LSR Label Switching Router
MFC Multicast Forwarding Cache
Acharya, Griffoul & Ansari [Page 2]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
MRT Multicast Routing Table
NHLF Next Hop Label Forwarding
PIM-DM Protocol Independent Multicast-Dense Mode
PIM-SM Protocol Independent Multicast-Sparse Mode
RP Rendezvous Point
(S,G) (Source, Group) pair
(*,G) (Match any source, Group) pair
UL Unused or Unassigned Label
iif Incoming Interface
oif Outgoing Interface
1. Introduction
This document considers the problem of supporting IP multicast
efficiently within an MPLS environment.Both PIM dense-mode and
sparse-mode multicast routing protocols are discussed. We observe
that, in dense-mode operation, multicast routing entries do not exist
prior to arrival of data packets and unlike unicast routing entries,
cannot be aggregated. This suggests that labels need to be assigned on
a per-flow (source, group) basis in a traffic-driven fashion. In case
of sparse-mode, we observe that source specific trees may co-exist
with the shared/core tree for a multicast group, and so, nodes of the
shared tree may prune data packets based on the source. This implies a
single label cannot be assigned to all flows on the shared tree,
independent of the source. We suggest a data-driven, per-source
assignment of labels to traffic on the shared tree. For the three
different types of trees (dense mode, sparse mode shared and sparse
mode source specific), we present a common scheme for implicitly
distributing and binding labels to multicast FECs.
Presently, support for multicast in MLPS networks [7] is undefined,
and this document suggests a possible solution for forwarding
multicast traffic at layer 2. For a review of multicast routing
protocols and their implications for a MPLS environment, the reader is
referred to [1]. For PIM-SM, [3] suggests that the multicast
forwarding cache (MFC) which contains forwarding entries for currently
active multicast flows, be used as a trigger method to setup a
label-switched path (LSP), but no specific methods for label binding
are suggested. It notes that coexistence of shared and source specific
trees in PIM-SM is problematic for L2 forwarding and suggests that L3
forwarding be used in such situations. In this document, we present a
data-driven scheme for label assignment to setup LSPs for both
dense-mode and sparse mode multicast, and is based on our prior work
on IP switching over ATM, IPSOFACTO ([IPSO1, IPSO2]).
Acharya, Griffoul & Ansari [Page 3]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
2. Dense-Mode Multicast: problem definition
The current MPLS specifications for unicast traffic [ARCH,LDP]
advocate control-driven label binding and downstream label assignment.
In this section, we will point out why such a topology-driven approach
is not suitable to the multicast dense-mode case.
Let us consider a unicast example to see how topology-driven label
binding works.
[NET1] [NET2]
| |
R1 R2
\ /
A \ / B
\ /
R3----[NET3]
/ C
/ D
/
R4
|
[NET4]
Figure 1
Let us assume the LSR R3 is either a packet-switched LSR or a VC-merge
capable ATM LSR (i.e. it supports label aggregation). A partial view
of the R3 label tables is:
Next Hop | IIF | Incoming Label | OIF | Outgoing Label
----------+------+-------------------+-----+------------------
R2 | A | l1 | B | l2
R2 | C | l3 | B | l2
R2 | D | l3 | B | l2
The key points of MPLS unicast forwarding are the following:
1. Routing table updates trigger the creation or destruction of
label bindings.
2. The label bindings are advertised using a dedicated Label
Distribution Protocol (LDP). It happens before any data is
received on the corresponding ports, thus all the packets
are forwarded at the layer 2.
Acharya, Griffoul & Ansari [Page 4]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
3. All packets whose destination is NET2, are aggregated
in R3: they are forwarded to R2 on interface B using a
single common label l2.
Now let us suppose a multicast group G has members in NET2 and NET4
and a source S1 in NET1. According to PIM-DM, R3 receives the packets
to G on interface A and forwards them on its outgoing interfaces B, C
and D. R3 creates the following multicast routing table entry:
(S1, G) iif={A} oif={B, C, D} prune={}
Packets are then forwarded at layer 3 since no label has been assigned
to (S1, G) so far. Subsequently, a PRUNE message is received on
interface C (since NET3 has no G member) and the multicast routing
table entry is modified as:
(S1, G) iif={A} oif={B, D} prune={C}
The interface C is added again to the outgoing interface list after
the Prune timer expires. Note the following points:
1. There is no routing entry at the LSR R3 corresponding to
(S1, G) prior to arrival of data from S1.
2. It is not possible to aggregate multicast routing entries in
Dense Mode. Suppose a source S2 in NET2 starts sending traffic
to G. R3 creates a new multicast routing table entry:
(S3, G) iif={B} oif={A, C, D} prune={}
which is then modified after receiving PRUNE messages from
interfaces A and C to:
(S3, G) iif={B} oif={ D} prune={A, C}
The (S3, G) entry cannot be aggregated with the entry for
(S1,G), since the incoming and outgoing interfaces are
different.
3. A given routing table entry changes dynamically (even without
any change in the unicast routing/network topology) due to
periodic pruning of branches and/or arrival of new members.
4. All packets are forwarded at L3 till such a time incoming and
outgoing labels are assigned to the (S1, G) entry.
Acharya, Griffoul & Ansari [Page 5]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
Points (1) and (3) lead us to conclude that label assignment for
dense-mode traffic needs to be hop-by-hop traffic-driven. Furthermore,
from (2), each (S, G) entry needs to be assigned separate incoming and
outgoing labels. When the first packet from source S to destination G
is received by an LSR, multicast IP forwarding carries out the RPF
check and creates an (S, G) entry in the multicast routing table. Once
this (S, G) entry exists, the procedure to bind a label to the (S, G)
FEC is activated.
Till such labels are assigned, all packets are forwarded at L3, and
therefore, the label bindings need to be done as quickly as possible
(to keep L3 processing at a minimum) after the routing entry is
created.
Arrival of a PIM Graft (S, G) message requires adding an outgoing
branch to the existing LSP.
From (3), labels need to be withdrawn in two cases, on Prune (S, G)
reception and/or emission; on activity timer expiration.
3. Sparse Mode Multicast: Problem definition
3.1 Existing proposals for PIM-SM in MPLS
[FAR1] suggests a piggy-backing methodology to assign and distribute
labels for multicast traffic for sparse-mode trees. The idea is that
PIM Join messages are augmented to carry labels. Besides requiring
changes to existing PIM message formats, [OOMS1] lists other drawbacks
of this piggybacking approach. As we discuss below, it is not also
possible to assign a single label, common to all sources, for
sparse-mode shared trees, and thus the piggybacking approach is not
adequate for this case. [OOMS2] recognizes the (*, G)/(S, G)
coexistence problem but only proposes to have recourse to IP L3
forwarding.
3.2 Shared tree/source tree co-existence problem
PIM-SM allows receivers to join a shared tree (*,G) for the group G
with a common core/Rendezvous Point (RP) as the root, or a
shortest-path (S, G) tree rooted at a specific source S. A receiver
may thus receive traffic for a given source S through the (S, G) tree,
and for other sources, through the shared tree. Note also that, some
members may receive the source traffic from the shared (*, G) tree
while other members may receive it from the (S, G) tree. Consequently,
the source Designated Router needs to forward the source traffic on
both the (*, G) and (S, G) trees.
Acharya, Griffoul & Ansari [Page 6]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
In a MPLS context, a problem arises from the situation when a node on
the shared (*, G) tree needs to forward data differently depending on
the source, for instance, because some members have joined a source
specific shortest-path tree.
Let us consider the case of Figure 2. The node R1 is not interested in
receiving S1's traffic from the (*, G) tree, since it has joined the
source-specific tree for S1. It sends a Prune(S1, G) message to R1 to
prevent S1's traffic from being forwarded on link 1. As a result, R1
forwards traffic from S1 on interface 3, while traffic from S2 is
forwarded on interfaces 1 and 3. To accomplish the same forwarding
behaviour at L2 within a MPLS network, a common label can not be
assigned to all traffic on R1's incoming link 2; the traffic from S1
on R1's interface 2 must be assigned a distinct label from that of S2.
R2 -------> Join(S1,G) -------> S1
\ /
| \ 1 /
| \ /
| +----+ 2 +----+
+---> | R1 |--------------------| RP |
Prune(S1,G) +----+ ------> +----+
/ Join(*,G) \
/ 3 \
/ \
R3 S2
at R1: (*, G) iif={2} oif={1, 3}
(S1, G) iif={2} oif={3}
Figure 2
It is easy to see that such selective forwarding may be necessary at
different points of the shared tree depending on the source of the
traffic. For PIM-SM, a naive topology-driven procedure to assign
labels leads to incorrect data delivery.
3.3 Per-source label assignment
PIM-SM shortest path tree support can be equivalent to PIM-DM tree
support: a label is assigned in a hop-by-hop traffic-driven way for
each (S, G) entry.
Acharya, Griffoul & Ansari [Page 7]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
To solve the (S, G)/(*, G) coexistence problem without resorting to IP
forwarding, source specific labels are to be assigned on intermediate
nodes of the shared tree. Multiple labels will be associated with one
(*, G) entry, corresponding to one label per active source. In order
to unambiguously distinguish a per-source (*, G) label binding from a
(S, G) binding, we propose to introduce a (G, S) FEC representing IP
packets from source S forwarded on the (*, G) tree. The other obvious
FEC, the (S, G) FEC represents IP packets from source S forwarded on
the (S, G) tree.
PIM-SM could then be supported using per-source label assignment. More
details are given in section 6.
4. Building block for proposed MPLS multicast
4.1 Assumptions
Our proposal is based on the following basic assumptions:
1. There is a label table associated with each interface of a
multicast-capable LSR.
2. On a multi-access link, multicast-capable LSRs must use
disjoint label spaces that are used for binding labels to
FECs.
An exact mechanism to achieve (2) through extensions to LDP is
deferred to a later draft. [FAR2] describes a solution for (2);
however, it augments PIM-Hello messages to achieve disjoint multicast
labels across PIM-capable LSRs on a multi-access link. [FAR2] proposes
label allocation from the downstream node; however, such a partitioned
label space can be used for upstream label allocation as well.
In the rest of the document, we use the term "Unused Label" or UL to
denote a free multicast label, i.e. a label within the multicast range
with no current binding.
We propose two types of label bindings: the first uses upstream
allocation with an "implicit" distribution, the second uses downstream
allocation based on explicit LDP-like control messages.
For both the approaches, a label binding is initiated when a FEC
is detected in the multicast flows.
Acharya, Griffoul & Ansari [Page 8]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
4.2 Upstream "implicit" label distribution
4.2.1 Label assignment
This proposal imposes an additional requirement:
3. When a multicast-capable LSR receives a packet with a label that
has no current binding on the incoming interface, L3 processing
is invoked.
When a multicast-capable LSR detects a new multicast FEC, it invokes
L3 routing to determine the outgoing interfaces.
For each outgoing interface, it selects a UL and binds the UL to the
corresponding multicast tree. It then forwards the packet downstream.
A downstream LSR receives the packet with the UL, invokes L3 routing
(since the incoming label has no binding) to determine the outgoing
interfaces and again selects UL for each of those interfaces. An entry
is added to the label table consisting of the incoming interface/label
and outgoing interfaces/label list. Subsequent traffic on the
corresponding multicast tree is label-switched at L2.
In Figure 3, consider a new multicast flow arriving on interface 1.
The UL selected by the upstream LSR is A, and reception of the packet
invokes L3 processing. As a result of L3 processing, interfaces 2, 3
and 4 are selected as the outgoing interfaces. ULs X, Y and Z are then
picked for the interfaces 2, 3 and 4 respectively, and a copy of the
packet is forwarded on each of those interfaces with the corresponding
labels. An entry is added to the label tabel:
< input = (1, A), output = {(2,X), (3,Y), (4,Z)} >
Subsequent packets that arrive at interface 1 with label A are
switched at L2, without invoking L3 processing. Thus, only the first
packet undergoes L3 processing.
Acharya, Griffoul & Ansari [Page 9]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
L3 UL=X
Processing ^ /
^ \ / /(2)
/ \-> / /
|--/---------|/ ------> UL=Y
___(1)___| / R (LSR) |_________________
|/-----------|\ (3)
---------/ \
UL=A \ \ (4)
\ \
UL=Z V \
Figure 3
Note that this scheme works well for both point-to-point and
multi-acess interfaces. A partitioned label space between multicast
and unicast traffic avoids a situation where a label l is allocated by
a downstream LSRd for unicast traffic from LSRu1, and is then
subsequently allocated by another LSRu2 for multicast traffic
downstream.
LSRu1 LSRu2
| ^ l / |
| |l <-/ |
-------\-------------
\ |
\ |
LSRd
Figure 4
A disjoint label space amongst multicast LSRs ensures that no two LSRs
assign the same label on a common multi-access link, e.g LSR u1 and
u2. Moreover, since there can only be one forwarder on the link for a
given (S, G), a per-source upstream label binding requires no further
coordination among multicast LSRs on a common link.
4.2.2 Label withdrawal
Once a label has been assigned on a LSR's outgoing interface, there
needs to be a mechanism to reclaim that label. To prevent traffic from
being switched along the wrong LSP, it is sufficient that the
following relation holds:
Acharya, Griffoul & Ansari [Page 10]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
(relation 1)
if "L" is a UL on an outgoing interface of LSRu then
"L" must also be an UL on the corresponding incoming interface
of any LSRd on the same link as LSRu.
Note that traffic is not forwarded incorrectly at L2, if l is an UL on
LSRd's incoming interface, but not a UL on LSRu's outgoing interface.
In this case, any traffic that LSRu sends with a label l invokes L3
processing at LSRd.
In our multicast solution for MPLS, we need to ensure that a label is
first reclaimed as an UL on the downstream LSR(s) first and only then
on the upstream LSR. When the label withdrawal is triggered by a
routing protocol control message, such as a PIM Prune, the L2 label
can be immediately reclaimed without additional coordination, since
the control message is sent from the downstream to the upstream node.
In the case where the label binding for a FEC is broken due to
expiration of the activity timer at a LSR, an explicit control message
needs to be sent to revoke the label binding. In a a point-to-point
link, we propose to send a LDP Label Release message from the
downstream to the upstream. Alternatively, the upstream LSR may send a
Label Withdraw message to the downstream node, followed by a Label
Release response. In case of a multi-access link, a similar
functionality needs to be supported. However, LDP as defined
currently, operates over a point-to-point (TCP) reliable connection
between adjacent LSRs. An analogous mechanism for the muti-party
interactions (e.g. Label Release/Withdraw) over a multi-access link is
to be discussed in a subsequent draft.
4.3 Explicit label allocation
An alternative to the above mechanism is to use explicit control
messages to bind a label to a FEC. On point-to-point links, we propose
to use the Label Distribution Protocol [LDP] in downstream label
distribution mode, along with new definitions for multicast FEC
elements. This approach is useful if requirement (3) above cannot be
met by a LSR. In that case, the traffic for a new FEC is first
forwarded on a default routed path (e.g. (VPI=0,VCI=32) for LDP over
ATM VC).
Acharya, Griffoul & Ansari [Page 11]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
To members <-- LSRd1
\
LSRu --- <---- from Source
/
To members <-- LSRd2
Figure 5
As shown in Figure 5, LSRu will initially receive packets (on a
default, routed path) that belong to a FEC for which it has no label
binding. Two options are then possible:
-- LSRu detects a new multicast FEC according and sends a Label
Request message to all the next hops (for the MRT entry corresponding
to the FEC). Each downstream LSR selects a free multicast label for
its corresponding incoming port and eventually sends a Label Mapping
message for the FEC to LSRu.
-- No Label Request is sent by LSRu. Instead, arrival of packets at
LSRd1 and LSRd2 on the routed path, trigger an unsolicited Label
Mapping message to LSRu.
Besides traffic-driven multicast FEC detection, a LSR initiates a
label binding procedure, when the oif list of a MRT entry is modified,
e.g. arrival of PIM-DM Graft messages and PIM-SM Join(*, G).
On point-to-point links, the above LDP procedures can be used without
additional protocol support. Multicast FEC elements and LDP
initialization session multicast extension are defined in Appendix A
and B.
As noted in the previous section, LDP messages are currently not
defined for multi-party interactions. In this document, we assume
that such a mechanism exists for assigning and withdrawing multicast
labels on a multi-access link, without specifying the exact
mechanism. Such a multicast analogue for LDP, e.g. periodic
link-local multicast of label bindings, will be described in a
subsequent draft.
Acharya, Griffoul & Ansari [Page 12]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
4.4 Comparison of the distribution procedures
For multicast traffic, upstream label allocation is simpler since
there can only be one upstream node (per link), and therefore, there
can be only one entity that binds the label. In downstream allocation
schemes, there may be multiple receivers (on a multi-access link) and
one of them needs to be chosen as the label allocator. Additionally
if the original allocator of a label (on a multi-access link) leaves
the multicast tree, either the label binding for the tree needs to be
changed and/or another LSR needs to be elected as the label allocator.
For traffic-driven approaches, upstream allocation is preferable since
it allows the label-binding (and consequently L2 switching) to happen
earlier than for downstream allocation.
In general, the advantage of an implicit coordination is that only the
first packet carrying an UL requires L3 processing. In contrast, an
explicit control message to propagate labels incurs a delay between
the arrival of a traffic stream and label binding. During this
interval, each incoming packet is processed at L3 and requires a L3
copy-and-forward operation for each outgoing branch of the multicast
tree.
In the next sections, we describe in more details our proposed
solution to support PIM-DM and PIM-SM in MPLS. Although we will focus
on the upstream label distribution procedure, the solutions are
equally applicable with downstream-on-demand LDP-based label
distribution, assuming that the necessary multicast extensions will
be defined LDP at a later time.
5. Proposed Solution for PIM-DM in MPLS
5.1 Basic operations
In PIM-DM, there is a one-to-one mapping between a multicast routing
entry and a LSP, so that the only FEC to be considered is the (S, G)
FEC. We use the building block described in section 4 to propose a
solution for PIM-DM as follows.
When a multicast packet with source S and destination G is received at
an incoming interface, the UL associated with the packet triggers
PIM-DM processing, e.g. RPF check, followed by selecting the outgoing
entries. A (S, G) routing table entry is installed. An UL is selected
for each outgoing link, and the packet is forwarded onto the next hop
using the selected labels. A corresponding LSP <(iif, label) set of
(oif, label)> is created.
Acharya, Griffoul & Ansari [Page 13]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
In PIM-DM, there is a one-to-one mapping between a multicast routing
entry and a LSP, so that the only FEC to be considered is the (S,G)
FEC. Following the first packet that is processed at L3 (which
triggers the LSP setup) all other packets are forwarded in L2.
In all the solutions, all PIM-DM control messages, Prune and Graft,
can be sent on a single hop LSP between adjacent LSRs.
5.2 Label Binding triggered by PIM-Graft
Arrival of a Graft(S, G) message requires adding an outgoing branch to
the existing LSP. For upstream implicit label allocation, it means to
select an UL on the link on which the Graft(S, G) was received.
5.3 Label Reclamation triggered by PIM-Prune
Subsequent to setting up the LSP, arrival of a PIM Prune message
removes the corresponding outgoing branch of the LSP, i.e. the
previously assigned label is now marked as UL. Suppose LSR1 is
upstream to LSR2 and the label assigned for a (S, G) FEC on the
LSR1--2 link is L1. Once Layer 3 processing at LSR2 sends the Prune to
LSR1, LSR2 marks the incoming label L1 as a UL on the LSR1--LSR2 link
(so that any subsequent assignment of L1 by LSR1 to a new FEC will
trigger L3 processing at LSR2). LSR1 marks L1 as a UL on receiving the
Prune, and modifies the LSP associated with the (S, G) entry. LSR1 is
now free to assign L1 to a new FEC.
5.4 Label Reclamation triggered by PIM inactivity timer
In PIM-DM, the (S, G) forwarding state is associated with an
inactivity timer ([PIM-DM]), which is used to remove inactive (S, G)
entries, i.e. flows with no traffic for a specified amount of time T.
In a L3 router, this is achieved by resetting the timer whenever a
packet is forwarded using the (S, G) entry.
When forwarding traffic in L2 mode, no traffic will be observed at L3
and therefore, we propose that the inactivity timer is reset based on
forwarding activity on the LSP. If no activity is observed within T,
both the LSP and the multicast routing entry should be removed.
To ensure that a label is first reclaimed as UL on the incoming
interface of a LSRd prior to that of an outgoing interface of a LSRu
on the same link, LSRu will send an LDP Label Withdraw message (see
section 4.2.2).
Acharya, Griffoul & Ansari [Page 14]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
5.5 Example
Let us come back to the example of the section 2. A multicast group G
has members in NET2 and NET4 and a source S1 in NET1 sends traffic to
G.
[NET1] [NET2]
| |
R1 R2
\ /
A \ / B
\ /
R3----[NET3]
/ C
/ D
/
R4
|
[NET4]
R3 receives the first packet to G on interface A with an unused label
l1. This unused label has been assigned by the upstream router R1. The
packet has to be forwarded on the outgoing interfaces B, C and D. R3
creates the following multicast routing table entry:
(S1, G) iif={A} oif={B, C, D} prune={}
In the same time, R3 chooses 3 unused labels, one for each outgoing
interface and stores the following bindings:
+------------+-----+----------------+----------------------+
|FEC Element | IIF | Incoming Label | OIF - Outgoing label |
+------------+-----+----------------+----------------------+
| | | | B l2 |
| (S1, G) | A | l1 | C l3 |
| | | | D l4 |
Figure 6: R3 's DM bindings after first packet arrival
Subsequently, a PRUNE message is received on interface C, since NET3
has no member of G and the multicast routing table entry is modified
as:
(S1, G) iif={A} oif={B, D} prune={C}
while the label binding is now:
Acharya, Griffoul & Ansari [Page 15]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
+------------+-----+----------------+----------------------+
|FEC Element | IIF | Incoming Label | OIF - Outgoing label |
+------------+-----+----------------+----------------------+
| | | | B l2 |
| (S1, G) | A | l1 | |
| | | | D l4 |
Figure 7: R3 's DM bindings after Prune arrival
The label l3 on the interface C is again in the pool of unused labels.
6. Proposed solution for PIM-SM in MPLS
Unlike PIM-DM, an entry in the MRT already exists in a sparse mode
(SM) tree prior to arrival of data packets.
SM trees are either source-specific shortest-path trees (SPT) or
shared trees (RPT). The MRT entries for a SM source-tree are similar
to that of a dense-mode tree: both are (S, G) entries. However, while
DM entries are installed on arrival of the first packet, SM entries
are established and refreshed via periodic PIM-Join messages towards
the sender. For a SM shared-tree, a single (*, G) entry in MRT is used
to forward traffic from multiple sources.
6.1 Source-specific/shortest-path tree
Since MRT entries for a source-specific tree are (S, G) entries, it is
natural to do a one-to-one mapping of the L3 tree to a LSP. [FAR1]
suggests piggybacking the label on PIM-Join messages. This requires
modifying L3 protocol messages.
The solution that we propose for label assignment/binding is the same
as that for PIM-DM, i.e. (S, G) routing entry label assignment in a
data-driven fashion, using upstream implicit distribution. Expiration
of the L3 forwarding state (eg non-arrival of Join(S, G) messages)
leads to either removal of outgoing branch from the (S, G) entry (and
the corresponding label of the the LSP) or to the removal of both the
MRT entry and the LSP (if it is the last branch to be deleted).
Note that in this scheme, the downstream LSR marks an incoming label
as UL before the same label is marked as UL on the outgoing interface
of the upstream LSR. Thus, the label is correctly reclaimed (section
4.2.2).
Both solutions use one label for every branch; however, in our
proposed solution, the PIM protocol messages are unchanged and no
labels are assigned till the source becomes active.
Acharya, Griffoul & Ansari [Page 16]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
6.2 Shared tree
The need for assigning source-specific labels on the intermediate
nodes of a shared tree was described in section 3.2. Our proposed
solution is similar to that for PIM-DM and SM source trees, as
follows.
When the first multicast data packet from source S (via the core/RP)
is received at an incoming interface of a LSR on the shared tree, the
UL associated with the packet triggers L3 routing. If a matching MRT
entry exists (either a (*, G) or a (S, G) entry), then UL for each
outgoing interface of the matching entry, is selected and the packet
is forwarded onto the next hop(s) using the selected label(s). A
corresponding LSP <(iif, label) set of (oif, label)> is created.
If the matching MRT entry was a (S, G) entry, then as with source
specific PIM-DM tree, there can be atmost one LSP associated with the
entry.
If the matching MRT entry was a (*, G) entry, then multiple LSPs may
be associated with each entry, corresponding to one LSP per active
source. For each active source S, the association between the MRT
entry and LSP should be explicitly recorded at the LSR. It is possible
that at a later time, the arrival of a PIM-Prune(S, G) message
triggers creation of a (S, G) entry (e.g. when a downstream node of
the shared tree starts to receive data from the source-specific tree
for S); the oif set for this newly created (S, G) entry will equal
that of the (*, G) entry but minus the interface on which the Prune
was received. This should trigger modification of the LSP, i.e. the
label associated with the outgoing interface on which the Prune is
received, is now marked a UL.
PIM-SM allows a sender to transmit packets either as encapsulated
messages (PIM-Register) to the RP, or as native multicast (which
typically happens when the RP joins the source specific tree). In the
former case, end-to-end LSP cannot be created since the LSP between
the source and the RP may have been setup using labels for an
aggregate (unicast) route; additionally, the data packets need to be
decapsulated at L3. In the latter case, i.e. when the RP receives
native multicast packets, end-to-end LSP can be created.
6.2.1 Label Reclamation
All LSPs associated with a (*, G) MRT entry are reclaimed when the L3
forwarding state times out, due to non-arrival of PIM-Join(*, G)
messages from all downstream nodes.
Acharya, Griffoul & Ansari [Page 17]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
Arrival of a Prune (S, G) message triggers label reclamation of a LSP
associated with a (*, G) entry (which then becomes a (S, G) entry
(section 6.2)), or of a LSP associated with (S, G) entry, if such a
LSP exists.
When there is (*, G) state at L3, and there are multiple active
sources, a LSP per source is setup. However, when a source S goes
inactive, there is no L3 mechanism that can act as a trigger to
reclaim the LSP. Notice that LSPs setup with PIM-DM had a similar
situation but since, PIM-DM maintains per-source timers at L3, the LSP
reclamation is triggered by expiration of such timers. In PIM-SM
shared tree, there is no per-source timer maintained at L3 (as part of
the protocol definition; specific implementations may use a per-source
MFC entry).
In order to reclaim labels, we propose that the many-to-one mapping
between a MRT entry and multiple LSPs be associated with an activity
timer per LSP, that is used in the same fashion as PIM-DM activity
timers (see 5.4). Note however, that is not a change to the L3
protocol (PIM-SM), but is an additional data structure maintained with
the L3 to L2 mapping entries.
Like PIM-DM, once the L2 LSP inactivity timer expires, the LSR must
send an LDP Label Withdraw to each LSP downstream nodes, as described
in section 4.2.2.
6.2.2 Example
Let us consider the case of section 3.2:
R2 -------> Join(S1,G) -------> S1
\ /
| \ 1 /
| \ /
| +----+ 2 +----+
+---> | R1 |--------------------| RP |
Prune(S1,G) +----+ ------> +----+
/ Join(*,G) \
/ 3 \
/ \
R3 S2
Initially both R1 and R2 have joined the shared (*, G) tree, so that
the LSP and MRT entries at R1 look like:
Acharya, Griffoul & Ansari [Page 18]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
MRT:
(* , G) iif={2} oif={1, 3}
LSP:
(G, S1) in={2,L12} out={(1,L11);(3,L13)}
(G, S2) in={2,L22} out={(1,L21);(3,L23)}
Note that we have per-source LSP for the group G, bound to the FEC (G,
S1) and (G, S2) as defined in section 3.3. The incoming labels L12 and
L22 are distinct.
Now R2 joins (S1, G) specific tree and we suppose R1 is not part of
the (S1, G) tree. R2 eventually sends a Prune(S1, G) message to R1.
The MRT entries for G become:
MRT:
(* , G) iif={2} oif={1, 3}
(S1, G) iif={2} oif={3}
Moreover the Prune(S1, G) message leads to the removal of one
outgoing branch of the (G, S1) LSP:
LSP:
(G, S1) in={2,L12} out={(3,L13)}
(G, S2) in={2,L22} out={(1,L21);(3,L23)}
With this procedure, R2 is still receiving the traffic from S2 on an
LSP following the L3 shared tree, while the traffic from S1 follows a
shortest-path tree. R3 is not affected and keeps on receiving the
whole traffic to G on the (*, G) interface.
7. Proposed solution for DVMRP and MOSPF in MPLS
DVMRP [DVMRP] is supported in the same fashion as PIM-DM: both are
flood-and-prune techniques which create a (S, G) entry in the MRT on
arrival of the first data packet. The difference between the two is
mainly at L3, e.g. DVMRP uses RIP specific information to disambiguate
equal-cost paths, while PIM-DM uses explicit PIM-Assert messages. Our
proposed solution for PIM-DM is equally applicable to setting up LSPs
when the L3 protocol is DVMRP.
MOSPF is not a flood-and-prune technique [MOSPF]. It uses link-state
advertisements to flood group membership to all routers within a area.
On arrival of the first data packet, a shortest path (S, G) tree
computation is triggered, and a (S, G) entry is installed in the MRT.
Again, our proposed solution for PIM-DM in MPLS is equally applicable
to setting upLSPs when the L3 protocol is MOSPF.
Acharya, Griffoul & Ansari [Page 19]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
8. Effects of L3 topology change on multicast LSP
8.1 Loops
Multicast packet forwarding in a L3 router is preceded by a Reverse
Path Forwarding (RPF) check, i.e. a packet is forwarded only if it
arrives on the "right" interface, as specified in a matching routing
entry ((S,G) or (*, G)). Thus, L3 routing for multicast packets never
creates routing loops. In our solution, the L3 entry is mapped to a L2
forwarding path, and so, the LSP is also loop-free.
8.2 Change of upstream router
Change in unicast routing entries at L3 may lead to a change in the
multicast routing tree at L3 as well. A given router R, may thus be
associated with a new upstream router Ru of the multicast tree, and/or
a different set of downstream routers Rd. A change in a L3 MRT entry
triggers a corresponding change in an existing LSP as follows. If the
incoming interface of the L3 MRT entry changes, then the incoming
label of an existing LSP for that entry is marked UL (and a new LSP
will be setup mirroring the changed L3 MRT entry). If a downstream
interface is deleted from the MRT entry, then the corresponding L2
label is marked UL. (That label will also be reclaimed by the
downstream LSR as it notices its upstream router/LSR has changed).
9. Conclusions
In this document, we first make the following observations for
existing multicast routing protocols (PIM, DVMRP, MOSPF):
1a. Dense-mode trees are created in a data-driven fashion; no L3
messages are used to create the tree.
1b. Dense-mode trees are created on a per-source basis, with no
known mechanisms to aggregate different (S, G) trees.
1c. Source-specific sparse-mode trees are setup via explicit L3
control messages, but like dense-mode trees, multiple (S, G)
trees cannot be aggregated.
1d. Nodes of a shared sparse-mode tree may forward traffic
selectively based on the traffic source.
Acharya, Griffoul & Ansari [Page 20]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
From these observations, it appears that:
2a. The (S, G) structure of DM and source-specific SM trees at L3
favours a per-source label-assignment.
2b. Sparse-mode trees should also be mapped to a per-source LSP to
avoid L3 routing at intermediate nodes of the shared tree.
This led us to suggest a per-source LSP setup that is applicable to
all three trees. No changes are needed to any L3 routing protocol.
Further, at the level of individual nodes, we observe that:
3a. Data-driven creation of MRT entry at DM tree nodes can be
coupled with label assignment, thus avoiding L3 processing
beyond the first packet.
3b. PIM-Prune messages can be exploited to trigger immediate
reclamation of labels on the upstream and downstream nodes of
the pruned branch (DM or SM).
3c. Nodes on a shared SM tree need to perform data-driven
per-source label assignment since the sources are not known
a-priori (see 1d and 2b)
As a result, we presented a basic building block, using the dual
notions of "unused labels" and "implicit binding", to achieve a
data-driven, per-source LSP that binds labels to FECs at the earliest
possible time, i.e the first packet.
10. Security Considerations
Security considerations are not addressed in this document.
11. Acknowledgments
Ajay Bakre, Kojiro Watanabe, D Raychaudhuri at Princeton
Sibylle Schaller,Jurgen Roethig and Heiner Stuettgen at Heidelberg.
12. References
[OOMS1] D.Ooms, W.Livens, B.Sales, M.Ramahlo, "Framework for IP
Multicast in MPLS", draft-ooms-mpls-multicast-00.txt,
August 1998.
[OOMS2] D.Ooms, W.Livens, B.Sales, "MPLS for PIM-SM",
draft-ooms-mpls-pimsm-00.txt, November 1998.
Acharya, Griffoul & Ansari [Page 21]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
[PIM-SM] D.Estrin, D.Farinacci, A.Helmy, D.Thaler, S.Deering,
M.Handley, V.Jacobson, C.Liu, P.Sharma, L.Wei;
"Protocol Independent Multicast (PIM), Sparse Mode Protocol:
Specification", RFC 2362, June 1998.
[PIM-DM] S.Deering, D.Estrin, D.Farinacci, V.Jacobson, A.Helmy,
D.Meyer, L.Wei; "Protocol Independent Multicast Version 2
Dense Mode Specification", draft-ietf-pim-v2-dm-01.txt
[DVMRP] T.Pusateri; "Distance Vector Multicast Routing Protocol",
draft-ietf-idmr-dvmrp-v3-07.
[MOSPF] J.Moy; "Multicast Extensions to OSPF",
draft-ietf-mospf-mospf-01.txt.
[IPSO1] A.Acharya, R.Dighe, F.Ansari; "IPSOFACTO: IP Switching
Over Fast ATM Cell Transport, draft-acharya-ipsw-fast-cell-00.txt
[IPSO2] A.Acharya, R.Dighe, F.Ansari; "IP Switching Over Fast
ATM Cell Transport (IPSOFACTO) : Switching Multicast Flows",
Globecom 97.
[LDP] L.Andersson, P.Doolan, N.Feldman, A.Fredette, B.Thomas,
"LDP Specification", draft-ietf-mpls-ldp-03.txt,
January 1999
[ARCH] E.Rosen, A.Viswanathan, R.Callon, "Multiprotocol Label
Switching Architecture", draft-ietf-mpls-arch-03.txt,
February 1999.
[FAR1] D.Farinacci, Y.Rekhter, "Multicast Label Binding and
Distribution using PIM",draft-farinacci-multicast-tagsw-01.txt,
November 1998.
[FAR2] D.Farinacci, "Partitioning Label Space among Multicast
Routers on a Common Subnet",
draft-farinacci-multicast-tag-part-01.txt,
November 1998.
13. Authors' Addresses
Arup Acharya
C&C Research Labs, NEC USA
4 Independence Way, Princeton, NJ, USA
Phone : 1 609 951 2992
Fax : 1 609 951 2499
E-mail: arup@ccrl.nj.nec.com
Acharya, Griffoul & Ansari [Page 22]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
Frederic Griffoul
C&C Research Labs, NEC Europe Ltd.
Adenauerplatz 6
D-69115 Heidelberg, Germany
Phone : 49 6221 905 1120
Fax : 49 6221 905 1155
E-mail: griffoul@ccrle.nec.de
Furquan Ansari
C&C Research Labs, NEC USA
4 Independence Way, Princeton, NJ, USA
Phone : 1 609 951 2965
Fax : 1 609 951 2499
E-mail: furquan@ccrl.nj.nec.com
Appendix A: LDP Multicast FEC Definitions
In order to use LDP for multicast traffic, three new FEC elements need
to be defined:
- the source-group (S, G) element, type 0x04
- the group (*, G) element, type 0x05
- the group-source (G, S) element, type 0x06
The source-group element corresponds to PIM-DM and PIM-SM source
specific multicast routing entry. The group element corresponds to
PIM-SM shared entry, The group-source FEC is required to support
per-source PIM-SM LSP, as described in section 3.3 and 6.2. Note that
the (G, S) FEC definition impacts the processing of the LDP
messages. For instance, when searching for the Next Hop of a (G, S)
FEC, the lookup must be performed only on the (*, G) entries. The
group FEC could be used in Label Withdraw/Release messages to break
label bindings related to a (*, G) routing entry that has been
removed.
Source-group element value encoding:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SrcGrp (4) | Address Family | S/G Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Acharya, Griffoul & Ansari [Page 23]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
Address Family:
Two octet containing a value from ADDRESS FAMILY NUMBERS in
RFC1700 that encodes the address family of both the source and
the group address.
S/G Len:
One octet unsigned integer containing the length in bits of the
source address that follows. The group address length in bits is
also S/G Len, so that the length of the FEC element after the
S/G Len field is 2 * S/G Len
Source Address:
An address encoding according to the Address Family field,
Group Address:
An address encoding according to the Address Family field.
Group element value encoding:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Grp (5) | Address Family | Grp Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Address Family:
Two octet containing a value from ADDRESS FAMILY NUMBERS in
RFC1700 that encodes the address family of both the source and
the group address.
Grp Len:
One octet unsigned integer containing the length in bits of the
group address that follows.
Group Address:
An address encoding according to the Address Family field.
Acharya, Griffoul & Ansari [Page 24]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
Group-source element value encoding:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| GrpSrc (6) | Address Family | G/S Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Address Family:
Two octet containing a value from ADDRESS FAMILY NUMBERS in
RFC1700 that encodes the address family of both the source and
the group address.
G/S Len:
One octet unsigned integer containing the length in bits of the
source address that follows. The group address length in bits is
also S/G Len, so that the length of the FEC element after the
S/G Len field is 2 * S/G Len
Source Address:
An address encoding according to the Address Family field,
Group Address:
An address encoding according to the Address Family field.
Appendix B: LDP Initialization Session Multicast Parameter
During the LDP session establishment procedure, Label Switching
Routers have to advertise their multicast label binding support and
the advertisement discipline. We propose to add a Multicast Session
Parameters TLV in the optional parameters list of the LDP
Initialization message (see [LDP]).
If the Multicast Session Parameters are not present in the
Initialization message received from LSR1 by LSR2, LSR2 will consider
LSR1 as non-multicast capable.
Acharya, Griffoul & Ansari [Page 25]
Internet Draft draft-ipsofacto-mpls-mcast-00.txt February 1999
The encoding of the Multicast Session Parameters experimental TLV is:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|U|F| Mcast Sess Parms (0x3F01) | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| A | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A = Multicast Label Advertisement Discipline
Indicates the type of Multicast Label advertisement.
00 means upstream "implicit" distribution
01 means downstream-on-demand LDP-based distribution.
If one LSR proposes upstream "implicit" and the other proposes
downstream-on-demand, a default discipline must be imposed.