PIM Working Group Mark Handley
Internet Draft ACIRI
Expiration Date: September, 2000 Isidor Kouvelas
cisco Systems
Lorenzo Vicisano
cisco Systems
March 1, 2000
Bi-directional Protocol Independent Multicast
<draft-ietf-pim-bidir-00.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document discusses Bi-directional PIM, a variant of PIM Sparse-
Mode [1] that builds bi-directional shared trees connecting multicast
sources and receivers. Bi-directional trees are built using a fail-
safe Designated Forwarder (DF) election mechanism operating on each
link of a multicast topology. With the assistance of the DF, multi-
cast data is natively forwarded from sources to the Rendezvous-Point
without requiring source-specific state. The DF election takes place
at RP discovery time and provides a default route to the RP thus
eliminating the requirement for data-driven protocol events.
1 Introduction
Handley, Kouvelas, Vicisano [Page 1]
Internet Draft Bidir PIM February 2000
This document discusses Bi-directional PIM, a variant of PIM Sparse
Mode [1] that builds bi-directional shared trees connecting multicast
sources and receivers.
PIM Sparse-Mode (PIM-SM) version 1 and version 2 construct uni-
directional shared trees that are used to forward data from senders
to receivers of a multicast group. PIM-SM also allows the construc-
tion of source specific trees, but this capability is not related to
the proposal described in this document.
The shared tree for each multicast group is rooted at a multicast
router called the Rendezvous Point (RP). Different multicast group
ranges can use separate RPs within a PIM domain.
In unidirectional sparse-mode PIM, there are two possible methods for
distributing data packets on the shared tree. These differ in the way
packets are forwarded from a source to the RP:
o Initially when a source starts transmitting, it's first hop router
encapsulates data packets in special control messages (Registers)
which are unicast to the RP. After reaching the RP the packets are
decapsulated and distributed on the shared tree.
o A transition from the above distribution mode can be made at a
later stage. This is achieved by building source specific state on
all routers along the path between the source and the RP. This
state is then used to natively forward packets from that source.
Both these mechanisms suffer from problems. Encapsulation results in
significant processing, bandwidth and delay overheads. Forwarding
using source specific state has additional protocol and memory
requirements.
Bi-directional PIM dispenses with both encapsulation and source state
by allowing packets to be natively forwarded from a source to the RP
using shared tree state. For a complete discussion of the pros and
cons of Bi-directional PIM consult appendix C.
The ideas presented in this document are similar to those described
in [2]. The main difference between the two proposals is in the
method used to forward packets traveling upstream from a source to
the RP. In particular [2] uses an IP option (UMP option) on data
packets to assist with upstream forwarding. The UMP option identifies
the next hop router responsible for forwarding the packet upstream.
In contrast, this proposal does not alter data packets to embed con-
trol information. Instead the identification of the next hop upstream
Handley, Kouvelas, Vicisano [Page 2]
Internet Draft Bidir PIM February 2000
forwarder is performed at RP discovery time using a fail-safe elec-
tion mechanism. This significantly simplifies forwarding procedures
and eliminates forwarding loops and packet duplication problems that
exist in [2]. Appendix D presents a comparison between the proposal
in this document and [2].
The rest of this document is structured as follows. Section 2 defines
basic terms. Section 3 describes bidirectional tree formation and
forwarding. The new forwarding rules rely heavily on an election
mechanism described in section 4.
2 Definitions
In the discussion below, the terms upstream, downstream and RPF
interface are always referring to the shared tree rooted at the Ren-
dezvous Point. Downstream indicates the direction on which packets
travel from the RP to receivers along the shared tree. Upstream indi-
cates the opposite direction used by packets traveling from sources
to the RP. The RPF interface for a group is the interface unicast
routing uses to reach the RP.
We assume that the reader is familiar with the unidirectional PIM-SM
protocol [1], as much of the functionality is common to the version
of bidir PIM described below. In particular in the rest of this docu-
ment we will use the concepts of (*,G), (S,G) and (*,*,RP) state and
their component fields (olist, iif, ...). We will also reference Join
and Prune messages whose semantics and packet formats are defined in
[1]. In the context of this document, entries in Join and Prune mes-
sages always have the RP and WC bits set. Also, default timer values
are the ones given in [1].
The protocol presented in this document is largely based on the con-
cept of a Designated Forwarder (DF). A single DF exists for each RP
on every link within a PIM domain (this includes both multi-access
and point-to-point links). The DF is the router on the link with the
best unicast route to the RP. A DF for a given RP is in charge of
forwarding traffic downstream onto the link, and forwarding upstream
traffic from the link towards the RP. It does this for all the bi-
directional groups served by the RP. For those familiar with the DR
in PIM-SM, the Bidir DF provides the same support for local
receivers. The DF election procedures are described in section 4.
3 Tree Building and Forwarding
This section describes how bi-directional tree building procedures
and forwarding rules vary from normal PIM-SM operation.
A router learns which multicast addresses will be used for sparse-
Handley, Kouvelas, Vicisano [Page 3]
Internet Draft Bidir PIM February 2000
mode PIM and which will be for bidirectional groups along with the
candidate RP information through PIM-SM bootstrap messages. Thus
unidirectional and bidirectional groups can coexist in the same
domain.
Throughout the section it is assumed that on each link, all the
routers have a consistent view on which router has the best path to
the RP. This router is called the DF for that RP on the link. This
assumption rests on the DF election procedures described in section
4.
In the procedures described in the rest of this section, if DF infor-
mation is required but not available (election is incomplete), then
no tree building or forwarding action is taken.
3.1 Tree Building
3.1.1 Joining the (Shared) Tree
The procedures for joining the (*,G) shared tree, are almost identi-
cal to those used in PIM-SM with the difference that the tasks of the
DR are handled by the DF.
When a router receives a membership indication from IGMP for a
bidirectional group G with rendezvous point RP, and it is the DF for
the RP on the link on which the report was received, the following
steps are taken:
o If no (*,G) state exists for the group, then a (*,G) entry is
created and populated with the RP DF information.
o If the interface on which the report was received is not in the
olist of the entry, then the interface is added to the olist.
o According to standard PIM-SM procedures [1], if the olist transi-
tioned from null to non-null, a Join message for the group is trig-
gered upstream. The Join is directed to the DF for the (*,G) incom-
ing interface.
When a router receives a Join message addressed to it for a bidir
group G with rendezvous point RP, it must determine if it is the DF
on the link for this RP. If the router is not the DF, it must ignore
the Join message. If it is the DF, then the following steps are
taken:
o If no (*,G) state exists for the group, then a (*,G) entry is
Handley, Kouvelas, Vicisano [Page 4]
Internet Draft Bidir PIM February 2000
created and populated with the RP DF information.
o If the interface on which the Join was received is not in the olist
of the entry, then the interface is added to the olist.
o According to standard PIM-SM procedures [1], the expiration timer
of the olist interface is updated and if the olist transitioned
from null to non-null, a Join message for the group is triggered
upstream. The Join is directed to the DF for the (*,G) incoming
interface.
3.1.2 Leaving the (shared) tree
When the DF for a link receives notification that an interface is no
longer required in the olist of a group (either through IGMP or by
receiving a Prune), it follows standard PIM-SM procedures except any
originated prunes are addressed to the DF on the (*,G) iif.
3.1.3 Designated Forwarder Change
When the DF for a RP on a link changes to a different router, tree
maintenance has to take place to ensure that traffic is still
delivered for all affected groups.
3.1.3.1 Old DF Actions
On losing its status as acting DF on a link, the old DF has to take
the following actions for existing groups that are affected.
o If there were downstream receivers (discovered through IGMP or
downstream Joins), the router has to delete the interface to the
link from its olist.
o If the interface deletion results in a null olist for the (*,G)
then the usual actions are taken to propagate a Prune upstream.
3.1.3.2 New DF Actions
On assuming the role of the DF for a given link, a router has to take
the following actions for each existing group that is affected. If
the router has IGMP information from local receivers for a group, the
interface to the link must be added to the olist for the (*,G). If
the (*,G) entry did not exist then it must be created and populated
with the RP DF information. If the (*,G) olist was previously null
then the usual actions are taken to propagate a Join upstream.
3.1.3.3 Downstream Router Actions
Handley, Kouvelas, Vicisano [Page 5]
Internet Draft Bidir PIM February 2000
When learning about a switch to a new DF on the RPF interface, down-
stream routers must take the following actions for all affected
groups.
o If the router has a (*,G) entry with a non-null olist, it must send
a Join for the group towards the new DF.
o The router may also send a Prune for the group towards the old DF.
3.2 Forwarding Data
The following responsibilities are uniquely assigned to the DF of a
link:
o The DF is the only router that forwards packets traveling down-
stream onto the link.
o The DF is the only router that picks-up upstream traveling packets
off the link to forward towards the RP.
Non-DF routers on a link, that use that link to reach the RP, may
perform the following forwarding actions for bidirectional groups:
o Forward packets from the link towards downstream receivers.
o Forward packets from downstream sources onto the link (provided
they are the DF for the downstream link from which the packet was
picked-up).
When a router receives a multicast packet sent to a bidir group G, it
first looks for a (*,G) matching entry. If this is not found, then
the matching (*,*,RP) state may be used. Alternatively (*,G) state
may be created with a null olist and populated with the RP DF infor-
mation.
The router must forward the packet if either:
o It was received on the RPF interface of the entry (always forward
downstream traveling packets)
o The router is the Designated Forwarder (DF) for the RP on the
interface the packet was received (only the DF forwards upstream).
If a decision to forward the packet is made, then it is forwarded on
all the interfaces in the olist including the entry's RPF interface
Handley, Kouvelas, Vicisano [Page 6]
Internet Draft Bidir PIM February 2000
but excluding the interface the packet was received on. Otherwise the
packet is discarded.
Note: A major advantage of using a Designated Forwarder in bi-
directional PIM is that special treatment is no longer required for
sources that are directly connected to a router. Data from such
sources does not need to be differentiated from other multicast
traffic and will automatically be picked up by the DF. This removes
the need for performing a directly connected check for data to groups
that do not have existing state.
3.2.1 Source-only Branches
Source-only branches of the distribution tree for a group are
branches which do not lead to any receivers, but which are used to
forward packets traveling upstream from sources towards the RP.
Routers along source-only branches do not have an olist for the group
and hence do not need to maintain (*,G) state. Upstream forwarding
can be performed using (*,*,RP) state. An implementation may decide
to maintain (*,G) state for accounting or performance reasons.
4 Designated Forwarder
This section presents a fail-safe mechanism for electing a per-RP
designated router on each link in a PIM domain. We call this router
the Designated Forwarder (DF).
4.1 DF Requirements
The DF election chooses the best router on a link to assume the
responsibility of forwarding traffic between the RP and the link for
the range of multicast groups served by the RP. Different multicast
groups that share a common RP must use the same bi-directional tree
for data forwarding. Hence, the election of an upstream forwarder on
each link does not have to be a group specific decision but instead
can be RP-specific. As the number of RPs is typically small, the
number of elections that have to be performed is significantly
reduced by this observation.
To optimise tree creation, it is desirable that the winner of the
election process should be the router on the link with the "best"
unicast routing metric to the RP. When comparing metrics from dif-
ferent unicast routing protocols, we use the same comparison rules
used in the PIM assert process [1].
The election process needs to take place when information on a new RP
initially becomes available, and can be re-used as new bidir groups
for the same RP are encountered. There are however some conditions
Handley, Kouvelas, Vicisano [Page 7]
Internet Draft Bidir PIM February 2000
where an update to the election is required:
o There is a change in unicast metric to reach the RP for any of the
routers on the link.
o The interface on which the RP is reachable changes to an interface
for which the router was previously the DF.
o A new PIM neighbor starts up on a link.
o The elected DF dies.
The election process has to be robust enough to ensure with very high
probability that all routers on the link have a consistent view of
the DF. This is because with the forwarding rules described in sec-
tion 3.2, if multiple routers end-up thinking that they should be
responsible for forwarding, loops may result. To reduce the possibil-
ity of this occurrence to a minimum, the election algorithm has been
biased towards discarding DF information and suspending forwarding
during periods of ambiguity.
4.2 DF Election Description
To perform the election of the DF for a particular RP, routers on a
link need to exchange their unicast routing metric information for
reaching the RP.
In the election protocol described below, many message exchanges are
repeated 3 times for reliability. In all those cases the message
retransmissions are spaced in time by a small random interval.
For the purposes of the election, interface specific counters and
timers need to be maintained for each RP. When (*,G) entries are
created, they inherit information on the elected DF from the
corresponding RP database entry. Subsequent changes in the winner of
the DF election for a RP are propagated to all dependent (*,G)
entries.
4.2.1 Bootstrap Election
Initially when no DF has been elected, routers finding out about a
new RP start participating in the election by sending Offer messages.
Offer messages include the router's metric to reach the RP. Offers
are periodically retransmitted with a period randomly chosen in the
interval [0.5 * Offer-Interval, Offer-Interval].
If a router hears a better offer from a neighbor, it stops
Handley, Kouvelas, Vicisano [Page 8]
Internet Draft Bidir PIM February 2000
participating in the election for a period of [3 * Offer-Interval].
If during this period no winner is elected, then it restarts the
election from the beginning. If a router receives an offer with
worse metrics, then it restarts the election from the beginning.
The result should be that all routers except the best candidate stop
advertising.
A router assumes the role of the DF after having advertised its
metrics 3 times without receiving any offer from any other neighbor.
At that point it transmits a Winner message which declares to every
other router on the link the identity of the winner and the metrics
it is using.
Routers hearing a winner message stop participating in the election
and record the identity and metrics of the winner. If the local
metrics are better than those of the winner then the router records
the identity of the winner but reinitiates the election.
4.2.2 Loser Metric Changes
Whenever the unicast metric to a RP changes for a non-DF router to a
value that is better than that previously advertised by the DF, the
router with the new metric should take action to eventually assume
forwarding responsibility. After the metric change is detected, the
new candidate restarts participating in the election. If no response
is received after 3 retransmissions, the router assumes the role of
the DF following the usual Winner announcement procedure.
Upon receipt of an offer that is worse than its current metric, the
DF will respond with a Winner message declaring its status and
advertising its metric. Upon receiving this message, the originator
of the Offer records the identity of the DF and aborts the election.
Upon receipt of an offer that is better the its current metric, the
DF records the identity and metrics of the offering router and
responds with a Backoff message. This instructs the offering router
to hold off for a short period of time while the unicast routing sta-
bilises. The Backoff message includes the offering router's new
metric and address. All routers on the link who have pending offers
with metrics worse than those in the backoff message (including the
original offering router) will hold further offers for a period of
time defined in the Backoff message.
If during the backoff period, a third router sends a new better
offer, the Backoff message is repeated for the new offer and the
backoff period restarted.
Handley, Kouvelas, Vicisano [Page 9]
Internet Draft Bidir PIM February 2000
Before the backoff period expires, the acting DF nominates the router
having made the best offer as the new DF using a Pass message. This
message includes the IDs and metrics of both the old and new DFs.
The old DF stops performing its tasks as soon as the transmission is
made. The new DF assumes the role of the DF as soon as it receives
the Pass message. All other routers on the link take note of the new
DF and its metric.
4.2.3 Winner Metric Changes
If the DF's routing metric to reach the RP changes to a worse value,
it sends a set of 3 randomly spaced Offer messages on the link,
advertising the new metric. Routers who receive this announcement but
have a better metric may respond with an Offer message which results
in the same handoff procedure described above. All routers assume
the DF has not changed until they see a Pass or Winner message indi-
cating the change.
There is no pressure to make this handoff quickly if the acting DF
still has a path to the RP. The old path may now be suboptimal but it
will still work while the re-election is in progress.
If the routing metric at the DF changes to a better value, a single
Winner message is sent advertising the new metric.
4.2.4 Winner Loses Path
If a router's path to the RP switches to be through a link for which
it is acting as the DF, then it can no longer provide forwarding ser-
vices for that link. It therefore immediately stops being the DF and
restarts the election. As its path to the RP is through the link, an
infinite metric is used in the Offer message it sends.
Note: At this stage the old DF will have a new RPF neighbor on the
link (indicated by unicast routing) which it could use in a Pass mes-
sage but this adds unnecessary complication to the election process.
4.2.5 Late Router Starting Up
A late router starting up will have no knowledge of a previous elec-
tion outcome. As a result it will start advertising its metric in
Offer messages. As soon as this happens, the Winner will respond
either with a Winner or with a Backoff message.
4.2.6 Winner Dies
Whenever the DF dies, a new DF has to be elected. The speed at which
this can be achieved depends on whether there are any downstream
Handley, Kouvelas, Vicisano [Page 10]
Internet Draft Bidir PIM February 2000
routers on the link.
If there are downstream routers, typically their RPF neighbor
reported by unicast routing will be the DF. They will therefore
notice a change in RPF neighbor away from the DF and will restart the
election by transmitting Offer messages. If the RP is now reachable
through the link via another upstream router, an infinite metric will
be used in the Offer.
If no downstream routers are present, the only way for other upstream
routers to detect a DF failure is by the timeout of the PIM neighbor
information, which will take significantly longer.
4.3 Election Protocol Specification
4.3.1 Protocol State
The operation of the election protocol makes use of the variables and
timers described below. These are maintained per RP for each multi-
cast enabled interface on the router.
Offer-Count (O-count)
Used to maintain the number of times an Offer or Winner mes-
sage has been transmitted.
Best-Offer
Used by the DF to record who has made the last offer for
sending the Pass message.
Offer-Timer (O-timer)
Used to schedule transmission of Offer and Winner messages.
Pass-Timer (P-timer)
Used on the DF to schedule transmission of a Pass message.
4.3.2 Message Summary
The election uses the following control messages:
Offer (OfferingID, Metric)
Sent by routers that believe they have a better metric to
the RP than the metric that has been on offer so far.
Winner (DF-ID, DF-Metric)
Sent by a router when assuming the role of the DF or when
re-asserting in response to worse offers.
Handley, Kouvelas, Vicisano [Page 11]
Internet Draft Bidir PIM February 2000
Backoff (DF-ID, DF-Metric, OfferingID, OfferMetric,
BackoffInterval)
Used by the DF to acknowledge better offers. It instructs
other routers with equal or worse offers to wait till the DF
passes responsibility to the sender of the offer.
Pass (Old-DF-ID, Old-DF-Metric, New-DF-ID, New-DF-Metric)
Used by the old DF to pass forwarding responsibility to a
router that has previously made an offer. The Old-DF-Metric
is the current metric of the DF at the time the pass is
sent.
4.3.3 Protocol Events
During protocol operation, in addition to the expiration of the two
timers and reception of the four messages, the following events can
take place:
o Discovery of new RP
o Metric change
o DF loses path
o Detection of DF failure (unicast routing changed for downstream or
Hello expired)
4.3.4 Protocol Operation
In the two tables below the following rules and notation apply:
o Whenever the notation "?=" is used to assign a value to a timer,
the value is assigned only if the timer is not running or the time
left running is longer than the new value.
o When a new DF is discovered through the receipt of a Winner or Pass
message, if it is not already a PIM neighbor, a neighbor entry is
created with the default expiration interval.
o Whenever the DF is set, the associated metrics are also recorded.
o Timers in square brackets are randomly chosen between 0.5 and 1
times the supplied value.
o When a router has a path to the RP through the link on which the
election is taking place, an infinite metric is used in Offer mes-
sages.
Handley, Kouvelas, Vicisano [Page 12]
Internet Draft Bidir PIM February 2000
Event Condition Non-DF action DF action
=======================================================================
Offer |Local metric |O-count = 0 |Send Backoff
rcvd |worse |O-timer = [Offer-Int] |Best-Offer = sender
| | + 3 * Offer-Int |P-timer = Backoff-Int
| | |Stop O-timer
|--------------------------------------------------------------
|Local metric |O-count = 0 |Send Winner
|better |O-timer ?= [Offer-Int] |Stop P-timer
-----------------------------------------------------------------------
Winner |Local metric |O-count = 0
rcvd |worse |Stop O-timer
| |DF = sender
| |Stop P-timer
|--------------------------------------------------------------
|Local metric |O-count = 0
|better |O-timer ?= [Offer-Int]
| |DF = sender
| |Stop P-timer
-----------------------------------------------------------------------
Backoff |Local metric |O-count = 0
rcvd |worse or to us |O-timer = Backoff-Int + [Offer-Int]
| |DF = sender
| |Stop P-timer
|--------------------------------------------------------------
|Local metric |O-count = 0
|better |O-timer ?= [Offer-Int]
| |DF = sender
| |Stop P-timer
-----------------------------------------------------------------------
Pass |Local metric |O-count = 0
rcvd |worse or to us |Stop O-timer
| |DF = destination
| |Stop P-timer
|--------------------------------------------------------------
|Local metric |O-count = 0
|better |0-timer ?= [Offer-Int]
| |DF = destination
| |Stop P-timer
-----------------------------------------------------------------------
Handley, Kouvelas, Vicisano [Page 13]
Internet Draft Bidir PIM February 2000
Event Condition Non-DF action DF action
=======================================================================
New RP | |O-count = 0 |N/A
| |O-timer ?= [Offer-Int] |
-----------------------------------------------------------------------
Metric |DF metric |nop |O-count = 0
change |better (*) | |O-timer ?= [Offer-Int]
| | |Stop P-timer
|--------------------------------------------------------------
|DF metric |O-count = 0 |Send Winner
|worse (*) |O-timer ?= [Offer-Int] |Stop P-timer
-----------------------------------------------------------------------
No path | |nop |Send Offer (**)
to RP | | |O-count = 1
| | |O-timer ?= [Offer-Int]
| | |DF = unknown
| | |Stop P-timer
-----------------------------------------------------------------------
DF | |O-count = 0 |N/A
failure | |O-timer ?= [Offer-Int] |
-----------------------------------------------------------------------
O-timer |O-count <= 3 |Send Offer
expires | |O-count++
| |O-timer ?= [Offer-Int]
|---------------|----------------------------------------------
|else |O-count = 0
| |Send Winner (***)
| |DF = us
-----------------------------------------------------------------------
P-timer | |DF = Best-Offer
expires | |Send Pass
-----------------------------------------------------------------------
(*) These comparisons are made against the previously stored DF
metrics. In the case of the DF, the old local metrics are used to
compare against. So "DF metric better" means that the metric has
actually become worse.
(**) As the path to the RP is now through the link an infinite metric
is used in the offer.
(***) We only become the DF and send a Winner message if we have a
path to the RP (which is not through the link with the ongoing elec-
tion).
4.4 Election Message Formats
All election messages are sent with a TTL of 1 and are multicast to
Handley, Kouvelas, Vicisano [Page 14]
Internet Draft Bidir PIM February 2000
the ALL-PIM-ROUTERS group. The structure of Encoded-Unicast addresses
is described in [1].
4.4.1 Common Header
The header below is common to all election messages.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type |Subtype| Rsvd | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type
TBD
Subtype
Used to distinguish between different election messages and
is set according to the table below:
Message Subtype
-----------------------
Offer 1
Winner 2
Backoff 3
Pass 4
Rsvd
Set to zero by senders and ignored by receivers.
Checksum
Calculated as specified in [1].
RP-Address
The address of the bidir RP for which the election is taking
place.
Sender Metric Preference
Preference value assigned to the unicast routing protocol
that the message sender used to obtain the route to the RP-
address.
Handley, Kouvelas, Vicisano [Page 15]
Internet Draft Bidir PIM February 2000
Sender Metric
The unicast routing table metric used by the message sender
to reach the RP. The metric is in units applicable to the
unicast routing protocol used.
The Backoff and Pass messages have the additional fields described
below.
4.4.2 Backoff Message
The Backoff message uses the following fields in addition to the com-
mon ones described above.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-Offering-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Offering Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Offering Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Offering Address
The address of the router that made the last (best) Offer.
Offering Metric Preference
Preference value assigned to the unicast routing protocol
that the offering router used to obtain the route to RP-
address.
Offering Metric
The unicast routing table metric used by the offering router
to reach the RP. The metric is in units applicable to the
unicast routing protocol used.
Interval
The backoff interval in milliseconds to be used by routers
with worse metrics than the offering router.
4.4.3 Pass Message
The Pass message uses the following fields in addition to the common
ones described above.
Handley, Kouvelas, Vicisano [Page 16]
Internet Draft Bidir PIM February 2000
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-New-Winner-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| New Winner Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| New Winner Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
New Winner Address
The address of the router that made the last (best) Offer.
New Winner Metric Preference
Preference value assigned to the unicast routing protocol
that the offering router used to obtain the route to RP-
address.
New Winner Metric
The unicast routing table metric used by the offering router
to reach the RP. The metric is in units applicable to the
unicast routing protocol used.
4.5 Timer Values
The Offer-Interval is 100 ms.
The default Backoff-Interval used in Backoff messages is 1 sec.
5 Advertising Bi-directional Groups
Routers discover that a group operates in bi-directional mode from
the Encoded-Group Address fields in PIM Bootstrap and Candidate-RP
Advertisement messages. The Encoded-Group Address field is modified
to include the Bidir-bit (B bit) as specified below:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Addr Family | Encoding Type |B| Reserved | Mask Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Multicast Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
When the Bidir-bit is set, all upgraded bi-directional PIM routers
will follow the forwarding rules described in this specification.
6 Security Considerations
All PIM control messages MAY use IPsec to address security concerns.
Handley, Kouvelas, Vicisano [Page 17]
Internet Draft Bidir PIM February 2000
7 References
[1] Estrin, et al., "Protocol Independent Multicast-Sparse Mode (PIM-
SM): Protocol Specification", RFC 2362, June 1998.
[2] D. Estrin, D. Farinacci, "Bi-directional Shared Trees in PIM-SM",
Work In Progress, <draft-farinacci-bidir-pim-01.txt>, May 1999.
[3] Wei, L., Farinacci, D., "PIM Version 2 DR Election Priority Option",
INTERNET-DRAFT, March 1998.
8 Acknowledgments
The bidir proposal in this draft is heavily based on the ideas and
text presented by Estrin and Farinacci in [2]. The main difference
between the two proposals is in the method chosen for upstream for-
warding.
We would also like to thank Deborah Estrin at ISI/USC as well as
Nidhi Bhaskar, Yiqun Cai, Tony Speakman and Rajitha Sumanasakera at
cisco for their contributions and comments to this draft.
9 Author Information
Mark Handley
mjh@aciri.org
AT&T Center for Internet Research at ICSI
Isidor Kouvelas
kouvelas@cisco.com
cisco Systems
Lorenzo Vicisano
lorenzo@cisco.com
cisco Systems
Handley, Kouvelas, Vicisano [Page 18]
Internet Draft Bidir PIM February 2000
Appendix A: Election Reliability Enhancements
For the correct operation of bi-directional PIM it is very important
to avoid situations where two routers consider themselves to be
Designated Forwarders for the same link. The two precautions below
are not required for correct operation but can help diagnose
anomalies and correct them.
A.1 Missing Pass
After a DF has been elected, a router whose metrics change to become
better than the DF will attempt to take over. If during the re-
election the acting DF has a condition that causes it to lose all of
the election messages (like a CPU overload), the new candidate will
transmit three offers and assume the role of the forwarder resulting
in two DFs on the link. This situation is pathological and should be
corrected by fixing the overloaded router. It is desirable that such
an event can be detected by a network administrator.
When a router becomes the DF for a link without receiving a Pass mes-
sage from the known old DF, the PIM neighbor information for the old
DF can be marked to this effect. Upon receiving the next PIM Hello
message from the old DF, the router can retransmit Winner messages
for all the RPs for which it acting as the DF. The anomaly may also
be logged by the router to alert the operator.
A.2 Periodic Winner Announcement
An additional degree of safety can be achieved by having the DF for
each RP periodically announce its status in a Winner message.
Transmission of the periodic Winner message can be restricted to
occur only for RPs which have active groups, thus avoiding the
periodic control traffic in areas of the network without senders or
receivers for a particular RP.
Appendix B: Interoperability with legacy code
The rules provided in [2] for interoperating between legacy PIM-SM
routers and new bi-directional capable routers change only slightly
to support this new proposal. The only difference is in the defini-
tion of a boundary between a bi-directional capable area and a legacy
area of the network. In [2], a bidir capable router forwarding
upstream, register encapsulates the data packet to the RP if its RPF
neighbor is not bidir capable.
In our proposal, since all the routers on a link need to co-operate
to elect the Designated Forwarder, if even one of the routers on the
link is a legacy router, the election cannot take place. As a result
Handley, Kouvelas, Vicisano [Page 19]
Internet Draft Bidir PIM February 2000
register encapsulation is necessary if one or more routers on the RPF
interface are not bi-directional capable.
As in [2], a Hello option must be used to differentiate between bi-
directional capable and legacy routers, and (S,G) state must be
created on the router doing the register encapsulation to prevent
loops.
Appendix C: Comparison with PIM-SM
This section describes the main differences between Bidir PIM and
sparse-mode PIM:
o Bidir PIM uses a single shared tree for distributing the data for
all the sources of a multicast group. The use of a signle tree sig-
nificantly reduces state requirements on a router. The drawback is
that it may produce suboptimal paths from sources to receivers pos-
sibly resulting in higher network latency and less efficient
bandwidth utilisation.
o In Bidir PIM, packets traveling from a source to the RP, are
natively forwarded on the shared tree. In contrast sparse-mode PIM
uses unicast encapsulation or source-specific state.
o In Bidir PIM, sender-only branches do not need to keep group state.
Data from the source can be natively forwarded towards the RP using
RP-specific forwarding state.
o The Bidir Designated Forwarder (DF) assumes all the responsibili-
ties of the sparse-mode DR. In a multi-access link, the DF responds
to IGMP notifications. Downstream routers on the link use the DF as
their upstream neighbor and direct all Join/Prune messages towards
it.
o To enforce a single forwarder on multi-access links, sparse-mode
PIM uses the Assert mechanism which requires data-packets to
trigger protocol events. In Bidir PIM, data-driven events are com-
pletely eliminated as a correct route is always available at packet
forwarding time.
The DF election problem is easier than the assert problem because
there is a small number of RPs and the per RP DF election can be
done in advance. With the assert mechanism, in addition to each RP,
a forwarder has to be elected for each possible source to a group.
This can not be done before data is available.
o With sparse-mode PIM, when forwarding packets using shared-tree
(*,G) state, a directly-connected-source check has to be made on
Handley, Kouvelas, Vicisano [Page 20]
Internet Draft Bidir PIM February 2000
every packet. This is done to determine if the packet was ori-
ginated by a source which is directly connected to the router. For
a connected source, source-specific state has to be created to
register packets to the RP and prune the source off the shared
tree.
With Bidir PIM directly connected sources do not need any special
handling. The DF for the RP of the group the source is sending to,
seamlessly picks-up and forwards upstream traveling packets.
Appendix D: Comparison with UMP based bidirectional PIM
Using an UMP option for upstream forwarding has the following disad-
vantages:
o Using the DF election, only routers willing to be forwarders can be
elected. In contrast in [2], the downstream router designates the
upstream neighbor responsible for forwarding (using Joins and UMP
packets).
o Using the UMP option, regular data packets are overloaded with con-
trol information for the routing protocol.
o Inserting the extra option in multicast packets transmitted from a
source may result in a packet size exceeding the MTU which will
result in fragmentation.
o The use of an option complicates the router forwarding mechanism.
Additional code to process the new special packet type needs to be
written.
o The contents of the UMP option have to be rewritten and the packet
checksum adjusted on each hop towards the RP at data forwarding
time. This introduces additional per packet processing overhead.
In bidir PIM [2], if the router elected as the DR is different from
that chosen by downstream neighbors for joining the tree, loops can
occur. The main shortcoming of the DR is that its election does not
take into consideration the location of the RP. To resolve this prob-
lem the DR priority draft [3] provides a method for manually confi-
guring the DR election winner. Although this provides a solution it
has two drawbacks:
o It requires a case by case manual configuration.
o It cannot solve the problem if there are different RPs in a domain
serving separate multicast group ranges. In this scenario the
Handley, Kouvelas, Vicisano [Page 21]
Internet Draft Bidir PIM February 2000
requirements of each RP for the DR positioning on a particular link
may differ.
Handley, Kouvelas, Vicisano [Page 22]