Network Working Group                                   Ross Finlayson
Internet-Draft                                          LIVE.COM
Expire in six months                                    1998/09/29

               The UDP Multicast Tunneling Protocol

                   <draft-finlayson-umtp-03.txt>

1. Status of this Memo

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups. Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   1id-abstracts.txt listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa) , nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast ), or
   ftp.isi.edu (US West Coast).

2. Abstract

Many Internet hosts - such as PCs - while capable of running multicast
applications, cannot access the MBone because (i) the router(s) that
connect them to the Internet do not yet support IP multicast routing, and
(ii) their operating systems cannot support a tunneled implementation of IP
multicast routing.

The ''UDP Multicast Tunneling Protocol'' (UMTP) enables such a host to
establish an 'ad hoc' connection to the MBone by tunneling multicast UDP
datagrams inside unicast UDP datagrams.  By using UDP, this tunneling can
be implemented as a 'user level' application, without requiring changes to
the host's operating system.  It is important to note, however, that this
tunneling mechanism is *not* a substitute for proper multicast routing, and
should be used *only* in environments where multicast routing cannot be
used instead.  This document describes this protocol, as is currently
implemented by the liveGate multicast tunneling server
(http://www.lvn.com/liveGate).

3. Revision History

September 1998: draft-finlayson-umtp-03.txt
        Added descriptions of two new commands: JOIN_RTP_GROUP
                and LEAVE_RTP_GROUP, for optimizing the control of
                RTP/RTCP sessions.

February 1998:  draft-finlayson-umtp-02.txt
        Rearranged the fields in the "encapsulation trailer" to allow for
                possible variable-sized trailers in the future.

December 1997:  draft-finlayson-umtp-01.txt
        Changed the "encapsulation header" into an "encapsulation trailer".
        Added 'cookie' fields to prevent source address spoofing.

December 1997:  draft-finlayson-umtp-01.txt
        Original draft.

4. Overview

UMTP operates using (unicast) UDP datagrams between pairs of nodes - each
pair forming the endpoints of a "tunnel".  Each datagram is either a
"command" (e.g., instructing the destination endpoint to join or leave a
group address & port), or "data": an encapsulated multicast UDP datagram,
including a (group, port) tuple.

For each (group, port) being tunneled, a tunnel endpoint can act either as
a "master" or a "slave".  A tunnel master (for a particular (group, port))
periodically sends a JOIN_GROUP command to the remote endpoint (a slave),
instructing it that this (group, port) is still of interest, and should be
tunneled (or continue to be tunneled).  A slave will stop tunneling a
(group, port) if it either (i) receives a LEAVE_GROUP command from the
master, or (ii) has not received any JOIN_GROUP commands for some period of
time (currently, 60 seconds).  Typically, a host that is trying to access
the MBone (e.g., a PC) will be a master, and its remote endpoint (a host
already on the MBone) will be a slave.  (It is also possible, however, for
both endpoints of a tunnel to be masters.)

Whenever a tunnel endpoint - whether a master or a slave - receives a
multicast UDP datagram addressed to a (group, port) that is currently being
tunneled, it encapsulates this datagram and sends it (as a unicast
datagram) to the other end of the tunnel.  Conversely, whenever a tunnel
endpoint receives, over the tunnel, an encapsulated multicast datagram for
a (group, port) of interest, it decapsulates it and resends it as a
multicast datagram (with a TTL that was specified as a parameter in the
encapsulation).

A node (typically a slave server) can be the endpoint of several different
UMTP tunnels - i.e., each with a different endpoint master.  Although,
superficially, such a system appears similar to a multicast<->unicast
reflector, it differs in two ways:
(i)  The tunneling is application-independent, and handles any (UDP)
multicast packets
(ii) After traversing a tunnel, a decapsulated packet is delivered to the
endpoint's application(s) via multicast, not unicast.  This allows regular
multicast-based applications to make use of a UMTP tunnels (subject to some
restrictions described below).

5. Restrictions

UMTP allows a multicast-capable - but otherwise non-MBone-connected - host
to run multicast-based applications.  Such applications, however, must
satisfy the following conditions:
1/ Their multicast packets must all be UDP - not 'raw' IP
2/ The UMTP implementation (i.e., a tunnel endpoint) must have a way of
knowing each (group, port) that the application uses.
3/ The application must not rely upon the source address of its incoming
multicast packets being different for different original data sources.  In
particular, the application must not use source addresses to identify these
original data sources.

Most multicast-based applications - especially those based on RTP [2] -
satisfy these requirements.  If the multicast-based applications are
launched from a separate 'session directory' application, then the UMTP
implementation may be built into the session directory.  For some multicast
applications, however, the (group, port) is not specified in advance, but
instead is determined by the application itself - e.g., by querying a
separate 'licensing' server.  Depending on the host operating system, a
separate UMTP implementation might not be able to independently determine
this (group, port).  In this case, UMTP could not be used, unless it were
incorporated into the application itself.

These application restrictions reinforce the point that UMTP should be used
*only* if full multicast routing cannot be provided instead.

6. Packet Format, and Command Codes

The payload of each UMTP datagram - i.e., excluding the outer UDP header -
consists of a 12-octet 'trailer' descriptor.  For commands other than "DATA",
this descriptor makes up the entire UMTP payload.  In the case of a "DATA"
command, however, this descriptor is also preceded by the data that is being
encapsulated (i.e., the original UDP payload).  The format of the 'trailer'
descriptor is as follows:

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         source cookie         |       destination cookie      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                 (IPv4) multicast address                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |              port             |      TTL      |version|command|
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Note that UMTP is defined only for IPv4.  (In IPv6, native multicast routing
will be ubiquitous.)

"version" - the protocol version - is currently zero.

The following "command"s are currently defined:
        0       (unused)
        1       DATA
        2       JOIN_GROUP
        3       LEAVE_GROUP
        4       TEAR_DOWN
        5       PROBE
        6       PROBE_ACK
        7       PROBE_NACK
        8       JOIN_RTP_GROUP
        9       LEAVE_RTP_GROUP
        10-15   (unused)

"source cookie" and "destination cookie" are used to protect against IP
source address spoofing, as described below.

Notes:
- The primary reason for having the encapsulated data appear *before*
  the encapsulation descriptor is that the encapsulated data will
  often be a RTP packet [2], and by retaining this data in its usual
  position, IP/UDP/RTP header compression [3] - if present - will
  work properly.  A secondary reason is that this order may allow
  UMTP implementations written in some type-safe programming
  languages - such as Java - to reduce the amount of byte copying that
  they would otherwise have to perform.  Note that UMTP implementations
  must allow for the possibility of the 'trailer' descriptor not being
  aligned on a 4-byte boundary.
- The version/command fields always occupy the last byte of the data
  payload.  This allows for the possibility of having different-sized
  encapsulation trailers (e.g., for trailer compression) in future
  versions of this protocol.

7. Protocol Operation

For the purposes of describing the protocol, a tunnel endpoint has the
following state:
- a set E of allowable remote endpoints (each defined by a unicast
  address & port).  Each such endpoint, e, is also tagged with:
        - localCookie_e: a hard-to-guess (e.g., random) 16-bit value
        - remoteCookie_e: a 16-bit value (initially, some arbitrary value)
- a set G of active (group, port) tuples.  Each such tuple, g, is also
  tagged with:
        - a flag F_g with one of two values: {"master", "slave"}
        - a subset E_g of E: the tunnels that are members of g
        - a default TTL T_g (the TTL to use if one is not otherwise known)

Also, for each tunnel endpoint, the following events of note may occur:
1/ The local node requests the joining of a (group, port) g,
   with default TTL t
2/ The local node requests the leaving of a (group, port) g
3/ The local node sends a multicast packet to a (group, port) g, with TTL t
4/ A (non-local) multicast packet arrives for a (group, port) g,
   with source address s
5/ A 'JOIN_GROUP timeout' occurs on a (group, port) g
6/ A tunneled UMTP packet arrives, with source address s,
   and containing 'cookies' (srcCookie, dstCookie)

Each of these events is handled as follows:

1/ The local node requests the joining of a (group, port) g,
   with default TTL t
        add g to G; set F_g to "master"; set T_g to t
        for each tunnel endpoint e in E_g,
                send, to e, at 15 second intervals, a command
                        JOIN_GROUP(group, port, t,
                                   srcCookie<-localCookie_e,
                                   dstCookie<-remoteCookie_e)

2/ The local node requests the leaving of a (group, port) g
        ignore this if g is not a member of G; otherwise:
        if F_g is "master"
                for each tunnel endpoint e in E_g,
                        cancel the ongoing periodic JOIN_GROUP commands
                        send, to e, a command
                                LEAVE_GROUP(group, port,
                                            srcCookie<-localCookie_e,
                                            dstCookie<-remoteCookie_e)
                                (The TTL field in the packet is unused)
                                (optional: Repeat this send several times)
                remove g from G

3/ The local node sends a multicast packet to a (group, port) g, with TTL t
        ignore this if g is not a member of G; otherwise:
        for each tunnel endpoint e in E_g,
                send, to e, a command
                        DATA(group, port, t-1,
                             srcCookie<-localCookie_e,
                             dstCookie<-remoteCookie_e)
                        with the original packet's data prepended

4/ A (non-local) multicast packet arrives for a (group, port) g,
   with source address s
        IMPORTANT (see below): If s is one of the endpoints e in E:
                send, to e, a command
                        TEAR_DOWN(srcCookie<-localCookie_e,
                                  dstCookie<-remoteCookie_e)
                        (the address, port, TTL fields are unused)
                remove e from E
        otherwise:
        ignore this if g is not a member of G; otherwise:
        if the TTL t of the incoming packet is not known:
                set t to T_g
        for each tunnel endpoint e in E_g,
                send, to e, a command
                        DATA(group, port, t-1,
                             srcCookie<-localCookie_e,
                             dstCookie<-remoteCookie_e)
                        with the original packet's data prepended

5/ A 'JOIN_GROUP timeout' occurs on a (group, port) g
        ignore this if g is not a member of G; otherwise:
        if F_g is "slave"
                remove g from G

6/ A tunneled UMTP packet arrives, with source address s,
   and containing 'cookies' (srcCookie, dstCookie)
        If s is not one of the endpoints e in E, ignore this packet
                *unless* the "command" is PROBE, in which case:
                        replace the packet's "command" field with PROBE_NACK,
                        swap the packet's source and destination cookie fields
                        send the resulting packet back to s
                        continue normal processing
        If dstCookie does *not* equal localCookie_s [*]
                send, to s, a command
                        PROBE_ACK(srcCookie<-localCookie_s,
                                  dstCookie<-the srcCookie from the packet)
                        (the address, port, TTL fields are ignored; they
                         should be given the same values as those in the
                         incoming packet)
                continue normal processing
                [*] (Note: An implementation may omit this check if it is sure
                     that source addresses incoming packets cannot be spoofed.)
        Set remoteCookie_s <- srcCookie
        Process the packet, based on the "command" field:
        DATA(group, port, t)
                set g to (group, port)
                        (Note: It is not required that g be a member of G.)
                multicast the encapsulated (prepended) data to g, with TTL t
                        (optional: Instead limit the TTL; see below)
                for each tunnel endpoint e in E_g, EXCEPT for s
                        send, to e, a command
                                DATA(group, port, t-1,
                                     srcCookie<-localCookie_e,
                                     dstCookie<-remoteCookie_e)
                                with the original packet's data prepended
        JOIN_GROUP(group, port, t)
                set g to (group, port)
                ignore this if g is not a multicast address; otherwise:
                if g is not already a member of G:
                        add g to G; set F_g to "slave"; set T_g to t
                if F_g is "slave":
                        set a 'JOIN_GROUP timeout' to occur in 60 seconds time
                                (replacing any existing such timeout)
        LEAVE_GROUP(group, port)
                set g to (group, port)
                ignore this if g is not a member of G; otherwise:
                if F_g is "slave"
                        remove g from G
        TEAR_DOWN
                remove e from E
        PROBE
                send, to s, a command
                        PROBE_ACK(srcCookie<-localCookie_s,
                                  dstCookie<-remoteCookie_s)
                        (the address, port, TTL fields are ignored; they
                         should be given the same values as those in the
                         incoming packet)
        PROBE_ACK, PROBE_NACK
                Ignore this packet (unless we have recently sent a PROBE)
                Note: The PROBE command is one that a node can (optionally)
                        use to determine whether a prospective endpoint
                        exists, and if so, whether it would accept us as an
                        endpoint in turn.  A PROBE can also be used, by a
                        master, to obtain the correct "remoteCookie" - via
                        the subsequent PROBE_ACK - prior to sending its
                        first JOIN_GROUP.

8. Relaying RTP/RTCP Sessions

RTP/RTCP sessions [2] use two ports: an even numbered port for RTP, and
an odd-numbered port (the next highest) for RTCP.  These sessions could
be relayed using two separate JOIN_GROUP commands - one for each port.

As an alternative, the single command JOIN_RTP_GROUP can be used.
This command works exactly like JOIN_GROUP, except that it implicitly
specifies a pair of ports: the port in the command, and that port +1.
Similarly, the command LEAVE_RTP_GROUP can be used to stop relaying a
RTP/RTCP session, as an alternative to using two separate LEAVE_GROUPs.

A UMTP master should use JOIN_RTP_GROUP and LEAVE_RTP_GROUP only for
RTP/RTCP sessions - not for other kinds of sessions that also happen
to use port pairs.  A UMTP implementation may handle these commands
especially, based upon the knowledge that they represent RTP/RTCP
sessions.  For example, an implementation might wish to perform
RTP/RTCP-specific monitoring or statistics gathering, or to check the
RTP SSRC ("synchronization source") field in each incoming multicast
packet for possible collisions (i.e., in case two separate multicast
sources happen to be using the same SSRC, but have not yet detected
and corrected this themselves) [4].

9. Loop Detection and Avoidance

A data loop may occur if the two endpoints of a UMTP tunnel are connected
by multicast, or via another UMTP tunnel elsewhere.  Each UMTP
implementation must take steps to prevent a loop from occurring:
- When multicasting a decapsulated DATA packet, a UMTP implementation
should choose a TTL that's no larger than necessary.  It must also ensure
that if this packet is then re-received (via loop-back), it is not resent
back over the same tunnel.
- If a UMTP implementation receives a multicast packet whose source address
is also the endpoint of a tunnel, it must immediately shut down this tunnel
(& send a TEAR_DOWN command to the endpoint)
- If a UMTP implementation is running on a node for which no more than one
tunnel is expected (e.g., the node is a non-MBone-connected PC), then this
implementation should attempt to ensure that no more than one tunnel can
be started on this note.  (For example, the implementation could use
operating system-level locking to prevent more than one copy of itself
from running simultaneously.)
- If loops in the tunneling topology remain possible, then each end of
the tunnel should periodically send a short 'status' packet - containing
its unicast address - to a common multicast address, and also listen on
this address, checking the contents of each received status packet.
Should these contents be the same as its original status packet, it must
immediately shut down all of its tunnels.
(Note: These are the same loop detection techniques used by "mTunnel" [5] -
a similar multicast tunneling system, developed independently.)

10. Security Considerations

Each UMTP implementation should specify, in advance, its set of allowable
endpoints (E), and thus should not permit arbitrary nodes to form tunnels.

Tunnels are authenticated by IP source address.  However, the 'cookie'
mechanism protects against source address spoofing.  To enhance this
protection, an implementation may choose to occasionally change its
'localCookies' while it is running.  (This should be done immediately prior
to sending a packet across the tunnel, so that the remote endpoint can
learn about the new cookie immediately.)

11. References

[1] LIVE.COM,
    The "liveGate" multicast tunneling server
    http://www.live.com/liveGate/
[2] Schulzrinne, H., Casner, S., Frederick, R., and Jacobson, V.,
    "RTP: A Transport Protocol for Real-Time Applications", RFC 1889,
    January, 1996.
[3] Jacobson, V., Casner, S.,
    "Compressing IP/UDP/RTP Headers for Low-Speed Serial Links"
    Work-in-Progress, Internet-Draft "draft-ietf-avt-crtp-04.txt"
    December, 1997.
[4] Casner, S.,
    Personal communication, December, 1997.
[5] Parnes, P.,
    "mTunnel"
    http://www.cdt.luth.se/~peppar/progs/mTunnel/

12. Author's Address

        Ross Finlayson,
        Live Networks, Inc. (LIVE.COM)
        email: finlayson@live.com
        WWW: http://www.live.com/