Internet Engineering Task Force                     Audio-Video Transport WG
INTERNET-DRAFT                         Schulzrinne/Casner/Frederick/Jacobson
draft-ietf-avt-rtp-06.txt                                  GMD/ISI/Xerox/LBL
                                                           November 28, 1994
                                                            Expires:  3/1/95

            RTP: A Transport Protocol for Real-Time Applications


Status of this Memo


This document is an Internet Draft.  Internet Drafts are working documents
of the Internet Engineering Task Force (IETF), its Areas, and its Working
Groups.   Note that other groups may also distribute working documents as
Internet Drafts.

Internet Drafts are draft documents valid for a maximum of six months.
Internet Drafts may be updated, replaced, or obsoleted by other documents
at any time.   It is not appropriate to use Internet Drafts as reference
material or to cite them other than as a ``working draft'' or ``work in
progress.''

Please check the I-D abstract listing contained in each Internet Draft
directory to learn the current status of this or any other Internet Draft.

Distribution of this document is unlimited.


                                  Abstract

     This memorandum describes the real-time transport protocol, RTP.
    RTP provides end-to-end network transport functions suitable for
    applications transmitting real-time data, such as audio, video
    or simulation data over multicast or unicast network services.
    RTP does not address resource reservation and does not guarantee
    quality-of-service for real-time services.  The data transport is
    augmented by a control protocol (RTCP) designed to provide minimal
    control and identification functionality, particularly in multicast
    networks.   RTP and RTCP are designed to be independent of the
    underlying transport and network layers.  The protocol supports the
    use of RTP-level translators and mixers.


***** DISCLAIMER: This document is not completed.   See the Open Issues
Section.
INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

This specification is a product of the Audio/Video Transport working group
within the Internet Engineering Task Force.   Comments are solicited and
should be addressed to the working group's mailing list at rem-conf@es.net
and/or the authors.


Contents


1 Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4

    1.1 Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . .   5

    1.2 Open Issues and Items to be Completed . . . . . . . . . . . . .   6

2 RTP Use Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . .   7

    2.1 Simple Multicast Audio Conference . . . . . . . . . . . . . . .   7

    2.2 Mixers  . . . . . . . . . . . . . . . . . . . . . . . . . . . .   8

    2.3 Translators . . . . . . . . . . . . . . . . . . . . . . . . . .   8

    2.4 Security  . . . . . . . . . . . . . . . . . . . . . . . . . . .   9

3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   9

4 Byte Order, Alignment, and Reserved Values  . . . . . . . . . . . . .  11

5 RTP Data Transfer Protocol  . . . . . . . . . . . . . . . . . . . . .  11

    5.1 RTP Fixed Header Fields . . . . . . . . . . . . . . . . . . . .  11

    5.2 SSRC Random Identifier Allocation . . . . . . . . . . . . . . .  13

    5.3 RTP Header Extension  . . . . . . . . . . . . . . . . . . . . .  14

6 RTP Control Protocol --- RTCP . . . . . . . . . . . . . . . . . . . .  15

    6.1 Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  15

    6.2 RTCP packet format  . . . . . . . . . . . . . . . . . . . . . .  16

    6.3 SR: Sender report . . . . . . . . . . . . . . . . . . . . . . .  17

    6.4 RR: Receiver report . . . . . . . . . . . . . . . . . . . . . .  20

    6.5 SDES: Source description  . . . . . . . . . . . . . . . . . . .  21

        6.5.1 CNAME: Canonical end-point identifier . . . . . . . . . .  22


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 2]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

        6.5.2 NAME: User name . . . . . . . . . . . . . . . . . . . . .  24

        6.5.3 EMAIL: User's electronic mail address . . . . . . . . . .  24

        6.5.4 PHONE: User's phone number  . . . . . . . . . . . . . . .  24

        6.5.5 LOC: Geographic user location . . . . . . . . . . . . . .  25

        6.5.6 TXT: Text describing the source . . . . . . . . . . . . .  25

        6.5.7 TOOL: Name of application or tool . . . . . . . . . . . .  25

        6.5.8 PRIV: Private extensions  . . . . . . . . . . . . . . . .  26

    6.6 BYE: Goodbye  . . . . . . . . . . . . . . . . . . . . . . . . .  27

    6.7 APP: Application-defined  . . . . . . . . . . . . . . . . . . .  27

7 RTP Translators and Mixers  . . . . . . . . . . . . . . . . . . . . .  28

    7.1 General Description . . . . . . . . . . . . . . . . . . . . . .  28

    7.2 Behavior of Mixers/Translators  . . . . . . . . . . . . . . . .  30

    7.3 Cascaded Mixers . . . . . . . . . . . . . . . . . . . . . . . .  30

8 Security  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  30

    8.1 Security Considerations . . . . . . . . . . . . . . . . . . . .  30

    8.2 Confidentiality . . . . . . . . . . . . . . . . . . . . . . . .  31

9 RTP over Network and Transport Protocols  . . . . . . . . . . . . . .  32

10Summary of Protocol Constants . . . . . . . . . . . . . . . . . . . .  33

    10.1RTCP packet types . . . . . . . . . . . . . . . . . . . . . . .  33

    10.2SDES types  . . . . . . . . . . . . . . . . . . . . . . . . . .  33

11RTP Profiles and Payload Format Specifications  . . . . . . . . . . .  34

A Implementation Notes  . . . . . . . . . . . . . . . . . . . . . . . .  35

    A.1 RTP Header Consistency Check  . . . . . . . . . . . . . . . . .  37

    A.2 Parsing RTCP Packets  . . . . . . . . . . . . . . . . . . . . .  38

    A.3 Generating SDES RTCP Packets  . . . . . . . . . . . . . . . . .  38

    A.4 Parsing SDES RTCP Packets . . . . . . . . . . . . . . . . . . .  39

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 3]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

    A.5 Generating a Random 32-bit Identifier . . . . . . . . . . . . .  40

    A.6 Computing the RTCP Transmission Period  . . . . . . . . . . . .  41

    A.7 Estimating the Interarrival Jitter  . . . . . . . . . . . . . .  44

    A.8 Determining the Expected Number of RTP Packets  . . . . . . . .  44

B Addresses of Authors  . . . . . . . . . . . . . . . . . . . . . . . .  45


1 Introduction


This  memorandum  specifies  the  real-time  transport  protocol  (RTP),
which  provides  end-to-end  delivery  services  for  data  with  real-time
characteristics, for example, interactive audio and video.   RTP itself
does not provide any mechanism to ensure timely delivery or provide other
quality-of-service guarantees, but relies on lower-layer services to do so.
It does not guarantee delivery or prevent out-of-order delivery, nor does
it assume that the underlying network is reliable and delivers packets in
sequence.   The sequence numbers included in RTP allow the end system to
reconstruct the sender's packet sequence, but sequence numbers might also
be used to determine the proper location of a packet, for example in video
decoding, without necessarily decoding packets in sequence.  RTP typically
runs on top of UDP but may be used with other suitable underlying network
or transport protocols (see Section 9).   RTP transfers data in a single
direction, possibly to multiple destinations if supported by the underlying
network.

RTP is intended to follow the principles of Application Level Framing and
Integrated Layer Processing proposed by Clark and Tennenhouse [1].  That is,
RTP is intended to be malleable to provide the information required by a
particular application and will often be integrated into the application
processing rather than being implemented as a separate layer.   While RTP
is primarily designed to satisfy the needs of multi-participant multimedia
conferences, it is not limited to that particular application.   Storage
of continuous data, interactive distributed simulation, active badge, and
control and measurement applications may also find RTP applicable.

This document defines RTP, consisting of two closely-linked parts:


  o the real-time transport protocol (RTP), for exchanging data that has
    real-time properties.

  o the RTP control protocol (RTCP), for monitoring quality of service
    and for conveying information about the participants in an on-going
    session.  The latter aspect of RTCP is used for "loosely controlled"
    sessions, i.e., where there is no explicit membership control and


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 4]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

    set-up.   This functionality may be fully or partially subsumed by a
    session control protocol, which is beyond the scope of this document.


In addition to this document,  a complete specification of RTP for a
particular application will require one or more companion documents (see
Section 11):


  o a profile specification document, which defines payload type codes
    and which may be used to define extensions or modifications to RTP
    that are specific to a particular class of applications.   Typically
    an application will operate under only one profile.   A profile for
    audio and video data may be found in the companion Internet draft
    draft-ietf-avt-profile(1).

  o payload format specification documents, which define how a particular
    payload, such as an audio or video encoding, is to be carried in RTP.


A discussion of real-time services and algorithms for their implementation
and background on some of the RTP design decisions can be found in [2].

The current Internet does not support the widespread use of real-time
services.     High-bandwidth  services  using  RTP,  such  as  video,  can
potentially seriously degrade other network services.  Thus, implementors
should take appropriate precautions to limit accidental bandwidth usage.
Application  documentation  should  clearly  outline  the  limitations  and
possible operational impact of high-bandwidth real-time services on the
Internet and other network services.


1.1 Changes


This section highlights the changes since the July 1994 draft.


  o Length fields in RTCP all have zero as their lowest valid value to
    simplify error checking.

  o The algorithm determining the RTCP send frequency has been specified.

  o The  RTP  header  file  has  been  brought  into  agreement  with  the
    specification.
------------------------------
 1. ftp://ds.internic.net/internet-draft/draft-ietf-avt-profile-03.txt



Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 5]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

  o The intended use of the RTP header extension mechanism has been
    clarified.

  o A separate table calls out the protocol constants.

  o The name 'bridge' has been changed to 'mixer'; generally, the behavior
    of mixers and translators has been clarified.   The description has
    been moved after the protocol has been described to avoid forward
    references.

  o A start has been made toward defining the delay jitter algorithm.  A
    few variations are being discussed.

  o PHONE and TOOL SDES items have been added as standard types, as these
    are likely to be used by a large number of applications.

  o For private, application-specific extensions, the PRIV SDES type has
    been added.

  o The implementation appendix adds parsing of SDES items.

  o The implementation appendix emphasizes that the header file is valid
    for big-endian bit order only.


1.2 Open Issues and Items to be Completed


There are several items which were not completed in time to make the
Internet Draft submission deadline, or need wider input in forming a
decision.   Please note that these things mean this draft should not be
considered complete and ready to implement.


  o Additional explanation is needed for the algorithms to calculate the
    RTCP report rate, to calculate the interarrival jitter report value, to
    perform SSRC ID collition and loop detection, and to perform RTP and
    RTCP header validation.

  o Guidelines on the use of SDES items other than CNAME is needed.
    Other than limited use of these values can negatively impact the RTCP
    reception reporting mechanism.

  o The numeric values assigned to the RTCP types still needs to be
    decided.  There are implementations using SR=0, some using SR=1, and
    in addition a recommendation to set SR=201 in order to aid in header
    validity checking.

  o In the common case where no session member has transmitted anything,
    the receiver report would be empty.  Should it be permissible to simply
    omit it?   Is there anything to be gained by mandating its inclusion

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 6]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

    given that an application should probably not fall over when it is
    missing?


2 RTP Use Scenarios


The following sections describe some aspects of the use of RTP. The examples
were chosen to illustrate the basic operation of applications using RTP,
not to limit what RTP may be used for.    In these examples, RTP is
carried on top of IP and UDP, and follows the conventions established by
the profile for audio and video specified in the companion Internet draft
draft-ietf-avt-profile.


2.1 Simple Multicast Audio Conference


A working group of the IETF meets to discuss the latest protocol draft,
using the IP multicast services of the Internet for voice communications.
Through  some  allocation  mechanism  the  working  group  chair  obtains  a
multicast group address and pair of ports.  One port is used for control
(RTCP) packets, and the other is used for audio data.  This address and port
information is distributed to the intended participants.  The exact details
of the allocation and distribution mechanism are beyond the scope of RTP.

The audio conferencing application used by each conference participant sends
audio data in small chunks of, say, 20 ms duration.  Each chunk of audio
data is preceded by an RTP header; RTP header and data are in turn contained
in a UDP packet.  The Internet, like other packet networks, occasionally
loses and reorders packets and delays them by variable amounts of time.  To
cope with these impairments, the RTP header contains timing information and
a sequence number that allow the receivers to reconstruct the timing seen by
the source, so that, in this example, a chunk of audio is delivered to the
speaker every 20 ms.  The sequence number can also be used by the receiver
to estimate how many packets are being lost.  Each RTP packet also indicates
what type of audio encoding (such as PCM, ADPCM or GSM) is being used,
so that senders can change the encoding during a conference, for example,
to accommodate a new participant that is connected through a low-bandwidth
link.

Each audio source has to have its timing reconstructed separately at the
receiver.  Sources are identified by the synchronization source identifier
(SSRC), not their network address.  The SSRC identifier is a randomly chosen
value meant to be globally unique within a particular conference.

Since members of the working group join and leave during the conference, it
is useful to know who is participating at any moment and how well they are
receiving the audio data.   For that purpose, each instance of the audio
application in the conference periodically multicasts a reception report

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 7]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

plus the name of its user on the RTCP (control) port.  The email address
and other user information may also be included.  A site sends the RTCP BYE
(Section 6.6) packet when it leaves a conference.  The RTCP reception report
indicates how well the current speaker is being received and may be used to
control adaptive encodings.


2.2 Mixers


So far, we have assumed that all sites want to receive audio data in the
same format.   However, this may not always be appropriate.   Consider
the case where participants in one area are connected through a low-speed
link to the majority of the conference participants, who enjoy high-speed
network access.   Instead of forcing everyone to use a lower-bandwidth,
reduced-quality audio encoding, a mixer is placed near the low-bandwidth
area.  This mixer resynchronizes incoming audio packets to reconstruct the
constant 20 ms spacing generated by the sender, mixes these reconstructed
audio streams, translates the audio encoding to a lower-bandwidth one and
forwards the lower-bandwidth packet stream to the low-bandwidth sites.

Since the mixer has constructed a new (mixed) stream of audio, it is
now the synchronization source for the stream.  In order to preserve the
identity of the sites which are speaking, the mixers inserts one or more
contributing source (CSRC) identifiers after the fixed RTP header.  These
identifiers are the synchronization source identifiers (SSRC) of those sites
that contributed to the mixed packet.  An example of this is shown for mixer
M1 in Fig. 1.  As name and location information is received by the mixer in
RTCP packets from the high-speed sites, that information is passed on to the
receivers served by the mixer, either aggregated or as received.


2.3 Translators


Not all sites are reachable by IP multicast.  For these sites, mixing may
not be necessary, but a translation of the underlying transport protocol
is.   RTP-level gateways that do not mix packets from different sources
are called translators in this document.  Application-level firewalls, for
example, will not let any IP packets pass.  Two translators are installed,
one on either side of the firewall, the outside one funneling all multicast
packets received through the secure connection to the translator inside the
firewall.  The translator inside the firewall sends them again as multicast
packets to a multicast group restricted to the site's internal network.
Other examples include the connection of a group of hosts speaking only
IP/UDP to a group of hosts that understand only ST-II. The packet-by-packet
encoding translation of single sources is another example.

The SSRC identifier makes it possible to identify individual sources even
though they all pass through the same translator, i.e., carry the same
network source address.  In Fig. 1, hosts T1 and T2 are translators.

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 8]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994



      [E1]                                    [E6]
       |                                       |
 E1:17 |                                 E6:15 |
       |                                       |   E6:15
       V  M1:48 (1,17)         M1:48 (1,17)    V   M1:48 (1,17)
      (M1)-------------><T1>-----------------><T2>--------------->[E7]
       ^                 ^     E4:47           ^   E4:47
  E2:1 |           E4:47 |                     |   M3:89 (64,45)
       |                 |                     |
      [E2]              [E4]     M3:89 (64,45) |
                                               |            legend:
[E3] --------->(M2)----------->(M3)------------|          [End system]
       E3:64        M2:12 (64)  ^                         (Mixer)
                                | E5:45                   <Translator>
                                |
                               [E5]          source: SSRC (CSRCs)
                                             ------------------->

   Figure 1:  Sample RTP network with end systems, mixers and translators

2.4 Security


Conference participants would often like to ensure that nobody else can
listen to their deliberations.   Encryption provides that privacy.   In
Section 8.1, RTP specifies a mechanism for using encryption, but the actual
key distribution must be accomplished by external means.


3 Definitions


RTP payload is the data following the RTP fixed header and the CSRC list.
    The payload format and interpretation are beyond the scope of this
    memo.  Examples of payload include audio samples and video data.

RTP packets consist of the fixed RTP header, a possibly empty list of
    contributing sources (CSRC list), and the payload, if any.   Some
    underlying protocols may require an encapsulation of the RTP packet to
    be defined.   A single packet of the underlying protocol may contain
    several RTP packets if permitted by the encapsulation method.

(protocol) port is  the  "abstraction  that  transport  protocols  use  to
    distinguish among multiple destinations within a given host computer.
    TCP/IP protocols identify ports using small positive integers." [4]
    The transport selectors (TSEL) used by the OSI transport layer are
    equivalent to ports.


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 9]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

Synchronization source: All packets from a synchronization source form
    part  of  the  same  timing  and  sequence  number  space.     Examples
    of synchronization sources are a microphone, a mixer or a camera.
    A receiver groups packets by synchronization source for playback.
    Typically a single synchronization source emits a single medium (e.g.,
    audio or video).    A synchronization source may change its data
    format, e.g., audio encoding, over time.  Synchronization sources are
    identified by the SSRC value, a numeric identifier contained in the RTP
    header.  SSRC is defined in Section 5.2.

Contributing source: A  contributing  source  identifies  sources  which
    contributed to the data coming from a synchronization source.   They
    are used by mixers (see below) to indicate which sources were combined
    to generate a particular packet.   An example application is audio
    conferencing where a mixer could indicate all the speakers whose speech
    was combined to produce the outgoing packet, allowing the receiver to
    indicate the current speaker, even though all audio packet originated
    from the same synchronization source.

End system: An end system generates the content to be sent in RTP packets
    and consume the content of received RTP packets.  An end system can act
    as one or more synchronization sources in a given media session, but
    typically only one.

Mixer: A mixer receives RTP packets from one or more sources, possibly
    changes their data format, combines them in some manner and then
    forwards a new RTP packet.   Since the timing among multiple input
    sources will not generally be synchronized, the mixer will make timing
    adjustments among the streams and generate its own timing for the
    combined stream.  Thus, all data packets originating from a mixer will
    be identified as having the mixer as their synchronization source.
    A mixer may indicate the contributing sources (see above) for the
    convenience of the receiver.

Translator: A translator forwards RTP packets with their synchronization
    source intact.   Examples of translators include devices that convert
    encodings without mixing or convert from multicast to unicast, and
    application-level filters in firewalls.

QOS monitor: A (QOS) monitor is an application that receives RTCP messages,
    including quality-of-service reports, and estimates the current quality
    of service for monitoring, fault diagnosis and long-term statistics.

Recorder: A recorder records RTP and RTCP packets for later playback.  A
    recorder is usually separate from an end system.   It should try to
    recreate the timing at the sender, without the jitter introduced by the
    network, using the RTP timestamp.  A recorder may not have access to
    the same encryption keys as the other participants in a session, in
    which case sender timing must be estimated if the RTP timestamps are
    encrypted.


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 10]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

Non-RTP mechanisms: refers to other protocols and mechanisms that may be
    needed to provide a usable service.   In particular, for multimedia
    conferences, a conference control application may distribute multicast
    addresses and keys for encryption and authentication, negotiate the
    encryption algorithm to be used, and determine the mapping from the
    RTP format field to the actual data format used.    For simple
    applications, electronic mail or a conference database may also be
    used.   The specification of such mechanisms is outside the scope of
    this memorandum.


4 Byte Order, Alignment, and Reserved Values


All integer fields are carried in network byte order, that is, most
significant byte (octet) first.   This byte order is commonly known as
big-endian.  The transmission order is described in detail in [5], Appendix
A. Unless otherwise noted, numeric constants are in decimal (base 10).

All header data is aligned to its natural length, i.e., 16-bit words are
aligned on even byte addresses, 32-bit long words are aligned at addresses
divisible by four, etc.  Octets designated as padding have the value zero.
Fields designated as "reserved" or R are set aside for future use; they
should be set to zero by senders and ignored by receivers.

NTP timestamps are represented as a 64-bit unsigned fixed-point number, in
seconds relative to 0h UTC on 1 January 1900.  The integer part is in the
first 32 bits and the fraction part in the last 32 bits [6].


5 RTP Data Transfer Protocol


5.1 RTP Fixed Header Fields


The RTP header has the following format:


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T=2|P|X|  CC   |M|     PT      |       sequence number         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           synchronization source (SSRC) identifier            |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|              content source (CSRC) identifiers                |
|                             ....                              |

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 11]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+


The first twelve octets are present in every RTP packet, while the list of
CSRC identifiers is present only when inserted by a bridge.  The fields have
the following meaning:


type (T): 2 bits
    Identifies the type of RTP packet.  The type of the packet described
    here is two (2).   (The value of 2 was chosen to easily distinguish
    packets from those of the prior version of RTP and the protocol used by
    the vat audio tool.)

padding (P): 1 bit
    If the padding bit is one, the packet contains one or more additional
    octets at the end which are not part of the payload.  The very last
    octet of the packet is a count of how many padding octets should be
    ignored.   Padding may be needed by some encryption algorithms with
    fixed block sizes or for carrying several RTP packets in a lower-layer
    protocol data unit.

extension (X): 1 bit
    The bit indicates that the fixed header is followed by exactly one
    header extension, with a format defined in Section 5.3.

CSRC count (CC): 4 bits
    This field contains the number of CSRC identifiers that follow the
    fixed header.

marker (M): 1 bit
    The interpretation of this field is defined by a profile.  A profile
    may define additional marker bits by reducing the number of bits in the
    payload type field.

payload type (PT): 7 bits
    The payload type forms an index into a table defined through profiles
    or non-RTP mechanisms (see Section 3).   The mapping establishes the
    format of the RTP payload and determines its interpretation by the
    application.  A profile specifies a standard mapping.  An initial set
    of default mappings for audio and video is specified in the companion
    profile document RFC TBD, and may be extended in future editions of the
    Assigned Numbers RFC.

sequence number: 16 bits
    The sequence number counts RTP packets.  The sequence number increments
    by one for each packet sent.   The sequence number may be used by
    the receiver to detect packet loss and to restore packet sequence.
    The initial value of the sequence number is random (unpredictable) to
    make known-plaintext attacks on encryption more difficult, even if the
    source itself does not encrypt, because the packets may flow through a

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 12]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

    translator that does.

timestamp: 32 bits
    The timestamp reflects the sampling instant of the first octet in the
    RTP data packet.  The timestamp is incremented with the nominal clock
    frequency determined by the format of data carried as payload.   For
    example, for fixed-rate audio, the timestamp would likely increment by
    one for each sample.   The clock frequency is determined statically
    for each payload type by a profile or payload format specification,
    or dynamically through non-RTP means.   If RTP packets are generated
    periodically, the nominal sampling instant is to be used, not a reading
    of the system clock.  For example, for 160-octet audio packets and a
    one-octet-per-sample encoding, the timestamp should be increased by 160
    for each block of 160 samples read from the input device whether the
    block is transmitted in a packet or dropped as silent.   All samples
    must be counted so that the clock is stable.

    Several consecutive RTP packets may have equal timestamps if they are
    (logically) generated at once, e.g., belong to the same video frame.
    The initial value of the timestamp is random, as for the sequence
    number.

SSRC: 32 bits
    Synchronization source identifier.  This value is chosen randomly, with
    the intent that no two synchronization sources within the same media
    session will have the same SSRC value.   Details are described in
    Section 5.2.

CSRC: up to 15 items, 32 bits each
    Zero  or  more  contributing  source  identifiers.     The  number  of
    identifiers is given by CC. There can be no more than 15 contributing
    sources identified.   CSRC identifiers are inserted by mixers, using
    the SSRC identifiers of contributing sources.  For example, for audio
    packets, all sources that were mixed together to create a packet are
    enumerated, allowing correct talker indication at the receiver.


5.2 SSRC Random Identifier Allocation


The SSRC identifier described above is a random 32-bit quantity that is
intended to be globally unique within a media session.  In particular, a
local network address such as the IPv4 address, is not to be used as an SSRC
identifier.  An example of how to generate such an identifier is presented
in Section A.5.

If a source discovers at any time that another source is already using the
same SSRC identifier, it randomly chooses a different SSRC identifier.  If
a source has transmitted packets with the colliding identifier, it should
send a BYE control packet with the old SSRC identifier before switching
to allow applications to clear any records for this SSRC. Statistics for

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 13]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

receiver reports are keyed to SSRC (not CNAMEs or other identifiers), thus,
a receiver does not have to attempt to carry over statistics when a source
changes SSRC identifiers.  A source that changes its SSRC identifier should
reset the statistics transmitted through sender reports.

If N is the number of sources and L the length of the identifier (here, 32
bits), the probability that two sources independently pick the same value
can be approximated for large N [7, p.  33] as 1 - exp(- N**2 / 2**(L+1)).
For N=1000, the probability is roughly 0.01%.

Because the random identifiers are globally unique, they can be used to
detect loops that may be introduced by bridges.    For each CSRC, the
application should check that packets contain a single SSRC value.  However,
duplicate SSRC values may also indicate a collision resolution in progress.


5.3 RTP Header Extension


The existing RTP data packet header is believed to be complete for the set
of functions required in common across all the application classes that RTP
might support.  If a particular class of applications, operating under one
profile, needs additional functionality, that profile may define additional
fixed fields to follow the SSRC field of the existing fixed header.  If it
turns out that additional functionality is needed across all profiles, then
a new version of RTP should be defined to make a permanent change to the
fixed header.

However, an escape hatch is provided to allow individual implementations to
experiment with new mechanisms that require additional information to be
carried in the RTP data packet header.  The header extension mechanism is
designed so that it may be ignored by other interoperating implementations
that have not been extended.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      defined by profile       |           length              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

If the X bit in the RTP header is one, a variable-length header extension is
appended to the RTP header, following the CSRC list if present.  The header
extension contains a 16-bit length field that counts the number of 32-bit
words in the extension, excluding the four-octet extension header (therefore
zero is a valid length).  Only a single extension may be appended to the
RTP data header.  To allow multiple interoperating implementations to each
experiment independently with different header extensions, or to allow a
particular implementation to experiment with more than one type of header
extension, the first 16 bits of the header extension are left open for
distinguishing identifiers or parameters.  The format of these 16 bits is

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 14]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

to be defined by the profile specification under which the implementations
are operating.  This RTP specification does not define any header extensions
itself.

Note that this mechanism is intentionally cumbersome.  Many candidate uses
would better be done another way, for example with a profile-specific
extension to the fixed header.    In particular, additional information
required for a particular payload type, such as a video encoding, should be
carried in the payload section of the packet.  This might be in a header
that is always present at the start of the payload section, or might be
indicated by a reserved value in the data pattern.

Every conformant RTP application needs to be able to skip, but not process
the header extension.


6 RTP Control Protocol --- RTCP


6.1 Introduction


The RTP control protocol (RTCP) provides two functions:  (1) monitoring the
distribution of data, and (2) conveying minimal session information.

The first function is performed by the RTCP sender or receiver report
packets, described below.  This function is an integral part of the RTP's
role as a transport protocol, and is mandatory when RTP is used in the IP
multicast environment.

The second RTCP function provides support for "loosely controlled" sessions,
i.e., where participants enter and leave without membership control and
parameter negotiation.

RTCP packets are sent to all members of a session,  using the same
distribution mechanism as for data packets.  The underlying protocol must
provide multiplexing of the data and control packets, for example using
separate port numbers with UDP. The period between RTCP packets should be
varied randomly to avoid synchronization of all sources.  Its mean should
increase with the number of participants in the session to limit the growth
of the overall network and host interrupt load to a small fraction of the
load induced by the media data.  An algorithm for calculating the period is
given in Appendix A.6.

The length of the RTCP period determines, for example, how long a receiver
joining a session has to wait until it can identify the source.  A receiver
may remove from its list of active sites a site that it has not been heard
 from for a given time-out period; the time-out period may depend on the
number of sites or the observed average interarrival time of RTCP messages.
A small multiple of the RTCP period is suggested to allow for packet loss.


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 15]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

Not every RTCP message has to contain all SDES descriptions for a source;
for example, SDES EMAIL might only be sent every few messages.


6.2 RTCP packet format


Each RTCP packet begins with a fixed part similar to that of RTP data
packets, followed by structured elements that may be of variable length but
always end on a 32-bit boundary.

The length field and alignment requirement are included to make RTCP packets
"stackable".  Multiple RTCP packets may be sent in a single packet of the
lower layer protocol such as UDP to combine as much information as possible
into one packet, particularly for translators and mixers.  This is advisable
since per-packet processing overhead in the network and in many operating
systems is high.   For example, in a Unix operating system running the X
windowing system, each packet is likely to cause a hardware interrupt, a
software interrupt, a context switch and an X event.

Any combination of RTCP packets may be stacked in one lower-layer packet,
and each RTCP packet is processed independently.  An application may skip
RTCP packets with types unknown to it.  Additional RTCP packet types may be
registered with the Internet Assigned Numbers Authority.

The first RTCP packet is always a report packet, which may be in either of
two forms:  a sender report (SR) for source that have recently transmitted
RTP data packets or receiver reports (RR) for sources that have not recently
sent RTP data.  It may optionally be followed by more receiver report (RR)
packets if the number of sources being reported exceeds 31, the number that
will fit into one SR or RR packet.  These one or more report packets are
followed by an SDES packet containing at least the CNAME item.  Finally,
APP, BYE or other, yet to be defined packet types may follow in any order.
Packet types may appear more than once.


















Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 16]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

6.3 SR: Sender report


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T=2|P|   RC    |  PT=RTCP_SR=0 |           length              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        SSRC of sender                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             NTP timestamp, most significant word              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            NTP timestamp, least significant word              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         RTP timestamp                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     sender's packet count                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      sender's octet count                     |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                  SSRC_1 (SSRC of first source)                |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             cumulative number of packets received             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             cumulative number of packets expected             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    interarrival jitter                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          last SR (LSR)                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  delay since last SR (DLSR)                   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  SSRC_2 (SSRC of second source)               |
                               ...
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                  application-specific extensions              |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

The sender report packet consists of two sections.  The first section, the
actual sender report, is 24 octets long and is present in every sender
report packet.  The second section contains zero or more reception reports
depending on the number of sources heard since the last report.  The fields
have the following meaning:


type (T): 2 bits
    The current value of the type identifier is 2 (two), as in RTP packets.

padding (P): 1 bit
    If the padding bit is one, the packet contains some additional octets
    at the end which are not part of the payload.  The very last octet of

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 17]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

    the packet is a count of how many padding octets should be ignored.
    Padding may be needed by some encryption algorithms with fixed block
    sizes.

reception report count (RC): 5 bits
    This field contains the number of reception report blocks contained in
    this packet.  A value of zero is valid.

packet type (PT): 8 bits
    The value of the packet type identifier is the constant RTCP_SR, defined
    in appendixA.

length: 16 bits
    The length of this RTCP packet in 32-bit words minus one, including the
    header and any padding.(2)

SSRC: 32 bits
    Synchronization source identifier for the sender of this RTCP packet.

NTP timestamp: 64 bits
    The  NTP  timestamp  corresponds  to  the  wallclock  time  when  this
    traffic report is sent so that it may be used in combination with
    timestamps returned in reception reports from other receivers to
    measure round-trip propagation to those receivers.   Receivers should
    expect that the measurement accuracy of the timestamp may be limited to
    far less than the resolution of the NTP timestamp.   The measurement
    uncertainty of the timestamp is not transmitted as it is usually
    difficult to estimate with any degree of reliability.  A sender that
    can keep track of real time but has no notion of wallclock time may use
    the elapsed time of the session instead.  It is permissible to use the
    sampling clock to estimate elapsed wallclock time.  This is assumed to
    be less than 68 years, so the high bit will be zero.  A sender that has
    no notion of wallclock time may set the NTP timestamp to zero.

RTP timestamp: 32 bits
    Reference timestamp that corresponds to the same time as the NTP
    timestamp (above).    This correspondence may be used for intra-
    and inter-media synchronization for sources whose NTP timestamps are
    synchronized,  and  may  be  used  by  media-independent  receivers  to
    estimate the nominal RTP clock frequency.    This RTP timestamp is
    calculated from the corresponding NTP timestamp using the relationship
    between the RTP timestamp counter and real time as maintained by
    periodically checking the real time at a sampling instant.

sender's packet count: 32 bits
    Counts the total number of RTP packets transmitted by the source since
    the source has started transmission and until the time this SR packet
------------------------------
 2. The offset of one makes zero a valid length and avoids possible infinite
loops.


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 18]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

    was generated.

sender's octet count: 32 bits
    Counts the total number of octets transmitted in RTP packets by the
    source since the source has started transmission and until the time
    this sender report packet was generated.  The octet count includes only
    the payload of RTP data packets.  This field can be used to estimate
    the overall payload data rate.


Each reception report in the second section of the sender packet conveys
statistics on the reception of RTP packets from a single synchronization
source.  These statistics are:


SSRC_n (source identifier): 32 bits
    SSRC identifier of the source to which the information in this
    reception report pertains.

cumulative number of packets received: 32 bits
    The field contains the total number of RTP packets received from the
    source since the beginning of reception.  By taking the difference in
    this number between two reception reports from a given source, and
    dividing by the interval between those two reports, a received packet
    rate may be calculated.

cumulative number of packets expected: 32 bits
    The field contains the total number of packets expected by the
    receiver,  which  may  be  computed  according  to  the  algorithm  in
    Appendix A.8.  Together with the cumulative number of packets received,
    a monitor can measure the packet loss rate over both short and long
    time periods.   The number of packets expected may also be used to
    judge the statistical validity of any loss estimates.  (For example,
    1 out of 5 packets lost has a different significance than 200 out of
    1000.)   There will be no loss indication (and likely no reception
    report issued) for a source if all recent packets from that source have
    been lost.

interarrival jitter: 32 bits
    The interarrival jitter field should be an estimate of the statistical
    variance of the RTP data interarrival time, measured in timestamp units
    and expressed as an unsigned integer.  A particular algorithm is not
    prescribed, but a sample algorithm is shown in Section A.7.   If a
    receiver cannot estimate this value, it should use a value of zero.

last SR timestamp (LSR): 32 bits
    The middle 32 bits of the last NTP timestamp (bytes 11 to 14) received
    as part of the RTCP reception report (RR) packet from the source being
    reported.

delay since last SR (DLSR): 32 bits

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 19]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

    Delay, expressed in units of 1/65536 seconds, between receiving the
    sender's SR packet and sending this SR packet.   The 'last SR' and
    'delay since last SR' fields allow the computation of round trip time
    by the sender of the SR. This may be used to cluster nodes according to
    propagation delay.  If the reception report for SSRC S from receiver
    R arrives at time A at S, S can compute the round-trip time to R as
    A -- LSR -- DLSR. Round-trip may be of limited use for many real-time
    applications and that some links have very asymmetric delays.


All reported numbers except interarrival jitter are cumulative.    The
difference between two reports can be used to estimate recent quality of
the distribution.  A fixed clock (NTP timestamp) is chosen so that quality
monitors do not have to be cognizant of the clock rate for the current
encoding.  If a source cannot compute a particular value, it inserts a value
of zero.

A receiver (end system or mixer) should send sender/receiver report packets
including a reception report for each source from which it has received RTP
packets since the last report, or for as many such sources as will fit.
A mixer should not send reception reports on one side for sources it has
received on the other side.

A profile may define application specific extensions to the sender report
if there is additional information that should be reported regularly about
the sender or receivers.  If information about receivers is to be included,
that data may be structured as an array of blocks parallel to the array of
receiver reports in the second section of the sender report.



6.4 RR: Receiver report


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T=2|P|   RC    |  PT=RTCP_RR=1 |           length              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     SSRC of packet sender                     |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                  SSRC_1 (SSRC of first source)                |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             cumulative number of packets received             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             cumulative number of packets expected             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    interarrival jitter                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          last SR (LSR)                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 20]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

|                  delay since last SR (DLSR)                   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  SSRC_2 (SSRC of second source)               |
                               ...
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                  application-specific extensions              |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

The RR packet is issued in place of an SR packet only if the application has
not recently sent any RTP data packets.  (Unless specified by a profile, the
timeout delay between sending the last RTP packet and ceasing to send SR
packets should be a small multiple of the current reporting interval.)  The
packet fields have the same meaning as for the SR packet.  Additional RR
packets may follow the initial SR or RR packet if there are more than 31
sources to be reported.



6.5 SDES: Source description


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T=2|P|    CC   | PT=RTCP_SDES=2|           length              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          SSRC/CSRC_1                          | chunk
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          SDES items                           |
|                              ...                              |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                          SSRC/CSRC_2                          | chunk
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          SDES items                           |
|                              ...                              |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

The SDES packet is composed of a header and zero or more chunks containing
items describing the sources identified in those chunks.   The items are
described individually below.


type (T), padding (P), payload type (SDES), length:
    As described for the SR packet.

CC: 5 bits
    This field contains the number of SSRC/CSRC chunks included in this
    SDES packet.




Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 21]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

Each chunk consists of an SSRC/CSRC identifier followed by a list of zero or
more items, which carry information about the SSRC/CSRC. Each chunk starts
on a 32-bit boundary.  Each item consists of an 8-bit type field, an 8-bit
octet count describing the length of the text (thus, not including this
two-octet header) and text.   The text is encoded according to the UTF-2
encoding specified in Annex F of ISO standard 10646 [8,9].  This encoding
is also known as UTF-8 or UTF-FSS. It is described in ``File System Safe UCS
Transformation Format (FSS_UTF)'', X/Open reliminary Specification, Document
Number:  P316 and Unicode Technical Report #4.  US-ASCII is a subset of this
encoding and requires no additional encoding.  The presence of multi-octet
encodings is indicated by setting the most significant bit to a value of
one.

Items are contiguous, i.e., items are not individually padded to a 32-bit
boundary.  Text is not zero terminated.  The list of items in each chunk is
terminated by one or more binary zeroes to denote the end of the list and
pad until the next 32-bit boundary.  An SDES packet with zero chunks or a
chunk with zero items is valid but useless.

End systems send one SDES packet containing their own source identifier (the
same as the SSRC in the fixed RTP header).  A mixer sends one SDES packet
containing a chunk for each contributing source from which it is receiving
SDES information, or more than one SDES packet if there are more than 31
such sources.

The following SDES items are currently defined.   Additional items may be
defined in a profile; some items shown here may be useful for particular
profiles only.  Not all items need to be sent with every SDES packet, except
for the CNAME item, which is mandatory.(3)


6.5.1 CNAME: Canonical end-point identifier


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    CNAME=1    |    length     | user and domain name         ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The CNAME identifier has the following properties:


  o Because the randomly allocated SSRC identifier may change if a conflict
    is discovered or if a program is restarted, the CNAME item is required
    to provide the binding to an identifier for the source that remains
    constant.
------------------------------
 3.  Items  are  defined  here  rather  than  in  the  profile  to  simplify
profile-independent applications, using common type numbers.


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 22]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

  o Like the SSRC identifier, the CNAME identifier should also be unique
    within one medium of a session.

  o To provide a binding among multiple media tools used in a session by
    one participant, the CNAME should be fixed for that participant.

  o To facilitate third-party monitoring, the CNAME should be suitable for
    either a program or a person to locate the source.


Therefore, the CNAME should be derived algorithmically and not entered
manually, when possible.  To meet these requirements, the following format
should be used unless a profile specifies an alternate syntax or semantics.
The CNAME item should have the format "user@host" or "host", where "host" is
the fully qualified domain name of the host from which the real-time data
originates, formatted according to the rules specified in RFC 1034, RFC 1035
and Section 2.1 of RFC 1123.  The "host" form may be used if a user name
is not available, for example on single-user systems.   Only if a system
cannot obtain a valid domain name, it may use the printable representation
of its lowest numbered numeric network address.   Hosts using IP Version
4 use the 'dotted decimal' (also known as 'dotted quad') representation.
Application writers should be aware that address assignments such as the
Net-10 assignment proposed in RFC 1597 may create IP network addresses that
are not globally unique.  This may create difficulties if sites that do not
have direct IP connectivity to the public Internet forward RTP packets to
the public Internet through an RTP-level firewall.  (See also RFC 1627.)  To
handle this case, applications should provide a means to configure a unique
name.

Examples are:


 "doe@sleepy.megacorp.com" or "sleepy.megacorp.com" or "doe@192.35.149.160"
                            or "192.35.149.160"


The user name should be in a form that a program such as "finger" or "talk"
could use, i.e., it typically is the login name rather than the real-life
name.  The host name is not necessarily identical to the electronic mail
address of the participant.

This syntax will not provide unique identifiers for each source if an
application permits a user to generate multiple sources from one host.  Such
an application would have to rely on the SSRC to further identify the
source, or the profile for that application would have to specify additional
syntax for the CNAME identifier.






Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 23]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

6.5.2 NAME: User name


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     NAME=2    |    length     | common name of source        ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The real name used to describe the source, e.g., "John Doe, Bit Recycler,
Megacorp".   This name may be in any form desired by the user.   For
applications such as conferencing, this form of name may be the most
desirable for display in participant lists, and therefore might be sent
most frequently (profiles may establish such priorities).  The NAME value
is expected to remain constant at least for the duration of a session.  It
should not be relied upon to be unique across the session.



6.5.3 EMAIL: User's electronic mail address


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    EMAIL=3    |    length     | email address of source      ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The  email  address  is  formatted  according  to  RFC  822,  for  example,
"John.Doe@megacorp.com".  The EMAIL value is expected to remain constant for
the duration of a session.



6.5.4 PHONE: User's phone number


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    PHONE=4    |    length     | phone number of source       ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The phone number should be formatted with the plus sign replacing the
international access code.  For example, "+1 908 555 1212" for a number in
the United States.






Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 24]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

6.5.5 LOC: Geographic user location


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     LOC=5     |    length     | geographic location of site  ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Depending on the application, different degrees of detail are appropriate
for this item.  For conference applications, a string like "Murray Hill, New
Jersey" may be sufficient, while, for an active badge system, strings like
"Room 2A244, AT&T BL MH" might be appropriate.  The degree of detail is left
to the implementation and/or user, but format and content may be prescribed
by a profile.  The LOC value is expected to remain constant for the duration
of a session, except for mobile hosts.



6.5.6 TXT: Text describing the source


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     TXT=7     |    length     | text describing source       ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Message describing the current state of the source, e.g., "can't talk,
having lunch".  During a seminar, this field might be used to convey the
title of the talk.  The TXT value is likely to change during a session.




6.5.7 TOOL: Name of application or tool


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     TOOL=6    |    length     | name/version of source appl. ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

String giving the name and possibly version of the application generating
the stream, e.g., "videotool 1.2".   This information may be useful for
debugging purposes and is similar to the Mailer or Mail-System-Version SMTP
headers.




Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 25]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

6.5.8 PRIV: Private extensions


  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |     PRIV=8    |    length     | length of type| type string  ...
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...              |                 value string                 ...
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

This type is used to define experimental or application-specific SDES
extensions.  The item contains a prefix consisting of a length-string pair,
followed by the value string filling the remainder of the item.  The prefix
length field is one octet long.   The prefix string is a name chosen by
the person defining the PRIV item to be unique with respect to other PRIV
items this application might receive.  The application creator might choose
to use the application name plus an additional subtype identification if
needed.  Alternatively, it is recommended that others choose a name based on
the entity they represent, then coordinate the use of the name within that
entity.  Note that the prefix consumes some space within the items total
length of 255 octets, so the prefix should be kept as short as possible.

The second string is the value, that is, the information carried by this
item.   SDES PRIV types will not be registered by IANA. If a type proves
to be of general utility, it should be assigned a regular SDES type
and registered with IANA instead for ease of handling and transmission
efficiency.
























Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 26]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

6.6 BYE: Goodbye


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T=2|P|     CC  | PT=RTCP_BYE=3 |           length              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           SSRC/CSRC                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                               ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       reason for leaving                     ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


The BYE packet indicates that one or more sources are no longer active.


type (T), padding (P), payload type (BYE), length:
    As described for the SR packet.

CC: 5 bits
    This field contains the number of SSRC/CSRC identifiers included in
    this SDES packet.  A count value of zero is valid, but meaningless.


If a BYE packet is received by a mixer, the mixer forwards the BYE packet
with the SSRC/CSRCS identifier(s) unchanged.   If a mixer shuts down, it
should send a BYE packet listing all contributing sources it handles, as
well as its own SSRC identifier.  Optionally, the BYE packet may include an
octet count followed by the indicated number of characters indicating the
reason for leaving, e.g., "camera malfunction".  The string has the same
encoding as that described for SDES. If the string fills the RTCP packet to
the next 32-bit boundary, the string is not zero terminated.  If not, the
RTCP BYE packet is padded with zeroes.



6.7 APP: Application-defined


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T=2|P| subtype | PT=RTCP_APP=4 |           length              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           SSRC/CSRC                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          name (ASCII)                         |


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 27]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   application-dependent data                 ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The APP packet is intended for experimental use as new applications and new
features are developed, without requiring packet type value registration.
APP packets with unrecognized names should be ignored.   After testing
and if wider use is justified, it is recommended that each APP packet
be redefined without the subtype and name fields and registered with the
Internet Assigned Numbers Authority using an RTCP packet type.


type (T), padding (P), packet type (APP), length:
    As defined for the SR packet.

subtype: 5 bits
    May be used as a subtype to allow a set of APP packets to be defined
    under one unique name, or for any application-dependent data.

name: 4 octets
    A name chosen by the person defining the set of APP packets to
    be unique with respect to other APP packets this application might
    receive.  The application creator might choose to use the application
    name,  and  then  coordinate  the  allocation  of  subtype  values  to
    others who want to define new packet types for the application.
    Alternatively,  it is recommended that others choose a name based
    on the entity they represent, then coordinate the use of the name
    within that entity.   The name is interpreted as a sequence of four
    ASCII characters, with uppercase and lowercase characters treated as
    distinct.

application-dependent data: variable length
    Application-dependent data may or may not appear in an APP packet.  It
    is interpreted by the application and not RTP itself.




7 RTP Translators and Mixers


7.1 General Description


Besides end-systems, RTP also supports the notion of "translators" and
"mixers",  which could be considered as "intermediate systems" at the
RTP level.   A translator connects two or more transport-level "clouds".
Typically, each cloud is defined by a common transport level port, a
multicast address and a transport protocol (e.g., UDP). (Exceptions are
network-level protocol translators, which we ignore here.)   The use of

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 28]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

translators and mixers must not result in loops.  In the explanations below,
we use the short-hand terms left and right cloud for conciseness to refer to
two clouds connected by a translator/mixer.  The concepts apply naturally in
the other direction or if a translator/mixer joins more than two clouds.

All  RTP  end  systems  that  can  communicate  through  one  or  more  RTP
translators/mixers share the same SSRC space, that is, the SSRC identifiers
must be unique among all these end systems.

Translators may change the encoding of the data (and thus the RTP data
payload type and timestamp) and may combine several data packets.  If they
combine several data packets, they have to change the sequence number in
each.

We distinguish three basic kinds of translators and mixers:


invisible translator: Invisible  translators  cannot  be  detected  by  end
    systems except if they knew what payload type or transport address was
    used by the sender.  They do not have their own SSRC identifier.

    [other terms:  anonymous translator; translator without a personality
    :-); shy translator :-)]

visible translator: A visible translator has its own SSRC identifier.  It
    forwards packets with their original SSRC identifier.

    [other terms:  self-identifying translator?]

mixer: A mixer has its own SSRC identifier and forwards all incoming
    data packets, combined (mixed) into a single stream, with its own
    SSRC identifier.   A mixer may indicate the sources that contributed
    to a particular packet by adding CSRC identifiers to the RTP data
    packet.    However,  this is not required and may be ill advised
    for some applications using low-bandwidth links.   A mixer that is
    also a contributing source for some packet must explicitly include an
    indentifier for itself in the CSRC list for that packet.


Fig. 1 shows a combination of mixers and translators and their effect
on CSRC and SSRC identifiers.   In the figure, end systems are shown as
rectangles (named E), translators as triangles (named T) and mixers as ovals
(named M). The notation "M1:   48(1,17)" designates a packet originating
a mixer M1, identified with a random SSRC value of 48 and two CSRC
identifiers, 1 and 17.







Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 29]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

7.2 Behavior of Mixers/Translators

              invisible visible mixing
own SSRC     no        yes     yes
insert CSRC  no        no      may
send own RR  no        yes     yes
send own SR  no        no      yes
send own BYE no        yes     yes


The processing of RTCP by translators and mixers is governed by these rules:


SDES: If and only if a translator/mixer has its own SSRC, it must
    send SDES CNAME information about itself.    Invisible and visible
    translators typically forward SDES information unchanged from one cloud
    to the other, but may, for example, decide to filter non-CNAME SDES
    information if bandwidth is scarce.   Mixers must forward SDES CNAME
    information if the CSRCs they include in RTP data packets.

RR: For sources in one cloud, the mixer generates its own reception reports
    and sends them to the same cloud.  It does not send these reception
    reports to the other cloud.  Invisible and visible translators forward
    reception reports with their SSRC identifier unchanged between the left
    and right cloud.

SR: An invisible translator does not generate its own sender reports, but
    rather forwards those received in one cloud to the other, suitably
    modified.  In particular, the RTP timestamp, the sender's packet and
    octet count may have to be modified if the encoding is changed.

BYE: Translators forward BYE packets unchanged.    Mixers only need to
    forward BYE packets if they use CSRC identifiers.  Mixers and visible
    translators should generate BYE packets with their own SSRC identifiers
    if they are about to cease forwarding packets.


7.3 Cascaded Mixers


8 Security


8.1 Security Considerations


RTP suffers from the same security liabilities as the underlying protocols.
For example, an impostor can fake source or destination network addresses,
or change the header or payload.    For example,  the CNAME and NAME

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 30]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

information may be used to impersonate another participant.  In addition,
RTP may be sent via IP multicast, which provides no direct means for a
sender to know all the receivers of the data sent and therefore no measure
of privacy.    Rightly or not, users may be more sensitive to privacy
concerns with audio and video communication than they have been with more
traditional forms of network communication [10].   Therefore, the use of
security mechanisms with RTP is important.

As a first step, RTCP makes it easy for all participants in a session to
identify themselves; if deemed important for a particular application, it
is the responsibility of the application writer to make listening without
identification difficult.  It should be noted, however, that privacy of the
payload can generally be assured only by encryption.

The  security  measures  described  below  can  be  used  to  implement
confidentiality.  Authentication and message integrity are not defined in
the current specificiation of RTP. Security services might also be provided
at the IP layer as security mechanisms are developed for that layer.

The periodic transmission of RTCP or empty RTP packets from sources that are
otherwise idle may make it possible to detect denial-of-service attacks,
as the receiver can detect the absence of these expected messages.   The
messages that are received must be verified for integrity and authenticated
before being accepted for this purpose.

Key distribution and certificates are outside the scope of this document.

The section below defines a confidentiality security service and defines
standard  algorithms  for  both  RTP  and  RTCP.  Other  services,  other
implementations of services and other algorithms may be defined in the
future.  The selection presented here is meant to simplify implementation
of interoperable, secure applications and provide guidance to implementors.
No claim is made that the methods presented here are appropriate for a
particular security need.

A profile specifies which of the services and algorithms should be offered
by applications, and may provide guidance as to their appropriate use.


8.2 Confidentiality


Confidentiality means that only the intended receiver(s) can decode the
received packets; for others, the packet contains no useful information.
Confidentiality of the content is achieved by encryption.

All RTP and RTCP packets in a single lower-layer protocol data unit are
encrypted as a unit.  For RTCP, it is allowed to send some such lower-layer
packets encrypted, others in the clear.  (This accomodates monitors that are
not privy to the encryption key.)  For RTP, no additional data structures
are required.    For RTCP, a 32-bit random number is prepended to the

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 31]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

unit before encryption to deter known plaintext attacks.  The presence of
encryption and the use of the correct key are confirmed by the receiver
through header or payload consistency checks.    An example of such a
consistency check is given in Section A.1.

The default encryption algorithm is the Data Encryption Standard (DES)
algorithm in CBC (cipher block chaining) mode, as described in Section 1.1
of RFC 1423 [11], except that padding to a multiple of 8 octets is indicated
as described for the P bit in Section 5.1.  The initialization vector is
zero because random values are supplied in the RTP header or by the random
prefix for RTCP packets.   For details on the use of CBC initialization
vectors, see [12].  Implementations that support encryption should always
support the DES algorithm in CBC mode.

As an alternative to encryption at the RTP level as described above,
profiles may define additional payload types for encrypted encodings.  Those
encodings must specify how padding and other aspects of the encryption
should be handled.   This method allows encrypting only the data while
leaving the headers in the clear for applications where that is desired.
It may be particularly useful for hardware devices that will handle both
decryption and decoding.


9 RTP over Network and Transport Protocols


This section describes issues specific to carrying RTP packets within
particular  network  and  transport  protocols.      The  following  rules
apply  unless  superseded  by  protocol-specific  definitions  outside  this
specifications.

RTP relies on the underlying protocol(s) to provide demultiplexing.  For UDP
and similar protocols, RTP uses an even port number and the corresponding
RTCP stream uses the next higher port number.

RTP packets contain no length field or other delineation, therefore RTP
relies on the underlying protocol(s) to provide a length indication.  The
maximum length of RTP packets is limited only by the underlying transport
mechanism.

If RTP packets are to be carried in an underlying protocol that provides
the abstraction of a continuous octet stream rather than messages (packets),
an encapsulation of the RTP packets must be defined to provide a framing
mechanism.  TCP is an example of such a protocol.  Framing is also needed if
the underlying protocol may contain padding so that the extent of the RTP
payload cannot be determined.  The framing mechanism is not defined here.

A profile may specify a framing method to be used even when RTP is
carried in protocols that do provide framing in order to allow carrying
several RTP packets in one lower-layer protocol data unit, such as a UDP


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 32]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

packet.   Carrying several RTP packets in one network or transport packet
reduces header overhead and may simplify synchronization between different
streams.


10 Summary of Protocol Constants


In this section, the symbolic constants used in the text are assigned
numeric values.  The following constants are defined in profiles rather than
this document:  RTP payload type (PT).


10.1 RTCP packet types


abbrev. name                value
SR      sender report           0
RR      receiver report         1
SDES    source description      2
BYE     Goodbye                 3
APP     application-defined     4


Other constants are assigned by IANA.


10.2 SDES types


abbrev. name                           value
END     end of SDES list                   0
CNAME   canonical name                     1
NAME    user name                          2
EMAIL   user's electronic mail address     3
PHONE   user's phone number                4
LOC     geographic user location           5
TOOL    name of application or tool        6
TXT     text describing the source         7
PRIV    private extensions                 8


Other constants are assigned by IANA. Constants not assigned by IANA are
available for experimental use.







Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 33]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

11 RTP Profiles and Payload Format Specifications


RTP may be used for a variety of applications with somewhat differing
requirements.  The flexibility to adapt to those requirements is provided
by allowing multiple choices in the main protocol specification, then in a
separate document defining a profile to select the appropriate choices for a
particular class of applications and environment.  Typically an application
will operate under only one profile and there is no explicit indication of
which profile is in use.  A profile for audio and video applications may be
found in the companion Internet draft draft-ietf-avt-profile.

Within this specification, the following possible uses of a profile have
been identified, but this list is not meant to be exclusive:


  o Define a set of payload formats (e.g., media encodings) and a default
    mapping of those formats to payload type values.   Where known, the
    nominal data rate of these encodings should be provided as the RTCP
    packet rate depends on this parameter.

  o Define the number and interpretation of the RTP marker bits,  if
    different from the default specified in Section 5.1.

  o Define an extension to the fixed RTP data header if some additional
    functionality is required across the class of applications independent
    of payload type, and define the first 16 bits of the RTP data header
    extension if implementation-specific extensions are to be allowed (see
    Section 5.3).

  o Define new application-class-specific RTCP packets, or the data format,
    preferred use,  or required use of particular RTCP packets.    In
    particular, SR and RR packets may be extended if there is additional
    information that should be reported regularly about the sender or
    receivers.

  o Specify  that  a  particular  underlying  network  or  transport  layer
    protocol will be used to carry RTP packets.

  o Specify the mapping of RTP and RTCP to transport-level names, e.g., UDP
    ports, if different from the mapping defined in Section 9.

  o Specify encapsulation of RTP packets that are to be used always or with
    particular underlying protocols.


It  is  not  expected  that  a  new  profile  will  be  required  for  every
application.  Within one application class, it would be better to extend an
existing profile rather than make a new one.  For example, additional RTCP
packet types or payload type values may be defined and registered through
the Internet Assigned Numbers Authority for publication in the Assigned

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 34]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

Numbers RFC as an alternative to publishing a new profile specification.

A payload format document specifies how a particular kind of payload data,
such as H.261 encoded video, should be carried in RTP. Payload formats may
be useful under multiple profiles and may therefore be defined independently
of any particular profile.   The profile document is then responsible for
assigning a default mapping of that format to a payload type value if
needed.


A Implementation Notes


We describe aspects of the receiver implementation in this section.  There
may be other implementation methods that are faster in particular operating
environments or have other advantages.  These implementation notes are for
informational purposes only.

The  following  definitions  are  used  for  all  examples;  the  structure
definitions are valid for 32-bit big-endian (most significant octet first)
architectures only.    Bit fields are assumed to be packed tightly in
big-endian bit order, with no additional padding.


#include <sys/types.h>

/*
 * The type definitions below are valid for 32-bit architectures and
 * may have to be adjusted for 16- or 64-bit architectures.
 */
typedef unsigned char  u_int8;
typedef unsigned short u_int16;
typedef unsigned int   u_int32;


/*
* rtp.h  --  RTP header file
*/

#include <types.h>

#define RTP_SEQ_MOD (1<<16)
#define RTP_TS_MOD  (0xffffffff)

#define RTP_MAX_SDES 256   /* maximum text length for SDES */

typedef enum {
  RTCP_SR,
  RTCP_RR,
  RTCP_SDES,

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 35]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

  RTCP_BYE,
  RTCP_APP
} rtcp_type_t;

typedef enum {
  RTCP_SDES_END,
  RTCP_SDES_CNAME,
  RTCP_SDES_NAME,
  RTCP_SDES_EMAIL,
  RTCP_SDES_PHONE,
  RTCP_SDES_LOC,
  RTCP_SDES_TOOL,
  RTCP_SDES_TXT,
  RTCP_SDES_PRIV
} rtcp_sdes_type_t;

typedef struct {
  unsigned int type:2;     /* packet type */
  unsigned int p:1;        /* padding flag */
  unsigned int x:1;        /* header extension flag */
  unsigned int cc:4;       /* CSRC count */
  unsigned int m:1;        /* marker bit */
  unsigned int pt:7;       /* payload type */
  u_int16 seq;             /* sequence number */
  u_int32 ts;              /* timestamp */
  u_int32 ssrc;            /* synchronization source */
  u_int32 csrc[1];         /* optional CSRC list */
} rtp_hdr_t;

typedef struct {
  unsigned int type:2;     /* packet type */
  unsigned int p:1;        /* padding flag */
  unsigned int count:5;    /* varies by payload type */
  unsigned int pt:8;       /* payload type */
  u_int16 length;          /* packet length in words, without this word */
} rtcp_common_t;

/* reception report */
typedef struct {
  u_int32 ssrc;            /* data source being reported */
  u_int32 received;        /* cumulative number of packets received */
  u_int32 expected;        /* cumulative number of packets expected */
  u_int32 jitter;          /* interarrival jitter */
  u_int32 lsr;             /* last SR packet from this source */
  u_int32 dlsr;            /* delay since last SR packet */
} rtcp_rr_t;

typedef struct {
  u_int8 type;             /* type of SDES item (rtcp_sdes_type_t) */
  u_int8 length;           /* length of SDES item (in octets) */
  char data[1];            /* text, not zero-terminated */

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 36]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

} rtcp_sdes_item_t;

/* one RTCP packet */
typedef struct {
  rtcp_common_t common;    /* common header */
  union {
    /* sender report (SR) */
    struct {
      u_int32 ssrc;        /* source this RTCP packet refers to */
      u_int32 ntp_sec;     /* NTP timestamp */
      u_int32 ntp_frac;
      u_int32 rtp_ts;      /* RTP timestamp */
      u_int32 psent;       /* packets sent */
      u_int32 osent;       /* octets sent */
      /* variable-length list */
      rtcp_rr_t rr[1];
    } sr;

    /* reception report (RR) */
    struct {
      u_int32 ssrc;        /* source this generating this report */
      /* variable-length list */
      rtcp_rr_t rr[1];
    } rr;

    /* BYE */
    struct {
      u_int32 src[1];      /* list of sources */
      /* can't express trailing text */
    } bye;

    /* source description (SDES) */
    struct rtcp_sdes_t {
      u_int32 src;              /* first SSRC/CSRC */
      rtcp_sdes_item_t item[1]; /* list of SDES items */
    } sdes;
  } r;
} rtcp_t;


A.1 RTP Header Consistency Check


The following checks may be used to determine whether an RTP header is
likely to be valid, given a previously received RTP packet:


  o RTP type field value equal to 2

  o payload type defined


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 37]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

  o RTP sequence number one higher than previous packet

  o if packets contain fixed number of timestamp counts, comparison of
    timestamp increment with sequence number increment

  o length of RTP packet consistent with CC and payload type


Depending on the application, algorithms may exploit additional knowledge,
e.g., the expected increment in timestamps between packets.  Note that this
algorithm is likely to occasionally create false alarms.


A.2 Parsing RTCP Packets


The following code fragment walks through one or more RTCP packets, checking
for invalid length fields.  It may also be advisable to treat the packet
type and payload type as a single field for checking and branching.


u_int32 len;       /* length of combined RTCP packets in words */
rtcp_t *r;         /* RTCP header */

while (len > 0) {
  len -= r->common.length + 1;
  if (len < 0) {
    /* something wrong with packet format */
    break;
  }
  switch (r->common.pt) {
  case RTCP_SR:
    break;
  default:
    /* invalid type */
    break;
  }
  r = (rtcp_t *)((u_int32 *)r + r->common.length + 1);
}


A.3 Generating SDES RTCP Packets


/*
* Function adds a single item 'item' to buffer 'b'.
* Returns updated buffer pointer.
*/
char *rtcp_sdes_add(char *b, rtcp_sdes_type_t type, char *item)
{


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 38]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

  rtcp_sdes_item_t *rsp;

  rsp = (rtcp_sdes_item_t *)b;
  rsp->type   = type;
  rsp->length = strlen(item);
  strcpy(rsp->data, item);
  b += strlen(item) + 2;
  return b;
}

/*
* Write SDES chunk into buffer 'b' from arrays type[] and value[] with
* argc members.
* Return pointer to next available location within 'b'.
*/
char *rtp_write_sdes(char *b, u_int32 src, int argc,
  rtcp_sdes_t type[], char *value[])
{
  rtcp_sdes_item_t i;
  src_id_t *src_pt = (src_id_t *)b;
  int pad;  /* octets for padding */

  /* SSRC header */
  *src_pt = src;
  b += 4;

  /* SDES items */
  for (i = 0; i < argc; i++) {
    b = rtcp_sdes_add(b, type[i], value[i]);
  }

  /* terminate with end marker */
  *b++ = RTCP_SDES_END;

  /* if necessary, pad with zeroes to next 4-octet boundary */
  pad = (4 - (b & 0x3)) & 0x3;
  memset(b, RTCP_SDES_END, pad);
  b += pad;
  return b;
}


A.4 Parsing SDES RTCP Packets


The function below parses one SDES chunk and calls a function 'member_sdes'
that sets the corresponding information for a session member 'm' (not
defined here).  It expects 'b' to point to the first item for this chunk.


/* round a number 'n' up modulo the size of data type 't' */

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 39]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

#define ROUND(n,t)  (((long)(n) + sizeof (t) - 1) & ~ (sizeof (t) - 1))

char *rtp_read_sdes(member_t m, char *b)
{
  rtcp_sdes_t *rsp = (rtcp_sdes_t *)b;

  for (; rsp->type; rsp = (rtcp_sdes_item_t *)((char *)rsp +
       rsp->length + 2)) {
    if (rsp->type > RTCP_SDES_TXT) return 0;
    member_sdes(m, rsp->type, rsp->data, rsp->length);
  }
  b = (char *)rsp;
  return b + (4 - (b & 0x3)) & 0x3;
}


A.5 Generating a Random 32-bit Identifier


The following subroutine generates a random 32-bit identifier using the MD5
routines published in RFC 1321.  The system routines may not be present on
all operating systems, but they should serve as hints as to what kinds of
information may be used.  Other system calls that may be appropriate include
getdomainname(), getwd().  ``Live'' video or audio samples are also a good
source of random numbers, but care must be taken to avoid that a turned-off
microphone or blinded camera is used as a source.


/*
* Generate a random 32-bit quantity.
*/
#include <sys/types.h>  /* u_long */
#include <sys/time.h>   /* gettimeofday() */
#include <unistd.h>     /* get..() */
#include <stdio.h>      /* printf() */
#include "global.h"     /* from RFC 1321 */
#include "md5.h"        /* from RFC 1321 */

#define MD_CTX MD5_CTX
#define MDInit MD5Init
#define MDUpdate MD5Update
#define MDFinal MD5Final

static u_long md_32(char *string, int length)
{
  MD_CTX context;
  union {
    char   c[16];
    u_long x[4];
  } digest;
  u_long r;

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 40]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

  int i;

  MDInit (&context);
  MDUpdate (&context, string, length);
  MDFinal ((unsigned char *)&digest, &context);
  r = 0;
  for (i = 0; i < 3; i++) {
    r ^= digest.x[i];
  }
  return r;
} /* md_32 */


/*
* Return random unsigned 32-bit quantity.
*/
u_long random32(void)
{
  struct {
    struct timeval tv;
    pid_t pid;
    u_long hostid;
    uid_t  uid;
    gid_t  gid;
    char   name[8];
  } s;

  gettimeofday(&s.tv, 0);
  s.pid    = getpid();
  s.hostid = gethostid();
  s.uid    = getuid();
  s.gid    = getgid();
  gethostname(s.name, sizeof(s.name));

  return md_32((char *)&s, sizeof(s));
} /* random32 */


A.6 Computing the RTCP Transmission Period


The RTCP messages emitted by all session members should not consume more
than a small fraction of the total data bandwidth used.   Thus, the time
between transmitting RTCP messages must increase with the number of session
members and the size of the previous RTCP message sent.

A suggested value for the bandwidth used by RTCP messages is 5% of the
single-sender data bandwidth, including any lower-layer protocols.  [TBD:
Having to know the total lower-layer overhead may not be a good idea, given
that we probably don't want to start adding ATM AAL overhead, PPP overhead,
etc., although they might be more significant.]

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 41]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

To  reduce  the  sender  load  for  very  small  sessions  and  to  provide
statistically meaningful sender reports, the minimum RTCP message interval
is (arbitrarily) set to 5 seconds.

The following function returns the time until the next transmission,
measured in seconds.  It should be called after sending an RTCP message.
The parameters have the following meaning:


rtcp_bw: The desired RTCP bandwidth, in octets per second.

nsenders: Number  of  active  senders  since  last  report,  known  from
    construction of receiver reports for this report.  Includes ourselves,
    if we also sent.

members: The estimated number of session members, including ourselves.
    Incremented as we discover new session members, decremented as session
    members time out after not having been heard from.  On the first call,
    this parameter should be one.  Session members are timed out if they
    have not been heard from TBD times our current RTCP interval mean
    value.  [move this sentence elsewhere?]

packet_size: The size of the last RTCP packet, in octets.

we_sent: Flag that is true if we have sent something since the last RTCP
    message.  If the flag is true, the RTCP message just sent contained an
    SR packet.


double rtcp_period(int members, int senders, double bw, int we_sent)
{
  /*
   * Minimum time between RTCP packets from this site (in seconds).
   * This time prevents the reports from `clumping' when sessions are
   * small and the law of large numbers isn't helping to smooth out
   * the traffic.  it also keeps the report interval from becoming
   * ridiculously small during transient outages like a network
   * partition.
   */
  double const RTCP_MIN_TIME 5.;
  /*
   * Fraction of the rtcp bandwidth to be shared among active senders.
   * (This fraction was chosen so that in a typical session with one or
   * two active senders, the computed report time would be roughly
   * equal to the min report time so that we don't unnecessarily slow
   * down receiver reports.)  The receiver fraction must be 1 - the
   * sender fraction.
   */
  double const RTCP_SENDER_BW_FRACTION 0.25;
  double const RTCP_RECEIVER_BW_FRACTION (1 - RTCP_SENDER_BW_FRACTION);
  /*

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 42]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

   * Gain (smoothing constant) for the low-pass filter that estimates
   * the average rtcp packet size.
   */
  double const RTCP_SIZE_GAIN (1./16.);

  double t; /* interval */
  double rtcp_min_time = RTCP_MIN_TIME;

  /* The avg. rtcp size is initialized to 128 bytes which is
   * conservative (it assumes everyone else is generating SRs instead
   * of RRs).
   */
  static double avg_rtcp_size = 128;

  int n;    /* number of members for computation */

  /* initial */
  if (members == 1) {
    rtcp_min_time /= 2;
  }

  /* Compute estimated number of session members. */
  n = members;
  if (senders > 0) {
    if (we_sent) {
      bw *= RTCP_SENDER_BW_FRACTION;
      n = nsenders;
    } else {
      bw *= RTCP_RECEIVER_BW_FRACTION;
      n -= nsenders;
    }
  }

  /* Update avg. size of message [Is this really helpful?] */
  avg_rtcp_size += (packet_size - avg_rtcp_size) * RTCP_SIZE_GAIN;

  /* compute interval */
  t = avg_rtcp_size * n / bw;

  /* enforce minimum spacing */
  if (t < rtcp_min_time) t = rtcp_min_time;

  /*
   * To avoid traffic bursts from unintended synchronization with
   * other sites, we then pick our actual next report interval as a
   * random number uniformly distributed between 0.5*t and 1.5*t.
   */
  return t * (drand48() + 1.0);
}



Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 43]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

A.7 Estimating the Interarrival Jitter


The interarrival jitter field in receiver reports should be an estimate of
the statistical variance of the RTP data playout delay.   The following
algorithm may be suitable:


double alpha = 0.01;
avg    = alpha * slack         + (1-alpha) * avg;
jitter = alpha * (slack - avg) + (1-alpha) * jitter;


A.8 Determining the Expected Number of RTP Packets


In order to compute packet loss rates, the number of packets expected and
actually received needs to be known.  The number of packets expected can
be computed by the receiver by tracking the first sequence number received
(seq0), the last sequence number received, seq, and the number of complete
sequence number cycles:


expected = cycles * 65536 + seq - seq0 + 1;


The cycle count cycles is updated for each packet, where seq_prior is the
sequence number of the prior packet.  The cycle count is incremented when
the sequence number wraps around in the "forward" direction, and needs to be
decremented if the sequence number wraps around in the "backward" direction.


unsigned short seq, seq_prior;

if (seq > seq_prior) {
  if (seq - seq_prior > 32768) {
    /* out-of-order packet with wrap-around (e.g., 65530 preceded by 3) */
    cycles--;
  }
}
else if (seq < seq_prior) {
  if (seq - seq_prior > 32768) {
    /* out-of-order packet (e.g., 2 preceded by 3) */
  }
  else {
    /* wrap-around (e.g., 3 preceded by 65530) */
    cycles++;
  }
}


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 44]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

seq_prior = seq;


Acknowledgments


This  memorandum  is  based  on  discussions  within  the  IETF  Audio/Video
Transport working group chaired by Stephen Casner.  The current protocol has
its origins in the Network Voice Protocol and the Packet Video Protocol
(Danny Cohen and Randy Cole) and the protocol implemented by the vat
application (Van Jacobson and Steve McCanne).  Christian Huitema provided
ideas for the random identifier generator.


B Addresses of Authors


Henning Schulzrinne
GMD Fokus
Hardenbergplatz 2
D-10623 Berlin
Germany
electronic mail: hgs@fokus.gmd.de


Stephen Casner
University of Southern California/Information Sciences Institute
4676 Admiralty Way
Marina del Rey, CA 90292-6695
United States
electronic mail: casner@isi.edu


Ron Frederick
Xerox Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, CA 94304
United States
electronic mail: frederic@parc.xerox.com


Van Jacobson
MS 46a-1121
Lawrence Berkeley Laboratory
Berkeley, CA 94720
United States
electronic mail: van@ee.lbl.gov




Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 45]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

References


 [1] D. D. Clark and D. L. Tennenhouse, "Architectural considerations for a
     new generation of protocols," in SIGCOMM Symposium on Communications
     Architectures and Protocols, (Philadelphia, Pennsylvania), pp. 200--
     208, IEEE, Sept. 1990.

 [2] H. Schulzrinne,    "Issues in designing a transport protocol for
     audio and video conferences and other multiparticipant real-time
     applications."(4) expired Internet draft, Oct. 1993.

 [3] J.-C. Bolot, T. Turletti, and I. Wakeman,  "Scalable feedback control
     for multicast video distribution in the internet,"(5) in SIGCOMM
     Symposium on Communications Architectures and Protocols,  (London,
     England), pp. --, ACM, Aug. 1994.

 [4] D. E. Comer, Internetworking with TCP/IP, vol. 1. Englewood Cliffs,
     New Jersey:  Prentice Hall, 1991.

 [5] J. Postel,   "Internet protocol,"(6) Request for Comments (Standard)
     RFC 791,  Internet Engineering Task Force,  Sept. 1981.  Obsoletes
     RFC0760.

 [6] D. Mills,   "Network time protocol (v3),"(7) Request for Comments
     (Proposed Standard) RFC 1305, Internet Engineering Task Force, Apr.
     1992. Obsoletes RFC1119.

 [7] W. Feller, An Introduction to Probability Theory and its Applications,
     Volume 1, vol. 1. New York, New York:  John Wiley and Sons, third ed.,
     1968.

 [8] International  Standards  Organization,  "ISO/IEC  DIS  10646-1:1993
     information technology -- universal multiple-octet coded character set
     (UCS) -- part I: Architecture and basic multilingual plane," 1993.

 [9] The Unicode Consortium, The Unicode Standard. New York, New York:
     Addison-Wesley, 1991.

[10] S. Stubblebine, "Security services for multimedia conferencing," in
     16th National Computer Security Conference, (Baltimore, Maryland),
     pp. 391--395, Sept. 1993.

[11] D. Balenson,  "Privacy enhancement for internet electronic mail:  Part
------------------------------
 4. ftp://gaia.cs.umass.edu/pub/hgschulz/rtp/draft-ietf-avt-issues-01.ps
 5. ftp://cs.ucl.ac.uk/darpa/multicast-congestion.ps.Z
 6. ftp://ds.internic.net/rfc/rfc791.txt
 7. ftp://ds.internic.net/rfc/rfc1305.ps


Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 46]


INTERNET-DRAFT          draft-ietf-avt-rtp-06.txt         November 28, 1994

     III: algorithms, modes, and identifiers,"(8) Request for Comments
     (Proposed Standard) RFC 1423, Internet Engineering Task Force, Feb.
     1993. Obsoletes RFC1115.

[12] V. L. Voydock and S. T. Kent, "Security mechanisms in high-level
     network protocols," ACM Computing Surveys, vol. 15, pp. 135--171, June
     1983.










































------------------------------
 8. ftp://ds.internic.net/rfc/rfc1423.txt

Schulzrinne/Casner/Frederick/Jacobson        Expires 3/1/95        [Page 47]