Internet Engineering Task Force                     Audio-Video Transport WG
INTERNET-DRAFT                                      H. Schulzrinne/S. Casner
                                                                    AT&T/ISI
                                                                 May 6, 1993
                                                          Expires:  10/01/93


              A Transport Protocol for Real-Time Applications



Status of this Memo


This document is an Internet Draft.   Internet Drafts are working  documents
of the Internet Engineering  Task Force (IETF), its  Areas, and its  Working
Groups.   Note that other  groups may also  distribute working documents  as
Internet Drafts.

Internet Drafts  are draft  documents valid  for a  maximum of  six  months.
Internet Drafts may be  updated, replaced, or  obsoleted by other  documents
at any time.   It  is not appropriate  to use Internet  Drafts as  reference
material or to  cite them other  than as  a ``working draft''  or ``work  in
progress.''

Please check  the I-D  abstract  listing contained  in each  Internet  Draft
directory to learn the current status of this or any other Internet Draft.

Distribution of this document is unlimited.


Contents


1 Introduction                                                             2


2 Real-time Data Transfer Protocol -- RTP                                  4

  2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

  2.2 RTP Header Fields . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Reverse Control                                                          8


4 Real Time Control Protocol --- RTCP                                      9

  4.1 Forward Control Options . . . . . . . . . . . . . . . . . . . . . . 10


INTERNET-DRAFT                        RTP                        May 6, 1993

5 Security Considerations                                                 15


6 RTP over network and transport protocols                                15

  6.1 Defaults  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    6.1.1Framing  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    6.1.2RTA option . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

  6.2 UDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

  6.3 TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

  6.4 ST-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

A Implementation Notes                                                    17


B Addresses of Authors                                                    18


                                  Abstract


     This  draft  describes  a protocol  called  RTP  suitable for  the
    network  transport of  real-time  data,  such  as audio,  video  or
    simulation data.    The  data transport  is enhanced  by  a control
    protocol  designed to  provide minimal  control  and identification
    functionality.  A reverse  control protocol provides mechanisms for
    monitoring quality of service  and other content-specific requests.
    This protocol is intended for experimental use.


This specification is a product  of the Audio-Video Transport working  group
within the Internet  Engineering Task  Force.   Comments  are solicited  and
should be addressed to the  working group's mailing list at  rem-conf@es.net
and/or the authors.



1 Introduction


This draft concisely specifies a real-time transport protocol.  A discussion
of the design decisions can be found in the current version of the companion
Internet draft draft-ietf-avt-issues.txt.   The transport protocol  provides
end-to-end delivery services for  one or more f_l_o_w_s_  of data with  real-time
characteristics, for example,  interactive audio  and video.    It does  n_o_t_
guarantee delivery  or prevent  out-of-order delivery,  nor does  it  assume
that the underlying network  is reliable and  delivers packets in  sequence.

H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 2]


INTERNET-DRAFT                        RTP                        May 6, 1993

[Note that the  sequence numbers  included in RTP  allow the  end system  to
reconstruct the  sender's packet  sequence, but  sequence numbers  may  also
be used  to determine  the  proper location  of  a packet,  for  example  in
video decoding, without necessarily decoding packets  in sequence].  RTP  is
designed to run on top of a variety of network and transport protocols,  for
example, IP, ST-II or UDP.  [For most applications, RTP offers  insufficient
demultiplexing to  run directly  on  IP.] RTP  transfers  data in  a  single
direction, possibly to multiple destinations if supported by the  underlying
network.   A mechanism  for indicating  a return  path for  control data  is
provided.

While RTP is primarily  designed to satisfy  the needs of  multi-participant
multimedia conferences, it  is not limited  to that particular  application.
Storage of continuous data,  interactive distributed simulation and  control
and measurement applications  may also find  RTP applicable.   Profiles  are
used to instantiate certain  header fields and  options for particular  sets
of applications.   A profile for audio  and video data may  be found in  the
companion Internet draft draft-ietf-avt-profile.txt.

This document defines two packet formats and protocols:


  o the  real-time  transport  protocol  (RTP)  for  exchanging  data   with
    real-time properties.

  o the real-time  control protocol (RTCP)  for conveying information  about
    the sites in an on-going  association.  RTCP information may be  ignored
    without affecting the  ability to correctly  receive information.   RTCP
    is used  for loosely  controlled conferences,  i.e., where  there is  no
    explicit  admission control  and  set-up.    Its  functionality  may  be
    subsumed by a conference control protocol (which is beyond the  scope of
    this document).


Control fields  (options) for  RTP and  RTCP share  the same  structure  and
numbering space and are carried within the same packet.  Options may  appear
in any  order, unless  specifically restricted  by the  option  description.
[The position of some security options may have significance.]  Each  option
consists of the final bit, the  option type designation, a one-octet  length
field denoting the total number of  32-bit long words comprising the  option
(including final  bit, type  and length),  and  finally any  option-specific
data.  The last  option before the packet data  portion has the 'F'  (final)
bit set to one, for all other options this field has a value of zero.

Fields within the fixed header and within options are aligned to the natural
length of  the field,  i.e., 16-bit  words are  aligned  on even  addresses,
32-bit long words are aligned at addresses  divisible by four, etc.   Octets
designated as padding  have the  value zero.    Options unknown  to the  RTP
implementation or the application  are to be ignored.   Options with  option
types having values  from 64 to  127 inclusive  are to be  used for  private
extensions.  Fields designated as MBZ ('must be zero') must have a value  of


H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 3]


INTERNET-DRAFT                        RTP                        May 6, 1993

binary zero and are to be ignored by the receiver.

All integer  fields  are  carried in  network  byte  order,  that  is,  most
significant byte (octet)  first.   The  transmission order  is described  in
detail in [1], Appendix A. Unless otherwise noted, constants are in  decimal
(base 10).

Textual information is  encoded accorded to  the UTF-2 encoding  of the  ISO
standard 10646 (Annex F) [2,3].  US-ASCII  is a subset of this encoding and
requires no additional encoding.   The presence  of multi-byte encodings  is
indicated by setting the  most significant bit to  a value of one.   A  byte
with a binary value of zero may  be used as a string terminator for  padding
purposes.



2 Real-time Data Transfer Protocol -- RTP


2.1 Definitions


A c_o_n_t_e_n_t_ s_o_u_r_c_e_ is the actual source of the data carried, for example,  the
user and host that originally generated the audio data.

A s_y_n_c_h_r_o_n_i_z_a_t_i_o_n_ s_o_u_r_c_e_ is the combination  of one or more content  sources
with its own timing.

A n_e_t_w_o_r_k_ s_o_u_r_c_e_ is the network-level origin of the RPDUs as seen by the end
system.

An e_n_d_ s_y_s_t_e_m_ generates the content to  be used in RTP packets and  delivers
the content of received RTP packets to the user application.  An end  system
is a synchronization source.

An (RTP-level)  b_r_i_d_g_e_  receives  RTP  packets from  one  or  more  sources,
combines them in some manner and then forwards  a new RTP packet.  A  bridge
may change the encoding.   A bridge always changes the timing  relationship,
introducing a new  time scale.   Bridges are  synchronization sources,  with
each of the sources whose packets were combined into an outgoing RTP  packet
as the  content  sources  for that  outgoing  packet.    Audio  bridges  and
media converters are examples  of bridges.   Example:  assume SMITH@FOO  and
JONES@BAR are using a bridge to  translate their audio from one encoding  to
another.  The bridge mixes audio  packets from Smith and Jones together  and
forwards the mixed packets.   If, say, Smith  was talking, she is  indicated
as the  content source  of the  outgoing packet,  allowing the  receiver  to
properly display the current speaker rather than just the bridge that  mixed
the audio.  For  an end system receiving RTP  packets from that bridge,  the
bridge is the  synchronization source  and Smith the  content source.    The
RTP-level bridges  described in  this  document are  unrelated to  the  data
link-layer bridges found in  local area networks.   If there is  possibility


H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 4]


INTERNET-DRAFT                        RTP                        May 6, 1993

for confusion,  the term  'RTP-level  bridge' should  be used.    [The  name
'bridge' follows common telecommunication usage.]

An (RTP-level) t_r_a_n_s_l_a_t_o_r_ does not alter the timing of packets.  Examples of
its use include encoding conversion  without mixing or retiming,  conversion
from multicast to unicast,  and application-level filters in  firewalls.   A
translator is neither a synchronization nor a content source.

A s_y_n_c_h_r_o_n_i_z_a_t_i_o_n_ u_n_i_t_ consists  of one or  more packets that,  as a  group,
share a common fixed  delay between generation and  playout of each part  of
the group, or can only be scheduled as a whole.  The delay may change at the
beginning of such a synchronization unit.   The most common  synchronization
units are talkspurts for voice and frames for video transmission.



2.2 RTP Header Fields


The header fields have the following meaning:


protocol version: 2 bits
    Defines  the protocol  version.    The version  number of  the  protocol
    defined in this draft is one.

flow: 6 bits
    The  value of  the  field is  the  flow  identifier, one  of  the  items
    used by the  receiver for demultiplexing.   A synchronization source  is
    identified by the receiver  as the unique combination of network  source
    address, flow value, and the synchronization source option, if present.

option present bit (P): 1 bit
    This flag has a value of one if the fixed RTP header is followed  by one
    or more options.

end-of-synchronization-unit (S): 1 bit
    This flag has  a value of  one in the last  packet of a  synchronization
    unit, a value of zero otherwise.

format: 6 bits
    The  'format' field  forms  an index  into  a table  defined  through  a
    conference announcement  protocol (to  be specified),  RTCP messages,  a
    conference server or  some other out-of-band means.   If no mapping  has
    been defined  in this  manner, a  standard mapping is  specified by  the
    companion profile document, RFC TBD. RFC 1340, Assigned Numbers,  or its
    successor, is to be used.

sequence number: 16 bits
    The sequence  number counts  RTP protocol  data  units (packets).    The
    sequence number increments by one  for each packet sent.  [The  sequence
    number may  be used by the  receiver to detect  packet loss, to  restore

H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 5]


INTERNET-DRAFT                        RTP                        May 6, 1993

    packet sequence and to identify packets to the application.]

timestamp: 32 bits
    The timestamp reflects the  wallclock time when the RPDU was  generated.
    The timestamp consists of the middle 32 bits of a 64-bit  NTP timestamp,
    as defined in RFC 1305  [4].  Note that several consecutive packets  may
    have equal timestamps.

    The  timestamp of  the first  packet(s)  within a  synchronization  unit
    is expected  to closely  reflect the actual  sampling instant,  measured
    by the  local system  clock.    It is  not expected  that the  timestamp
    of the  beginning of  every synchronization  unit  is based  on a  local
    synchronized system  clock.   However,  the local clock  should be  used
    frequently enough so that clock drift between synchronized  system clock
    and sampling clock can be  compensated for gradually.  The local  system
    clock should be  controlled by a  time synchronization protocol such  as
    NTP if such  a service is available.   Within one synchronization  unit,
    it may be appropriate to compute timestamps based on the  logical timing
    relationships between the packets.  For audio samples, for  example, the
    nominal sampling interval  may be used.   If the clock quality field  of
    the CDES  option does  not indicate otherwise,  it is  assumed that  the
    timestamp at the beginning  of a synchronization unit is derived from  a
    synchronized system clock.  However, it is allowable to  operate without
    synchronized time on those  systems where it is not available, unless  a
    profile or session protocol requires otherwise.



 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver|   flow    |P|S|  format   |       sequence number         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     timestamp (seconds)       |     timestamp (fraction)      |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| options ...                                                   |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+


                        Figure 1:  RTP header format

The packet  header is  followed by  options,  if any,  and the  media  data.
Optional fields are summarized below.   Unless otherwise noted, each  option
may appear only  once per packet.   Each  packet may contain  any number  of
options.








H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 6]


INTERNET-DRAFT                        RTP                        May 6, 1993

CSRC 0   Content source identifiers.  The content source option is  inserted
        only by bridges and identifies  all sources that contributed to  the
        packet.   For example,  for audio  packets, all  sources are  listed
        that were mixed  together to  create this  packet, allowing  correct
        talker indication at  the receiver.   Each CSRC  option may  contain
        one or more  content source  identifiers, each  16 bits long.    The
        identifier values must  be unique for  all content sources  received
        through a particular synchronization source (bridge) on a particular
        conference (destination address and port); the value of binary  zero
        is reserved and may not be used.   If the number of content  sources
        is even, the two octets needed to pad the list to a multiple of four
        octets are set to zero.   There should only be a single CSRC  option
        within a packet.  If no  CSRC option is present, the content  source
        is assumed to have a value of  zero.  CSRC options are not  modified
        by RTP-level translators.


         0                   1                   2                   3
         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    CSRC     |    length     | content source identifier    ...
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

SSRC 1   Synchronization source  identifier.     The  SSRC  option  is  only
        inserted by  RTP-level translators;  the  translator must  assign  a
        unique identifier  for each  synchronization  source from  which  it
        receives packets for  a particular  conference (destination  address
        and port).    The  value zero  is reserved  and  must not  be  used.
        If no  SSRC option  is present,  the network  source is  assumed  to
        indicate the synchronization source.  There must be no more than one
        SSRC identifier per packet; thus,  a translator must remap the  SSRC
        identifier of an  incoming packet into  a new,  locally unique  SSRC
        identifier.


         0                   1                   2                   3
         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    SSRC     | length = 1    | identifier                    |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

BOP 2   (beginning of playout unit)  16-bit sequence number designating  the
        first packet within the current playout unit.


         0                   1                   2                   3
         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|     BOP     | length = 1    | sequence number               |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 7]


INTERNET-DRAFT                        RTP                        May 6, 1993

3 Reverse Control


This section describes  a means  for the receiver  of RTP  protocol data  to
signal back to the  sender or a  third party.   Reverse control packets  are
sent to the destination specified  by the sender of  the data using the  RNA
and RTA options.    Use of  reverse control packets  is optional.    Reverse
control packets have the format  shown below.  The  packet is preceded by  a
32-bit packet length  field if and  only if the  underlying transport  layer
does not support framing.   The packet length  field contains the number  of
octets within the packet, n_o_t_ including the packet length field itself.  The
flow index is that of the flow to which this reverse control is a  response.
Reverse control packets are only sent to the synchronization source.  It  is
the responsibility of the RTP-level bridge to convey information back to the
content sources, if necessary.



 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       0       |       0       |       0       |  flow index   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| reverse-control options (variable length) ...                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The following options may be used within reverse control packets:


QOS 64   Quality of service  measurement.   The option  contains the  number
        of packets received (16  bits), the number  of packets expected  (16
        bits), the minimum delay, the  maximum delay and the average  delay.
        The delay measures are encoded as 16/16 NTP timestamps, that is,  16
        bits encode the  number and seconds  and 16 bits  the fraction of  a
        second.  [The timestamp format is  identical to the one used in  the
        fixed RTP header.]


          0                   1                   2                   3
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |F|     QOS     | length = 5    |            MBZ                |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         | packets received              | sequence number range         |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         | minimum delay (seconds)       | minimum delay (fraction)      |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         | maximum delay (seconds)       | maximum delay (fraction)      |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         | average delay (seconds)       | average delay (fraction)      |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 8]


INTERNET-DRAFT                        RTP                        May 6, 1993

RAD 65   Reverse application data.    The data  contained in  the option  is
        directly passed to the application, without interpretation by RTP.


          0                   1                   2                   3
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |F|    RAD      |    length     | reverse application data      |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        ...                                                             ...
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



4 Real Time Control Protocol --- RTCP


The real-time control protocol  (RTCP) conveys minimal out-of-band  advisory
information during a conference.  It provides support for loosely controlled
conferences, i.e.,  where  participants enter  and leave  without  admission
control and parameter negotiation.   The services provided by RTCP  services
enhance RTP, but an end system does  not have to implement RTCP features  to
participate in conferences(1) .  RTCP  does not aim to provide the  services
of a conference control protocol and  does not provide some of the  services
desirable for two-party conversations.  If a conference control protocol  is
in use, the  services of RTCP should  not be required.   (Note:   as of  the
writing of this document, a conference  or session control protocol has  not
been specified within the Internet.)

Unless otherwise  noted,  control  information is  carried  periodically  as
options within RPDUs.    In the absence  of media  data, packets  containing
only RTCP options are sent periodically to the same multicast group as  data
packets, using the same time-to-live value.  Note that RTCP options could be
sent in separate packets even when there is data to send; however, the  RTCP
packets would consume sequence  numbers and make detection  of lost data  at
the receiver more difficult.  The period should be varied randomly to  avoid
synchronization of all sources and its mean should increase with the  number
of participants in the conference  to limit the overall  network load.   The
length of the period determines, for example, how long a receiver joining  a
conference has to wait in the worst  case until it can identify the  source.
An initial period varying randomly between 3 and 10 seconds is  recommended.
A receiver may remove  a site that it  has not been heard  from for a  given
time-out period  from its  list of  active sites;  the  time-out period  may
depend on the number of sites  or the observed average interarrival time  of
RTCP messages.  Note that not every periodic message has to contain all RTCP
options; for example,  the MAIL part  within the SDES  option might only  be

------------------------------
 1. There  is one  exception to  that rule:   if  an application  sends  FMT
options, the receiver has to decode these in order to properly interpret the
RTP payload.


H. Schulzrinne/S. Casner              Expires 10/01/93              [Page 9]


INTERNET-DRAFT                        RTP                        May 6, 1993

sent every few messages.

The item types are defined below:



4.1 Forward Control Options


The following options are sent in the same direction as the data stream.


FMT 32   Format description.


        format:  6 bits
            The 'format' field designates the index value from  the 'format'
            fixed header field, with values ranging from 0 to 63.

        Clock quality:  8 bits
            Provides an  indication as  to the  sender-perceived quality  of
            the timestamps  in the  RTP header.   The  octet is  interpreted
            as a quantity indicating  the maximum dispersion to a root  time
            server measured  in fractions  of a  second and  expressed as  a
            power of two.

            If a source  is known to be  synchronized to standard time,  but
            with an  unknown dispersion, or the  dispersion is greater  than
            TBD, the  value TBD  is used.    If the  clock is  based on  the
            nominal sample rate of the source, a value of TBD is used.

            The clock quality indication can be used to judge how  the delay
            measurements reported by the  QOS option can be interpreted  (as
            absolute delay or only as  delay variation).  It is also  useful
            for determining  to what extent  several sources with  different
            clocks can be synchronized.

        Format-dependent data:  variable
            Format-dependent data  may or may  not appear in  a FMT  option.
            It is passed to the next layer and not interpreted by RTP.













H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 10]


INTERNET-DRAFT                        RTP                        May 6, 1993

        A FMT  mapping  changes  the interpretation  of  a  given  'content'
        value starting at  the packet containing  the FMT option.   The  new
        interpretation applies  only  to packets  from  the  synchronization
        source of this packet.   A sender  should refrain from changing  the
        content type and flow index of  a mapping defined by external  means
        such as a conference  registry, conference announcement protocol  or
        otherwise agreed-upon  mapping.   Dynamic  changes to  these  values
        may result  in misinterpretation  of RTP  payload if  the  packet(s)
        containing the FMT option are lost.


          0                   1                   2                   3
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |F|     FMT     |    length     |0|0|  format   | clock quality |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |  format-dependent data                                       ...
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

SDES 33   This option provides a mapping between a numeric source identifier
        and one or more  identifying attributes.   [Several attributes  were
        combined into one option to avoid multiple mappings from identifiers
        to the receiver site data structure.]  For those applications  where
        the size of  a multipart SDES  option would be  a concern,  multiple
        SDES options may be formed with subsets  of the parts to be sent  in
        separate packets.  An end system always uses an identifier value  of
        zero.   A bridge uses  the content source  identifiers used in  CSRC
        options to identify contributors,  and a value  of zero to  identify
        itself.  Translators do not modify or insert SDES options.  The  end
        system performs the  same mapping  it uses to  identify the  content
        sources (that is, the combination of network source, synchronization
        source and the source number within this SDES option) to identify  a
        particular source.

        Currently, the following items  are defined.   Each has a  structure
        similar to that  of RTCP  and RTP  options,  that is,  a type  field
        followed by a length field  (measured in multiples of four  octets).
        No final bit is needed since the overall length is known.

        The class  identifier of  the informational  items within  the  SDES
        option is  identical  to the  CLASS  value in  the  resource  record
        (RR) in  the  Domain Name  Service  protocol (DNS)  [RFC  1034,  RFC
        1035] [5,6] and may be found in the current version of the Assigned
        Numbers RFC  issued  by  the Internet  Assigned  Numbers  Authority.
        Additional values  that  are  reserved are  used  for  SDES-specific
        identifiers.







H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 11]


INTERNET-DRAFT                        RTP                        May 6, 1993

              name  class   description
              USER  0       user and host identifier,
                            e.g., ``doe@sleepy.megacorp.com'' or
                            ``sleepy.megacorp.com''
              MAIL  3       user's electronic mail address
                            e.g., ``John.Doe@megacorp.com''
              TEXT  65535   text describing the source,
                            e.g.,``John Doe, Bit Recycler, Megacorp''
              ADDR  1       IPv4 address of source
                    2-65534 other address formats



        Class value 4  is currently assigned  to historical network  address
        types (HESSIOD)  and thus  safe for  private SDES  use.   Items  are
        padded with zero to the next multiple of four octets.  The USER item
        must have the format  ``user@host'' or ``host'',  where ``host''  is
        the fully qualified domain name of the host where the real-time data
        originates from, formatted according to  the rules specified in  RFC
        1035.  The latter form may be used if a user name is  not available,
        for example on  single-user systems.   The  user name  should be  in
        a form that  a program  such as  ``finger'' or  ``talk'' could  use,
        i.e., it typically is the login  name rather than the ``real  life''
        name.  Note that the host  name is not necessarily identical to  the
        electronic mail address of the participant.  The latter is  provided
        through the MAIL option.  The USER item is intended to be parsed  by
        an application program.


          0                   1                   2                   3
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |F|     SDES    |    length     |       source identifier       |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |            class = 0          |    length     | text         ...
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         | ... describing the source ...                                ...
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |        class = 0xFFFF         |    length     | user and     ...
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         | domain name of source                                        ...
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |          class = 1            |   length      |       0       |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |                          IPv4 address                         |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 12]


INTERNET-DRAFT                        RTP                        May 6, 1993

RNA 36   The RNA  (reverse network  address) indicates  the network  address
        to be used for  sending reverse control data  for the given  content
        type.   The address  type field  contains the  address class,  using
        the DNS-based namespace  described for the  SDES option above.    If
        a host has  several network  addresses (for  example, for  different
        network protocols), the  RNA option is  to be repeated  as often  as
        needed.   The  receiver  then chooses  the address  appropriate  for
        its needs.   The  'interval' field  contains the  number of  seconds
        between QOS packets, expressed  as the exponent of  a power of  two.
        For example,  a value  of  3 means  that the  source would  like  to
        receive quality-of-service reports  every 2 **  3 = 8  seconds.   To
        avoid synchronization between receivers, a receiver should space QOS
        reports randomly between one half and twice the interval  requested.
        The interval is advisory only and an application may choose to  send
        QOS reports at a different frequency.  [This caveat is necessary  as
        keeping track of a different interval for each source may be  unduly
        burdensome.]  A profile may specify a different algorithm.  A  value
        in the 'interval' field of 255  decimal implies that no QOS  packets
        should be sent.


         0                   1                   2                   3
         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    RNA      |    length     |    format     |   interval    |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |         address class         |       0       |       0       |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |                        network-address                       ...
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    RNA      |  length = 2   |    format     |   interval    |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |      address class = 1        |       0       |       0       |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |                          IPv4 address                         |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

RTA 37   The  RTA  (reverse  transport  address)  indicates  the   transport
        selector (e.g., port number) to be used for sending reverse  control
        data.   The transport protocol  field determines the  interpretation
        of the following octets,  using the IP  Protocol Numbers defined  in
        the current edition of  the Assigned Numbers  RFC. The figure  shows
        the use of  the RTA  option for the  ST-II, TCP  and UDP  protocols.
        [The port numbers are placed so  that the second 32-bit word can  be
        interpreted as the port  number, with  the most-significant bits  as
        zero.]


         0                   1                   2                   3


H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 13]


INTERNET-DRAFT                        RTP                        May 6, 1993

         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    RTA      |    length     |    format     | transport pro.|
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |              transport-address (port number)                 ...
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    RTA      |  length >= 2  |    format     | protocol = 5  |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |       0       |       0       |       0       |   SAP bytes   |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |    padding    : ST-II service access point                   ...
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    RTA      |  length = 2   |    format     | protocol = 6  |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |       0       |       0       |        TCP port number        |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|    RTA      |  length = 2   |    format     | protocol = 17 |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |       0       |       0       |        UDP port number        |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

BYE 35   The BYE  option  indicates that  a  particular site  is  no  longer
        active.  A bridge sends BYE options with a (non-zero) content source
        value.   An  identifier  value of  zero  indicates that  the  source
        indicated by the  synchronization source (SSRC)  option and  network
        address is no  longer active.   If  a bridge shuts  down, it  should
        first send BYE options for all content sources it handles,  followed
        by a BYE option with an identifier value of zero.  Each RTCP message
        can contain one or  more BYE messages.   [Multiple identifiers in  a
        single BYE option are not  allowed to avoid ambiguities between  the
        special value of zero and any necessary padding.]


         0                   1                   2                   3
         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |F|     BYE     | length = 1    | content source identifier     |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+









H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 14]


INTERNET-DRAFT                        RTP                        May 6, 1993

5 Security Considerations


RTP suffers from the same security deficiencies as the underlying protocols,
for example,  the  ability of  an impostor  to  fake source  or  destination
network addresses.

The usage  of  network  addresses for  identification  within  the  protocol
(SDES  option)  allows  impersonating  another  site.     Impersonation  and
denial-of-service attacks can  be made more  difficult by providing  digital
signatures for all or parts of a  message.  IP multicast provides no  direct
means for a sender to know all the receivers of the data sent.   RTP options
make it easy for all participants in a conference to identify themselves; if
deemed important for a particular  application, it is the responsibility  of
the application writer to  make listening without identification  difficult.
It should be noted, however, that within an internet, privacy of the payload
can generally only be assured by encryption.

The TBD  RTP options  described in  Section  2 allow  the provision  of  the
following security services within this layer:  TBD.



6 RTP over network and transport protocols


This  section  describes  issues  specific  to  carrying  RTP  packets  over
particular network and  transport protocols.   Unless  otherwise noted,  the
mechanisms apply to both the forward (data) and reverse control directions.


6.1 Defaults


The following rules apply unless superseded by protocol-specific subsections
in this section.


6.1.1 Framing


If RTP protocol data units (RPDU),  in both forward and reverse  directions,
are carried  over underlying  protocols that  provide the  abstraction of  a
continuous bit  stream rather  than messages,  each RPDU  is prefixed  by  a
32-bit framing field containing the length  of the RPDU measured in  octets,
not including the framing  field itself.   If a RPDU  traverses a path  over
a mixture of  octet-stream and  message-oriented protocols,  each  RTP-level
bridge between these protocols  is responsible for  adding and removing  the
framing field.   A  profile may  determine that framing  is to  be used  for
protocols that do  provide framing in  order to allow  carrying several  RTP
packets in one underlying protocol data unit.  [Carrying several RTP packets


H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 15]


INTERNET-DRAFT                        RTP                        May 6, 1993

in one network  or transport  packet reduces  header overhead  and may  ease
synchronization between different streams.]


6.1.2 RTA option


Port numbers (or equivalent) are by default two octets long.



6.2 UDP


The format of the RTA option is shown below.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|    RTA      |  length = 2   |    format     | protocol = 17 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       0       |       0       |        UDP port number        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


6.3 TCP


The format of the RTA option is shown below.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|    RTA      |  length = 2   |    format     | protocol = 6  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       0       |       0       |        TCP port number        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


6.4 ST-II


The next protocol field (``NextPCol'', Section 4.2.2.10 in RFC-1190) is used
to distinguish two encapsulations of RTP over ST-II. The first uses NextPCol
value TBD and directly places the RTP packet  into the ST-II data area.   If
NextPCol value TBD is used,  the RTP header is  preceded by a 32-bit  header
shown below.   The  byte count  determines the number  of bytes  in the  RTP
header and payload to be checksummed.  The 16-bit checksum uses the TCP  and



H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 16]


INTERNET-DRAFT                        RTP                        May 6, 1993

UDP checksum algorithm.



  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | count of bytes to be checked  |           check sum           |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
... RTP header ...
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The format of the RTA option is shown below.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|    RTA      |  length = 2   |    format     | protocol = 5  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       0       |       0       | ST-II service access pt (SAP) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


A Implementation Notes


In this section,  one possible implementation  of the part  of the  receiver
that maps  incoming RTP  packets to  sources  is described.    The  receiver
maintains a list of all  sources, content and synchronization sources  alike
in a  table.   Synchronization  sources  are stored  with a  content  source
value of zero.   When  an RTP  packet arrives, the  receiver determines  its
network source and port (from information returned by the operating system),
synchronization source (SSRC  option) and content  source(s) (CSRC  option).
To locate  the  table  entry containing  timing  information,  mapping  from
content descriptor to actual encoding,  etc., the receiver sets the  content
source to  zero and  locates a  table  entry based  on the  triple  (network
address and port, synchronization source identifier, 0).

The receiver identifies  the contributors to  the packet  (for example,  the
speaker who is  heard in  the packet) through  the list  of content  sources
carried in the CSRC option.   To locate the  table entry, it matches on  the
triple (network address and port, synchronization source identifier, content
source).

Note that  since  network  addresses  are  only  generated  locally  at  the
receiver, the receiver can choose whatever format seems most appropriate for
matching.  For example, a Berkeley Unix-based system may use struct sockaddr
data types if it expects network sources with non-IP addresses.




H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 17]


INTERNET-DRAFT                        RTP                        May 6, 1993

Acknowledgments


This draft  is based  on discussion  within the  IETF audio-video  transport
working group  chaired by  Stephen Casner.    The current  protocol has  its
origins in the Network Voice Protocol  and the Packet Video Protocol  (Danny
Cohen and Randy Cole) and the protocol implemented by the 'vat'  application
(Van Jacobson and Steve McCanne).



B Addresses of Authors


Stephen Casner
USC/Information Sciences Institute
4676 Admiralty Way
Marina del Rey, CA 90292-6695
telephone:  +1 310 822 1511 (extension 153)
electronic mail:  casner@isi.edu


Henning Schulzrinne
AT&T Bell Laboratories
MH 2A244
600 Mountain Avenue
Murray Hill, NJ 07974
telephone:  +1 908 582 2262
electronic mail:  hgs@research.att.com



References


[1] J.  Postel, ``Internet  protocol,'' Network  Working  Group Request  for
    Comments RFC 791, Information Sciences Institute, Sept. 1981.

[2] International   Standards  Organization,   ``ISO/IEC  DIS   10646-1:1993
    information technology  -- universal multiple-octet coded character  set
    (UCS) -- part I: Architecture and basic multilingual plane,'' 1993.

[3] The  Unicode Consortium,  T_h_e_  U_n_i_c_o_d_e_  S_t_a_n_d_a_r_d_. New  York,  New  York:
    Addison-Wesley, 1991.

[4] D.  L. Mills,  ``Network  time protocol  (version 3)  --  specification,
    implementation  and  analysis,''  Network  Working   Group  Request  for
    Comments RFC 1305, University of Delaware, Mar. 1992.

[5] P.  Mockapetris, ``Domain names  -- concepts  and facilities,''  Network
    Working Group Request for Comments RFC 1034, ISI, Nov. 1987.


H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 18]


INTERNET-DRAFT                        RTP                        May 6, 1993

[6] P. Mockapetris,  ``Domain names  -- implementation and  specification,''
    Network Working Group Request for Comments RFC 1035, ISI, Nov. 1987.



















































H. Schulzrinne/S. Casner             Expires 10/01/93              [Page 19]