RTP Payload Format for Bundled MPEG
Network Working Group                                          M. Civanlar
Request for Comments: 2343                                         G. Cash
Category: Experimental                                          B. Haskell
                                                        AT&T Labs-Research
                                                                  May 1998

                  RTP Payload Format for Bundled MPEG

Status of this Memo

   This memo defines an Experimental Protocol for the Internet
   community.  This memo does not specify an Internet standard of any
   kind.  Discussion and suggestions for improvement are requested.
   Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (1998).  All Rights Reserved.


   This document describes a payload type for bundled, MPEG-2 encoded
   video and audio data that may be used with RTP, version 2. Bundling
   has some advantages for this payload type particularly when it is
   used for video-on-demand applications. This payload type may be used
   when its advantages are important enough to sacrifice the modularity
   of having separate audio and video streams.

1. Introduction

   This document describes a bundled packetization scheme for MPEG-2
   encoded audio and video streams using the Real-time Transport
   Protocol (RTP), version 2 [1].

   The MPEG-2 International standard consists of three layers: audio,
   video and systems [2]. The audio and the video layers define the
   syntax and semantics of the corresponding "elementary streams." The
   systems layer supports synchronization and interleaving of multiple
   compressed streams, buffer initialization and management, and time
   identification.  RFC 2250 [3] describes packetization techniques to
   transport individual audio and video elementary streams as well as
   the transport stream, which is defined at the system layer, using the

   The bundled packetization scheme is needed because it has several
   advantages over other schemes for some important applications
   including video-on-demand (VOD) where, audio and video are always
   used together.  Its advantages over independent packetization of
   audio and video are:

     1. Uses a single port per "program" (i.e. bundled A/V).  This may
     increase the number of streams that can be served e.g., from a VOD
     server. Also, it eliminates the performance hit when two ports are
     used for the separate audio and video streams on the client side.

     2. Provides implicit synchronization of audio and video.  This is
     particularly convenient when the A/V data is stored in an
     interleaved format at the server.

     3. Reduces the header overhead. Since using large packets increases
     the effects of losses and delay, audio only packets need to be
     smaller increasing the overhead. An A/V bundled format can provide
     about 1% overall overhead reduction. Considering the high bitrates
     used for MPEG-2 encoded material, e.g. 4 Mbps, the number of bits
     saved, e.g. 40 Kbps, may provide noticeable audio or video quality

     4. May reduce overall receiver buffer size. Audio and video streams
     may experience different delays when transmitted separately. The
     receiver buffers need to be designed for the longest of these
     delays. For example, let's assume that using two buffers, each with
     a size B, is sufficient with probability P when each stream is
     transmitted individually. The probability that the same buffer size
     will be sufficient when both streams need to be received is P times
     the conditional probability of B being sufficient for the second
     stream given that it was sufficient for the first one. This
     conditional probability is, generally, less than one requiring use
     of a larger buffer size to achieve the same probability level.

     5. May help with the control of the overall bandwidth used by an
     A/V program.

   And, the advantages over packetization of the transport layer streams

     1. Reduced overhead. It does not contain systems layer information
     which is redundant for the RTP (essentially they address similar

     2. Easier error recovery. Because of the structured packetization
     consistent with the application layer framing (ALF) principle, loss
     concealment and error recovery can be made simpler and more
