Internet Engineering Task Force        Audio-Video Transport WG & Others
INTERNET-DRAFT                                  D. Singer & P. Westerink
draft-singer-mpeg4-ip-00                            Apple Computer & IBM
                                                             July 3 2000
                                               Expires: January  3, 2001
                                             MPEG Document number: M6150

     A Framework for the delivery of MPEG-4 over IP-based Protocols

Status of This Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as ``work in progress.''

     The list of current Internet-Drafts can be accessed at

     The list of Internet-Draft Shadow Directories can be accessed at

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on (Africa), (Europe), (Pacific Rim), (US East Coast), or (US West Coast).

Distribution of this document is unlimited.


   This document forms an umbrella specification for the carriage and
   operation of MPEG-4 multimedia sessions over IP-based protocols,
   including RTP, RTSP, and HTTP, among others.

   It also serves to document the standard MIME types associated with

1 Introduction

   MPEG-4 is a complex multimedia system designed for delivery over a
   variety of transport protocols.  It includes scene management,

D. Singer, P Westerink                                          [Page 1]

Internet Draft          draft-singer-mpeg4-ip-00             July 3 2000

   interactivity, video, audio, and other streams.

   This document provides a number of specifications for the detailed
   mapping of MPEG-4 into several IP-based protocols.

   Open issues:  it might be desirable to signal to the terminal the
   amount of buffering assumed by the encoding/transmission process (in
   addition to any network jitter).

2 Use of RTP

   There are a number of Internet Drafts describing RTP packetization
   schemes for MPEG-4 data [5] [6] [7] [8] [9].  This draft does not
   specify any new one.  Media-aware packetization (e.g. video frames
   split at recoverable sub-frame boundaries) is a principle in RTP, and
   thus it is likely that several RTP schemes will be needed, to suit
   both the different kinds of media - audio, video, etc. - and
   different encodings (e.g. AAC and CELP audio codecs).

   This specification requires that, no matter what packetization scheme
   is used, there are a number of common characteristics that all MUST

   2.1]  The RTP timestamp corresponds to the CTS of the earliest AU
   within the packet.

   2.2]  RTP packets SHOULD have sequence numbers, and be sent, in
   decoding order.  Note that in the case where multiple, interleaved
   access units are sent in one packet, this will not be adhered to
   completely.  However, any transmission scheme MUST respect decoding
   dependencies in any re-ordering it does, so that dependent access
   units do not arrive before the access units they depend on.

   2.3]  The MPEG-4 timescale (clock ticks per second) SHOULD be used as
   the RTP timescale, e.g. as declared in RTP.

   2.4]  To achieve a base level of interoperability, and to ensure that
   any MPEG-4 stream may be carried, all senders and receivers SHOULD
   implement the simple scheme for the carriage of 1 elementary stream
   over RTP [8].  It is not a requirement that any particular session
   use this scheme for any particular scheme, merely that every terminal
   be able to receive this scheme.

   2.5]  There is a single payload format for the carriage of Flexmux
   over RTP [5].  Senders and receivers MAY implement this scheme.

   2.6]  Streams SHOULD be synchronized using RTP techniques (notable
   RTCP sender reports);  the MPEG-4 OCR is logically mapped to the NTP

D. Singer, P Westerink                                          [Page 2]

Internet Draft          draft-singer-mpeg4-ip-00             July 3 2000

   time axis used in RTCP.

   Other payload formats MAY be used.  They are signalled as dynamic
   payload IDs, defined by a suitable names (e.g. a payload name in an
   SDP RTPMAP attribute).  In particular, the development of specialized
   RTP payloads for video (e.g. respecting video packets) and audio
   (e.g. providing interleave [10]) is expected.  It is possible that
   these schemes can be compatible with the simple scheme required here

   For those streams requiring reliable delivery, the recommendation is
   to investigate the leverage of existing work in the IETF in this area
   (including, but not limited to FEC, re-transmission, or repetition),
   rather than making it a characteristic of the packing scheme itself.
   However, techniques in combined source/channel coding, or error-
   correction which is dependent on the coding scheme, may make other
   schemes attractive [9].

3 SDP Information

   This specification currently assumes that any session described by
   SDP (e.g. in SAP, as a file download, as a DESCRIBE over RTSP) has at
   most one MPEG-4 session.  It is desirable that this restriction be

   3.1] Senders SHOULD alert receivers that an MPEG-4 session is
   included, by means of an SDP attribute that is general (i.e. before
   any "media" lines).  This takes the form of an attribute line:

   a=mpeg4-iod [<location>]

   In an RTSP session, the location is optional.  If the location is not
   supplied, the IOD is retrieved over the RTSP session by using
   DESCRIBE with an accept of type application/mpeg4-iod. Where the SDP
   information is supplied by some other means (e.g. as a file, in SAP),
   the location is obligatory and should be a URL enclosed in double-
   quotes, which will supply the IOD (e.g. small ones may be encoded
   using DATA:, or otherwise HTTP: or other suitable file-access URL).

   3.2] The mapping of RTP streams to elementary streams. This needs to
   cover the Flexmux case as well as the single stream.  Within the SDP
   information, a stream-specific attribute SHOULD be present for each
   MPEG-4 stream.  It takes one of two forms, depending on whether a
   single elementary stream, or a flexmux, is carried.

   a=mpeg4-esid a.b.c or a=mpeg4-esids m.i:a.b.c,n.k:p.q

D. Singer, P Westerink                                          [Page 3]

Internet Draft          draft-singer-mpeg4-ip-00             July 3 2000

   The first attribute is used for single streams; the second for
   flexmux.  In this, a is the ESID of the top-level OD stream (declared
   within the IOD), b is the ESID within that scope of another OD
   stream, and c is the ESID within that scope of the stream;  similarly
   for p and q.  m and n are muxcodes (identifying the table used), and
   i and k are flexmux channel numbers within the indicated muxcode.

   3.3] The flexmux stream also needs a muxcode table supplied to the
   receiver.  These are indicated via a stream-level attribute.

   a=mpeg4-muxcodetable <location>

   where <location> is a URL enclosed in double quotes, that will supply
   the table(s).  If they are small, a DATA: URL will probably suffice
   to carry them in-line. If not, the URL should use a file-retrieval
   scheme (e.g. HTTP, FTP). The data at the indicated URL consists of
   some number of concatenated muxcode tables, complete, in binary
   format (but note that DATA URLs allow for base64 encoding of binary
   data, which would be needed here).  The mime type of a muxcode table
   needs deciding (application/mpeg4-muxcodetable). These tables have an
   intrinsic length, so simple concatenation suffices.

4 MIME Types

   Amendment 1 of the MPEG-4 standard (also known as version 2) is
   nearing completion, and includes a standard file type for
   encapsulating MPEG-4 data.  This file type can be used in a number of
   ways:  perhaps the most important are its use as an interchange
   format for MPEG-4 data, its use as a content-download format, and as
   the format read by streaming media servers.

   These first two uses will be greatly facilitated if there is a
   standard MIME type for serving these files (e.g. over HTTP).

   The MPEG-4 standard is broad, and therefore the type of data which
   may be in such a file can vary.  In brief, simple compressed video
   and audio (using a number of different compression algorithms) can be
   included;  interactive scene information;  meta-data about the
   presentation;  references to MPEG-4 media streams outside the file;
   and so on.

   The historical approach for MPEG data is to declare it under "video".
   Though MPEG-4 is considerably broader than MPEG-1 and MPEG-2, I
   believe that we should follow this precedent, as "multipart" seems
   inappropriate and "application" too diffuse.  However, MPEG-4 may be
   used for a purely audio environment, and in that case the type
   audio/mpeg4 should be used.  In either case, these indicate files

D. Singer, P Westerink                                          [Page 4]

Internet Draft          draft-singer-mpeg4-ip-00             July 3 2000

   conforming to the "MP4" specification (ISO/IEC 14496 Amendment 1,
   systems file format).

   3.1] When an MP4 file is served (e.g. over HTTP) or otherwise must be
   identified by a MIME type, the type "video/mpeg4"  SHOULD be used.
   The type "audio/mpeg4" MAY be used when the MPEG-4 presentation
   contained within the MP4 file has no visual aspects and is entirely

   3.2]  In some cases, the initial object descriptor needs to be
   identified with a MIME type. In this case, the type
   "application/mpeg4-iod" SHOULD be used.

   3.3] In some cases, the muxcode table needed by a flexmux decoder
   needs to be identified with a MIME type. In this case, the type
   "application/mpeg4-muxcodetable" SHOULD be used.

   3.4] The payload names used in an RTPMAP attribute within SDP, to
   specify the mapping of payload number to its definition, also come
   from the MIME namespace.  Each of the RTP payload mappings defined
   above has a distinct name.  For those payloads carrying a variety of
   MPEG stream types, the name SHOULD be drawn from the "video"
   namespace.  For those payloads specific to audio only, the name
   SHOULD be drawn from the "audio" namespace.

   Given the broad and general nature of MPEG-4, and the interactive
   environment, it is hard to say that there are no security
   considerations.  However, none are known to the author at this time,
   and the standard was developed with the intent that there be none.

MIME media type name:              video, and audio
MIME subtype name:                 mpeg4

MIME media type name:              application
MIME subtype name:                 mpeg4-iod, mpeg4-muxcodetable
Required parameters:               none
Optional parameters:               none
Encoding considerations:           base64 generally preferred; files are
                                   binary and should be transmitted
                                   without CR/LF conversion, 7-bit
                                   stripping etc.
Security considerations:           None known at the time of writing
Interoperability considerations:   A number of interoperating
                                   implementations exist within the
                                   MPEG-4 community;  and that community
                                   has reference software for reading
                                   and writing the file format.
Published specification:           Pending (ISO/IEC 14496, MPEG-4

D. Singer, P Westerink                                          [Page 5]

Internet Draft          draft-singer-mpeg4-ip-00             July 3 2000

                                   Systems Amendment 1).
Applications:                      Multimedia
Additional information:

Magic number(s):                   none
File extension(s):                 mp4 and mpg4 are both declared at
Macintosh File Type Code(s):       mpg4  is registered with Apple

Person to contact for info:        David Singer,

Intended usage:                    Common

Author/Change controller:          David Singer, MPEG-4 file format

5  RTSP usage

   RTSP may be used as a session control protocol for sessions which
   carry MPEG-4 information.  When RTSP is used as a session-control

   5.1]  RTP SHOULD be used as the transport protocol.

   5.2] The initial DESCRIBE format SHOULD be SDP.  If the SDP
   information reveals that an IOD is needed, and the terminal does not
   already have it, then a second DESCRIBE accepting an IOD SHOULD be
   performed (see above).

   5.3] Note that if all MPEG-4 streams are closed (TEARDOWN) then the
   RTSP session ID will be lost.  The next (re-)opened stream will
   supply a new session ID.  Care should be taken that the target of the
   URL has not changed in the interval;  new DESCRIBEs may be needed.


   This draft has benefited greatly by contributions from many people,
   including Mike Coleman, Jean-Claude Duford, Carsten Herpel, Olivier
   Avaro, Paul Christ, and many others.  Their insight, foresight, and
   contribution is gratefully acknowledged.  Little has been invented
   here by the authors;  this is mostly a collation of greatness that
   has gone before.

D. Singer, P Westerink                                          [Page 6]

Internet Draft          draft-singer-mpeg4-ip-00             July 3 2000


   [1] H. Schulzrinne, et. al., "RTP : A Transport Protocol for Real-
   Time Applications", IETF RFC 1889, January 1996.

   [2] H. Schulzrinne, et. al., "RTP Profile for Audio and Video
   Conference with Minimal Control", IETF RFC 1890, January 1996.

   [3] H. Schulzrinne, et. al., "Real Time Streaming Protocol", IETF
   Draft, draft-ietf-mmusic-rtsp-09.txt, February 2 1998, Expires:
   August 2 1998.

   [4] M. Handley, "SDP: Session Description Protocol", IETF Draft,
   draft-ietf-mmusic-sdp-05.txt, November 21 1997, Expires: November 21

   [5] C.Roux  et al., "RTP Payload Format for Flexmultiplexed MPEG-4
   Streams", IETF Draft, draft-rgcc-avt-mpeg4flexmux-00, March, 09 2000
   expires Sept 9 2000

   [6] Yoshihiro Kikuchi et al., "RTP payload format for MPEG-4
   Audio/Visual streams", IETF Draft, draft-ietf-avt-rtp-mpeg4-es-01,
   Feb 1 2000, expires Aug 1 2000

   [7] C.Guillemot et al., "RTP Payload Format for MPEG-4 with Flexible
   Error Resiliency", IETF Draft, draft-ietf-avt-mpeg4streams-00, March
   1 2000, expires Sept 1 2000

   [8] R Civanlar et al., " RTP Payload Format for MPEG-4 Streams", IETF
   Draft, draft-ietf-avt-rtp-mpeg4-02, ?? 2000, expires ?? 20000

   [9] C.Guillemot et al., "RTP payload format for MPEG-4 Visual
   Advanced Profiles", IETF Draft, draft-gc-avt-mpeg4visual-00.txt,
   March 1 2000, expires Sept 1 2000

   [10] R. Finlayson, "A More Loss-Tolerant RTP Payload Format for MP3
   Audio", IETF Draft, draft-ietf-avt-rtp-mp3-01.txt, Mar 10 2000,
   expires Sep 10 2000

Authors' Contact Information
   David Singer
   Tel: +1 408 974 3162

   Apple Computer, Inc.
   One Infinite Loop, MS:302-3MT
   Cupertino  CA 95014

D. Singer, P Westerink                                          [Page 7]

Internet Draft          draft-singer-mpeg4-ip-00             July 3 2000

   Peter Westerink
   Tel: +1 914 784 7173

   30 Saw Mill River Road
   Hawthorne, NY 10532

D. Singer, P Westerink                                          [Page 8]