Network Working Group                                        S. PfeifferInternet-Draft                                                 C. Parker
Expires: June 30, 2004                                           A. Pang
                                                                   CSIRO
                                                       December 31, 2003
 The Annodex annotation format for time-continuous bitstreams, Version
                                  2.0
                       draft-pfeiffer-annodex-01
Status of this Memo
   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026 except that the right to
   produce derivative works is not granted.
   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.
   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."
   The list of current Internet-Drafts can be accessed at http://
   www.ietf.org/ietf/1id-abstracts.txt.
   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.
   This Internet-Draft will expire on June 30, 2004.
Copyright Notice
   Copyright (C) The Internet Society (2003). All Rights Reserved.
Abstract
   This specification defines a file format for annotating and indexing
   time-continuous bitstreams for the World Wide Web. The format has
   been named "Annodex" for annotating and indexing. The Annodex format
   enables the specification of named anchor points in time-continuous
   bitstreams together with textual annotations and hyperlinks in URI
   [4] format. These anchor points are merged time-synchronously with
   the time-continuous bitstreams when authoring a file in Annodex
   format. The ultimate aim of the Annodex format is to enable an
   integration of time-continous bitstreams into the browsing and

Pfeiffer, et al.         Expires June 30, 2004                  [Page 1]


Internet-Draft                  ANNODEX                    December 2003
   searching functionality of the World Wide Web.
   At this point in time, the right to produce derivative works is not
   granted to the IETF as the authors are uncertain about the necessity
   to create a working group. The specification is not encumbered by
   patents.  The Annodex format is protected by a trade mark to prevent
   the use of the term "Annodex" for any related but non-conformant and
   therefore non-interoperable technology. Conformant technology is
   encouraged to use the term "Annodex" when refering to the file
   format.
   Notice the change to Annodex 2.0 from the previous version of this
   Internet-Draft, replacing Annodex 1.0.
Table of Contents
   1.    Introduction . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.    The architecture of a Continuous Media Web . . . . . . . . .  5
   3.    Overview of the Annodex bitstream format . . . . . . . . . .  7
   4.    Handling time in the Annodex format bitstream  . . . . . . .  9
   5.    Media encapsulation format . . . . . . . . . . . . . . . . . 12
   5.1   Media mapping for Ogg encapsulation  . . . . . . . . . . . . 12
   5.2   The format of the Annodex media mapping bos  . . . . . . . . 13
   5.3   The format of the media and annotation bitstream media
         mapping bos  . . . . . . . . . . . . . . . . . . . . . . . . 15
   6.    The decoding of Annodex format bitstreams to CMML  . . . . . 18
   7.    MIME media type registration for 'application/annodex' . . . 20
   7.1   URI addressing into Annodex files  . . . . . . . . . . . . . 21
   7.1.1 Query parameters for use with the HTTP protocol
         server-side  . . . . . . . . . . . . . . . . . . . . . . . . 21
   7.1.2 Fragment identifiers for use with the HTTP protocol
         client-side  . . . . . . . . . . . . . . . . . . . . . . . . 21
   7.1.3 HTTP 'Accept' header field interpretation  . . . . . . . . . 22
   8.    Security considerations  . . . . . . . . . . . . . . . . . . 23
         References . . . . . . . . . . . . . . . . . . . . . . . . . 24
         Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 25
   A.    Definitions of terms and abbreviations . . . . . . . . . . . 26
   B.    Glossary of acronyms . . . . . . . . . . . . . . . . . . . . 27
   C.    Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 28
         Intellectual Property and Copyright Statements . . . . . . . 29



Pfeiffer, et al.         Expires June 30, 2004                  [Page 2]


Internet-Draft                  ANNODEX                    December 2003
1. Introduction
   When searching the World Wide Web, time-continuous data such as audio
   and video files are still treated as "dark matter" outside the
   existing infrastructure of the World Wide Web. It is not possible to
   look inside such files, search for their content through common
   text-based search engines, and directly hyperlink to points of
   interest inside them. The file can only be consumed in its entirety
   until the point of interest is reached. In addition, such files are
   "dead ends" in that by consuming their content the hyperlinking
   functionality of the Web is left behind.
   This document specifies a file format that enables integrated
   handling of time-continuous data on the World Wide Web. By
   interleaving XML markup with the time-continuous data, the internal
   structure and content of the time-continuous data file becomes
   accessible, the file becomes annotated and indexed, or a "Annodex"
   file. The Annodex format together with the Continuous Media Markup
   Language (CMML) [15] and the URI standard [4], extended by temporal
   URI references [14] build the basis technology to enable searching
   and surfing of time-continuous data via existing Web infrastructure.
   The Annodex format enables encapsulation of any type of streamable
   time-continuous bitstream format thus being independent of current or
   future compression formats. The XML tags were chosen to be very
   similar to XHTML to enable a simple transfer of knowledge for HTML
   authors.
   The file extension of Annodex files is ".anx". This document also
   applies for registration of the mime-type "application/annodex" for
   Annodex format bitstreams.
   The structure of this document is as follows: this introduction is
   followed by a section describing the architecture of a Continuous
   Media Web based on Annodex format data. The next section gives an
   overview of the Annodex file format, including the annotation
   bitstream. The handling of the different time constructs in Annodex
   is quite complex and is explained in section 4. Section 5 then
   describes in detail how media encapsulation is performed and what the
   multiplexing format consists of. How to extract the annotation and
   meta data content of an Annodex stream into a CMML file is explained
   in section 6. The MIME type application and security considerations
   constitute the final sections.
   Please note that this document assumes that the reader has a fluent
   working knowledge of XML [1], HTML [2], XHTML [3] and the World Wide
   Web. Deep knowledge of the Ogg encapsulation format version 0 [11] is
   also a prerequisite to understanding this specification. It is also a
   sister document to the specification of the Continous Media Markup

Pfeiffer, et al.         Expires June 30, 2004                  [Page 3]


Internet-Draft                  ANNODEX                    December 2003
   Language (CMML Version 2.0) [15] for authoring annotations for
   time-continuous data and for steering the encoding of Annodex format
   bitstreams.
















Pfeiffer, et al.         Expires June 30, 2004                  [Page 4]


Internet-Draft                  ANNODEX                    December 2003
2. The architecture of a Continuous Media Web
   As with Webpages, Annodex format bitstreams first have to be authored
   and then published on a server. Authoring includes the creation of
   the media bitstream plus the creation of annotations (i.e. textual
   data descriptions), indexes (i.e. anchor points) and hyperlinks (i.e.
   URIs [4]) for clips of the media data. Annotations, indexes and
   hyperlinks are created in "head" and "clip" tags conformant to the
   CMML [15] specification, and interleaved into the media document to
   create Annodex format bitstreams in a time-synchronous fashion. This
   procedure can be performed both on files and live streams. The
   collection of Annodex format bitstreams on the Internet is called the
   Continuous Media Web as it builds a Web of time-continuous resources.
   Distribution of Annodex format bitstreams happens via a network
   protocol such as HTTP [5] or RTP/RTSP [7]. The basic process is the
   following: The client dispatches a download or streaming request to
   the server with a certain URI. The server resolves the URI and starts
   packetising Annodex format bitstreams, taking into account URI
   addressed offsets or fragments. Currently the distribution with HTTP
   is clear and discussed in this document, while the details of a
   distribution via RTP/RTSP are not yet examined and thus unspecified.
   The Annodex format has been designed to accommodate for reliable and
   unreliable transport. In case of packet loss due to an unreliable
   transport, media data or clip data may get lost; this may be
   important to the application or not. Both media and mark-up data are
   treated with the same importance. If a user doesn't care whether the
   media data is completely received, then the mark-ups will be regarded
   the same way. Clips are typically treated as state changes; if a clip
   tag is lost, the next clip tag will restore the proper state. We
   envisage, however, that a client may require the current state
   information, so there should be a protocol request for sending the
   current state again. This will be delivered by the server by
   inserting another copy of the currently active clip into the Annodex
   bitstream.
   To access the Continuous Media Web, a client such as a conformant Web
   browser is required. A client can link to an Annodex bitstream via a
   URI. A URI can point to a temporal offset in the Annodex bitstream
   using URI time interval specifierss [14] or to a named offset by
   using the id tag of a clip element as a URI fragment identifier. In
   this way, direct access to points of interest in the media document
   is enabled. While playing back Annodex format bitstreams, a user is
   being offered hyperlinks (URIs) to other Web resources which are
   related to the currently displayed media content.
   A client may be a special player or a browser plugin. This

Pfeiffer, et al.         Expires June 30, 2004                  [Page 5]


Internet-Draft                  ANNODEX                    December 2003
   application must split an Annodex format bitstream into its
   constituent time-continuous data streams and the annotation bitstream
   consisting of "head" and "clip" tags. A decoder is required to
   display the encapsulated media document after decoding it with the
   appropriate media decoder. While playing back the media document, the
   application displays the hyperlinks and the annotations for the
   active clips.
   Search engines can include published Annodex format files into their
   search repertoire by finding annotations in the clip tags in a
   standard way independent of the encoding and packetising format of
   the annodexed time-continuous data streams. This allows any media
   format to be spidered. In addition, the protocol should allow the
   downloading of only the CMML mark-up from a published Annodex format
   file in order to discourage spiders from creating extensive network
   loads, as they do not need to download the media bitstream to gain
   the necessary information. It also reduces the size of search
   archives, even for large amounts of published Annodex format files,
   because a CMML file contains all searchable annotations for the clips
   of its Annodex format file.










Pfeiffer, et al.         Expires June 30, 2004                  [Page 6]


Internet-Draft                  ANNODEX                    December 2003
3. Overview of the Annodex bitstream format
   The format of Annodex bitstreams consists of interleaved bitstreams
   of time-continuous data and structured XML mark-up of an annotation
   bitstream. It is designed to be used both as a persistent file format
   and as a streaming format. Any encoding format for time-continuous
   data can be encapsulated in the Annodex format as long as it is
   streamable and is based on a regular data sampling rate (called
   granulerate). XML mark-up is inserted between data packets at the
   synchronised point in time.
   An Annodex bitstream is designed to allow several tracks of
   temporally synchronous time-continuous data. Each bitstream track
   represents codec data for one type of time-continuous data stream.
   The annotation bitstream is regarded as one of these data bitstreams
   representing a CMML [15] file. It annotates the Annodex bitstream by
   subdividing it sequentially into clips of data and providing
   annotations for it. Several annotation tracks may be represented in
   on annotation bitstream, allowing to describe the Annodex bitstream
   from different aspects, e.g. by giving different language tracks, or
   representing a shot structure and a scene structure. Thus an Annodex
   bitstream has the following conceptual structure:
     Annodexed media file with data bistreams D1-D3 and an annotation bitstream
     with two annotation tracks A1, A2:
       _______________________________________________________________________
   D1  |    |   |        |         |    |        |      |       |   |        |
       _______________________________________________________________________
   D2  |          |            |            |             |          |       |
       _______________________________________________________________________
   D3  |  |   |  |  |   |   |  |  |   |   |  |  |  |   |   |  |   |   |  |   |
       _______________________________________________________________________
   A1  | clip 1                  | clip 2                     | clip 3       |
       _______________________________________________________________________
   A2  | clip 1                       | --  | clip 2      | clip 3           |
       _______________________________________________________________________
   The time axis                                                              t
       |---------------------------------------------------------------------->
   For the purposes of Annodex, data bitstreams are being regarded as a
   sequence of data packets that each have a timestamp representing the
   time at which the packet data starts and containing all the data
   required for the interval until the next packet starts. Thus, to
   insert a gap in a data bitstream (as in the annotation track 2 of the

Pfeiffer, et al.         Expires June 30, 2004                  [Page 7]


Internet-Draft                  ANNODEX                    December 2003
   example above), a data packet has to be inserted explicitly
   annullating the data.
   Data bitstreams generally contain the following information:
   o  setup information for a codec
   o  content data
   The setup information is inserted at the start of a data bitstream
   before any content data.
   For the annotation bitstream, which represents a CMML file [15], the
   codec setup information consists of a CMML "head" tag containing
   annotations and meta data for the complete Annodex bitstream. It is
   thus inserted at the start of an annotation bitstream as setup
   information for that bitstream. The content data consists of the CMML
   "clip" tags without timing information. They contain information on
   the fragment of data between the clip's insertion time and the next
   clip on the same track (or the end of the document if none follows).
   CMML "clip" tags are encoded as described in the CMML specification
   [15].









Pfeiffer, et al.         Expires June 30, 2004                  [Page 8]


Internet-Draft                  ANNODEX                    December 2003
4. Handling time in the Annodex format bitstream
   Annodex format bitstreams inherently represent one timeline only,
   where the different data and the annotation bitstream can be thought
   of as content tracks on that timeline. All of these tracks relate to
   the same timeline which starts at a certain time point and ends when
   the last bitstream ends. An example bitstream can be seen in the
   following figure. It consists of an Annodex format bitstream that
   contains 4 media bitstreams and the annotation bitstream.
   The following bitstream is a conceptual representation of the time
   intervals covered by the different logical bitstreams. In the flat
   representation these will be multiplexed such that the data packets
   of each of these bitstreams occur at the correct time.
   t0                                                                   tn
   |------------------------------------------------------------------->|
   ----------------------------------------------------------------------
   |clip1  | clip 2 | clip 3               | clip 4                     |
   ----------------------------------------------------------------------
   annotation bitstream
   ---------------------------------------------
   | audio bitstream 1                         |
   ---------------------------------------------
           --------------------------------------------------------------
           | video bitstream 1                                          |
           --------------------------------------------------------------
                    -----------------------------------------------------
                    | audio bitstream 2                                 |
                    -----------------------------------------------------
                           ------------------------------
                           | video bitstream 2          |
                           ------------------------------
   The time point at which the Annodex format bitstream starts (t0 in
   the above diagram) is called the "timebase" and represents the
   playback time in seconds associated with the beginning of the Annodex
   bitstream. This start time may but does not have to be 0 - it can be
   any positive time offset.
   Each one of the encapsulated media bitstreams and the annotation
   bitstream have their own temporal resolution at which they can
   provide data to cover the given timeline. This temporal resolution is
   usually given through the sampling rate of the particular bitstream.
   For example, a raw audio bitstream at CD quality is sampled with a
   sampling rate of 44100 Hz. A video bitstream may be sampled with a
   frame rate of 25 frames per second. This temporal resolution is

Pfeiffer, et al.         Expires June 30, 2004                  [Page 9]


Internet-Draft                  ANNODEX                    December 2003
   stored in the "granulerate" field of the bos page of the bitstream.
   The "granulerate" is used for the calculation of the time position
   for which a data packet of a media bitstream contains data. The
   "granulepos" field in an Ogg page when divided by the "granulerate"
   of that page's logical bitstream provides the time position that is
   reached in that bitstream after decoding all data packets finished on
   this page. E.g. if an audio bitstream has a granulerate of 44100 and
   starts at 0, then a granulepos of 88200 signifies that the bitstream
   has reached the second sec after the end of decoding this page's
   packets.
   The annotation bitstream's "granulerate" can be chosen arbitrarily by
   the bitstream multiplexer. One option is to choose the least common
   multiple of the granulerates of all the media bitstreams to gain at
   least the resolution of the bitstreams. However, that resultion may
   not be enough compared to the one that the author of clips is asking
   for on insertion time. One solution is to accommodate for all
   possible time schemes of the clips. Thus, selecting the least common
   multiple of the resolutions of all the possible npt and smpte time
   schemes as the resolution of the annotation bitstream is another
   option.
   The possible time schemes with their respecitve resolutions are:
   o  npt: 1000
   o  smpte-24: 24
   o  smpte-24-drop: 24/1.001 = 23.976 (approx. as per SMPTE)
   o  smpte-25: 25
   o  smpte-30: 30
   o  smpte-30-drop: 30/1.001 = 29.970 (approx. as per SMPTE)
   o  smpte-50: 50
   o  smpte-60: 60
   o  smpte-60-drop: 60/1.001 = 59.940 (approx. as per SMPTE)
   To get to integer values, it is necessary to multiply all resolutions
   by 1000 and then take the least common multiple: lcm(1000000, 24000,
   23976, 25000, 30000, 29970, 50000, 60000, 59940) = 2997000000. The
   "granulerate" would therefore be 2997000.  This provides for a
   temporal resolution on the order of 10^-6, accommodating for a mixed

Pfeiffer, et al.         Expires June 30, 2004                 [Page 10]


Internet-Draft                  ANNODEX                    December 2003
   use of all the above given time schemes with complete accuracy on the
   annotation bitstream.
   The "granulepos" of the (set of) page(s) holding a "clip" element of
   the annotation stream has to signify the start time of that "clip"
   element. E.g. if the "granulerate" of the annotation bitstream is
   1000, the "timebase" is 0, and a clip is to be inserted at
   npt=12.020, its "granulepos" will be 12020. Clips can be repeated in
   the Annodex format bitstream, which will be signified by having the
   same "track" attribute and the same page_sequence_number as the
   previous "clip" element.













Pfeiffer, et al.         Expires June 30, 2004                 [Page 11]


Internet-Draft                  ANNODEX                    December 2003
5. Media encapsulation format
   An Annodex format bitstream consists of XML markup in the annotation
   bitstream interleaved with the related media packet of the media
   bitstreams into a single bitstream.
   It is not possible to use straight XML as encapsulation because XML
   cannot enclose binary data except encoded as Unicode. The use of
   Unicode would introduce too much overhead. Therefore, an
   encapsulation format that could handle binary bitstreams and textual
   pacetsk was required.
   The following list gives a summary of the requirements for the
   Annodex format bitstream:
   o  framing for binary time-continuous data and XML.
   o  temporal synchronisation between time-continuous media bitstreams
      and XML on interleaving.
   o  temporal resynchronisation after parsing error.
   o  detection of corruption.
   o  seeking landmarks for direct random access.
   o  streaming capability (i.e. the information required to parse and
      decode a bitstream part is available at the time at which the
      bitstream part is reached and does not come e.g. at the end of the
      stream).
   o  small overhead.
   o  simple interleaving format with a track paradigm.
   The Ogg encapsulation format version 0 [11] was chosen as the
   encapsulation format for Annodex format bitstreams as it provides for
   all the requirements and has proven reliable and stable.
5.1 Media mapping for Ogg encapsulation
   This section specifies the way the Ogg media encapsulation framework
   is used for creating Annodex format bitstreams. As such, knowledge of
   the Ogg bitstream format as specified in the Ogg RFC [11] is
   presumed. Please also refer to that document for descriptions of the
   terms used in this document. This section describes the specific
   media mapping that is used for Annodex format bitstreams.

Pfeiffer, et al.         Expires June 30, 2004                 [Page 12]


Internet-Draft                  ANNODEX                    December 2003
   Annodex format bitstreams consist of one or more time-continuous data
   bitstreams and an XML annotation bitstream concurrently interleaved
   (in Ogg terms: multiplexed) into an Ogg bitstream. Sequential
   multiplexing is allowed, but can only happen with complete Annodex
   format bitstreams.
   As with any Ogg bitstream, the physical bitstream starts with the bos
   pages of all logical bitstreams, followed by the secondary header
   pages,       followed by the data pages. Every Annodex format bitstream
   consists of at least two logical bitstreams: the Annodex media
   mapping bitstream and the annotation bitstream that represents a CMML
   [15] file. An Annodex physical bitstream starts with the bos pages of
   these two (in order), followed by the bos pages of any number of data
   bitstreams. Then all the secondary header pages of all the data
   bitstreams follow, including the secondary bos page(s) of the
   annotation bitstream containing the CMML "head" tag. Finally, all the
   data bitstreams and the annotation bitstream are multiplexed in Ogg
   pages in a time-synchronous fashion.
   The next sections describe the different bos pages, which occur in
   the Annodex physical bitstream in the following order:
   1.  Annodex media mapping bos
   2.  annotation bitstream media mapping bos
   3.  data bitstream(s) media mapping bos
   4.  empty Annodex media mapping eos
   5.  annotation bitstream secondary header page(s)
   6.  data bitstream(s) secondary header page(s)
   7.  annotation and data bitstream(s) content pages
   8.  annotation and data bitstream(s) eos-s
5.2 The format of the Annodex media mapping bos
   The Annodex media mapping bitstream consists only of one bos page
   which contains information for the complete Annodex physical
   bitstream. The bos page has the following format:


Pfeiffer, et al.         Expires June 30, 2004                 [Page 13]


Internet-Draft                  ANNODEX                    December 2003
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Identifier 'Annodex\0'                                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Version major                 | Version minor                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Timebase numerator                                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Timebase denominator                                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | UTC                                                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   Fields with more than one byte length are encoded LSB (least
   significant byte) first.
   The fields in the Annodex bos page have the following meaning:
   1.  Identifier: a 8 Byte field that identifies this file to be of
       Annodex format. It contains the magic numbers:
          0x41 'A'
          0x6e 'n'
          0x6e 'n'
          0x6f 'o'
          0x64 'd'


Pfeiffer, et al.         Expires June 30, 2004                 [Page 14]


Internet-Draft                  ANNODEX                    December 2003
          0x65 'e'
          0x78 'x'
          0x00 '\0'
   2.  Version major: 2 Byte short integer number signifying the major
       version number of the Annodex format bitstream. This document
       specifies the major version 2.
   3.  Version minor: 2 Byte short integer number signifying the minor
       version number of the Annodex format bitstream. This document
       specifies the minor version 0.
   4.  Timebase numerator & denominator: 8 Byte integer number each.
       They represent together the timebase of the Annodex format
       bitstream given as a rational number. The denominator represents
       the temporal resolution at which the timebase is given. E.g. 5 on
       1000 results in a timebase of 0.005 sec. This enables a very high
       temporal resolution without having to store floating point
       numbers.
   5.  UTC: a 20 Byte string containing a UTC time in the form of
       YYYYMMDDTHHMMSS.sssZ. It associates a calendar date and a
       wall-clock time with the timebase. It is a sequence of 20 NUL
       Bytes if not in use, making this bos page constant length.
   Please note: The possible temporal resolution of the timebase is on
   the order of 2^-64. However the time formats in use for media that
   are described in this document range from 1/24 to 1/60 for the
   different smpte formats and to 1/1000 for npt.  Thus, this resolution
   is enough for any one of them. What's more, this resolution is
   expected to accommodate any future needs of time resolution for any
   other time format (and time-continuous sampled data).
5.3 The format of the media and annotation bitstream media mapping bos
   The media and annotation bitstreams start each with one bos page
   containing information required for the decoding of the bitstream.
   After that, secondary header pages follow that contain information to
   set up the decoder for the bitstream and other stream-specific
   information. Then, the pages that contain the actual data follow.
   The bos page of a media or annotation bitstream has the following
   format:


Pfeiffer, et al.         Expires June 30, 2004                 [Page 15]


Internet-Draft                  ANNODEX                    December 2003
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Identifier 'AnxData\0'                                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Granule rate numerator                                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Granule rate denominator                                      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Number of secondary header pages                              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Message header fields ...                                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   Fields with more than one byte length are encoded LSB (least
   significant byte) first.
   The fields in an AnxData bos page have the following meaning:
   1.  Identifier: a 8 Byte field that identifies this file to be of a
       logical input bitstream with encoded information. It contains the
       magic numbers:
          0x41 'A'
          0x6e 'n'
          0x78 'x'
          0x44 'D'
          0x61 'a'
          0x74 't'
          0x61 'a'
          0x00 '\0'
   2.  Granule rate numerator & denominater: 8 Byte integer number each.
       They represent the temporal resolution of the logical bitstream

Pfeiffer, et al.         Expires June 30, 2004                 [Page 16]


Internet-Draft                  ANNODEX                    December 2003
       in Hz given as a rational number in the same way as the timebase
       attribute above.
   3.  Number of secondary header pages: a 4 Byte integer number that
       contains the number of secondary header pages of that particular
       logical bitstream following after this bos page.
   4.  Message header fields: header fields, following the generic
       Internet Message Format defined in RFC 2822 [6]. Each header
       field consists of a name followed by a colon (":") and the field
       value. Field names are case-insensitive. The field value MAY be
       preceded by any amount of LWS, though a single SP is prefered.
       Header fields can be extended over multiple lines by preceding
       each extra line with at least one SP or HT.
   Message header fields are considered protocol data, i.e. it is not
   expected to have human readable text in there. and are entirely
   encoded in UTF-8.
   There is one mandatory Message header field for all of the logical
   bitstreams: the "Content-type" header field. For an application that
   is parsing the Annodex file, this field contains the MIME type and
   the character encoding of the data in the logical bitstream. E.g. for
   the annotation bitstream, this field will contain the value
   "Content-type: text/x-cmml; UTF-8" if the character set used is
   UTF-8. E.g. for a bitstream containing Ogg vorbis data the value is
   "Content-type: audio/x-vorbis". The Content-type message header field
   comes first of all the Message header fields such that it can be
   found at a fixed location in the AnxData header.







Pfeiffer, et al.         Expires June 30, 2004                 [Page 17]


Internet-Draft                  ANNODEX                    December 2003
6. The decoding of Annodex format bitstreams to CMML
   The decoding of an Annodex format bitstream to CMML is roughly
   inverse to the encoding an Annodex format bitstream from a CMML file.
   There are some special cases to take care of, therefore the decoding
   steps are outlined in order here.
   The core of a CMML file can be created from the "head" tag taken from
   the secondary header page of the annotation bitstream, and from the
   sequence of "clip" tags extracted from the content of the annotation
   bitstream. A decoder MUST take care to reinsert the start time of
   each "clip" element into the "start" attribute of the respective CMML
   "clip" tag. The start time will be calculated from the Granule rate
   in the annotation bos page and the Granule pos given in the
   respective "clip" Ogg page.
   If the Annodex bitstream has a non-zero timebase or a non-null utc
   time in the Annodex bos page, a "stream" tag MUST be created with
   these attribute values. That "stream" tag is empty by default. A
   ripping application MAY however extract all the data bitstreams out
   of the Annodex bitstream into files, and then reference these files
   in the "src" attribute of "import" tags.
   Other attributes of the "import" tags MAY also be filled in from the
   logical bitstreams:
   o  the "contenttype" attribute from the "Content-type" Message header
      field of the respecitve bos,
   o  the "granulerate" attribute from the Granulerate fields of the
      respecitive bos,
   o  the "id" attribute from a Message header field called "ID" if
      available,
   o  and "param" elements from all the remaining Message header fields
      of the respective bos, where the field name gets stored in the
      "name" attribute and the value in the "value" attribute.
   A stream tag will thus roughly be created like this:
   <stream timebase="[Timebase]" utc="[UTC]">
     <import id="[ID]" granulerate="[Granulerate]" contenttype="[Content-type]"
             src="[stream1.mpg]" start="0"/>
   </stream>
   If the annotation bitstream has Message header fields called "ID",
   "Content-Language", or "Content-Dir", the "cmml" tag of the decoded

Pfeiffer, et al.         Expires June 30, 2004                 [Page 18]


Internet-Draft                  ANNODEX                    December 2003
   CMML file MUST use these field values in its "id", "lang", and "dir"
   attributes. This ensures that the default language setting of the
   annotation bitstream gets preserved:
   <cmml id="[ID]" lang="[Content-Language]" dir="[Content-Dir]">
   To restore the correct XML preamble for the CMML file, the charset
   part of the "Content-type" Message header field of the annotation
   bitstream MUST be extracted and used as value of the "encoding"
   attribute of the XML processing instruction. All the other fields of
   the XML preamble are fixed:
   <?xml version="1.0" encoding="[Content-type]" standalone="yes"?>
   <!DOCTYPE cmml SYSTEM "cmml.dtd">












Pfeiffer, et al.         Expires June 30, 2004                 [Page 19]


Internet-Draft                  ANNODEX                    December 2003
7. MIME media type registration for 'application/annodex'
   This section contains the registration information for the
   'application/annodex' media type. While this media type is not
   approved by the IANA, 'application/x-annodex' may be used.
   To: ietf-types@iana.org
   Subject: Registration of MIME media type application/annodex
   MIME media type name: application
   MIME subtype name: annodex
   Required parameters: none
   Optional parameters: none
   Encoding Considerations: the Annodex enables encapsulation of any
   type of encoding format. The authoring software has to provide for
   the encoders, providing the MIME type (and potentially the charset
   for text-based formats) in the "Content-type" Message header field of
   each bitstream track. The client software can select an appropriate
   decoder based on this information.
   Security considerations: see next section.
   Interoperability considerations: the Annodex bitstream format is a
   free specification that is independent of any media encoding format.
   It is designed to provide interoperability with the existing World
   Wide Web. Its specification is not patented and can be implemented by
   third parties without patent considerations.
   Additional information:
      Magic numbers: "OggS" identifies an Ogg page, "Annodex" identifies
      an Ogg page with an Annodex format bitstreams, and "AnxData"
      signifies an Ogg page with media or annotation bitstream data.
      File extension: .anx
      Macintosh File Type Code: "ANDX"
      Intended usage: COMMON


Pfeiffer, et al.         Expires June 30, 2004                 [Page 20]


Internet-Draft                  ANNODEX                    December 2003
7.1 URI addressing into Annodex files
   There are two ways of hyperlinking via URIs into Annodex files: via
   specification of a temporal interval or via specification of a clip.
   Both of these ways of addressing are supported for URI queries and
   URI fragments of Annodex files.
7.1.1 Query parameters for use with the HTTP protocol server-side
   For the purposes of URI queries on Annodex files, it is assumed that
   the query string takes the format of a CGI query string. The Common
   Gateway Interface, or CGI, is a standard for external gateway
   programs to interface with information servers such as HTTP servers
   (see http://hoohoo.ncsa.uiuc.edu/cgi/). This query string is expected
   to be interpreted by the HTTP server to return a valid Annodex file
   that differs from the original Annodex file only by reducing it to
   the specified interval.
   Addressing of temporal intervals of Annodex files is possible through
   specification of temporal query intervals in URIs [14]. An example is
   the following URI: http://www.blah.au/sample.anx?t="npt:4" , which
   relates to a complete Annodex file composed from sample.anx by
   starting it at an offset of 4 seconds.
   Addressing of a clip is possible through specification of the clip's
   id attribute value. An example is the following URI: http://
   www.blah.au/sample.anx?id="dolphin" , which relates to the clip whose
   id attribute value is "dolphin". Note that id attribute values of all
   elements have to be unique throughout a XML file (and thus also
   throughout an Annodex file which represents a CMML file).
7.1.2 Fragment identifiers for use with the HTTP protocol client-side
   For the purposes of URI fragment specifications on Annodex files, it
   is assumed that the fragment gets interpreted by the HTTP client
   after the retrieval action. The HTTP client is expected to restrict
   the usage of the resource to the specified interval.
   Addressing of temporal intervals of Annodex files is possible through
   specification of temporal fragments in URIs [14] An example is the
   following URI: http://www.blah.au/sample.anx#npt:4 . This then
   relates to starting the Annodex file at a 4 second offset. It may
   e.g. be useful to do a zoom into a retrieved Annodex resource.
   The values of the id attribute of the clip tags can be used for
   addressing media clips directly through fragment identifiers as in
   http://www.blah.au/sample.anx#dolphin.

Pfeiffer, et al.         Expires June 30, 2004                 [Page 21]


Internet-Draft                  ANNODEX                    December 2003
7.1.3 HTTP 'Accept' header field interpretation
   The Annodex and the CMML file that can be extracted from it are very
   tightly related to each other: the CMML file contains all annotation
   and indexing information including timebase and UTC time about the
   Annodex file. Therefore, receiving the CMML file instead of the
   Annodex file is like receiving all information about the bitstreams
   in the Annodex file except for the data bitstreams themselves.
   This situation can be taken advantage of with the "Accept" header of
   HTTP. When an Annodex file is requested from a HTTP server and the
   acceptable content types given in the "Accept" message header field
   contains "text/x-cmml" with a higher priority than "application/
   x-annodex", then the HTTP server SHOULD return the CMML file instead
   of the requested Annodex file itself. As is standard, the HTTP
   response will contain a "Content-type" field indicating what content
   was actually returned. A Web crawler of a search engine, e.g., can
   thus avoid extra network load and retrieve easier parsable
   information. It SHOULD set the "Accept" HTTP header to "Accept: text/
   x-cmml" for every requested Annodex URI.










Pfeiffer, et al.         Expires June 30, 2004                 [Page 22]


Internet-Draft                  ANNODEX                    December 2003
8. Security considerations
   Annodex format bitstreams contain several multiplexed binary media
   and one XML annotation bitstream. There is no generic encryption or
   signing mechanism provided for the complete bitstream or anyone of
   its parts. As the format of the encapsulated media bitstreams is not
   prescribed and is identified through the "Content-type" Message
   header field in that bitstream's bos page, it is possible to encrypt
   or sign that media bitstream and then mark it accordingly with a MIME
   type that signifies the encryption. It is up to the applications that
   use this bitstream to provide an appropriate codec to handle such
   bitstreams.
   As Annodex format bitstreams contain binary media bitstreams, it is
   possible to include executable content in them. This can be an issue
   with applications that decode these bitstreams, especially when they
   are used in a network scenario. Such applications have to ensure
   correct handling of manipulated bitstreams, of buffer overflow and
   the like.










Pfeiffer, et al.         Expires June 30, 2004                 [Page 23]


Internet-Draft                  ANNODEX                    December 2003
References
   [1]   World Wide Web Consortium, "Extensible Markup Language (XML)
         1.0", W3C XML, October 2000, <http://www.w3.org/TR/2000/
         REC-xml-20001006>.
   [2]   World Wide Web Consortium, "HTML 4.01 Specification", W3C HTML,
         December 1999, <http://www.w3.org/TR/html4/>.
   [3]   World Wide Web Consortium, "XHTML(TM) 1.0 The Extensible Hyper
         Text Markup Language", W3C XHTML, January 2000, <http://
         www.w3.org/TR/xhtml1/>.
   [4]   Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform
         Resource Identifiers (URI): Generic Syntax", RFC 2396, August
         1998.
   [5]   Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L.,
         Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol --
         HTTP/1.1", RFC 2616, June 1999.
   [6]   Resnick, P., "Internet Message Format", RFC 2822, April 2001.
   [7]   Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming
         Protocol (RTSP)", RFC 2326, April 1998.
   [8]   Alvestrand, H., "Tags for the Identification of Languages", RFC
         1766, March 1995, <http://www.ietf.org/rfc/rfc1766.txt>.
   [9]   Freed, N. and N. Borenstein, "Multipurpose Internet Mail
         Extensions (MIME) Part Two: Media Types", RFC 2046, November
         1996, <http://www.ietf.org/rfc/rfc2046.txt>.
   [10]  Whitehead, E. and M. Murata, "XML Media Types", RFC 2376, July
         1998, <http://www.ietf.org/rfc/rfc2376.txt>.
   [11]  Pfeiffer, S., "The Ogg encapsulation format version 0", RFC
         3533, May 2003, <http://www.ietf.org/rfc/rfc3533.txt>.
   [12]  The Society of Motion Picture and Television Engineers, "SMPTE
         STANDARD for Television, Audio and Film - Time and Control
         Code", ANSI 12M-1999, September 1999.
   [13]  ISO, TC154., "Data elements and interchange formats --
         Information interchange -- Representation of dates and times",
         ISO 8601, 2000.
   [14]  Pfeiffer, S., Parker, C. and A. Pang, "Specifying time

Pfeiffer, et al.         Expires June 30, 2004                 [Page 24]


Internet-Draft                  ANNODEX                    December 2003
         intervals in URI queries and fragments of time-based Web
         resources (BCP) (work in progress)", I-D
         draft-pfeiffer-temporal-fragments-02.txt, December 2003,
         <http://www.annodex.net/TR/URI_fragments.txt>.
   [15]  Pfeiffer, S., Parker, C. and A. Pang, "The Continuous Media
         Markup Language (CMML), Version 2.0 (work in progress)", I-D
         draft-pfeiffer-cmml-01.txt, December 2003, <http://
         www.annodex.net/TR/cmml.txt>.
Authors' Addresses
   Silvia Pfeiffer
   Commonwealth Scientific and Industrial Research Organisation CSIRO, Australia
   Locked Bag 17
   North Ryde, NSW  2113
   Australia
   Phone: +61 2 9325 3141
   EMail: Silvia.Pfeiffer@csiro.au
   URI:   http://www.ict.csiro.au/
   Conrad D. Parker
   Commonwealth Scientific and Industrial Research Organisation CSIRO, Australia
   Locked Bag 17
   North Ryde, NSW  2113
   Australia
   Phone: +61 2 9325 3133
   EMail: Conrad.Parker@csiro.au
   URI:   http://www.ict.csiro.au/
   Andre T. Pang
   Commonwealth Scientific and Industrial Research Organisation CSIRO, Australia
   Locked Bag 17
   North Ryde, NSW  2113
   Australia
   Phone: +61 2 9325 3156
   EMail: Andre.Pang@csiro.au
   URI:   http://www.ict.csiro.au/


Pfeiffer, et al.         Expires June 30, 2004                 [Page 25]


Internet-Draft                  ANNODEX                    December 2003
Appendix A. Definitions of terms and abbreviations
   Clip element: XML data containing information on a fragment of a
      time-continuous bitstream.
   Fragment: a subpart of a media document covering some temporal
      interval.
   Mark-up: XML tags and their content used to describe a media
      document.
   Annodex bitstream: encapsulated time-continuous bitstream with head
      and clip elements.
   Annotating: the task of giving textual descriptions to fragments of
      media documents.
   Indexing: the task of identifying index points for media documents or
      fragments thereof.
   Hyperlinking: the task of linking from one Web resource to another.
      If a link has an offset into the resource, this is sometimes
      called deep hyperlinking.
   head element: XML data containing information on an Annodexed media
      file.
   media packet: a block of digital data that represents a temporal
      subpart of a stream of continuous media. Media packets of one
      continuous media file do not overlap in time.
   bitstream: a sequence of time-continuous data.






Pfeiffer, et al.         Expires June 30, 2004                 [Page 26]


Internet-Draft                  ANNODEX                    December 2003
Appendix B. Glossary of acronyms
   CMML: Continuous Media Markup Language.
   DTD: Document Type Declaration.
   XML: eXtensible Markup Language.
   CMWeb: Continuous Media Web.
   Web: World Wide Web.
   URI: Unified Resource Identifier.












Pfeiffer, et al.         Expires June 30, 2004                 [Page 27]


Internet-Draft                  ANNODEX                    December 2003
Appendix C. Acknowledgements
   The authors greatly acknowledge the contributions of Zentaro
   Kavanagh, Andrew Nesbit and Simon Lai in developing this
   specification.















Pfeiffer, et al.         Expires June 30, 2004                 [Page 28]


Internet-Draft                  ANNODEX                    December 2003
Intellectual Property Statement
   The IETF takes no position regarding the validity or scope of any
   intellectual property or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; neither does it represent that it
   has made any effort to identify any such rights. Information on the
   IETF's procedures with respect to rights in standards-track and
   standards-related documentation can be found in BCP-11. Copies of
   claims of rights made available for publication and any assurances of
   licenses to be made available, or the result of an attempt made to
   obtain a general license or permission for the use of such
   proprietary rights by implementors or users of this specification can
   be obtained from the IETF Secretariat.
   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights which may cover technology that may be required to practice
   this standard. Please address the information to the IETF Executive
   Director.
Full Copyright Statement
   Copyright (C) The Internet Society (2003). All Rights Reserved.
   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.
   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assignees.
   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION

Pfeiffer, et al.         Expires June 30, 2004                 [Page 29]


Internet-Draft                  ANNODEX                    December 2003
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgment
   Funding for the RFC Editor function is currently provided by the
   Internet Society.














Pfeiffer, et al.         Expires June 30, 2004                 [Page 30]