M. Baker



                    The 'application/xhtml+xml' Media Type
                      draft-baker-xhtml-media-reg-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-
   Drafts as reference material or to cite them other than as
   "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on May 27, 2001.

Abstract

   This document defines the "application/xhtml+xml" MIME media type
   for XHTML based markup languages; it is not intended to obsolete
   any previous IETF documents, in particular RFC 2854 which registers
   "text/html".

   This document was prepared by members of the W3C HTML working group
   based on the structure, and some of the content, of RFC 2854, the
   registration of "text/html". Please send comments to
   www-html@w3.org, a public mailing list with archives at
   <http://lists.w3.org/Archives/Public/www-html/>.

1. Introduction

   In 1998, the W3C HTML working group began work on reformulating HTML
   in terms of XML 1.0 [XML] and XML Namespaces [XMLNS].  The first
   part of that work concluded in January 2000 with the publication of
   the XHTML 1.0 Recommendation [XHTML1], the reformulation for HTML
   4.01 [HTML401].

   Work continues in the HTML WG on XHTML Modularization (see
   http://www.w3.org/TR/xhtml-modularization), the decomposition of
   XHTML 1.0 into modules that can be used to compose new XHTML based
   languages, plus a framework for supporting this composition.

   As of December 2000, the HTML WG has taken no official position on
   what MIME media type should be used to describe XHTML 1.0 or any
   other XHTML based language, except in the case where XHTML 1.0
   documents satisfy certain additional requirements (see [XHTML1]
   section 5.1) and can be described with "text/html" (see [TEXTHTML]).

   This document only registers a new MIME media type,
   "application/xhtml+xml".  It does not define anything more than is
   required to perform this registration.  The HTML WG expects to
   publish further documentation on this subject, including but not
   limited to, information about rules for which documents should and
   should not be described with this new media type, and further
   information about recognizing XHTML documents.

   This document follows the convention set out in [XMLMIME] for the
   MIME subtype name; attaching the suffix "+xml" to denote that the
   entity being described conforms to the XML syntax as defined in XML
   1.0 [XML].

2. Registration of MIME media type application/xhtml+xml

   MIME media type name:      application
   MIME subtype name:         xhtml+xml
   Required parameters:       none
   Optional parameters:

     charset
       This parameter has identical semantics to the charset parameter
       of the "application/xml" media type as specified in [XMLMIME].

     schema-location
       See Section 8 of this document.

  Encoding considerations:
     See Section 4 of this document.

  Security considerations:
     See Section 7 of this document.

  Interoperability considerations:
     XHTML 1.0 [XHTML10] specifies user agent conformance rules that
     dictate behaviour that must be followed when dealing with, amoung
     other things, unrecognized elements.

     With respect to XHTML Modularization [XHTMLMOD] and the existence
     of XHTML based languages (referred to as XHTML family members)
     that are not XHTML 1.0 conformant languages, it is possible that
     "application/xhtml+xml" may be used to describe some of these
     documents.  The HTML WG will be releasing further guidelines about
     what documents should and should not be described with this type.
     However, it should suffice for now for the purposes of
     interoperability that user agents accepting
     "application/xhtml+xml" content use the user agent conformance
     rules in [XHTML1].

     Although conformant "application/xhtml+xml" interpreters can
     expect that content received is well-formed XML (as defined in
     [XML]), it cannot be guaranteed that the content is valid XHTML
     (as defined in [XHTML1].  This is in large part due to the reasons
     in the preceeding paragraph.

  Published specification:
     XHTML 1.0 is now defined by W3C Recommendation; the latest
     published version is [XHTML1].  It provides for the description of
     some types of conformant content as "text/html", but also doesn't
     disallow the use with other content types (effectively allowing
     for the possibility of this new type).

  Applications which use this media type:
     Some content authors have already begun hand and tool
     authoring on the Web with XHTML 1.0.  However that content
     is currently described as "text/html", allowing existing
     Web browsers to process it without reconfiguration for a
     new media type.

     There is no experimental, vendor specific, or personal tree
     predecessor to "application/xhtml+xml", reflecting the fact that
     no applications currently recognize it.  This new type is being
     registered in order to allow for the expected deployment of XHTML
     on the World Wide Web, as a first class XML application where
     authors can expect that user agents are conformant XML 1.0 [XML]
     processors.

  Additional information:

     Magic number:
       There is no single initial byte sequence that is always present
       for XHTML files. However, Section 5 below gives some guidelines
       for recognizing XHTML files.

     File extension:
       There are two known file extensions that are currently in use
       for XHTML 1.0; ".xht" and ".xhtml".

       It is not recommended that the ".xml" extension (defined in
       [XMLMIME]) be used, as web servers may be configured to
       distribute such content as type "text/xml" or "application/xml".
       [XMLMIME] discusses the unreliability of this approach in
       section 3.

     Macintosh File Type code: TEXT

   Person & email address to contact for further information:
     Mark Baker <mark.baker@canada.sun.com>

   Intended usage: COMMON

   Author/Change controller:
     The XHTML specifications are a work product of the World
     Wide Web Consortium's HTML Working Group.  The W3C has change
     control over these specifications.

3. Fragment identifiers

   For documents labeled as "application/xhtml+xml", the fragment
   identifier notation is exactly that for application/xml, as
   specified in [XMLMIME].

4. Encoding considerations

   By virtue of XHTML content being XML, it has the same considerations
   when sent as "application/xhtml+xml" as does XML.  See [XMLMIME],
   section 3.2.

5. Recognizing XHTML files

   All XHTML files will have the string "<html" near the beginning
   of the file.  Some will also begin with an XML declaration
   which begins with "<?xml", though that alone does not indicate
   an XHTML document.  All XHTML 1.0 documents will include a DOCTYPE
   declaration that begins with "<!DOCTYPE html", however other XHTML
   based languages (including those conformant with XHTML
   Modularization) may not.

   XHTML Modularization provides a naming convention that conformant
   document types must use that guarantees that the FPI of the doctype
   contain the string "//DTD XHTML ".  And while some XHTML based
   languages require the doctype declaration to occur within documents
   of that type, such as XHTML 1.0, or XHTML Basic
   (http://www.w3.org/TR/xhtml-basic), it is not the case that all
   XHTML based languages will include it.

   All XHTML files should also include a declaration of the XHTML
   namespace.  This should appear shortly after the string
   "<html", and should read 'xmlns="http://www.w3.org/1999/xhtml"'.

6. Charset default rules

   By virtue of all XHTML content being XML, it has the same
   considerations when sent as "application/xhtml+xml" as does XML.
   See [XMLMIME], section 3.2.

7. Security considerations

   The considerations for "text/html" as specified in [TEXTHTML] also
   hold for "application/xhtml+xml".

   In addition, because of the extensibility features for XHTML as
   provided by XHTML Modularization, it is possible that
   "application/xhtml+xml" may describe content that has security
   implications beyond those described here.  However, if the user
   agent follows the user agent conformance rules in [XHTML1], this
   content will be ignored.  Only in the case where the user agent
   recognizes and processes the additional content, or where further
   processing of that content is dispatched to other processors, would
   security issues potentially arise.  And in that case, they would
   fall outside the domain of this registration document.

8. The "schema-location" optional parameter

   This parameter is meant to solve the short-term problem of using
   MIME media type based content negotiation (such as that done with
   the HTTP "Accept" header) to negotiate for a variety of XHTML based
   languages.  It is intended to be used only during content
   negotiation.  It is not expected that it be used to deliver content,
   or that origin web servers have any knowledge of it (though they are
   welcome to).  It is primarily targetted for use on the network by
   proxies in the HTTP chain that manipulate data formats (such as
   transcoders).

   The parameter value is a URI that identifies the particular schema
   that the user agent supports, independant of the schema language.
   For example, it could be a URI to a DTD, an XML Schema
   (see http://www.w3.org/TR/xml-schema-1), or other schema
   representation.

   As an example, user agents supporting only XHTML Basic (see
   http://www.w3.org/TR/xhtml-basic) currently have no standard means
   to convey their inability to support the additional functionality in
   XHTML 1.0 [XHTML1] that is not found in XHTML Basic.  While XHTML
   Basic user agent conformance rules (which are identical to XHTML
   1.0) provide some guidance to its user agent implementators for
   handling some additional content, the additional content in XHTML
   1.0 that is not part of XHTML Basic is substantial, making the those
   conformance rules insufficient for practical processing and
   rendering to the end user.  There is also the matter of the
   potentially substantial burden on the user agent in receiving and
   parsing this additional content.

   It is expected that more fine grained content negotiation
   mechanisms will eventually solve this problem in the general case.
   For example, CC/PP (see http://www.w3.org/TR/CCPP-struct) combined
   with an XHTML feature description language could be used to
   communicate the details of what XHTML (and extension) features the
   user agent supports.

   An example use of this parameter as part of a HTTP GET transaction
   would be;

     Accept: application/xhtml+xml; \
       schema-location=\
       "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd"

9. Author's Address

   Mark A. Baker
   Sun Microsystems Inc.
   126 York St., Suite 325
   Ottawa, Ontario, CANADA. K1N 5T5
   phone:+1-613-261-5172
   mailto:mark.baker@canada.sun.com
   mailto:distobj@acm.org

10. References

[HTML401] Raggett, D., et al., "HTML 4.01 Specification", W3C
         Recommendation, December 1999. Available at
         <http://www.w3.org/TR/html4>
         (or <http://www.w3.org/TR/1999/REC-html401-19991224>).

[MIME]   Freed, N., and Borenstein, N., "Multipurpose Internet Mail
         Extensions (MIME) Part Two: Media Types", RFC 2046, November
         1996.

[XHTML1] "XHTML 1.0: The Extensible HyperText Markup Language: A
         Reformulation of HTML 4 in XML 1.0", W3C Recommendation,
         January 2000. Available at <http://www.w3.org/TR/xhtml1>.

[XML]    "Extensible Markup Language (XML) 1.0", W3C Recommendation,
         February 1998.  Available at <http://www.w3.org/TR/REC-xml>
         (or <http://www.w3.org/TR/1998/REC-xml-19980210>).

[TEXTHTML] Connolly, D., Masinter, L., "The 'text/html' Media Type",
         RFC 2854, June 2000.

[XMLMIME] Murata, M., St.Laurent, S., Kohn, D., "XML Media Types",
         Internet-Draft (work in progress).  Available at
         <http://www.ietf.org/internet-drafts/draft-murata-xml-09.txt>.