INTERNET-DRAFT                                             Eric A. Hall
  Document: draft-hall-mime-app-mbox-01.txt                      May 2004
  Expires: December, 2004
  Category: Standards Track
  
  
                       The APPLICATION/MBOX Media-Type
  
  
     Status of this Memo
  
     This document is an Internet-Draft and is in full conformance with
     all provisions of Section 10 of RFC 2026.
  
     Internet-Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and its working groups. Note that
     other groups may also distribute working documents as Internet-
     Drafts.
  
     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time. It is inappropriate to use Internet-Drafts
     as reference material or to cite them other than as "work in
     progress."
  
     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt
  
     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.
  
  
     Copyright Notice
  
     Copyright (C) The Internet Society (2004).  All Rights Reserved.
  
  
     Abstract
  
     This document requests that the application/MBOX media-type be
     authorized for allocation by IANA, according to the terms
     specified in RFC 2048 [RFC2048].
  
  
  
  Internet Draft     draft-hall-mime-app-mbox-01.txt          May 2004
  
  
  
  1.      Background and Overview
  
     UNIX and look-alike operating systems have historically made use
     of "mbox" database files for a variety of messaging purposes. In
     the common case, these database files hold collections of
     electronic mail messages which are collectively manipulated as
     "folders" in a private mail-store. These files are also widely
     used by a variety of filtering systems, archival programs, and
     other messaging-related tools, and are also widely supported on
     non-UNIX platforms for similar purposes.
  
     The increased pervasiveness of these files has led to an increased
     demand for improvements in cross-system, network-wide interchange
     of these files. In turn, this requirement also dictates a need for
     a media-type definition for mbox files in general, so that the
     data can be tagged and identified during transfer.
  
     Note that there are many inconsistencies in how mbox databases are
     structured and stored, and some form of prior agreement is usually
     necessary before these files can be used in automated tasks. For
     example, it is entirely possible for an mbox database to contain
     untagged eight-bit character data or to use end-of-line sequences
     which are peculiar to a host platform, although the mbox database
     format does not provide any means for indicating these options
     within the database itself. As a result, some form of external
     negotiation or prior agreement is often necessary in order to
     ensure that the contents of the database are accurately read by
     messaging systems on other hosts.
  
     On this point, it is useful to note that the multipart/digest
     [RFC2046] media-type has authoritative, platform-independent
     formatting rules which facilitate much more predictable transfer
     and conversion routines in multi-system environments, and
     implementations are strongly encouraged to give preference to the
     multipart/digest media-type where possible.
  
  2.      Prerequisites and Terminology
  
     Readers of this document are expected to be familiar with the
     specification for MIME registrations [RFC2048].
  
     The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
     NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
     in this document are to be interpreted as described in RFC 2119
     [RFC2119].
  
  Hall                  I-D Expires: December 2004             [page 2]


  Internet Draft     draft-hall-mime-app-mbox-01.txt          May 2004
  
  
  
  3.      The APPLICATION/MBOX Media-Type Registration Request
  
     This section provides the registration request, as per [RFC2048],
     and which will be submitted to IANA after IESG approval.
  
     MIME media type name: application
  
     MIME subtype name: mbox
  
     Required parameters: none
  
     Optional parameters: none
  
     Encoding considerations: mbox data typically consists of seven-bit
     ASCII characters in an eight-bit file stream, although this is not
     required, nor can it be assumed. mbox databases may contain
     unencoded and untagged eight-bit character data, or may contain
     data which has been encoded to fit within a seven-bit stream, or
     may contain a mixture of both types simultaneously.
  
     Any transfer encoding may be used with mbox files, with the
     appropriate encoding for any specific file being entirely
     dependant upon the database contents.
  
     Security considerations: mbox data is passive, and does not
     generally represent a unique or new security threat. However,
     there is some risk in sharing any kind of data, in that
     unintentional information may be exposed, and that risk applies to
     mbox data as well.
  
     Interoperability considerations: mbox databases on UNIX-like
     systems typically use ASCII Line Feed (0x0A) as the end-of-line
     character. Messaging systems on other platforms may use the UNIX-
     centric end-of-line marker, but may also use an end-of-line signal
     that is specific to their host operating system.
  
     Messages in mbox databases contain header fields which appear to
     mirror common Internet message headers, but which typically use
     local encodings rather than Internet formats. For example, message
     headers in an mbox database may contain untagged eight-bit
     character data, or may contain email addresses with no domain
     name, with these usages reflecting local-system mail-store
     services that are incompatible with defined Internet formats.
  
  
  Hall                  I-D Expires: December 2004             [page 3]


  Internet Draft     draft-hall-mime-app-mbox-01.txt          May 2004
  
  
     As a result of these and other vagaries, mbox databases generally
     require some kind of out-of-band negotiation or prior agreement
     before they can be successfully parsed and read on other systems.
  
     Published specification: see Appendix A.
  
     Applications which use this media type: scores of messaging
     products make use of the mbox database format.
  
     Magic number(s): no standard
  
     File extension(s): mbox files sometimes have a ".mbox" extension,
     but this is not required nor expected.
  
     Macintosh File Type Code(s): no standard
  
     Person & email address to contact for further information: Eric A.
     Hall (ehall@ntrg.com)
  
     Intended usage: COMMON
  
  4.      Security Considerations
  
     See the discussion in section 3.
  
  5.      IANA Considerations
  
     After any IESG approval which may be forthcoming, IANA would be
     expected to register the application/mbox media-type, using the
     application provided in section 3 above.
  
  6.      Normative References
  
          [RFC2046]     Freed, N., Borenstein, N., "Multipurpose
                         Internet Mail Extensions (MIME) Part Two:
                         Media Types", RFC 2046, November 1996.
  
          [RFC2048]     Freed, N., Klensin, J., Postel, J.,
                         "Multipurpose Internet Mail Extensions (MIME)
                         Part Four: Registration Procedures", BCP 13,
                         RFC 2048, November 1996.
  
          [RFC2119]     Bradner, S., "Key words for use in RFCs to
                         Indicate Requirement Levels", BCP 14, RFC
                         2119, March 1997.
  
  
  
  Hall                  I-D Expires: December 2004             [page 4]


  Internet Draft     draft-hall-mime-app-mbox-01.txt          May 2004
  
  
  Appendix A.    The mbox Database Format
  
     The MBOX database format is not documented by any authoritative
     source, but instead only exists as commonly-understood output from
     historical messaging tools. Partly due to the lack of
     authoritative documentation, the mbox format has been adapted and
     mutated by various utilities over the years, and does not exist in
     a form which is syntactically precise.
  
     In general, mbox files typically contain a sequence of messages,
     each of which begin with a "From_" line, and which are further
     separated from their neighboring messages by an empty line that
     precedes the next "From_" line. This means that the first message
     in an mbox file will immediately begin with a "From_" line, while
     every other message will begin with a "From_" line that is
     immediately preceded by a Line Feed character.
  
     The structure of the "From_" lines vary somewhat, but almost
     always contain the exact character sequence of "From", followed by
     whitespace, followed by an email address of some kind, followed by
     more whitespace, and terminated by a timestamp sequence of some
     kind. Note that the email address may reflect any addressing
     syntax which has ever been used on any system in all of history,
     and the timestamp sequences can also vary according to system
     functions. In most cases, the timestamp is followed by an end-of-
     line signal, but some messaging systems have also been known to
     append additional information after the timestamp.
  
     The exact format of the "From_" line in use with a particular mbox
     file can often be determined by examining the first line of the
     file itself, which will likely be a "From_" line, and which is
     easy to locate, although implementers are cautioned that multiple
     mbox files may have been joined together, or a single file may
     have been accessed from multiple systems, resulting in different
     "From_" line formats being used within a single file.
  
     Many implementations are also known to escape body lines beginning
     with "From " with a leading Greater-Than symbol (0x3E) so that
     excessively-liberal parsers do not misinterpret these sentences as
     new "From_" lines. However, other implementations are known not to
     escape such lines unless they also appear to contain an email
     address and a timestamp, while others are known to perform
     secondary escapes against text which is already escaped or quoted.
  
  
  Hall                  I-D Expires: December 2004             [page 5]


  Internet Draft     draft-hall-mime-app-mbox-01.txt          May 2004
  
  
     A comprehensive description of mbox database files on UNIX-like
     systems can be found at http://qmail.org./man/man5/mbox.html, and
     should be treated as anecdotally authoritative.
  
  Acknowledgments
  
     Funding for the RFC editor function is currently provided by the
     Internet Society.
  
  
  Authors' Addresses
  
     Eric A. Hall
     ehall@ntrg.com
  
  
  Full Copyright Statement
  
     Copyright (C) The Internet Society (2004). All Rights Reserved.
  
     This document and translations of it may be copied and furnished
     to others, and derivative works that comment on or otherwise
     explain it or assist in its implementation may be prepared,
     copied, published and distributed, in whole or in part, without
     restriction of any kind, provided that the above copyright notice
     and this paragraph are included on all such copies and derivative
     works. However, this document itself may not be modified in any
     way, such as by removing the copyright notice or references to the
     Internet Society or other Internet organizations, except as needed
     for the purpose of developing Internet standards in which case the
     procedures for copyrights defined in the Internet Standards
     process must be followed, or as required to translate it into
     languages other than English.
  
     The limited permissions granted above are perpetual and will not
     be revoked by the Internet Society or its successors or assigns.
  
     This document and the information contained herein is provided on
     an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
     ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
     IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
     THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
     WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
  
  
  Hall                  I-D Expires: December 2004             [page 6]