MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text
RFC 2047
Network Working Group K. Moore
Request for Comments: 2047 University of Tennessee
Obsoletes: 1521, 1522, 1590 November 1996
Category: Standards Track
MIME (Multipurpose Internet Mail Extensions) Part Three:
Message Header Extensions for Non-ASCII Text
Status of this Memo
This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
Abstract
STD 11, RFC 822, defines a message representation protocol specifying
considerable detail about US-ASCII message headers, and leaves the
message content, or message body, as flat US-ASCII text. This set of
documents, collectively called the Multipurpose Internet Mail
Extensions, or MIME, redefines the format of messages to allow for
(1) textual message bodies in character sets other than US-ASCII,
(2) an extensible set of different formats for non-textual message
bodies,
(3) multi-part message bodies, and
(4) textual header information in character sets other than US-ASCII.
These documents are based on earlier work documented in RFC 934, STD
11, and RFC 1049, but extends and revises them. Because RFC 822 said
so little about message bodies, these documents are largely
orthogonal to (rather than a revision of) RFC 822.
This particular document is the third document in the series. It
describes extensions to RFC 822 to allow non-US-ASCII text data in
Internet mail header fields.
Moore Standards Track [Page 1]
RFC 2047 Message Header Extensions November 1996
Other documents in this series include:
+ RFC 2045, which specifies the various headers used to describe
the structure of MIME messages.
+ RFC 2046, which defines the general structure of the MIME media
typing system and defines an initial set of media types,
+ RFC 2048, which specifies various IANA registration procedures
for MIME-related facilities, and
+ RFC 2049, which describes MIME conformance criteria and
provides some illustrative examples of MIME message formats,
acknowledgements, and the bibliography.
These documents are revisions of RFCs 1521, 1522, and 1590, which
themselves were revisions of RFCs 1341 and 1342. An appendix in RFC
2049 describes differences and changes from previous versions.
1. Introduction
RFC 2045 describes a mechanism for denoting textual body parts which
are coded in various character sets, as well as methods for encoding
such body parts as sequences of printable US-ASCII characters. This
memo describes similar techniques to allow the encoding of non-ASCII
text in various portions of a RFC 822 [2] message header, in a manner
which is unlikely to confuse existing message handling software.
Like the encoding techniques described in RFC 2045, the techniques
outlined here were designed to allow the use of non-ASCII characters
in message headers in a way which is unlikely to be disturbed by the
quirks of existing Internet mail handling programs. In particular,
some mail relaying programs are known to (a) delete some message
header fields while retaining others, (b) rearrange the order of
addresses in To or Cc fields, (c) rearrange the (vertical) order of
header fields, and/or (d) "wrap" message headers at different places
than those in the original message. In addition, some mail reading
programs are known to have difficulty correctly parsing message
headers which, while legal according to RFC 822, make use of
backslash-quoting to "hide" special characters such as "<", ",", or
":", or which exploit other infrequently-used features of that
specification.
While it is unfortunate that these programs do not correctly
interpret RFC 822 headers, to "break" these programs would cause
severe operational problems for the Internet mail system. The
extensions described in this memo therefore do not rely on little-
used features of RFC 822.
Moore Standards Track [Page 2]
RFC 2047 Message Header Extensions November 1996
Instead, certain sequences of "ordinary" printable ASCII characters
(known as "encoded-words") are reserved for use as encoded data. The
syntax of encoded-words is such that they are unlikely to
"accidentally" appear as normal text in message headers.
Furthermore, the characters used in encoded-words are restricted to
those which do not have special meanings in the context in which the
encoded-word appears.
Generally, an "encoded-word" is a sequence of printable ASCII
Show full document text