Internet Draft: The TEXT/PLAIN FORMAT Parameter       R. Gellens, Editor
Document: draft-gellens-format-02.txt                           Qualcomm
Expires: 17 May 1999                                    17 November
1998


                    The TEXT/PLAIN FORMAT Parameter


Status of this Memo:

    This document is an Internet Draft.  Internet Drafts are working
    documents of the Internet Engineering Task Force (IETF), its Areas,
    and its Working Groups.  Note that other groups may also distribute
    working documents as Internet Drafts.

    Internet Drafts are draft documents valid for a maximum of six
    months.  Internet Drafts may be updated, replaced, or obsoleted by
    other documents at any time.  It is not appropriate to use Internet
    Drafts as reference material or to cite them other than as a
    "working draft" or "work in progress."

    To learn the current status of any Internet Draft, please check the
    "1id-abstracts.txt" listing contained in the Internet Drafts shadow
    directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
    munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or
    ftp.isi.edu (US West Coast).


    A version of this draft document is intended for submission to the
    RFC editor as a Proposed Standard for the Internet Community.
    Discussion and suggestions for improvement are requested.

Comments:

    Private comments should be sent to the author.  Public comments may
    be sent to the IETF 822 mailing list, <ietf-822@imc.org>.  To
    subscribe, send a message to <ietf-822-request@imc.org> with the
    word SUBSCRIBE as the body of the message.  Archives for the list
    are at <http://www.imc.org/ietf-822/>.


Copyright Notice

    Copyright (C) The Internet Society 1998.  All Rights Reserved.











Gellens                    [Page 1]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

Table of Contents

     1.  Changes in this Version  . . . . . . . . . . . . . . . . . .  2
     2.  Introduction  . . . . . . . . . . . . . . . . . . . . . . .   2
     3.  Conventions Used in this Document  . . . . . . . . . . . . .  3
     4.  The Problem . . . . . . . . . . . . . . . . . . . . . . . .   3
       4.1.  Paragraph Text . . . . . . . . . . . . . . . . . . . . .  3
       4.2.  Embarrassing Line Wrap  . . . . . . . . . . . . . . . .   3
       4.3.  New Media Types  . . . . . . . . . . . . . . . . . . . .  4
     5.  The FORMAT Parameter to the TEXT/PLAIN Media Type . . . . .   4
       5.1.  Generating Format=Flowed . . . . . . . . . . . . . . . .  5
       5.2.  Usenet Signature Convention . . . . . . . . . . . . . .   5
       5.3.  Quoting  . . . . . . . . . . . . . . . . . . . . . . . .  6
       5.4.  Digital Signatures and Encryption . . . . . . . . . . .   7
       5.5.  Line Analysis Table  . . . . . . . . . . . . . . . . . .  7
       5.6.  Examples  . . . . . . . . . . . . . . . . . . . . . . .   8
     6.  ABNF . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     7.  Failure Modes . . . . . . . . . . . . . . . . . . . . . . .   9
       7.1.  Trailing White Space Corruption  . . . . . . . . . . . .  9
     8.  Security Considerations . . . . . . . . . . . . . . . . . .   9
     9.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . .  9
    10.  References  . . . . . . . . . . . . . . . . . . . . . . . .   9
    11.  Editor's Address . . . . . . . . . . . . . . . . . . . . . . 10
    12.  Full Copyright Statement  . . . . . . . . . . . . . . . . .  10


1.  Changes in this Version
        -  Reworded Q-P prohibition again.
        -  Added section on digital signatures and encryption


2.  Introduction

    Interoperability problems have been observed with erroneous
    labelling of paragraph text as TEXT/PLAIN, and with various forms of
    'embarrassing line wrap.' (See section 4.)

    Attempts to deploy new media types, such as TEXT/ENRICHED [RICH] and
    TEXT/HTML [HTML] have suffered from a lack of backwards
    compatibility and an often hostile user reaction at the receiving
    end.

    What is desired is a format which is in all significant ways
    TEXT/PLAIN, and therefore is quite suitable for display as
    TEXT/PLAIN, and yet allows the sender to express to the receiver
    which lines can be considered a logical paragraph, and thus flowed
    (wrapped and joined) as appropriate.

    This memo proposes a new parameter to be used with TEXT/PLAIN, and,
    in the presence of this parameter, the use of trailing whitespace to
    indicate flowed lines.  This results in an encoding which appears as
    normal TEXT/PLAIN in older implementations, since it is in fact


Gellens                    [Page 2]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

    normal TEXT/PLAIN.


3.  Conventions Used in this Document

    The key words "REQUIRED", "MUST", "MUST NOT", "SHOULD", "SHOULD
    NOT", and "MAY" in this document are to be interpreted as described
    in "Key words for use in RFCs to Indicate Requirement Levels"
    [KEYWORDS].


4.  The Problem

    The TEXT/PLAIN media type is the lowest common denominator of
    Internet email, with lines of no more than 998 characters (by
    convention usually no more than 80), and where the CRLF sequence
    represents a line break [MIME-IMT].

    TEXT/PLAIN is usually displayed as preformatted text, often in a
    fixed font.  That is, the characters start at the left margin of the
    display window, and advance to the right until a CRLF sequence is
    seen, at which point a new line is started, again at the left
    margin.  When a line length exceeds the display window, some clients
    will wrap the line, while others invoke a horizontal scroll bar.

    Some interoperability problems have been observed with this media
    type:

4.1.  Paragraph Text

    Many modern programs use a proportional-spaced font and CRLF to
    represent paragraph breaks.  Line breaks are "soft", occurring as
    needed on display.  That is, characters are grouped into a paragraph
    until a CRLF sequence is seen, at which point a new paragraph is
    started.  Each paragraph is displayed, starting at the left margin
    (or paragraph indent), and continuing to the right until a word is
    encountered which does not fit in the remaining display width.  The
    display shifts to the next line, starting with the word which would
    not fit on the previous line.  This continues until the paragraph
    ends (a CRLF is seen).  Extra vertical space is left between
    paragraphs.

    Numerous software products erroneously label this media type as
    TEXT/PLAIN, resulting in much user discomfort.

4.2.  Embarrassing Line Wrap

    As TEXT/PLAIN messages get quoted in replies or forwarded, the
    length of each line gradually increases, resulting in "embarrassing
    line wrap." This results in text which is at best hard to read, and
    often confuses attributions.



Gellens                    [Page 3]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

    In addition, as devices with display widths smaller than 80
    characters become more popular, embarrassing line wrap has become
    even more prevalent, even with unquoted text.

4.3.  New Media Types

    Attempts to deploy new media types, such as TEXT/ENRICHED [RICH] and
    TEXT/HTML [HTML] have suffered from a lack of backwards
    compatibility and an often hostile user reaction at the receiving
    end.

    In particular, TEXT/ENRICHED requires that open angle brackets ("<")
    and hard line breaks be doubled, with resulting user unhappiness
    when viewed as TEXT/PLAIN.  TEXT/HTML requires even more alteration
    of text, with a corresponding increase in user complaints.

    A proposal to define a new media type to explicitly represent the
    paragraph form suffered from a lack of interoperability with
    currently deployed software.  Some programs treat unknown subtypes
    of TEXT as an attachment.

    What is desired is a format which is in all significant ways
    TEXT/PLAIN, and therefore is quite suitable for display as
    TEXT/PLAIN, and yet allows the sender to express to the receiver
    which lines can be considered a logical paragraph, and thus flowed
    (wrapped and joined) as appropriate.


5.  The FORMAT Parameter to the TEXT/PLAIN Media Type

    This document defines a new MIME parameter for use with TEXT/PLAIN:

        Name:  Format
        Value:  Fixed, Flowed

    (Neither the parameter name nor its value are case sensitive.)

    If not specified, a value of Fixed is assumed.  The semantics of the
    Fixed value are the usual associated with TEXT/PLAIN [MIME-IMT].

    A value of Flowed indicates that any line which ends in exactly one
    space MAY be treated as a "flowed" line.  A series of one or more
    such lines is considered a paragraph, and MAY be flowed (wrapped and
    unwrapped) as appropriate on display and in the construction of new
    messages (see section 5.3).

    A line consisting of exactly one space is considered a flowed line.

    Because flowed lines are all-but-indistinguishable from fixed lines,
    currently deployed software will treat flowed lines as normal
    TEXT/PLAIN (which is what they are).  Thus, no interoperability
    problems are expected.


Gellens                    [Page 4]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

5.1.  Generating Format=Flowed

    When generating Format=Flowed text, lines SHOULD be shorter than 80
    characters.  As suggested values, any paragraph longer than 79
    characters in total length could be wrapped using lines of 72 or
    fewer characters.  While the specific line length used is a matter
    of aesthetics and preference, longer lines are more likely to
    require rewrapping.  It has been suggested that 66 character lines
    are the most readable.

    When creating flowed text, the generating agent wraps, that is,
    inserts 'soft' line breaks (SPACE CRLF sequences) as needed.  Soft
    line breaks are added between words.

    A generating agent SHOULD:
        1.  Ensure all lines (fixed and flowed) are less than 80
            characters in length, not counting the CRLF.
        2.  Trim spaces before user-inserted hard line breaks.

    A generating agent SHOULD NOT:
        1.  Wrap immediately before a close angle-bracket (">").
        2.  Generate multiple spaces before the CRLF on flowed lines.
        3.  Wrap immediately before "From ".

    A Format=Flowed message consists of zero or more paragraphs, each
    containing zero or more flowed lines, and one or more fixed lines.
    The usual case is a series of paragraphs with blank (empty) lines
    between them.

    [Quoted-Printable] encoding SHOULD only be used with Format=Flowed
    when absolutely necessary (for example, non-US-ASCII 8-bit
    characters over a strictly 7-bit transport such as unextended SMTP).
    In particular, Quoted-Printable SHOULD NOT be used solely to protect
    the trailing space.  Since gateways which strip trailing spaces have
    become less common than user agents which fail to correctly decode
    Quoted-Printable in all cases (for example, view, reply and save),
    the safer course is to not protect the trailing spaces unless the
    body part is cryptographically signed (see Section 5.4).

    The intent of Format=Flowed is to allow user agents to generate
    flowed text which is non-obnoxious when viewed as pure Text/Plain;
    use of Quoted-Printable hinders this and may cause Format=Flowed to
    be rejected by end users.


5.2.  Usenet Signature Convention

    There is a convention in Usenet news of using "-- " as the separator
    line between the body and the signature of a message.  When
    generating a Format=Flowed message containing a Usenet-style
    separator before the signature, the separator line is sent as-is.
    This is a special case; an (optionally quoted) line consisting of


Gellens                    [Page 5]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

    DASH DASH SPACE is not considered flowed.


5.3.  Quoting

    In Format=Flowed, the canonical quote indicator is one or more close
    angle bracket (">") characters followed by one space. (The space is
    required because some systems alter in-transit messages to insert a
    close angle-bracket before any line which starts with "From ".)
    Lines which start with the quote indicator are considered quoted.
    Flowed lines which are also quoted may require special handling on
    display and when copied to new messages.

    When creating quoted flowed lines, each such line starts with the
    quote indicator.

    When generating quoted flowed lines, an agent needs to pay attention
    to changes in quote level (depth).  A sequence of quoted lines of
    the same quote depth SHOULD be encoded as a paragraph, with the last
    line generated as fixed and prior lines generated as flowed.

    If a receiving agent wishes to reformat flowed quoted lines (joining
    and/or wrapping them) on display or when generating new messages,
    the lines SHOULD be de-quoted, reformatted, and then re-quoted.  To
    de-quote, the number of close angle brackets in the quote indicator
    at the start of each line is counted.  Consecutive lines with the
    same quoting depth are considered one logical entity and are
    reformatted together.  To re-quote after reformatting, a quote
    indicator containing the same number of close angle brackets
    originally present are prefixed to each line.

    On reception, if a change in quoting depth occurs on a flowed line,
    there are two possible interpretations.  One, the 'quote-count-wins'
    rule, would be to ignore the flowed indicator and treat the line as
    fixed.  The other, 'flowed-wins', would be to pay attention to the
    flowed indicator and treat the following line as non-quoted, or as
    quoted but at the same depth (ignoring the change in quote level).

    For example, consider the following sequence of lines (using '*' to
    indicate a soft line break, and '#' to indicate a hard line break):

       > Thou villainous ill-breeding spongy dizzy-eyed*
       > reeky elf-skinned pigeon-egg!*     <--- problem ---<
       >> Thou artless swag-bellied milk-livered*
       >> dismal-dreaming idle-headed scut!#
       >>> Thou errant folly-fallen spleeny reeling-ripe*
       >>> unmuzzled ratsbane!#
       >>>> Henceforth, the coding style is to be strictly*
       >>>> enforced, including the use of only upper case.#
       >>>>> I've noticed a lack of adherence to the coding*
       >>>>> styles, of late.#
       >>>>>> Any complaints?#


Gellens                    [Page 6]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

    The second line ends in a soft line break, even though it is the
    last line of the one-deep quote block.  The question then arises as
    to how this line should be interpreted, considering that the next
    line is the first line of the two-deep quote block.

    The two approaches can be classified as 'quote-depth wins' or
    'flowed wins'.  In the former, the change in quote depth overrides
    the soft line break; in the latter, the soft line break is
    unconditionally obeyed (and either ignoring the quote altogether or
    ignoring the change in quote depth).

    The example text above, when processed according to quote-depth
    wins, results in the first two lines being considered as one quoted,
    flowed section, with a quote depth of 1; the third and fourth lines
    become a quoted, flowed section, with a quote depth of 2.

    To implement flowed wins, a receiving agent always obeys the flowed
    indicator.  Quote depth is still important for operations such as
    displaying excerpt bars, generating replies, etc.  When flowed wins
    is used on the example text above, the second line either becomes an
    unquoted, flowed line; a quoted, flowed line with a quote depth
    different from other lines in its section; or a quoted, flowed line
    with an implied quote depth of the other lines in its section.

    A generating agent SHOULD NOT create this situation; a receiving
    agent SHOULD handle it using quote-depth wins.


5.4.  Digital Signatures and Encryption

    If a message is digitally signed or encrypted, and is natively in
    paragraph form, it is important that cryptographic processing use
    the on-the-wire Format=Flowed format.  That is, during generation
    the message SHOULD be prepared for transmission, including addition
    of soft line breaks, before being digitally signed or encrypted;
    similarly, on receipt the message SHOULD have the signature verified
    or be decrypted before removal of soft line breaks and reflowing.


5.5.  Line Analysis Table

    Lines contained in a Text/Plain body part with Format=Flowed can be
    analyzed by examining the start and end of the line.  If the line
    starts with the quote indicator, it is quoted.  If the line ends
    with exactly one space character, it is flowed.  This is summarized
    by the following table:

        Starts          Ends in
        with            Exactly            Line
        Quote           One Space          Type
        ------          ---------          ---------------
        no              no                 unquoted, fixed


Gellens                    [Page 7]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

        yes             no                 quoted,   fixed
        no              yes                unquoted, flowed
        yes             yes                quoted,   flowed


5.6.  Examples

    The following example contains three paragraphs:

       `Take some more tea,' the March Hare said to Alice, very
       earnestly.

       `I've had nothing yet,' Alice replied in an offended tone, `so I
       can't take more.'

       `You mean you can't take LESS,' said the Hatter: `it's very easy
       to take MORE than nothing.'

    This could be encoded as follows (using '*' to indicate a soft line
    break, that is, SPACE CRLF sequence, and '#' to indicate a hard line
    break, that is, CRLF):

       `Take some more tea,' the March Hare said to Alice, very*
       earnestly.#
       #
       `I've had nothing yet,' Alice replied in an offended tone, `so* I
       can't take more.'#
       #
       `You mean you can't take LESS,' said the Hatter: `it's very* easy
       to take MORE than nothing.'#


    Here we have the same exchange, in quoted form:

                >>>Take some more tea.#
                >>I've had nothing yet, so I can't take more.#
                >You mean you can't take LESS, it's very easy to take*
                >MORE than nothing.#


6.  ABNF

    The constructs used in Format=Flowed are described using [ABNF]:

        paragraph     = *flowed-line fixed-line
        fixed-line    = fixed / sig-sep
        fixed         = [quote] *CHAR (non-sp / 2*SP) CRLF
        flowed-line   = flow-qt / flow-unqt
        flow-qt       = quote [non-empty SP] CRLF
        flow-unqt     = [non-empty] SP CRLF
        non-empty     = *CHAR non-sp
        non-sp        = %x01-19 / %21-7F ; any 7-bit except null or SP


Gellens                    [Page 8]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

        quote         = 1*">" SP
        sig-sep       = [quote] "--" SP CRLF


7.  Failure Modes

7.1.  Trailing White Space Corruption

    There are systems in existence which alter trailing whitespace on
    messages which pass through them.  Such systems may strip, or in
    rarer cases, add trailing whitespace, in violation of RFC 821 [SMTP]
    section 4.5.2.

    Stripping trailing whitespace has the effect of converting flowed
    lines to fixed lines, which results in a message no worse than if
    Format=Flowed had not been used.

    Adding trailing whitespace most often has no effect or merely
    converts flowed lines to fixed, but if exactly one trailing space is
    added to one or more lines of a message which uses the Format=Flowed
    parameter, the effect may be a corrupted display or reply.  Since
    most systems which add trailing white space do so to create a line
    which fills an internal record format, the result is almost always a
    line which contains an even number of characters (counting the added
    trailing white space).

    One possible avoidance, therefore, would be to define Format=Flowed
    lines to use either one or two trailing space characters to indicate
    a flowed line, such that the total line length is odd.  However,
    considering the scarcity of such systems today, it is not worth the
    added complexity.


8.  Security Considerations

    This parameter introduces no security considerations beyond those
    which apply to text/plain.

    Section 5.4 discusses the interaction between Format=Flowed and
    digital signatures or encryption.


9.  Acknowledgments

    This proposal evolved from a discussion of Chris Newman's
    TEXT/PARAGRAPH draft which took place on the IETF 822 mailing list.
    Steve Dorner and Laurence Lundblade, among others, contributed
    heavily.


10.  References



Gellens                    [Page 9]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

    [ABNF] Crocker, Overell, "Augmented BNF for Syntax Specifications:
    ABNF", RFC 2234, Internet Mail Consortium, Demon Internet Ltd.,
    November 1997.

    [KEYWORDS] Bradner, "Key words for use in RFCs to Indicate
    Requirement Levels", RFC 2119, Harvard University, March 1997.

    [RICH] Resnick, Walker, "The text/enriched MIME Content-type", RFC
    1896, QUALCOMM, InterCon, February 1996.

    [MIME-IMT] Freed, Borenstein, "Multipurpose Internet Mail Extensions
    (MIME) Part Two:  Media Types", RFC 2046, Innosoft, First Virtual,
    November 1996.

    [Quoted-Printable] Freed, Borenstein, "Multipurpose Internet Mail
    Extensions (MIME) Part One:  Format of Internet Message Bodies", RFC
    2045, Innosoft, First Virtual, November 1996.

    [SMTP] Postel, "Simple Mail Transfer Protocol", RFC 821, Information
    Sciences Institute, August 1982.


11.  Editor's Address

    Randall Gellens                    +1 619 651 5115
    QUALCOMM Incorporated              randy@qualcomm.com
    6455 Lusk Blvd.
    San Diego, CA  92121-2779
    USA


12.  Full Copyright Statement

    Copyright (C) The Internet Society 1998.  All Rights Reserved.

    This document and translations of it may be copied and furnished to
    others, and derivative works that comment on or otherwise explain it
    or assist in its implementation may be prepared, copied, published
    and distributed, in whole or in part, without restriction of any
    kind, provided that the above copyright notice and this paragraph
    are included on all such copies and derivative works.  However, this
    document itself may not be modified in any way, such as by removing
    the copyright notice or references to the Internet Society or other
    Internet organizations, except as needed for the purpose of
    developing Internet standards in which case the procedures for
    copyrights defined in the Internet Standards process must be
    followed, or as required to translate it into languages other than
    English.

    The limited permissions granted above are perpetual and will not be
    revoked by the Internet Society or its successors or assigns.



Gellens                    [Page 10]                    Expires May
1999Internet Draft            The FORMAT Parameter            November
1998

    This document and the information contained herein is provided on an
    "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
    TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
    BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
    HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
















































Gellens                    [Page 11]                    Expires May 1999