Network Working Group                                      Jutta Degener
Internet Draft                                           Philip Guenther
Expires: February 2007                                    Sendmail, Inc.
                                                             August 2006

                 Sieve Email Filtering: Body Extension

Status of this memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-
   Drafts as reference material or to cite them other than as
   "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

Copyright Notice

   Copyright (C) The Internet Society (2006).


   This document defines a new primitive for the "Sieve" email
   filtering language that tests for the occurrence of one or more
   strings in the body of an email message.

Degener & Guenther           Standards Track                    [Page 1]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

1. Introduction

   The proposed "body" test checks for the occurrence of one
   or more strings in the body of an email message.
   Such a test was initially discussed for the [SIEVE] base
   document, but was subsequently removed because it was
   thought to be too costly to implement.

   Nevertheless, several server vendors have implemented
   some form of the "body" test.

   This document reintroduces the "body" test as an extension,
   and specifies its syntax and semantics.

2. Conventions used.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in [KEYWORDS].

   Conventions for notations are as in [SIEVE] section 1.1, including
   use of the "Usage:" label for the definition of text and tagged
   arguments syntax.

   The capability string associated with the extension defined in
   this document is "body".

3. Test body

                <key-list: string-list>

   The body test matches text in the body of an email message, that
   is, anything following the first empty line after the header.
   (The empty line itself, if present, is not considered to be part
   of the body.)

   The COMPARATOR and MATCH-TYPE keyword parameters are defined
   in [SIEVE].  The BODY-TRANSFORM is a keyword parameter
   discussed in section 4, below.

   If a message consists of a header only, not followed by an empty
   line, all "body" tests return false, including that for an empty

Degener & Guenther           Standards Track                    [Page 2]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

   If a message consists of a header followed only by an empty
   line with no body lines following it, the message is considered
   to have an empty string as a body.

4. Body Transform

   Prior to matching text in a message body, "transformations"
   can be applied that filter and decode certain parts of the body.
   These transformations are selected by a "BODY-TRANSFORM"
   keyword parameter.

   Usage: ":raw"
        / ":content" <content-types: string-list>
        / ":text"

   The default transformation is :text.

4.1 Body Transform ":raw"

   The ":raw" transform is intended to match against the undecoded
   body of a message.

   If the specified body-transform is ":raw", the [MIME] structure
   of the body is irrelevant.  The implementation MUST NOT remove
   any transfer encoding from the message, MUST NOT refuse to filter
   messages with syntactic errors (unless the environment it is
   part of rejects them outright), and MUST treat multipart boundaries
   or the MIME headers of enclosed body parts as part of the text
   being matched against instead of MIME structures to interpret.


        require ["body", "reject"];

        # This will match a message containing the literal text
        # "MAKE MONEY FAST" in body parts (ignoring any
        # content-transfer-encodings) or MIME headers other than
        # the outermost RFC 2822 header.

        if body :raw :contains "MAKE MONEY FAST" {

Degener & Guenther           Standards Track                    [Page 3]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

4.2 Body Transform ":content"

   If the body transform is ":content", only MIME parts that have
   the specified content-types are selected for matching.

   If an individual content type begins or ends with a '/' (slash)
   or contains multiple slashes, it matches no content types.
   Otherwise, if it contains a slash, then it specifies a full
   <type>/<subtype> pair, and matches only that specific content
   type.  If it is the empty string, all MIME content types are
   matched.  Otherwise, it specifies a <type> only, and any subtype
   of that type matches it.

   The search for MIME parts matching the :content specification
   is recursive and automatically descends into multipart and
   message/rfc822 MIME parts.  All MIME parts with matching types
   are searched for the key strings.  The test returns true if any
   combination of searched MIME part and key-list argument match.

   If the :content specification matches a multipart MIME part,
   only the prologue and epilogue sections of the part will be
   searched for the key strings; the contents of nested parts are
   only searched if their respective types match the :content

   If the :content specification matches a message/rfc822 MIME part,
   only the header of the nested message will be searched for the
   key strings; the contents of the nested message body parts are
   only searched if its content-type matches the :content specification.

   (Matches against container types with an empty match string can
   be useful as tests for the existence of such parts.)

Degener & Guenther           Standards Track                    [Page 4]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

        From: Whomever
        To: Someone
        Date: Whenever
        Subject: whatever
        Content-Type: multipart/mixed; boundary=outer

     &  This is a multi-part message in MIME format.
        Content-Type: multipart/alternative; boundary=inner

     &  This is a nested multi-part message in MIME format.
        Content-Type: text/plain; charset="us-ascii"

     $  Hello
        Content-Type: text/html; charset="us-ascii"

     %  <html><body>Hello</body></html>
     &  This is the end of the inner MIME multipart.
        Content-Type: message/rfc822

     !  From: Someone Else
     !  Subject: hello request

     $  Please say Hello
     &  This is the end of the outer MIME multipart.

   In the above example, the '&', '$', '%', and '!' characters at
   the start of a line are used to illustrate what portions of the
   example message are used in tests:

   - the lines starting with '&' are the ones that are tested when
     a 'body :content "multipart" :contains "MIME"'
     test is executed.

Degener & Guenther           Standards Track                    [Page 5]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

   - the lines starting with '$' are the ones that are tested when
     a 'body :content "text/plain" :contains "Hello"' test is

   - the lines starting with '%' are the ones that are tested when
     a 'body :content "text/html" :contains "Hello"' test is executed.

   - the lines starting with '$' or '%' are the ones that are tested
     when a 'body :content "text" :contains "Hello"' test is executed.

   - the lines starting with '!' are the ones that are tested when
     a 'body :content "message/rfc822" :contains "Hello"' test is

   Comparisons are performed on octets.  Implementations decode
   the content-transfer-encoding and convert text to [UTF-8] as
   input to the comparator.  MIME parts that cannot be decoded and
   converted MAY be treated as plain US-ASCII, omitted, or processed
   according to local conventions.  A NUL octet (character zero)
   SHOULD NOT cause early termination of the content being compared
   against.  Implementations MUST support the "quoted-printable",
   "base64", "7bit", "8bit", and "binary" content transfer encodings.
   Implementations MUST be capable of converting to UTF-8 the
   US-ASCII, ISO-8859-1, and the US-ASCII subset of
   ISO-8859-* character sets.

   Search expressions MUST NOT match across MIME part boundaries.
   MIME headers of the containing text MUST NOT be included in the

        require ["body", "fileinto"];

        # Save any message with any text MIME part that contains the
        # words "missile" or "coordinates" in the "secrets" folder.

        if body :content "text" :contains ["missile", "coordinates"] {
                fileinto "secrets";

        # Save any message with an audio/mp3 MIME part in
        # the "jukebox" folder.

        if body :content "audio/mp3" :contains "" {
                fileinto "jukebox";

Degener & Guenther           Standards Track                    [Page 6]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

4.3 Body Transform ":text"

   The ":text" body transform matches against the results of
   an implementation's best effort at extracting UTF-8 encoded
   text from a message.

   In simple implementations, :text MAY be treated the same
   as :content "text".

   Sophisticated implementations MAY strip mark-up from the text
   prior to matching, and MAY convert media types other than text
   to text prior to matching.

   (For example, they may be able to convert proprietary text
   editor formats to text or apply optical character recognition
   algorithms to image data.)

        require ["body", "fileinto"];

        # Save messages mentioning the project schedule in the
        # project/schedule folder.
        if body :text :contains "project schedule" {
                fileinto "project/schedule";

5. Interaction with Other Sieve Extensions

   Any extension that extends the grammar for the COMPARATOR or
   MATCH-TYPE nonterminals will also affect the implementation of

   The [REGEX] extension can place a considerable load on a system
   when applied to whole bodies of messages, especially when
   implemented naively or used maliciously.

   Regular and wildcard expressions used with "body" are exempt
   from the side effects described in [VARIABLES].  That is, they
   MUST NOT set match variables (${1}, ${2}...) to the input values
   corresponding to wild card sequences in the matched pattern.
   However, if the extension is present, variable references in the
   key strings or content type strings are evaluated as described
   in the draft.

Degener & Guenther           Standards Track                    [Page 7]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

6.  IANA Considerations

    The following template specifies the IANA registration of the Sieve
    extension specified in this document:

    Subject: Registration of new Sieve extension

    Capability name: body
    Description:     adds the 'body' test for matching against the
                     the body of the message being processed
    RFC number:      this RFC
    Contact Address: Jutta Degener <>

    This information should be added to the list of sieve extensions
    given on

7. Security Considerations

   The system MUST be sized and restricted in such a manner that
   even malicious use of body matching does not deny service to
   other users of the host system.

   Filters relying on string matches in the raw body of an email
   message may be more general than intended.  Text matches are no
   replacement for a spam, virus, or other security related
   filtering system.

8. Acknowledgments

   This document has been revised in part based on comments and
   discussions that took place on and off the SIEVE mailing list.
   Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson,
   Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison,
   Greg Shapiro, Tim Showalter, Nigel Swinson, and Dowson Tong for
   reviews and suggestions.

Degener & Guenther           Standards Track                    [Page 8]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

9. Authors' Addresses

   Jutta Degener
   5245 College Ave, Suite #127
   Oakland, CA 94618


   Philip Guenther
   Sendmail, Inc.
   6425 Christie Ave, 4th Floor
   Emeryville, CA 94608


10. Discussion

   This section will be removed when this document leaves the
   Internet-Draft stage.

   This draft is intended as an extension to the Sieve mail filtering
   language.  Sieve extensions are discussed on the MTA Filters mailing
   list at <>.  Subscription requests can
   be sent to <> (send an email
   message with the word "subscribe" in the body).

   More information on the mailing list along with a WWW archive of
   back messages is available at <>.

10.1 Changes from draft-ietf-sieve-body-03.txt

   Update IANA registration to match 3028bis.

   Added direct boilerplate for [KEYWORDS].

10.2 Changes from draft-ietf-sieve-body-02.txt

   Updated charset conversion to match draft-ietf-sieve-3028bis-06.txt.

   Change "Syntax:" to "Usage:".

   Updated references.

Degener & Guenther           Standards Track                    [Page 9]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

10.3 Changes from draft-ietf-sieve-body-01.txt

   Updated charset conversion requirements to match those in
   draft-ietf-sieve-3028bis-03.txt for headers.

10.4 Changes from draft-ietf-sieve-body-00.txt

   Updated IPR boilerplate to RFC 3978/3979.

   Many prose corrections in response to WGLC comments.  Of particular
     - made clear that :raw treats MIME boundaries and headers as
       text to be matched against
     - corrected description in comment of :raw example
     - clarified the interpretation of invalid content-types in
     - gave precise description of what gets matched when :content
       is used with message/rfc822 or any multipart type, as well
       as a comprehensive example
     - include an example of :text
     - tightened wording of interaction with [VARIABLES]
     - added informative reference to [REGEX]

10.5 Changes from draft-degener-sieve-body-04.txt

   Renamed to draft-ietf-sieve-body-00.txt; tweaked the title and

   Added Philip Guenther as co-author.

   Split references into normative and informative.  Updated [UTF-8]
   and [VARIABLES] references.

   Updated IPR boilerplate.

10.6 Changes from draft-degener-sieve-body-03.txt

   Made "body" exempt from variable-setting side effects in the
   presence of the "variables" extension and wild cards.  It's too
   hard to implement.

   Removed :binary.  It's uglier and less useful than it needs to be
   to bother.

   Added IANA section.

Degener & Guenther           Standards Track                   [Page 10]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006


Appendix A.  Normative References

   [KEYWORDS]   Bradner, S., "Key words for use in RFCs to Indicate
                Requirement Levels", RFC 2119, March 1997.

   [MIME]       Freed, N. and N. Borenstein, "Multipurpose Internet Mail
                Extensions (MIME) Part One: Format of Internet Message
                Bodies", RFC 2045, November 1996.

   [SIEVE]      Guenther, P. and T. Showalter, "Sieve: A Mail Filtering
                Language", draft-ietf-sieve-3028bis-06, March 2006.

   [UTF-8]      Yergeau, F., "UTF-8, a transformation format of ISO
                10646", RFC 3629, November 2003.

Appendix B.  Informative References

   [REGEX]     Murchison, K., "Sieve Email Filtering -- Regular
               Expression Extension",
               draft-ietf-sieve-regex-00.txt, February 2006

   [VARIABLES] Homme, K.T., "Sieve Extension: Variables",
               draft-ietf-sieve-variables-07.txt, October 2005

Degener & Guenther           Standards Track                   [Page 11]

Internet-Draft   Sieve Email Filtering: Body Extension       August 2006

Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is
   subject to the rights, licenses and restrictions contained in
   BCP 78, and except as set forth therein, the authors retain all
   their rights.

   This document and the information contained herein are provided on an

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use
   of such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository

   The IETF invites any interested party to bring to its attention
   any copyrights, patents or patent applications, or other
   proprietary rights that may cover technology that may be required
   to implement this standard.  Please address the information to the
   IETF at


   Funding for the RFC Editor function is currently provided by
   the Internet Society.

Degener & Guenther           Standards Track                   [Page 12]