APPSAWG M. Kucherawy
Internet-Draft Cloudmark, Inc.
Intended status: BCP February 29, 2012
Expires: September 1, 2012
Best Current Practices for Handling of Malformed Messages
draft-ietf-appsawg-malformed-mail-01
Abstract
The email ecosystem has long had a very permissive set of common
processing rules in place, despite increasingly rigid standards
governing its components, ostensibly to improve the user experience.
The handling of these come at some cost, and various components are
faced with decisions about whether or not to permit non-conforming
messages to continue toward their destinations unaltered, adjust them
to conform (possibly at the cost of losing some of the original
message), or outright rejecting them.
This memo includes a collection of the best current practices in a
variety of such situations, to be used as implementation guidance.
It must be emphasized, however, that the intent of this memo is not
to standardize malformations or otherwise encourage their
proliferation. The messages that are the subject of this memo are
manifestly malformed, and the code and culture that generates them
needs to be fixed. Nevertheless, many malformed messages from
otherwise legitimate senders are in circulation and will be for some
time and, unfortunately, commercial reality shows that we cannot
simply reject or discard them. Accordingly, this memo presents
recommendations for dealing with them in ways that seem to do the
least additional harm until the infrastructure is tightened up to
match the standards.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Kucherawy Expires September 1, 2012 [Page 1]
Internet-Draft Mailformed Mail BCP February 2012
This Internet-Draft will expire on September 1, 2012.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. The Purpose Of This Work . . . . . . . . . . . . . . . . . 3
1.2. Not The Purpose Of This Work . . . . . . . . . . . . . . . 3
2. Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Internal Representations . . . . . . . . . . . . . . . . . . . 4
5. Invariate Content . . . . . . . . . . . . . . . . . . . . . . . 4
6. Mail Submission Agents . . . . . . . . . . . . . . . . . . . . 4
7. Header Anomalies . . . . . . . . . . . . . . . . . . . . . . . 5
7.1. Non-Header Lines . . . . . . . . . . . . . . . . . . . . . 5
7.2. Header Malformations . . . . . . . . . . . . . . . . . . . 6
7.3. Header Field Counts . . . . . . . . . . . . . . . . . . . . 7
8. MIME Anomalies . . . . . . . . . . . . . . . . . . . . . . . . 8
8.1. Missing MIME-Version Field . . . . . . . . . . . . . . . . 8
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8
10. Security Considerations . . . . . . . . . . . . . . . . . . . . 9
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9
11.1. Normative References . . . . . . . . . . . . . . . . . . . 9
11.2. Informative References . . . . . . . . . . . . . . . . . . 9
Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . . 9
Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . . 9
Kucherawy Expires September 1, 2012 [Page 2]
Internet-Draft Mailformed Mail BCP February 2012
1. Introduction
1.1. The Purpose Of This Work
The history of email standards, going back to [RFC822] and beyond,
contains a fairly rigid evolution of specifications. But
implementations within that culture have also long had an
undercurrent known formally as the robustness principle, but also
known informally as Postel's Law: "Be conservative in what you do, be
liberal in what you accept from others."
In general, this served the email ecosystem well by allowing a few
errors in implementations without obstructing participation in the
game. The proverbial bar was set low. However, as we have evolved
into the current era, some of these lenient stances have begun to
expose opportunities that can be exploited by malefactors. Various
email-based applications rely on strong application of these
standards for simple security checks, while the very basic building
blocks of that infrastructure, intending to be robust, fail utterly
to assert those standards.
This memo presents some areas in which the more lenient stances can
provide vectors for attack, and then presents the collected wisdom of
numerous applications in and around the email ecosystem for dealing
with them to mitigate their impact.
1.2. Not The Purpose Of This Work
It is important to understand that this work is not an effort to
endorse or standardize certain common malformations. The code and
culture that introduces such messages into the mail stream needs to
be repaired, as the security penalty now being paid for this lax
processing arguably outweighs the reduction in support costs to end
users who are not expected to understand the standards. However, the
reality is that this will not be fixed quickly.
Given this, it is beneficial to provide implementers with guidance
about the safest or most effective way to handle malformed messages
when they arrive, taking into consideration the tradeoffs of the
choices available especially with respect to how various actors in
the email ecosystem respond to such messages in terms of handling,
parsing, or rendering to end users.
2. Keywords
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [KEYWORDS].
Kucherawy Expires September 1, 2012 [Page 3]
Internet-Draft Mailformed Mail BCP February 2012
3. Background
The reader would benefit from reading [EMAIL-ARCH] for some general
background about the overall email architecture. Of particular
interest is the Internet Message Format, detailed in [MAIL].
Throughout this document, the use of the term "messsage" should be
assumed to mean a block of text conforming to the Internet Message
Format.
4. Internal Representations
Any agent handling a message could have one or two (or more) distinct
representations of a message it is handling. One is an internal
representation, such as a block of storage used for the header and a
block for the body. These may be sorted, encoded, decoded, etc., as
per the needs of that particular module. The other is the
representation that is output to the next agent in the handling
chain. This might be identical to the version that is input to the
module, or it might have some changes such as added or reordered
header fields, body modifications to remove malicious content, etc.
In some cases, advice is provided only for internal representations.
However, there is often occasion to mandate changes to the output as
well.
5. Invariate Content
Experience has shown that it is beneficial to ensure that, from the
first analysis agent at ingress to the last before delivery to the
user, the message each agent sees is identical. Absent this, it can
be impossible for different agents in the chain to make assertions
about the content that correlate. For example, where a user might
make a complaint about the content, altered content will not match
what the automated components saw when they processed the message,
which can lead to inaccurate handling of the complaint.
Therefore, agents comprising an inbound message processing
environment SHOULD ensure that each agent sees the same content, and
the message reaches the end user unmodified. An exception to this is
content that is identitfied as certainly harmful, such as some kind
of malicious executable software included in the message.
6. Mail Submission Agents
Within the email context, the single most influential component that
can reduce the presence of malformed items in the email system is the
Mail Submission Agent (MSA). This is the component that is
essentially the interface between end users that create content and
Kucherawy Expires September 1, 2012 [Page 4]
Internet-Draft Mailformed Mail BCP February 2012
the mail stream.
The lax processing described earlier in the document creates a high
support and security cost overall. Thus, MSAs MUST evolve to become
more strict about enforcement of all relevant email standards,
especially [MAIL] and the [MIME] family of documents.
Relay Mail Transport Agents (MTAs) SHOULD also be more strict;
although preventing the dissemination of malformed messages is
desirable, the rejection of such mail already in transit also has a
support cost, namely the creation of a [DSN] that many end users
might not understand.
7. Header Anomalies
This section covers common syntactical and semantic anomalies found
in headers of messages, and presents preferred mitigations.
7.1. Non-Header Lines
It has been observed that some messages contain a line of text in the
header that is not a valid message header field of any kind. For
example:
From: user@example.com
To: userpal@example.net
Subject: This is your reminder
about the football game tonight
Date: Wed, 20 Oct 2010 20:53:35 -0400
Don't forget to meet us for the tailgate party!
The cause of this is typically a bug in a message generator of some
kind. If the fourth line was intended to be a continuation of the
third, it should be indented by whitespace as set out in Section
2.2.3 of [MAIL].
This anomaly has varying impacts on processing software, depending on
the implementation:
1. some agents choose to separate the header of the message from the
body only at the first empty line (i.e. a CRLF immediately
followed by another CRLF);
2. some agents assume this anomaly should be interpreted to mean the
body starts at line four, as the end of the header is assumed by
encountering something that is not a valid header field or folded
portion thereof;
Kucherawy Expires September 1, 2012 [Page 5]
Internet-Draft Mailformed Mail BCP February 2012
3. some agents assume this should be interpreted as an intended
header folding as described above;
4. some agents reject this outright as line four is neither a valid
header field nor a folded continuation of a header field prior to
an empty line.
This can be exploited if it is known that one message handling agent
will take one action while the next agent in the handling chain will
take another. For example, a filter trained to detect malicious body
anomalies (e.g. references to dangerous web sites) that is fed by a
Mail Transfer Agent (MTA) implementing (1) above might not get the
opportunity to identify something dangerous in a message if it is
unaware of the anomaly and does not itself check for it.
Consensus indicates the preferred implementation is to terminate
header processing before the first character in line four, as
described in (2) above. Thus, a module compliant with this
specification MUST terminate header processing upon encountering the
first line of text that is not a valid header field. That is, all
data after that point in the input MUST NOT be considered part of the
header of the message. If that line is not an empty line, an empty
line MUST be inserted at that point in the emitted version of the
message being processed.
It should be noted that a few implementations make choice (4) above
since any reputable message generation program will get header
folding right, and thus anything so blatant as this malformation is
likely an error caused by a malefactor.
7.2. Header Malformations
There are various malformations that exist. A common one is
insertion of whitespace at unusual locations, such as:
From: user@example.com
To: userpal@example.net
Subject: This is your reminder
MIME-Version : 1.0
Content-Type: text/plain
Date: Wed, 20 Oct 2010 20:53:35 -0400
Don't forget to meet us for the tailgate party!
Note the addition of whitespace in line four after the header field
name but before the colon that separates the name from the value.
The acceptance grammar of [MAIL] permits that extra whitespace, so it
Kucherawy Expires September 1, 2012 [Page 6]
Internet-Draft Mailformed Mail BCP February 2012
cannot be considered invalid. However, a consensus of
implementations prefers to remove that whitespace. There is no
perceived change to the semantics of the header field being altered
as the whitespace is itself semantically meaningless. Thus, a module
compliant with this memo MUST remove all whitespace after the field
name but before the colon, and MUST emit that version of that field
on output.
7.3. Header Field Counts
Section 3.6 of [MAIL] prescribes specific header field counts for a
valid message. Few agents actually enforce these in the sense that a
message whose header contents exceed one or more limits set there are
generally allowed to pass; they may add any required fields that are
missing, however.
Also, few agents that use messages as input, including Mail User
Agents (MUAs) that actually display messages to users, verify that
the input is valid before proceeding. Two popular open source
filtering programs and two popular Mailing List Management (MLM)
packages examined at the time this memo was drafted select either the
first or last instance of a particular field name, such as From, to
decide who sent a message. Absent enforcement of [MAIL], an attacker
can craft a message with multiple fields if that attacker knows the
filter will make a decision based on one but the user will be shown
the other.
This situation is exacerbated when a claim of message validity is
inferred by something like a valid [DKIM] signature. Such a
signature might cover one instance of a constrained field but not
another, and a naive consumer of DKIM's output, not realizing which
one was covered by a valid signature, presume the wrong one was the
"good" one. An MUA, for example could show the first of two From
fields as "good" or "safe" while the DKIM signature actually only
verified the second.
Thus, an agent compliant with this specification MUST enact one of
the following:
1. reject outright or refuse to process further any input message
that does not conform to Section 3.6 of [MAIL];
2. remove or, in the case of an MUA, refuse to render any instances
of a header field whose presence exceeds a limit prescribed in
Section 3.6 of [MAIL] when generating its output;
3. alter the name of any header field whose presence exceeds a limit
prescribed in Section 3.6 of [MAIL] when generating its outputso
Kucherawy Expires September 1, 2012 [Page 7]
Internet-Draft Mailformed Mail BCP February 2012
that later agents can produce a consistent result.
8. MIME Anomalies
[MIME], et seq, define a mechanism of message extensions for
providing text in character sets other than ASCII, non-text
attachments to messages, multi-part message bodies and similar
facilities.
Some anomalies with MIME-compliant generation are also common. This
section discusses some of those and presents preferred mitigations.
8.1. Missing MIME-Version Field
Any message that uses [MIME] constructs is required to have a MIME-
Version header field. Without them, the Content-Type and associated
fields have no semantic meaning.
It is often observed that a message has complete MIME structure, yet
lacks this header field.
As described at the end of Section 7.1, this is not expected from a
reputable content generator and is often an indication of mass-
produced spam or other undesirable messages.
Therefore, an agent compliant with this specification MUST internally
enact one or more of the following in the absence of a MIME-Version
header field:
1. Ignore all other MIME-specific fields, even if they are
syntactically valid, thus treating the entire message as a
single-part message of type text/plain;
2. Remove all other MIME-specific fields, even if they are
syntactically valid, both internally and when emitting the output
version of the message;
3. Rename all other MIME-specific fields, even if they are
syntactically valid, both internally and when emitting the output
version of the message.
9. IANA Considerations
This memo contains no actions for IANA.
Kucherawy Expires September 1, 2012 [Page 8]
Internet-Draft Mailformed Mail BCP February 2012
10. Security Considerations
The discussions of the anomalies above and their prescribed solutions
are themselves security considerations. The practises enumerated in
this memo are generally perceived to resolve security considerations
that already exist rather than introducing new ones.
11. References
11.1. Normative References
[KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[MAIL] Resnick, P., "Internet Message Format", RFC 5322,
October 2008.
11.2. Informative References
[DKIM] Allman, E., Callas, J., Delany, M., Libbey, M., Fenton,
J., and M. Thomas, "DomainKeys Identified Mail (DKIM)
Signatures", RFC 4871, May 2007.
[DSN] Moore, K. and G. Vaudreuil, "An Extensible Message
Format for Delivery Status Notifications", RFC 3464,
January 2003.
[EMAIL-ARCH] Crocker, D., "Internet Mail Architecture", RFC 5598,
July 2009.
[MIME] Freed, N. and N. Borenstein, "Multipurpose Internet
Mail Extensions (MIME) Part One: Format of Internet
Message Bodies", RFC 2045, November 1996.
[RFC822] Crocker, D., "Standard for the Format of Internet Text
Messages", RFC 822, August 1982.
Appendix A. Examples
Examples, if needed, can go here.
Appendix B. Acknowledgements
The author wishes to acknowledge the following for their review and
constructive criticism of this proposal: (names)
Kucherawy Expires September 1, 2012 [Page 9]
Internet-Draft Mailformed Mail BCP February 2012
Author's Address
Murray S. Kucherawy
Cloudmark, Inc.
128 King St., 2nd Floor
San Francisco, CA 94107
US
Phone: +1 415 946 3800
EMail: msk@cloudmark.com
Kucherawy Expires September 1, 2012 [Page 10]