Network Working Group                                          J. Levine
Internet-Draft                                      Taughannock Networks
Intended status: Informational                                R. Gellens
Expires: May 14, 2012                              Qualcomm Incorporated
                                                           November 2011

                   Mailing Lists and UTF-8 Addresses
                    draft-ietf-eai-mailinglistbis-00

Abstract

   This document describes considerations for mailing lists with the
   introduction of internationalized email addresses.  It outlines some
   possible scenarios for handling lists with mixtures of
   internationalized and traditional addresses, but does not offer
   implementation or deployment advice.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 14, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (http://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  2
   2.  Scenarios Involving Mailing Lists  . . . . . . . . . . . . . .  4

Levine & Gellens                  info                          [Page 1]


Internet-Draft     Mailing Lists and UTF-8 Addresses       November 2011

     2.1.  Fully EAI lists  . . . . . . . . . . . . . . . . . . . . .  4
     2.2.  Mixed EAI and ASCII lists  . . . . . . . . . . . . . . . .  4
     2.3.  SMTP issues  . . . . . . . . . . . . . . . . . . . . . . .  5
   3.  List headers . . . . . . . . . . . . . . . . . . . . . . . . .  5
     3.1.  EAI list headers . . . . . . . . . . . . . . . . . . . . .  5
     3.2.  Downgrading list headers . . . . . . . . . . . . . . . . .  6
     3.3.  Subscribers' addresses in downgraded headers . . . . . . .  7
   4.  Security considerations  . . . . . . . . . . . . . . . . . . .  7
   5.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  7
   Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . .  8
     Appendix A.1.  Changes up to -00 . . . . . . . . . . . . . . . .  8
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .  8

1.  Introduction

   This document describes considerations for mailing lists with the
   introduction of internationalized email addresses.

   Mailing lists are an important part of email usage and collaborative
   communications.  The introduction of internationalized email
   addresses affects mailing lists in three main areas: (1) transport
   (receiving and sending messages); (2) message headers of received and
   retransmitted messages; and (3) mailing list operational policies.

   A mailing list is a mechanism that distributes a message to multiple
   recipients when the originator sends it to a single address.  An
   agent (typically not a human being) at that single address receives
   the message and then causes the message to be redistributed to a list
   of recipients.  This agent sets the envelope return address of the
   redistributed message to a different address from that of the
   original message.  Using a different envelope return address
   (reverse-path) directs error and other automatically generated
   messages to an error handling address associated with the mailing
   list.  This sends error and other automatic messages to the list
   agent, which can often do something useful with them, rathern than to
   the original sender, who typically doesn't control the list and hence
   can't do anything about them.

   In most cases, the mailing list agent redistributes a received
   message to its subscribers as a new message, that is, conceptually it
   uses message submission [RFC6409] (as did the sender of the original
   message).  The exception, where the mailing list is not a separate
   agent that receives and redistributes messages in separate
   transactions, but is instead an expansion step within an SMTP
   transaction where one local address expands to multiple local or non-












Levine & Gellens                  info                          [Page 2]


Internet-Draft     Mailing Lists and UTF-8 Addresses       November 2011

   local addresses, is not addressed by this document.

   Some mailing lists alter message header fields, while others do not.
   A number of standardized list-related header fields have been
   defined, and many lists add one or more of these headers.  Separate
   from these standardized list-specific header fields, and despite a
   history of interoperability problems from doing so, some lists alter
   or add header fields in an attempt to control where replies are sent.
   Such lists typically add or replace the "Reply-To" field and some add
   or replace the "Sender" field.  Some lists alter or replace other
   fields, including "From".

   Among these list-specific header fields are those specified in
   [RFC2369] and [RFC2919].  For more information, see [listheaders].

   While the mail transport protocol does not differ between regular
   email recipients and mailing list recipients, lists have special
   considerations with internationalized email addresses because they
   retransmit messages composed by other agents to potentially many
   recipients.

   There are considerations for internationalized email addresses in the
   envelope as well as in header fields of redistributed messages.  In
   particular, a message with UTF-8 addresses in the headers or envelope
   cannot be sent to non-UTF8 recipients.

   With mailing lists, there are two different types of considerations:
   first, the purely technical ones involving message handling, error
   cases, and the like, and second, those that arise from the fact that
   humans use mailing lists to communicate.  As an example of the first,
   mailing lists might choose to reject all messages from
   internationalized addresses.  As an example of the second, a user who
   sends a message to a list often is unaware of the list membership.
   In particular, the user often doesn't know if the members are UTF-8
   mail users or not, and often neither the original sender nor the
   recipients personally know each other.  As a consequence of this,
   remedies that may be readily available for one-to-one communication
   might not be appropriate when dealing with mailing lists.  For
   example, if a user sends a message which is undeliverable, normally
   the telephone, instant messaging, or other forms of communication are
   available to obtain a working address.  With mailing lists, the users
   may not have any recourse.  Of course, with mailing lists, the
   original sender usually does not know if the message was successfully
   delivered to any list members, or if it was undeliverable to some.

   Conceptually, a mailing list's internationalization can be divided
   into three capabilities: First, does it have a UTF-8 submission or
   return-path address?  Second, does it accept subscriptions to UTF-8
   addresses?  And third, does it accept EAI messages?








Levine & Gellens                  info                          [Page 3]


Internet-Draft     Mailing Lists and UTF-8 Addresses       November 2011

   If a list has subscribers with ASCII addresses, those subscribers
   might or might not be able to accept EAI messages.

2.  Scenarios Involving Mailing Lists

   Generally (and exclusively within the scope of this document), an
   original message is sent to a mailing list as a completely separate
   and independent transaction from the mailing list agent sending the
   retransmitted message to one or more list recipients.  In both cases,
   the message might have only one recipient, or might have multiple
   recipients.  That is, the original message might be sent to
   additional recipients as well as the mailing list agent, and the
   mailing list might choose to send the retransmitted message to each
   list recipient in a separate message submission transaction, or might
   choose to include multiple recipients per transaction.  Often,
   mailing lists are constructed to work in cooperation with, rather
   than include the functionality of, a message submission server, and
   hence the list transmits to a single submission server one copy of
   the retransmitted message, with all list recipients specified in the
   SMTP envelope.  The submission server then decides which recipients
   to include in which transaction.

2.1.  Fully EAI lists

   Some lists may wish to be fully EAI.  That is, all subscribers are
   expected to be able to receive EAI mail.  For list hygeine reasons,
   such a list would probably want to prevent subscriptions from
   addresses that are unable to receive EAI mail.  If a putative
   subscriber has a UTF-8 address, it must be able to receive EAI mail,
   but there is no way to tell whether a subscriber with an ASCII
   address can receive EAI mail short of sending an EAI probe or
   confirmation message and somehow finding out whether it was
   delivered.

2.2.  Mixed EAI and ASCII lists

   Other lists may wish to handle a mixture of EAI and ASCII
   subscribers, either as a transitional measure as subscribers upgrade
   to EAI-capable mail software, or as an ongoing feature.  While it is
   not possible in general to downgrade EAI mail to ASCII mail, list
   software might divide the recipients into two sets, EAI and ASCII
   recipients, and create a downgraded version of EAI list messages to
   send to ASCII recipients.

   To determine which set an address belongs in, list software might
   make the conservative assumption that ASCII addresses get ASCII
   messages, it might try to probe the address with an EAI test message,










Levine & Gellens                  info                          [Page 4]


Internet-Draft     Mailing Lists and UTF-8 Addresses       November 2011

   or it might let the subscriber set the message format manually,
   similar to the way that some lists now let subscribers choose between
   plain text and HTML mail, or individual messages and a daily digest.

   To determine whether a message needs to be downgraded for ASCII
   recipients, list software might assume that any message received via
   an EAI SMTP session is an EAI message, or might examine the headers
   and body of the message to see whether it needs EAI treatment.
   Depending on the interface between the list software and the MTA and
   MDA that handle incoming messages, it may not be able to tell the
   type of of session for incoming messages.

2.3.  SMTP issues

   Mailing list software invariably changes the envelope addresses on
   each message.  The return path is set to an address that will return
   bounces to the list software, and the recipient addresses are set to
   the subscribers of the list.  For some lists, all messages to a list
   get the same bounce address.  For others, list software may create a
   bounce address per recipient, or a unique bounce address per message
   per recipient, bounce management techniques known as VERP.

   The bounce address for a list typically includes the name of the
   list, so a list with an EAI name will have an EAI bounce address.
   Given the unknown paths that bounce messages might take, list
   software might modify the bounce address to be ASCII to make it more
   likely that bounces can be delivered back to the list.  Similarly, a
   VERP address for each subscriber typically embeds a version of the
   subscriber's address so the VERP bounce address for an EAI subscriber
   address will be an EAI address.  For the same reason, the list
   software might want to modify the VERP bounce addresses to be ASCII.

3.  List headers

   List software typically adds list-specific headers to each message
   before resending the message to list recipients.

3.1.  EAI list headers

   The list headers in [RFC2369] and [RFC2919] were all specified before
   EAI mail existed and their definitions do not address where UTF-8
   characters might appear.  These include, for example:















Levine & Gellens                  info                          [Page 5]


Internet-Draft     Mailing Lists and UTF-8 Addresses       November 2011


   List-Id: List Header Mailing List
      <list-header.example.com>
   List-Help:
      <mailto:list@example.com?subject=help>
   List-Unsubscribe:
      <mailto:list@example.com?subject=unsubscribe>
   List-Subscribe:
      <mailto:list@example.com?subject=subscribe>
   List-Post:
      <mailto:list@example.com>
   List-Owner:
      <mailto:listmom@example.com> (Contact Person for Help)

   List-Archive: <mailto:archive@example.com?subject=index%20list>

   As described in [RFC2369], "The contents of the list header fields
   mostly consist of angle-bracket ('<', '>') enclosed URLs, with
   internal whitespace being ignored."  [RFC2919] specifies that, "The
   list identifier will, in most cases, appear like a host name in a
   domain of the list owner."

   These mailing list header fields contain URLs.  The most common
   schemes are generally HTTP, HTTPS, mailto, and FTP.  Schemes that
   permit both URI and IRI forms should use the URI-encoded form
   described in [RFC3987].  Future work may extend  these header fields
   or define replacements to directly support non-encoded UTF-8 in IRIs
   (for example, [I-D.duerst-mailto-bis]), but in the absence of such
   extension or replacement, non-ASCII characters can only appear within
   when encoded as ASCII. Note that discussion  on whether
   internationalized domain names should be percent-encoded or puny-
   coded, is ongoing; see [I-D.duerst-iri-bis].

   The most commonly-used URI schemes in List-* headers tend to be HTTP
   and mailto.  The current specification for mailto does not permit
   unencoded UTF-8 characters, although work has been proposed to extend
   or more likely replace mailto in order to permit this.

   The List-ID header field uniquely identifies a list.  The intent is
   that the value of this header remain constant, even if the machine or
   system used to operate or host the list changes.  This header field
   is often used in various filters and tests, such as client-side
   filters, Sieve filters, and so forth.  Such filters and tests may not
   properly compare a non-ASCII value which has been encoded into ASCII.
   In addition to these comparison considerations, it is generally
   desirable that this header field contain something meaningful that
   users can type in.  However, ASCII encodings of non-ASCII characters
   are unlikely to be meaningful to users or easy for them to accurately
   type.

3.2.  Downgrading list headers






Levine & Gellens                  info                          [Page 6]


Internet-Draft     Mailing Lists and UTF-8 Addresses       November 2011


   If list software prepares a downgraded version of an EAI message, all
   the List-* headers must be downgraded.  In particular, if a List-*
   header contains a UTF-8 mailto (even encoded in ASCII), it may be
   advisable to edit the header to remove the UTF-8 address, or replace
   it with an equivalent ASCII address if one is known to the list
   software.  Otherwise, a client might run into trouble if the decoded
   mailto results in a non-ASCII address.

   When downgrading list headers, it may not be possible to produce a
   downgraded version that is satisfactorily equivalent to the original
   header.  In particular, if a UTF-8 List-ID is downgraded to an ASCII
   version, software and humans at recipient systems will typically not
   be able to tell that both refer to the same list.

   If lists permit mail with multiple MIME parts, some MIME headers in
   EAI messages may include UTF-8 characters in filenames and other
   descriptive text strings.  Downgrading these strings may lose the
   sense of the names, break references from other MIME parts (such as
   HTML IMG refereneces to embedded images) and otherwise damage the
   mail.

3.3.  Subscribers' addresses in downgraded headers

   List software typically leaves the original submitter's address in
   the From: line, both so that receipients can tell who wrote the
   message, and so that they have a choice of responding to the list or
   directly to the submitter.  If a submitter has a UTF-8 address, there
   is no way to downgrade the From: header and preserve the address so
   that ASCII recipients can respond to it, since non-EAI mail systems
   can't send mail to UTF-8 addresses.

   Possible workarounds (none implemented that we know of) might include
   allowing subscribers with UTF-8 addresses to register an alternate
   ASCII address with the list software, having the list software itself
   create ASCII forwarding addresses, or just putting the list's address
   in the From: line and losing the ability to respond to the submitter.

4.  Security considerations

   None beyond what mailing lists do now.

5.  References

   [I-D.duerst-iri-bis]
              Duerst, M, Suignard, M and L Masinter, "Internationalized
              Resource Identifiers (IRIs)", Internet-Draft draft-duerst-
              iri-bis-07, October 2009.

   [I-D.duerst-mailto-bis]
              Duerst, M, Masinter, L and J Zawinski, "The 'mailto' URI
              Scheme", Internet-Draft draft-duerst-mailto-bis-10, May
              2010.




Levine & Gellens                  info                          [Page 7]


Internet-Draft     Mailing Lists and UTF-8 Addresses       November 2011


   [RFC2369]  Neufeld, G. and J.D. Baer, "The Use of URLs as Meta-Syntax
              for Core Mail List Commands and their Transport through
              Message Header Fields", RFC 2369, July 1998.

   [RFC2919]  Chandhok, R. and G. Wenger, "List-Id: A Structured Field
              and Namespace for the Identification of Mailing Lists",
              RFC 2919, March 2001.

   [RFC3987]  Duerst, M. and M. Suignard, "Internationalized Resource
              Identifiers (IRIs)", RFC 3987, January 2005.

   [RFC6409]  Gellens, R. and J. Klensin, "Message Submission for Mail",
              STD 72, RFC 6409, November 2011.

Appendix A.  Change Log

   *NOTE TO RFC EDITOR: This section may be removed upon publication of
   this document as an RFC.*

Appendix A.1.  Changes up to -00

   Rewrite completely.

Authors' Addresses

   John Levine
   Taughannock Networks
   PO Box 727
   Trumansburg, NY 14886

   Phone: +1 831 480 2300
   Email: standards@taugh.com
   URI:   http://jl.ly


   Randall Gellens
   Qualcomm Incorporated
   5775 Morehouse Drive
   San Diego, CA 92121

   Email: rg+ietf@qualcomm.com















Levine & Gellens                  info                          [Page 8]