Network Working Group                                         R. Gellens
Internet-Draft                                     QUALCOMM Incorporated
Obsoletes: 5721 (if approved)                                  C. Newman
Intended status: Standards Track                                  Oracle
Expires: February 1, 2013                                         J. Yao
                                                                   CNNIC
                                                             K. Fujiwara
                                                                    JPRS
                                                           July 31, 2012


                         POP3 Support for UTF-8
                    draft-ietf-eai-rfc5721bis-07.txt

Abstract

   This specification extends the Post Office Protocol version 3 (POP3)
   to support UTF-8 encoded international string in user names,
   passwords, mail addresses, message headers, and protocol-level
   textual strings.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on February 1, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect



Gellens, et al.         Expires February 1, 2013                [Page 1]


Internet-Draft           POP3 Support for UTF-8                July 2012


   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Conventions Used in This Document  . . . . . . . . . . . .  3
   2.  UTF8 Capability  . . . . . . . . . . . . . . . . . . . . . . .  4
     2.1.  The UTF8 Command . . . . . . . . . . . . . . . . . . . . .  4
     2.2.  USER Argument to UTF8 Capability . . . . . . . . . . . . .  6
   3.  LANG Capability  . . . . . . . . . . . . . . . . . . . . . . .  6
   4.  Non-ASCII character Maildrops  . . . . . . . . . . . . . . . .  9
   5.  UTF8 Response Code . . . . . . . . . . . . . . . . . . . . . .  9
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
   8.  Change History . . . . . . . . . . . . . . . . . . . . . . . . 10
     8.1.  draft-ietf-eai-rfc5721bis: Version 00  . . . . . . . . . . 10
     8.2.  draft-ietf-eai-rfc5721bis:  Version 01 . . . . . . . . . . 10
     8.3.  draft-ietf-eai-rfc5721bis:  Version 02 . . . . . . . . . . 11
     8.4.  draft-ietf-eai-rfc5721bis:  Version 03 . . . . . . . . . . 11
     8.5.  draft-ietf-eai-rfc5721bis:  Version 04 . . . . . . . . . . 11
     8.6.  draft-ietf-eai-rfc5721bis:  Version 05 . . . . . . . . . . 11
     8.7.  draft-ietf-eai-rfc5721bis:  Version 06 . . . . . . . . . . 11
     8.8.  draft-ietf-eai-rfc5721bis:  Version 07 . . . . . . . . . . 11
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 11
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 13
   Appendix A.  Design Rationale  . . . . . . . . . . . . . . . . . . 13
   Appendix B.  Acknowledgments . . . . . . . . . . . . . . . . . . . 13








Gellens, et al.         Expires February 1, 2013                [Page 2]


Internet-Draft           POP3 Support for UTF-8                July 2012


1.  Introduction

   This document forms part of the Email Address Internationalization
   protocols described in the Email Address Internationalization
   Framework document [RFC6530].  As part of the overall Email Address
   Internationalization work, email messages could be transmitted and
   delivered containing Unicode string encoded in UTF-8 in the header
   and/or body, and maildrops that are accessed using POP3 [RFC1939]
   might natively store UTF-8.

   This specification extends POP3 [RFC1939] using the POP3 extension
   mechanism [RFC2449] to permit un-encoded UTF-8 [RFC3629] in headers,
   and bodies (e.g., transferred using 8-bit Content Transfer Encoding)
   as described in "Internationalized Email Headers" [RFC6532].  It also
   adds a mechanism to support login names and passwords containing
   UTF-8 string and a mechanism to support UTF-8 string in protocol
   level response strings as well as the ability to negotiate a language
   for such response strings.

   This specification also adds a new response code to indicate that a
   message was not delivered because it required UTF-8 mode discussed in
   section 2 and the server was unable or unwilling to create and
   deliver a variant form of the message as discussed in Section 7 of
   [I-D.ietf-eai-5738bis].

   This specification replaces an earlier, experimental, approach to the
   same problem RFC 5721 [RFC5721].

1.1.  Conventions Used in This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in "Key words for use in
   RFCs to Indicate Requirement Levels" [RFC2119].

   The terms "UTF-8 string" or "UTF-8 character" are used to refer to
   Unicode characters, which may or may not be members of the ASCII
   subset, in UTF-8 RFC3629 [RFC3629], a standard Unicode Encoding Form.
   All other specialized terms used in this specification are defined in
   the Email Address Internationalization framework document.

   In examples, "C:" and "S:" indicate lines sent by the client and
   server, respectively.  If a single "C:" or "S:" label applies to
   multiple lines, then the line breaks between those lines are for
   editorial clarity only and are not part of the actual protocol
   exchange.

   Note that examples always use ASCII characters due to limitations of



Gellens, et al.         Expires February 1, 2013                [Page 3]


Internet-Draft           POP3 Support for UTF-8                July 2012


   this document format; otherwise, some examples for the "LANG" command
   may appear incorrectly.

2.  UTF8 Capability

   This specification adds a new POP3 Extension [RFC2449] capability
   response tag and command to specify support for header field
   information in UTF-8 rather than only ASCII.  The capability tag and
   new command and functionality are described below.

   CAPA tag:
      UTF8

   Arguments with CAPA tag:
      USER

   Added Commands:
      UTF8

   Standard commands affected:
      USER, PASS, APOP, LIST, TOP, RETR

   Announced states / possible differences:
      both / no

   Commands valid in states:
      AUTHORIZATION

   Specification reference:
      this document

   Discussion:

   This capability adds the "UTF8" command to POP3.  The UTF8 command
   switches the session from the ASCII-only mode of RFC 1939 to UTF-8
   mode.  The UTF-8 mode means that, all messages transmitted between
   servers and clients are UTF-8 strings, and both servers and clients
   can send and accept UTF-8 string.

2.1.  The UTF8 Command

   The UTF8 command enables UTF-8 mode.  The UTF8 command has no
   parameters.

   UTF-8 mode has no effect on messages in an ASCII-only maildrop.
   Messages in native UTF-8 maildrops can be ASCII or UTF-8 using
   internationalized headers [RFC6532] and/or 8bit content-transfer-
   encoding, as defined in MIME Section 2.8 [RFC2045].  The character



Gellens, et al.         Expires February 1, 2013                [Page 4]


Internet-Draft           POP3 Support for UTF-8                July 2012


   encoding format of maildrops may not be UTF-8 or ASCII.  In UTF-8
   mode, if the character encoding format of maildrops is UTF-8 or
   ASCII, the messages are sent to the client as-is; if the character
   encoding format of maildrops is format other than UTF-8 or ASCII, the
   messages' encoding format SHOULD be converted to be UTF-8 before they
   are sent to the client.  When not in UTF-8 mode, non-ASCII string
   messages including UTF-8 string messages in the maildrop MUST NOT be
   sent to the client as-is.  If a client requests a UTF-8 message when
   not in UTF-8 mode, the server MUST either create the message content
   variant (discussed in Section 7 of [I-D.ietf-eai-5738bis]) it sends
   to the client to comply with unextended POP and Internet Mail Format
   without UTF-8 mode support, or fail the request with a -ERR response
   containing the UTF-8 response code (see section 5).  The UTF8 command
   MAY fail.

   Note that even in UTF-8 mode, MIME binary content-transfer-encoding
   as defined in MIME Section 6.2 [RFC2045] is still not permitted.

   The octet count (size) of a message reported in a response to the
   LIST command SHOULD match the actual number of octets sent in a RETR
   response (not counting byte-stuffing).  Sizes reported elsewhere,
   such as in STAT responses and non-standardized, free-form text in
   positive status indicators (following "+OK") need not be accurate,
   but it is preferable if they were.

   Normal operation for maildrops that natively support non-ASCII
   characters will be for both servers and clients to support the
   extension discussed in this specification.  Upgrading of both clients
   and servers is the only fully satisfactory way to support the
   capabilities offered by the "UTF8" extension and SMTPUTF8 mail more
   generally.  Servers must, however, anticipate the possibility of a
   client attempting to access a message that requires this extension
   without having issued the "UTF8" command.  There are no completely
   satisfactory responses for that case other than upgrading the client
   to support this specification.  One solution, unsatisfactory because
   the user may be confused by being able to access the message through
   some means and not others, is that a server MAY choose to reject the
   command to retrieve the message as discussed in Section 5.  Other
   alternatives, including the possibility of creating and delivering
   variant form of the message, are discussed in Section 7 of
   [I-D.ietf-eai-5738bis].

   Clients MUST NOT issue the STLS command [RFC2595] after issuing UTF8;
   servers MAY (but are not required to) enforce this by rejecting with
   an "-ERR" response an STLS command issued subsequent to a successful
   UTF8 command.  (Because this is a protocol error as opposed to a
   failure based on conditions, an extended response code [RFC2449] is
   not specified.)



Gellens, et al.         Expires February 1, 2013                [Page 5]


Internet-Draft           POP3 Support for UTF-8                July 2012


2.2.  USER Argument to UTF8 Capability

   If the USER argument is included with this capability, it indicates
   that the server accepts UTF-8 user names and passwords.

   Servers that include the USER argument in the UTF8 capability
   response SHOULD apply SASLprep [RFC4013] or one of its standards-
   track successors to the arguments of the USER and PASS commands.

   A client or server that supports APOP and permits UTF-8 in user names
   or passwords MUST apply SASLprep [RFC4013] or one of its standards-
   track successors to the user name and password used to compute the
   APOP digest.

   When applying SASLprep [RFC4013], servers MUST reject UTF-8 user
   names or passwords that contain a UTF-8 character listed in Section
   2.3 of SASLprep.  When applying SASLprep to the USER argument, the
   PASS argument, or the APOP username argument, a compliant server or
   client MUST treat them as a query string [RFC3454].  When applying
   SASLprep to the APOP password argument, a compliant server or client
   MUST treat them as a stored string [RFC3454].

   The client does not need to issue the UTF8 command prior to using
   UTF-8 in authentication.  However, clients MUST NOT use UTF-8 string
   in USER, PASS, or APOP commands unless the USER argument is included
   in the UTF8 capability response.

   The server MUST reject UTF-8 user names or passwords that fail to
   comply with the formal syntax in UTF-8 [RFC3629].

   Use of UTF-8 string in the AUTH command is governed by the POP3 SASL
   [RFC5034] mechanism.

3.  LANG Capability

   This document adds a new POP3 Extension [RFC2449] capability response
   tag to indicate support for a new command: LANG.  The capability tag
   and new command are described below.

   CAPA tag:
      LANG

   Arguments with CAPA tag:
      none







Gellens, et al.         Expires February 1, 2013                [Page 6]


Internet-Draft           POP3 Support for UTF-8                July 2012


   Added Commands:
      LANG

   Standard commands affected:
      All

   Announced states / possible differences:
      both / no

   Commands valid in states:
      AUTHORIZATION, TRANSACTION

   Specification reference:
      this document

   Discussion:

   POP3 allows most +OK and -ERR server responses to include human-
   readable text that, in some cases, might be presented to the user.
   But that text is limited to ASCII by the POP3 specification
   [RFC1939].  The LANG capability and command permit a POP3 client to
   negotiate which language the server uses when sending human-readable
   text.

   The LANG command requests that human-readable text included in all
   subsequent +OK and -ERR responses be localized to a language matching
   the language range argument (the "Basic Language Range" as described
   by [RFC4647]).  If the command succeeds, the server returns a +OK
   response followed by a single space, the exact language tag selected,
   another space.  Human-readable text in the appropriate language then
   appears in the rest of the line.  This and subsequent protocol-level
   human-readable text is encoded in the UTF-8 charset.

   If the command fails, the server returns an -ERR response and
   subsequent human-readable response text continues to use the language
   that was previously used.

   If the client issues a LANG command with the special "*" language
   range argument, it indicates a request to use a language designated
   as preferred by the server administrator.  The preferred language MAY
   vary based on the currently active user.

   If no argument is given and the POP3 server issues a positive
   response, that response will usually consist of multi-lines.  After
   the initial +OK, for each language tag the server supports, the POP3
   server responds with a line for that language.  This line is called a
   "language listing".




Gellens, et al.         Expires February 1, 2013                [Page 7]


Internet-Draft           POP3 Support for UTF-8                July 2012


   In order to simplify parsing, all POP3 servers are required to use a
   certain format for language listings.  A language listing consists of
   the language tag [RFC5646] of the message, optionally followed by a
   single space and a human-readable description of the language in the
   language itself, using the UTF-8 charset.  There are no specific
   listing order of languages, which may depend on configuration or
   implementation.

   Examples:

      Note that some examples do not include the correct character
      accents due to limitations of this document format.


      C: USER karen
      S: +OK Hello, karen
      C: PASS password
      S: +OK karen's maildrop contains 2 messages (320 octets)

      Client requests deprecated MUL language.  Server replies
      with -ERR response.

      C: LANG MUL
      S: -ERR invalid language MUL

      A LANG command with no parameters is a request for
      a language listing.

      C: LANG
      S: +OK Language listing follows:
      S: en English
      S: en-boont English Boontling dialect
      S: de Deutsch
      S: it Italiano
      S: es Espanol
      S: sv Svenska
      S: .

      A request for a language listing might fail.

      C: LANG
      S: -ERR Server is unable to list languages

      Once the client selects the language, all responses will be in
      that language, starting with the response to the LANG command.

      C: LANG es
      S: +OK es Idioma cambiado



Gellens, et al.         Expires February 1, 2013                [Page 8]


Internet-Draft           POP3 Support for UTF-8                July 2012


      If a server returns an -ERR response to a LANG command
      that specifies a primary language, the current language
      for responses remains in effect.

      C: LANG uga
      S: -ERR es Idioma <<UGA>> no es conocido

      C: LANG sv
      S: +OK sv Kommandot "LANG" lyckades

      C: LANG *
      S: +OK es Idioma cambiado

4.  Non-ASCII character Maildrops

   When a POP3 server uses a native non-ASCII character maildrop, it is
   the responsibility of the server to comply with the POP3 base
   specification [RFC1939] and Internet Message Format [RFC5322] when
   not in UTF-8 mode.  When the server is not in UTF-8 mode and the
   message requires that mode, requests to download the message MAY be
   rejected (as specified in the next section) or the various other
   alternatives outlined in Section 2.1 above, including creation and
   delivery of variations on the original message, MAY be considered.

5.  UTF8 Response Code

   Per "POP3 Extension Mechanism" [RFC2449], this document adds a new
   response code: UTF8, described below.

   Complete response code:
      UTF8

   Valid for responses:
      -ERR

   Valid for commands:
      LIST, TOP, RETR

   Response code meaning and expected client behavior:

   The UTF8 response code indicates that a failure is due to a request
   when not in UTF-8 mode for message content containing UTF-8 string.

   The client MAY reissue the command after entering UTF-8 mode.







Gellens, et al.         Expires February 1, 2013                [Page 9]


Internet-Draft           POP3 Support for UTF-8                July 2012


6.  IANA Considerations

   Section 2 and 3 of this specification update two capabilities ("UTF8"
   and "LANG") to the POP3 capability registry [RFC2449].

   Section 5 of this specification also adds one new response code
   ("UTF8") to the POP3 response codes registry [RFC2449].

7.  Security Considerations

   The security considerations of UTF-8 [RFC3629], SASLprep [RFC4013]
   and Unicode Format for Network Interchange [RFC5198] apply to this
   specification, particularly with respect to use of UTF-8 in user
   names and passwords.

   The "LANG *" command might reveal the existence and preferred
   language of a user to an active attacker probing the system if the
   active language changes in response to the USER, PASS, or APOP
   commands prior to validating the user's credentials.  Servers are
   strongly advised to implement a configuration to prevent this
   exposure.

   It is possible for a man-in-the-middle attacker to insert a LANG
   command in the command stream, thus making protocol-level diagnostic
   responses unintelligible to the user.  A mechanism to protect the
   integrity of the session, such as Transport Layer Security (TLS)
   [RFC2595] can be used to defeat such attacks.

   As with other internationalization upgrades, modifications to server
   authentication code (in this case, to support non-ASCII strings)
   needs to be done with care to avoid introducing vulnerabilities (for
   example, in string parsing or matching).  This is particularly
   important if the native databases or mailstore of the operating
   system use some character set or encoding other than Unicode in
   UTF-8.

8.  Change History

8.1.  draft-ietf-eai-rfc5721bis: Version 00

   following the new charter

8.2.  draft-ietf-eai-rfc5721bis:  Version 01

   refine the texts






Gellens, et al.         Expires February 1, 2013               [Page 10]


Internet-Draft           POP3 Support for UTF-8                July 2012


8.3.  draft-ietf-eai-rfc5721bis:  Version 02

   update the texts based on Joseph's comments

8.4.  draft-ietf-eai-rfc5721bis:  Version 03

   improve the texts

   text instructing servers to either downconvert or reject

   new UTF-8 response code for use

8.5.  draft-ietf-eai-rfc5721bis:  Version 04

   improve the texts

8.6.  draft-ietf-eai-rfc5721bis:  Version 05

   updated according to jabber interim meeting result

   updated according to john and apparea's review comments

8.7.  draft-ietf-eai-rfc5721bis:  Version 06

   improve the texts, updated section 3.2 to provide for SASL successor
   specs.

8.8.  draft-ietf-eai-rfc5721bis:  Version 07

   updated according to John's comments

9.  References

9.1.  Normative References

   [I-D.ietf-eai-5738bis]  Resnick, P., Newman, C., and S. Shen, "IMAP
                           Support for UTF-8", draft-ietf-eai-5738bis-03
                           (work in progress), December 2011.

   [RFC1939]               Myers, J. and M. Rose, "Post Office Protocol
                           - Version 3", STD 53, RFC 1939, May 1996.

   [RFC2045]               Freed, N. and N. Borenstein, "Multipurpose
                           Internet Mail Extensions (MIME) Part One:
                           Format of Internet Message Bodies", RFC 2045,
                           November 1996.

   [RFC2047]               Moore, K., "MIME (Multipurpose Internet Mail



Gellens, et al.         Expires February 1, 2013               [Page 11]


Internet-Draft           POP3 Support for UTF-8                July 2012


                           Extensions) Part Three: Message Header
                           Extensions for Non-ASCII Text", RFC 2047,
                           November 1996.

   [RFC2119]               Bradner, S., "Key words for use in RFCs to
                           Indicate Requirement Levels", BCP 14,
                           RFC 2119, March 1997.

   [RFC2449]               Gellens, R., Newman, C., and L. Lundblade,
                           "POP3 Extension Mechanism", RFC 2449,
                           November 1998.

   [RFC3454]               Hoffman, P. and M. Blanchet, "Preparation of
                           Internationalized Strings ("stringprep")",
                           RFC 3454, December 2002.

   [RFC3629]               Yergeau, F., "UTF-8, a transformation format
                           of ISO 10646", STD 63, RFC 3629,
                           November 2003.

   [RFC4013]               Zeilenga, K., "SASLprep: Stringprep Profile
                           for User Names and Passwords", RFC 4013,
                           February 2005.

   [RFC4647]               Phillips, A. and M. Davis, "Matching of
                           Language Tags", BCP 47, RFC 4647,
                           September 2006.

   [RFC5198]               Klensin, J. and M. Padlipsky, "Unicode Format
                           for Network Interchange", RFC 5198,
                           March 2008.

   [RFC5322]               Resnick, P., Ed., "Internet Message Format",
                           RFC 5322, October 2008.

   [RFC5646]               Phillips, A. and M. Davis, "Tags for
                           Identifying Languages", BCP 47, RFC 5646,
                           September 2009.

   [RFC6530]               Klensin, J. and Y. Ko, "Overview and
                           Framework for Internationalized Email",
                           RFC 6530, February 2012.

   [RFC6532]               Yang, A., Steele, S., and N. Freed,
                           "Internationalized Email Headers", RFC 6532,
                           February 2012.





Gellens, et al.         Expires February 1, 2013               [Page 12]


Internet-Draft           POP3 Support for UTF-8                July 2012


9.2.  Informative References

   [RFC2595]               Newman, C., "Using TLS with IMAP, POP3 and
                           ACAP", RFC 2595, June 1999.

   [RFC5034]               Siemborski, R. and A. Menon-Sen, "The Post
                           Office Protocol (POP3) Simple Authentication
                           and Security Layer (SASL) Authentication
                           Mechanism", RFC 5034, July 2007.

   [RFC5721]               Gellens, R. and C. Newman, "POP3 Support for
                           UTF-8", RFC 5721, February 2010.

Appendix A.  Design Rationale

   This non-normative section discusses the reasons behind some of the
   design choices in the above specification.

   Due to interoperability problems with RFC 2047 and limited deployment
   of RFC 2231, it is hoped these 7-bit encoding mechanisms can be
   deprecated in the future when UTF-8 header support becomes prevalent.

   The USER capability (Section 2.2) and hence the upgraded USER command
   and additional support for non-ASCII credentials, are optional
   because the implementation burden of SASLprep [RFC4013] is not well
   understood, and mandating such support in all cases could negatively
   impact deployment.

Appendix B.  Acknowledgments

   Thanks to John Klensin, Joseph Yee, Tony Hansen, Alexey Melnikov and
   other Email Address Internationalization working group participants
   who provided helpful suggestions and interesting debate that improved
   this specification.

Authors' Addresses

   Randall Gellens
   QUALCOMM Incorporated
   5775 Morehouse Drive
   San Diego, CA  92651
   US

   EMail: rg+ietf@qualcomm.com







Gellens, et al.         Expires February 1, 2013               [Page 13]


Internet-Draft           POP3 Support for UTF-8                July 2012


   Chris Newman
   Oracle
   800 Royal Oaks
   Monrovia, CA  91016-6347
   US

   EMail: chris.newman@oracle.com


   Jiankang YAO
   CNNIC
   No.4 South 4th Street, Zhongguancun
   Beijing

   Phone: +86 10 58813007
   EMail: yaojk@cnnic.cn


   Kazunori Fujiwara
   Japan Registry Services Co., Ltd.
   Chiyoda First Bldg. East 13F, 3-8-1 Nishi-Kanda
   Tokyo

   Phone: +81 3 5215 8451
   EMail: fujiwara@jprs.co.jp


























Gellens, et al.         Expires February 1, 2013               [Page 14]