Those Troublesome Characters: A Registry of Unicode Code Points Needing Special Consideration When Used in Network Identifiers
draft-freytag-troublesome-characters-02

Document Type Active Internet-Draft (individual)
Last updated 2018-06-30
Stream (None)
Intended RFC status (None)
Formats plain text xml pdf html bibtex
Stream Stream state (No stream defined)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date
Responsible AD (None)
Send notices to (None)
IETF                                                          A. Freytag
Internet-Draft                                               ASMUS, Inc.
Intended status: Standards Track                              J. Klensin
Expires: December 31, 2018
                                                             A. Sullivan
                                                            Oracle Corp.
                                                           June 29, 2018

Those Troublesome Characters: A Registry of Unicode Code Points Needing
         Special Consideration When Used in Network Identifiers
                draft-freytag-troublesome-characters-02

Abstract

   Unicode's design goal is to be the universal character set for all
   applications.  The goal entails the inclusion of very large numbers
   of characters.  It is also focused on written language in general;
   special provisions have always been needed for identifiers.  The
   sheer size of the repertoire increases the possibility of accidental
   or intentional use of characters that can cause confusion among
   users, particularly where linguistic context is ambiguous,
   unavailable, or impossible to determine.  A registry of code points
   that can be sometimes especially problematic may be useful to guide
   system administrators in setting parameters for allowable code points
   or combinations in an identifier system, and to aid applications in
   creating security aids for users.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 31, 2018.

Freytag, et al.         Expires December 31, 2018               [Page 1]
Internet-Draft           Troublesome Characters                June 2018

Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Unicode code points and identifiers . . . . . . . . . . . . .   3
   2.  Background and Conventions  . . . . . . . . . . . . . . . . .   5
   3.  Techniques already in place . . . . . . . . . . . . . . . . .   5
   4.  A registry of code points requiring special attention . . . .   7
     4.1.  Description . . . . . . . . . . . . . . . . . . . . . . .   7
     4.2.  Maintenance . . . . . . . . . . . . . . . . . . . . . . .  10
     4.3.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . .  10
   5.  Registry initial contents . . . . . . . . . . . . . . . . . .  11
     5.1.  Overview  . . . . . . . . . . . . . . . . . . . . . . . .  11
     5.2.  Interchangeable Code Points . . . . . . . . . . . . . . .  12
     5.3.  Excludable Code Points  . . . . . . . . . . . . . . . . .  13
     5.4.  Combining Marks . . . . . . . . . . . . . . . . . . . . .  14
     5.5.  Mitigation  . . . . . . . . . . . . . . . . . . . . . . .  15
       5.5.1.  Mitigation Strategies . . . . . . . . . . . . . . . .  16
       5.5.2.  Limits of Mitigation  . . . . . . . . . . . . . . . .  18
     5.6.  Notes . . . . . . . . . . . . . . . . . . . . . . . . . .  19
   6.  Table of Code Points  . . . . . . . . . . . . . . . . . . . .  19
     6.1.  References for Registry . . . . . . . . . . . . . . . . .  27
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  28
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  29
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  29
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  29
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  30
   Appendix A.  Additional Background  . . . . . . . . . . . . . . .  31
     A.1.  The                       Theory of Inclusion . . . . . .  31
     A.2.  The Difference Between Theory and Practice  . . . . . . .  33
       A.2.1.  Confusability . . . . . . . . . . . . . . . . . . . .  33
Show full document text