Lightweight Directory Access Protocol (LDAP): Internationalized String Preparation
RFC 4518

Document Type RFC - Proposed Standard (June 2006; Errata)
Author Kurt Zeilenga 
Last updated 2020-01-21
Stream IETF
Formats plain text html pdf htmlized with errata bibtex
Stream WG state WG Document
Document shepherd No shepherd assigned
IESG IESG state RFC 4518 (Proposed Standard)
Consensus Boilerplate Unknown
Telechat date
Responsible AD Ted Hardie
Send notices to (None)
Network Working Group                                        K. Zeilenga
Request for Comments: 4518                           OpenLDAP Foundation
Category: Standards Track                                      June 2006

             Lightweight Directory Access Protocol (LDAP):
                  Internationalized String Preparation

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2006).


   The previous Lightweight Directory Access Protocol (LDAP) technical
   specifications did not precisely define how character string matching
   is to be performed.  This led to a number of usability and
   interoperability problems.  This document defines string preparation
   algorithms for character-based matching rules defined for use in

1.  Introduction

1.1.  Background

   A Lightweight Directory Access Protocol (LDAP) [RFC4510] matching
   rule [RFC4517] defines an algorithm for determining whether a
   presented value matches an attribute value in accordance with the
   criteria defined for the rule.  The proposition may be evaluated to
   True, False, or Undefined.

      True      - the attribute contains a matching value,

      False     - the attribute contains no matching value,

      Undefined - it cannot be determined whether the attribute contains
                  a matching value.

Zeilenga                    Standards Track                     [Page 1]
RFC 4518       LDAP: Internationalized String Preparation      June 2006

   For instance, the caseIgnoreMatch matching rule may be used to
   compare whether the commonName attribute contains a particular value
   without regard for case and insignificant spaces.

1.2.  X.500 String Matching Rules

   "X.520: Selected attribute types" [X.520] provides (among other
   things) value syntaxes and matching rules for comparing values
   commonly used in the directory [X.500].  These specifications are
   inadequate for strings composed of Unicode [Unicode] characters.

   The caseIgnoreMatch matching rule [X.520], for example, is simply
   defined as being a case-insensitive comparison where insignificant
   spaces are ignored.  For printableString, there is only one space
   character and case mapping is bijective, hence this definition is
   sufficient.  However, for Unicode string types such as
   universalString, this is not sufficient.  For example, a case-
   insensitive matching implementation that folded lowercase characters
   to uppercase would yield different results than an implementation
   that used uppercase to lowercase folding.  Or one implementation may
   view space as referring to only SPACE (U+0020), a second
   implementation may view any character with the space separator (Zs)
   property as a space, and another implementation may view any
   character with the whitespace (WS) category as a space.

   The lack of precise specification for character string matching has
   led to significant interoperability problems.  When used in
   certificate chain validation, security vulnerabilities can arise.  To
   address these problems, this document defines precise algorithms for
   preparing character strings for matching.

1.3.  Relationship to "stringprep"

   The character string preparation algorithms described in this
   document are based upon the "stringprep" approach [RFC3454].  In
   "stringprep", presented and stored values are first prepared for
   comparison so that a character-by-character comparison yields the
   "correct" result.

   The approach used here is a refinement of the "stringprep" [RFC3454]
   approach.  Each algorithm involves two additional preparation steps.

   a) Prior to applying the Unicode string preparation steps outlined in
      "stringprep", the string is transcoded to Unicode.

   b) After applying the Unicode string preparation steps outlined in
      "stringprep", the string is modified to appropriately handle
      characters insignificant to the matching rule.

Zeilenga                    Standards Track                     [Page 2]
RFC 4518       LDAP: Internationalized String Preparation      June 2006

   Hence, preparation of character strings for X.500 [X.500] matching
   [X.501] involves the following steps:

      1) Transcode
      2) Map
      3) Normalize
      4) Prohibit
      5) Check Bidi (Bidirectional)
      6) Insignificant Character Handling

   These steps are described in Section 2.

   It is noted that while various tables of Unicode characters included
   or referenced by this specification are derived from Unicode
   [Unicode] data, these tables are to be considered definitive for the
Show full document text