Character Normalization in IETF Protocols
draft-duerst-i18n-norm-04
Document | Type | Expired Internet-Draft (individual) | |
---|---|---|---|
Authors | Martin Dürst , Mark Davis | ||
Last updated | 2000-09-13 | ||
Stream | (None) | ||
Intended RFC status | (None) | ||
Formats |
Expired & archived
pdf
htmlized (tools)
htmlized
bibtex
|
||
Stream | Stream state | (No stream defined) | |
Consensus Boilerplate | Unknown | ||
RFC Editor Note | (None) | ||
IESG | IESG state | Expired | |
Telechat date | |||
Responsible AD | (None) | ||
Send notices to | (None) |
https://www.ietf.org/archive/id/draft-duerst-i18n-norm-04.txt
Abstract
The Universal Character Set (UCS) [ISO10646, Unicode] covers a very wide repertoire of characters. The IETF, in [RFC 2277], requires that future IETF protocols support UTF-8 [RFC 2279], an ASCII-compatible encoding of UCS. The wide range of characters included in the UCS has lead to some cases of duplicate encodings. This document proposes that in IETF protocols, the class of duplicates called canonical equivalents be dealt with by using Early Uniform Normalization according to Unicode Normalization Form C, Canonical Composition (NFC) [UTR15]. This document describes both Early Uniform Normalization and Normalization Form C.
Authors
Martin Dürst
(duerst@it.aoyama.ac.jp)
Mark Davis
(mark.davis@macchiato.com)
(Note: The e-mail addresses provided for the authors of this Internet-Draft may no longer be valid.)