Independent Submission S. Sharikov
Request for Comments: 5992 Regtime Ltd
Category: Informational D. Miloshevic
ISSN: 2070-1721 Afilias
J. Klensin
October 2010
Internationalized Domain Names Registration and Administration
Guidelines for European Languages Using Cyrillic
Abstract
This document is a guideline for registries and registrars on
registering internationalized domain names (IDNs) based on (in
alphabetical order) Bosnian, Bulgarian, Byelorussian, Kildin Sami,
Macedonian, Montenegrin, Russian, Serbian, and Ukrainian languages in
a DNS zone. It describes appropriate characters for registration and
variant considerations for characters from Greek and Latin scripts
with similar appearances and/or derivations.
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This is a contribution to the RFC Series, independently of any other
RFC stream. The RFC Editor has chosen to publish this document at
its discretion and makes no statement about its value for
implementation or deployment. Documents approved for publication by
the RFC Editor are not a candidate for any level of Internet
Standard; see Section 2 of RFC 5741.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc5992.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Sharikov, et al. Informational [Page 1]
RFC 5992 Cyrillic IDNs October 2010
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Similar Characters and Variants . . . . . . . . . . . . . 3
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
2. Languages and Characters . . . . . . . . . . . . . . . . . . . 5
2.1. Bosnian and Serbian . . . . . . . . . . . . . . . . . . . 5
2.2. Bulgarian . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3. Byelorussian (Belarusian, Belarusan) . . . . . . . . . . . 5
2.4. Kildin Sami . . . . . . . . . . . . . . . . . . . . . . . 6
2.5. Macedonian . . . . . . . . . . . . . . . . . . . . . . . . 7
2.6. Montenegrin . . . . . . . . . . . . . . . . . . . . . . . 7
2.7. Russian . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.8. Serbian . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.9. Ukrainian . . . . . . . . . . . . . . . . . . . . . . . . 8
3. Language-Based Tables . . . . . . . . . . . . . . . . . . . . 8
4. Table Processing Rules . . . . . . . . . . . . . . . . . . . . 8
5. Table Format . . . . . . . . . . . . . . . . . . . . . . . . . 8
6. Steps after Registering an Input Label . . . . . . . . . . . . 9
7. Security Considerations . . . . . . . . . . . . . . . . . . . 9
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
9.1. Normative References . . . . . . . . . . . . . . . . . . . 10
9.2. Informative References . . . . . . . . . . . . . . . . . . 10
Appendix A. European Cyrillic Character Tables . . . . . . . . . 13
1. Introduction
Cyrillic is one of a fairly small number of scripts that are used,
with different subsets of characters, to write a large number of
languages, some of which are not closely related to the others. When
those languages might be used together in a zone (typical of generic
TLDs (gTLDs) but likely in other zones both at and below the root),
special considerations for intermixing characters may apply.
Cyrillic also has the property that, while it is usually considered a
separate script from the Latin (Roman) and Greek ones, it shares many
characters with them, creating opportunities for visual confusion.
Those difficulties are especially pronounced when "all of Cyrillic"
is used rather than only the characters associated with a particular