Internet Draft                      Author: Ben Chan
July 15, 2001
Expires in six months


                Supreme Chinese Domain Name System
                     draft-chan-idn-scdns-00.txt




Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other groups
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html


Abstract

Chinese can be written in 2 different scripts, traditional Chinese and
simplified Chinese, that cannot be distinguished by many people of
certain background/cultures/groups who use them interchangeably.  As a
result, users of Chinese Domain Names (CDN) have special needs that can
only be satisfied by adding a label to CDNs that distinguish a CDN with
traditional characters from a CDN with simplified characters.  This
labeling is an entire system that can be accomplished with SLDs or by
creating a new type of TLD called Language Script TLD (lsTLD).  This
draft describes the benefits that the system will provide and the
techniques involved in implementing it.


1.  Introduction

(Labeling of a CDN can be accomplished with either SLDs or lsTLDs.
However, for simplicity, most of this draft will only use lsTLD to
describe the system and its techniques.  For more information on how
SLDs can be used, please see section 5.2.)

The <.traditional> and <.simplified> TLDs in Chinese characters are:
a) <.traditional> in traditional Chinese is '.' '(U+7E41)( U+9AD4) b)
<.traditional> in simplified Chinese is '.'' (U+7E41)( U+4F53) c)
<.simplified> in simplified Chinese is '.'' (U+7B80)( U+4F53) d)
<.simplified> in traditional Chinese is '.'' (U+7C21)( U+9AD4)

Using the 4 language script TLDs above, a Chinese Domain Name System
can be created to satisfy the needs of CDN users by combining together
the following 3 benefits:

Benefit A- A registrant is given the choice of pointing a traditional
CDN to one location (ie. traditional Chinese website) and pointing the
corresponding simplified CDN to another location (ie. simplified
Chinese website).

Benefit B- The registration of a simplified CDN will automatically
reserves the corresponding traditional CDN(s) and visa versa thus
giving users the flexibility of using either script.

Benefit C- A method that will guide users to enter a CDN in their
applications (ie. web browsers) matching the meaning that was intended
by the registrant when the CDN was registered.


2.  Description of the Importance of the Benefits

It is important to understand the needs of ordinary everyday people who
will be the users of CDNs.  The following subsections will explain in
detail the 3 benefits of this system that satisfied those needs.


2.1  Importance of Benefit A.

It would be appropriate for a traditional CDN to be pointing to a
traditional website with contents that are suitable for visitors from
Hong Kong or Taiwan.  On the same line, it would be appropriate for a
simplified CDN to be pointing to a simplified website with contents
that are suitable for visitors from China or Singapore.


2.2  Importance of Benefit B.

Since many Chinese can read / write in both scripts, it is only
appropriate for a traditional CDN to be mapped to its corresponding
simplified CDN by applying a conversion.  This will ensure that no
matter what script the user types in, he will always be able to reach
the intended location(s).


2.3  Importance of Benefit C.

The relationship between simplified Chinese and traditional Chinese is
very complicated.  A TC character that corresponds to a SC character
may not have the same meaning.  To complicate the situation, one TC
character can be mapped into many different SC and visa versa.  One CDN
can potential have a great number of different written variations.
Without a method, a user can be given a CDN and type in the correct CDN
but still cannot reach the proper destination because it is a variation
of the original CDN not intended by the registrant.


3.  Solution / Method

The method for implementing this system is done both at the
registration system and at the client end.


3.1  Implemented at the registration system

The solution to delivering the 3 benefits explained above is a Chinese
domain name system that uses language script TLDs- a TLD of
<.traditional> for traditional CDNs (defined here as a CDN that uses
all traditional characters) and a TLD of <.simplified> for simplified
CDNs (defined here as a CDN that uses all simplified characters).
During registration, a person is allowed to register CDNs in either all
traditional Chinese characters or all simplified Chinese characters but
not by mixing the 2 scripts together.  If he registers in traditional
characters, he will be given a traditional CDN (with the TLD of
<.traditional>) and any similar traditional CDNs will be reserved.  At
the same time, the corresponding simplified CDN(s) (with the TLD of
<.simplified>) will also be reserved- to be activated at a later date
if the registrant chooses to do so.  If he registers in simplified
characters, he will be given a simplified CDN (with the TLD of
<.simplified>) and any similar simplified CDNs will be reserved.  At
the same time, the corresponding traditional CDN(s) (with the TLD of
<.traditional>) is reserved.


3.2     Implemented at the client end.

If a user types in a traditional CDN (with the <.traditional>), error
checking can be done by the application (ie. web browser- nameprep to
prohibit invalid entries) on the CDN by searching for the characters in
a Unicode table containing all the valid traditional Chinese
characters.  If a certain character is found not to be a valid
traditional character, an error will be displayed to point out which
character is invalid.  If a user types in a simplified CDN (with the
<.simplified>), the same error checking will be performed by searching
for valid simplified Chinese character.  (Please see Appendix A for a
list of the disallowed Unicodes for traditional CDNs and simplified
CDNs.)


4.  Conclusion

Under such a method of creating a relationship between the lsTLDs, all
3 benefits will be satisfied.  Benefit A will be satisfied because he
can point <whatever>.<traditional> to a traditional website and point
<whatever>.<simplified> to a simplified website.  Benefit B will be
satisfied because when a user is given a <whatever>.<traditional> CDN,
but because he is from mainland China and is more comfortable using the
simplified script, he can simply use the corresponding CDN of
<whatever>.<simplified>.  In other words, a user can use the script of
his choice whether it is traditional Chinese or simplified Chinese and
still reach the location(s) intended by the registrant.  Benefit C will
be satisfied because when the user is given the
<whatever>.<traditional> CDN, the <.traditional> tells him that he must
set his Chinese Input editor to recognize TC only and thereby
preserving the original intended meaning of the CDN when it was first
registered.  In other words, the language script TLDs give the users
much more control and eliminates any guess work.


5.  Other Comments

There are 2 important related issues- 'TC<->SC equivalence' and 'lsSLDs'.


5.1  TC<->SC equivalence

An interest question is how this system is effect with TC<->SC
equivalence in the DNS protocol?  The answer is that it will even be
better.  With TC<->SC equivalence in the DNS protocol, all 4 lsTLDs are
used.  No error checking will be performed.  The <.traditional> lsTLD
in both simplified and traditional forms are consider equivalent and
point to the same location (ie.  traditional website).  The
<.simplified> lsTLD in both simplified and traditional forms are
considered equivalent and point to the same location (ie.  simplified
website).  The author of this draft strongly endorse any efforts made
in finding a reasonable solution to the TC<->SC equivalence.


5.2     lsSLDs

The same techniques documented in this draft can also be applied to the
current gTLD and ccTLD registries by using SLDs.  In order to be fair,
everyone must agree to this system and make it a standard.  In
addition, every registry must change their current registered second
level domains to third level domains (ie.
<whatever>.<traditional>.TLD, <whatever>.<simplified>.TLD)


6.  Author's Address

Ben Chan
cc-www.com
Box 92241
2900 Warden Avenue
Scarborough, Ontario
Canada
M1W 3Y9


7.  References

[IDNREQ]  Requirements of Internationalized Domain Names, Zita
Wenzel, James
Seng, draft-ietf-idn-requirements



Appendix A-  Error checking for Unicodes of traditional/simplified
Chinese characters

(The following is a partial list for information only.  A complete list
will be presented upon actual implementation.)


Acceptable Unicodes             Acceptable Unicodes
for a simplified                        for a traditional
CDN                             CDN


7691                            769A
788D                            7919
7231                            611B
8884                            8956
5965                            5967
575D                            58E9
7F62                            7F77
6446                            64FA
8D25                            6557
9881                            9812
529E                            8FA6
7ECA                            7D46
5E2E                            5E6B
7ED1                            7D81
9551                            938A
8C24                            8B17
5265                            525D
9971                            98FD
5B9D                            5BF6
62A5                            5831
9C8D                            9B91
8F88                            8F29
8D1D                            8C9D
94A1                            92C7
72C8                            72FD
5907                            5099
60EB                            618A
7EF7                            7E43
7B14                            7B46
6BD5                            7562
6BD9                            6583
5E01                            5E63
95ED                            9589
8FB9                            908A
7F16                            7DE8
8D2C                            8CB6
53D8                            8B8A

etc.