Character Sets ISO-10646 and ISO-10646-J-1
RFC 1815

Document Type RFC - Informational (July 1995; No errata)
Last updated 2013-03-02
Stream Legacy
Formats plain text pdf html bibtex
Stream Legacy state (None)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state RFC 1815 (Informational)
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group                                            M. Ohta
Request For Comments: 1815                 Tokyo Institute of Technology
Category: Informational                                        July 1995

               Character Sets ISO-10646 and ISO-10646-J-1

Status of this Memo

   This memo provides information for the Internet community.  This memo
   does not specify an Internet standard of any kind.  Distribution of
   this memo is unlimited.

Abstract

   Though the ISO character set standard of ISO 10646 is specified
   reasonably well about European characters, it is not so useful in an
   fully internationalized environment.

   For the practical use of ISO 10646, a lot of external profiling such
   as restriction of characters, restriction of combination of
   characters and addition of language information is necessary.

   This memo provides information on such profiling, along with charset
   names to each profiled instance.

   Though all the effort is done to make the resulting charset as useful
   10646 based charset as possible, the result is not so good.  So, the
   charsets defined in this memo are only for reference purpose and its
   use for practical purpose is strongly discouraged.

Introduction

   This memo describes two text encoding schemes based on ISO 10646
   [10646].

   As ISO 10646 specifies too little about how text is visualized, to
   practically use ISO 10646, it is necessary to restrict the standard
   minimally and then add some amount of profiling information.

   For ISO 2022 [ISO2022] based national standards, sufficient profiling
   information is provided by national standardization bodies, but, for
   ISO 10646, such a profiling is not yet provided.

   As the profiling of ISO 10646 largely affects which character or
   combination of characters could be properly displayed, changes of
   profiling of ISO 10646 are as significant as additions of new
   character sets of ISO 2022.

M. Ohta                      Informational                      [Page 1]
RFC 1815       Character Sets ISO-10646 and ISO-10646-J-1      July 1995

   That is, it's impractical to support the entirety of ISO 10646 (new
   restriction or profiling can always be added), so a client needs to
   know whether some restriction or profiling is being used before it
   can decide whether to display the body part. Thus, it is necessary to
   provide multiple charset names to each variation of ISO 10646.

   For example, in Japan with Japanese windows NT, only those Han
   characters already supported by MS Kanji code (mostly equivalent to
   JIS X 0208 [JISX0208]) can be displayed, because no other font
   pattern is commonly provided.

   The other problem of ISO 10646 for Han characters is that, to display
   them in quality required for daily plain text processing in
   China/Japan/Korea, it is necessary to add profiling information on
   which one of Chinese/Japanese/Korean the text is using.  It should be
   noted that this feature makes multilingual mixed
   Chinese/Japanese/Korean text with ISO 10646 impractical.

   Also, just as [RFC1521] was unclear about how bi-directionality
   should be supported with "ISO-8859-6" and "ISO-8859-8" which was
   corrected by [RFC1556], it is also unclear how bi-directionality
   could be supported with ISO 10646.  There are too much ways to
   support bi- directionality.  So, until some bi-directionality
   mechanism(s) becomes widely supported, it is necessary to exclude
   characters for languages which requires bi-directionality support
   from the minimal variation.  It should be noted that, though ISO
   10646 is intended to be free from long term states, save for some
   profiling information, introduction of bi-directionality with ISO
   10646 do requires the long term states.

   Combining characters also cause problems. In many countries where
   combining characters based on [ISO2022] is used, there are
   restrictions on how combining characters are ordered [TIS].  Without
   such restriction, the result of combination is completely meaningless
   which is the current state of ISO 10646.  That is, if some
   combination is allowed in some implementation while the other does
   not support it, communication between them is difficult unless ISO
   10646 is profiled to be least common set of widely supported
   combinations.  So, again, until combination restriction will be
   developed for each language, it is necessary to exclude characters
   for such languages from the minimal variation.

   Conjoining characters also, may or may not be supported, which
   requires another profiling.

   According to those considerations, this memo defines two variations
   of ISO 10646. They are "ISO-10646" as the minimal basic variation and
   "ISO-10646-J-1" as the variation which could be useful in Japan.

M. Ohta                      Informational                      [Page 2]
RFC 1815       Character Sets ISO-10646 and ISO-10646-J-1      July 1995
Show full document text