ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages
RFC 1842

Document Type RFC - Informational (August 1995; No errata)
Last updated 2013-03-02
Stream Legacy
Formats plain text pdf html bibtex
Stream Legacy state (None)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state RFC 1842 (Informational)
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group                                             Y. Wei
Request for Comments: 1842                        AsiaInfo Services Inc.
Category: Informational                                         Y. Zhang
                                                           Harvard Univ.
                                                                   J. Li
                                                              Rice Univ.
                                                                 J. Ding
                                                  AsiaInfo Services Inc.
                                                                Y. Jiang
                                                       Univ. of Maryland
                                                             August 1995

      ASCII Printable Characters-Based Chinese Character Encoding
                         for Internet Messages

Status of this Memo

   This memo provides information for the Internet community.  This memo
   does not specify an Internet standard of any kind.  Distribution of
   this memo is unlimited.

Abstract

   This document describes the encoding used in electronic mail [RFC822]
   and network news [RFC1036] messages over the Internet. The 7-bit
   representation of GB 2312 Chinese text was specified by Fung Fung Lee
   of Stanford University [Lee89] and implemented in various software
   packages under different platforms (see appendix for a partial list
   of the available software packages that support this encoding
   method). It is further tested and used in the usenet newsgroups
   alt.chinese.text and chinese.* as well as various other network
   forums with considerable success. Future extensions of this encoding
   method can accommodate additional GB character sets and other east
   asian language character sets [Wei94].

   The name given to this encoding is "HZ-GB-2312", which is intended to
   be used in the "charset" parameter field of MIME headers (see [MIME1]
   and [MIME2]).

Wei, et al                   Informational                      [Page 1]
RFC 1842            ASCII/Chinese Character Encoding         August 1995

Table of Contents

   1.     Introduction................................................ 2
   2.     Description................................................. 3
   3.     Formal Syntax............................................... 4
   4.     MIME Considerations......................................... 5
   5.     Background Information...................................... 5
   6.     References.................................................. 6
   7.     Acknowledgements............................................ 6
   8.     Security Considerations..................................... 7
   9.     Authors' Addresses.......................................... 7
   10.    Appendix: List of Software Implementing HZ Representation... 9

1. Introduction

   Chinese (and other east Asia languages) characters are encoded with
   multiple bytes to guarantee sufficient coding space for the large
   number of glyphs these languages contain. With the prolification of
   internetwork traffic around the world, it becomes necessary to define
   ways to facilitate the transfer of text in multiple-byte character-
   set languages (hereafter as Chinese text) over internet.

   There are two layers of concerns need to be addressed by any
   mechanism whose purpose is to transfer Chinese text over internet.
   The first is on application layer, in which concerned applications
   should be able to recognize the encoding of the text and/or discern
   different character sets which might be mixed in the text and handle
   it accordingly. The second layer is the actual transport of Chinese
   text between point A to point B over the Internet. Because the
   prevailing mail transport protocol used over internet, the Simple
   Mail Transport Protocol (aka. SMTP) was designed originally for ASCII
   character set only, many internet mail agents are not 8 bit clean and
   therefore introduce challenges for any attempt to actually implement
   a mechanism for the transport of Chinese text over internet.

   Here we describe a mechanism for transmission of Chinese text over IP
   network. This described mechanism has being implemented by various
   software package dealing with multi-language support and has been
   tested on USENET newsgroups and other types of internet forums over
   the last two years. The test results shows that the HZ representation
   can pass through almost all existing mail delivery agents without
   being corrupted. The HZ representation currently handles GB2312-80
   Chinese character set only. Further expansion to other Chinese
   encoding systems and to other East Asia Language is under
   consideration.

Wei, et al                   Informational                      [Page 2]
RFC 1842            ASCII/Chinese Character Encoding         August 1995
Show full document text