Conventions for Encoding the Vietnamese Language VISCII: VIetnamese Standard Code for Information Interchange VIQR: VIetnamese Quoted-Readable Specification
RFC 1456

Document Type RFC - Informational (May 1993; No errata)
Last updated 2013-03-02
Stream Legacy
Formats plain text pdf html bibtex
Stream Legacy state (None)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state RFC 1456 (Informational)
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group          Vietnamese Standardization Working Group
Request for Comments: 1456                                     May 1993

            Conventions for Encoding the Vietnamese Language
      VISCII: VIetnamese Standard Code for Information Interchange
             VIQR: VIetnamese Quoted-Readable Specification
                              Revision 1.1

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard.  Distribution of this memo is
   unlimited.

Abstract

   This document provides information to the Internet community on the
   currently used conventions for encoding Vietnamese characters into
   7-bit US ASCII and in an 8-bit form.  These conventions are widely
   used by the overseas Vietnamese who are on the Internet and are
   active in USENET.  This document only provides information and
   specifies no level of standard.

1. Introduction

   In this paper we describe two conventions for representing Vietnamese
   characters.  VISCII (pronounced "visky") is an 8-bit character
   encoding that is similar to that used with ISO-8859.  VIQR
   (pronounced "vicker") is a mnemonic encoding of Vietnamese characters
   into US ASCII for use on 7-bit systems.  There is substantial
   existing online freely distributable software that implements these
   conventions for UNIX and personal computers.  These encodings enable
   Vietnamese-language users to take full advantage of powerful tools
   already developed for the English-speaking world, eliminating
   unnecessary reinvention.  This paper describes these conventions in
   part so that MIME-compliant software might also support the
   Vietnamese language.

   NOTE: The accented Vietnamese letters are herein represented by their
   VIQR equivalents, offset by enclosing angle brackets.  For example,
   the single letter "a acute" is written as <a'>, where the apostrophe
   is the mnemonic symbol for the acute.

2. LINGUISTIC OVERVIEW

   As a romanized language, Vietnamese appears to lend itself readily to
   integration into existing English-based systems.  To cite a simple

Vietnamese Standardization Working Group                        [Page 1]
RFC 1456          Conventions for Encoding Vietnamese           May 1993

   example, consider implementing support for French in such systems.
   One can allocate code positions in the 8-bit space necessary for
   accented letters such as <e^> or <e'>, then provide a means for users
   to access these codes through the keyboard.  The required number of
   "extra" code positions is small (see, e.g., ISO-8859/Latin-1 [1]),
   and the relatively low frequency of occurrence of accented letters
   does not place heavy demand on efficient keyboard input schemes.  The
   same things cannot be said for Vietnamese, where both the number and
   occurrence frequency of accented letters are large.  Apart from the
   alphabetics already available in ASCII, Vietnamese requires an
   additional 134 combinations of a letter and diacritical symbols.

   Note that one can resort to a composite encoding scheme to reduce
   this requirement, but that would mean giving up on integration into
   today's computing platforms which for the most part do not support
   such schemes.  In addition, the heavy use of diacritical marks in
   Vietnamese text calls for a keyboard input scheme that does not
   require extra keystrokes such as a special "compose" key to generate
   accented letters.  Because of the large number of possible
   combinations, the scheme should also be easily learned and memorized.

   Finally, to integrate Vietnamese into current electronic mail systems
   which are still limited to 7 bits, there should be a representation
   for Vietnamese text that is readily readable in its 7-bit form.

   The Viet-Std group, an electronic standardization roundtable, has
   worked over the past few years to draft proposals addressing these
   issues.  This has culminated in the conventions to be described
   briefly in the next two sections.  The detailed technical
   considerations have been reported elsewhere [2].  In this memo we
   give a brief outline of the working standards and describe supporting
   software availability.

3. SPECIFICATION OF VISCII

   VISCII stands for VIetnamese Standard Code for Information
   Interchange, an 8-bit encoding specification.  Its salient features
   are:

    1.  Encoding of all Vietnamese letters as single units
        rather than separating base vowels and diacritical
        marks.

    2.  Retention of the complete ASCII graphics repertoire
        in order to facilitate integration.

    3.  Encoding the 6 least-often-used upper-case letters into
        6 least problematic C0 (control) characters.

Vietnamese Standardization Working Group                        [Page 2]
RFC 1456          Conventions for Encoding Vietnamese           May 1993

    4.  Character placement have been designed with
Show full document text