TELNET CHARSET Option
draft-gellens-telnet-char-option-03
The information below is for an old version of the document that is already published as an RFC.
Document | Type |
This is an older version of an Internet-Draft that was ultimately published as RFC 2066.
|
|
---|---|---|---|
Author | Randall Gellens | ||
Last updated | 2013-03-02 (Latest revision 1996-07-24) | ||
RFC stream | Legacy stream | ||
Intended RFC status | (None) | ||
Formats | |||
Stream | Legacy state | (None) | |
Consensus boilerplate | Unknown | ||
RFC Editor Note | (None) | ||
IESG | IESG state | Became RFC 2066 (Experimental) | |
Telechat date | (None) | ||
Responsible AD | (None) | ||
Send notices to | (None) |
draft-gellens-telnet-char-option-03
Network Working Group R. Gellens
INTERNET DRAFT Unisys
July 24, 1996
Document: draft-gellens-telnet-char-option-03.txt
Postscript: draft-gellens-telnet-char-option-03.ps
TELNET CHARSET Option
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are
working documents of the Internet Engineering Task Force
(IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum
of six months and may be updated, replaced, or obsoleted
by other documents at any time. It is inappropriate to
use Internet-Drafts as reference material or to cite them
other than as ``work in progress.''
To learn the current status of any Internet-Draft, please
check the ``1id-abstracts.txt'' listing contained in the
Internet-Drafts Shadow Directories on ftp.is.co.za
(Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific
Rim), ds.internic.net (US East Coast), or ftp.isi.edu
(US West Coast).
A mailing list has been established for discussion of
this (and related) topics. To subscribe, send mail to:
telnet-request@mv.Unisys.com
Gellens Expires January 24, 1997 [Page 1]
Internet Draft TELNET CHARSET Option July 24, 1996
Items for redistribution to the list should be sent to:
telnet@mv.Unisys.com
1. Abstract
This document specifies a mechanism for passing character
set and translation information between a TELNET client
and server. Use of this mechanism enables an application
used by a TELNET user to send and receive data in the
correct character set.
Either side can (subject to option negotiation) at any
time request that a (new) character set be used.
2. Command Names and Codes
CHARSET.......................xx
REQUEST ....................01
ACCEPTED ...................02
REJECTED ...................03
TTABLE-IS ..................04
TTABLE-REJECTED ............05
TTABLE-ACK .................06
TTABLE-NAK .................07
As a convenience, standard TELNET text and codes for
commands used in this document are reproduced here
(excerpted from [1]):
All TELNET commands consist of at least a two byte
sequence: the "Interpret as Command" (IAC) escape
character followed by the code for the command. The
commands dealing with option negotiation are three byte
sequences, the third byte being the code for the option
referenced. ... [O]nly the IAC need be doubled to be sent
as data, and the other 255 codes may be passed
Gellens Expires January 24, 1997 [Page 2]
Internet Draft TELNET CHARSET Option July 24, 1996
transparently. The following are [some of] the defined
TELNET commands. Note that these codes and code
sequences have the indicated meaning only when
immediately preceded by an IAC.
NAME CODE MEANING
SE 240 End of subnegotiation parameters.
SB Indicates that what follows is
250
subnegotiation of the indicated
option.
WILL (option 251 Indicates the desire to begin
code) performing, or confirmation that
you are now performing, the
indicated option.
WON'T 252 Indicates the refusal to perform,
(option or continue performing, the
code) indicated option.
DO (option 253 Indicates the request that the
code) other party perform, or
confirmation that you are expecting
the other party to perform, the
indicated option.
DON'T 254 Indicates the demand that the other
(option party stop performing, or
code) confirmation that you are no longer
expecting the other party to
perform, the indicated option.
IAC 255 Data Byte 255.
Gellens Expires January 24, 1997 [Page 3]
Internet Draft TELNET CHARSET Option July 24, 1996
3. Command Meanings
A very simple meta-syntax is used, where most tokens
represent previously defined items (such as IAC); angle-
brackets (``<>``) are used for items to be further defined;
curly-braces (``{}'') are used around optional items;
ellipses represent repeated sequences of items; and quotes
are used for literal strings.
IAC WILL CHARSET
The sender REQUESTS permission to, or AGREES to, use
CHARSET option subnegotiation to choose a character set.
IAC WON'T CHARSET
The sender REFUSES to use CHARSET option subnegotiation
to choose a character set.
IAC DO CHARSET
The sender REQUESTS that, or AGREES to have, the other
side use CHARSET option subnegotiation to choose a
character set.
IAC DON'T CHARSET
The sender DEMANDS that the other side not use the
CHARSET option subnegotiation.
IAC SB CHARSET REQUEST { ``[TTABLE]'' <Version> } <char set
list> IAC SE
Char set list:
<sep> <character set> { ...<sep> <character set> }
This message initiates a new CHARSET subnegotiation. It
can only be sent by a side that has received a DO CHARSET
message and sent a WILL CHARSET message (in either
order).
Gellens Expires January 24, 1997 [Page 4]
Internet Draft TELNET CHARSET Option July 24, 1996
The sender requests that all text sent to and by it be
encoded in one of the specified character sets.
If the string [TTABLE] appears, the sender is willing to
accept a mapping (translation table) between any
character set listed in <char set list> and any character
set desired by the receiver.
<Version> is an octet whose binary value is the highest
version level of the TTABLE-IS message which can be sent
in response. This field must not be zero. See the
TTABLE-IS message for the permitted version values.
<Char set list> is a sequence of 7-BIT ASCII printable
characters. The first octet defines the separator
character (which must not appear within any character
set). It is terminated by the IAC SE sequence. Case is
not significant. It consists of one or more character
sets. The character sets should appear in order of
preference (most preferred first).
<Sep> is a separator octet, the value of which is chosen
by the sender. Examples include a space or a semicolon.
Any value other than IAC is allowed. The obvious choice
is a space or any other punctuation symbol which does not
appear in any of the character set names.
<Character set> is a sequence of 7-BIT ASCII printable
characters. Case is not significant.
If a requested character set is registered with the
Internet Assigned Number Authority (IANA) [2], it is
required that the standardized spelling of its name or a
registered alias be used. While it is permitted to
request non-standard character sets such as those not
registered with IANA, this is strongly discouraged, as
such character sets are unlikely to be recognized by the
receiver of the CHARSET REQUEST message. Even worse, a
non-registered character set could have the same name as
some other character set which is registered. Each side
would then be using a character set different from that
expected by the other.
Gellens Expires January 24, 1997 [Page 5]
Internet Draft TELNET CHARSET Option July 24, 1996
The receiver responds in one of four ways:
If the receiver is already sending text to and
expecting text from the sender to be encoded in one of
the specified character sets, it sends a positive
acknowledgment (CHARSET ACCEPTED); it MUST NOT ignore
the message. (Although ignoring the message is perhaps
suggested by some interpretations of the relevant RFCs
([1], [3]), in the interests of determinacy it is not
permitted. This ensures that the issuer does not need
to time out and infer a response, while avoiding
(because there is no response to a positive
acknowledgment) the non-terminating subnegotiation
which is the rationale in the RFCs for the non-response
behavior.)
If the receiver is capable of handling at least one of
the specified character sets, it can respond with a
positive acknowledgment for one of the requested
character sets. Normally, it should pick the first set
it is capable of handling but may choose one based on
its own preferences. After doing so, each side MUST
encode subsequent text in the specified character set.
If the string [TTABLE] is present, and the receiver
prefers to use a character set not included in <char
set list>, and is capable of doing so, it can send a
translate table (TTABLE-IS) response.
If the receiver is not capable of handling any of the
specified character sets, it sends a negative
acknowledgment (CHARSET REJECTED).
Because it is not valid to reply to a CHARSET REQUEST
message with another CHARSET REQUEST message, if a
CHARSET REQUEST message is received after sending one, it
means that both sides have sent them simultaneously. In
this case, the server side MUST issue a negative
acknowledgment. The client side MUST respond to the one
from the server.
Gellens Expires January 24, 1997 [Page 6]
Internet Draft TELNET CHARSET Option July 24, 1996
IAC SB CHARSET ACCEPTED <Charset> IAC SE
This is a positive acknowledgment response to a CHARSET
REQUEST message; the receiver of the CHARSET REQUEST
message acknowledges its receipt and accepts the
indicated character set.
<Charset> is a character sequence identical to one of the
character sets in the CHARSET REQUEST message. It is
terminated by the IAC SE sequence.
Text messages which follow this response must now be
coded in the indicated character set. This message
terminates the current CHARSET subnegotiation.
IAC SB CHARSET REJECTED IAC SE
This is a negative acknowledgment response to a CHARSET
REQUEST message; the receiver of the CHARSET REQUEST
message acknowledges its receipt but refuses to use any
of the requested character sets. Messages can not be
sent in any of the indicated character sets. This
message can also be sent by the sender of a TTABLE-IS
message, if multiple TTABLE-NAK messages were sent in
response. This message terminates the current CHARSET
subnegotiation.
IAC SB CHARSET TTABLE-IS <version> <syntax for version> IAC
SE
In response to a CHARSET REQUEST message in which
[TTABLE] was specified, the receiver of the CHARSET
REQUEST message acknowledges its receipt and is
transmitting a pair of tables which define the mapping
between specified character sets.
<Version> is an octet whose binary value is the version
level of this TTABLE-IS message. Different versions have
different syntax. The lowest version level is one (zero
is not valid). The current highest version level is also
one. This field is provided so that future versions of
the TTABLE-SEND message can be specified, for example, to
handle character sets for which there is no simple one-
Gellens Expires January 24, 1997 [Page 7]
Internet Draft TELNET CHARSET Option July 24, 1996
to-one character-for-character translation. This might
include some forms of multi-octet character sets for
which translation algorithms or subsets need to be sent.
Syntax for Version 1:
<sep> <char set name 1> <sep> <char size 1> <char count
1> <char set name 2> <sep> <char size 2> <char count 2>
<map 1> <map 2>
<Sep> is a separator octet, the value of which is chosen
by the sender. Examples include a space or a semicolon.
Any value other than IAC is allowed. The obvious choice
is a space or any other punctuation symbol which does not
appear in either of the character set names.
<Char set name 1> and <Char set name 2> are sequences of
7-BIT ASCII printable characters which identify the two
character sets for which a mapping is being specified.
Each is terminated by <sep>. Case is not significant.
If a character set is registered with IANA, it is
required that the standardized spelling of its name or a
registered alias be used. <Char set name 1> should be
chosen from the <char set list> in the CHARSET REQUEST
message. <Char set name 2> can be arbitrarily chosen.
Text on the wire should be encoded using <char set name
2>.
<Char size 1> and <char size 2> are single octets each.
The binary value of the octet is the number of bits
nominally required for each character in the
corresponding table. It should be a multiple of eight.
<Char count 1> and <char count 2> are each three-octet
binary fields in Network Byte Order [6]. Each specifies
how many characters (of the maximum 2**<char size>) are
being transmitted in the corresponding map.
Gellens Expires January 24, 1997 [Page 8]
Internet Draft TELNET CHARSET Option July 24, 1996
<Map1> and <Map 2> each consist of the corresponding
<char count> number of characters. These characters form
a mapping from all or part of the characters in one of
the specified character sets to the correct characters in
the other character set. If the indicated <char count>
is less than 2**<char size>, the first <char count>
characters are being mapped, and the remaining characters
are assumed to not be changed (and thus map to
themselves). That is, each map contains characters 0
through <char count> -1. <Map 1> maps from <char set
name 1> to <char set name 2>. <Map 2> maps from <char
set name 2> to <char set name 1>. Translation between
the character sets is thus an obvious process of using
the binary value of a character as an index into the
appropriate map. The character at that index replaces
the original character. If the index exceeds the <char
count> for the map, no translation is performed for the
character.
[Note to implementers: since TELNET works in octets, it
is possible for octets of value 255 to appear
``spontaneously'' when using multi-octet or non-8-bit
characters. All octets of value 255 (other than IAC)
MUST be quoted to conform with TELNET requirements. This
applies even to octets within a table, or text in a
multi-octet character set.]
IAC SB CHARSET TTABLE-ACK IAC SE
The sender acknowledges the successful receipt of the
translate table. Text messages which follow this
response must now be coded in the character set specified
as <char set name 2> of the TTABLE-IS message. This
message terminates the current CHARSET subnegotiation.
IAC SB CHARSET TTABLE-NAK IAC SE
The sender reports the unsuccessful receipt of the
translate table and requests that it be resent. If
subsequent transmission attempts also fail, a TTABLE-
REJECTED or CHARSET REJECTED message (depending on which
Gellens Expires January 24, 1997 [Page 9]
Internet Draft TELNET CHARSET Option July 24, 1996
side sends it) should be sent instead of additional
futile TTABLE-IS and TTABLE-NAK messages.
IAC SB CHARSET TTABLE-REJECTED IAC SE
In response to a TTABLE-IS message, the receiver of the
TTABLE-IS message acknowledges its receipt and indicates
it is unable to handle it. This message terminates the
Any system which supports the CHARSET option MUST fully
support the CHARSET REQUEST, ACCEPTED, REJECTED, and TTABLE-
REJECTED subnegotiation messages. It MAY optionally fully
support the TTABLE-IS, TTABLE-ACK, and TTABLE-NAK messages.
If it does fully support the TTABLE-IS message, it MUST also
fully support the TTABLE-ACK and TTABLE-NAK messages.
4. Default
WON'T CHARSET
DON'T CHARSET
5. Motivation for the Option
Many TELNET sessions need to transmit data which is not
in 7-bit ASCII. This is usually done by negotiating
BINARY, and using local conventions (or terminal type
kluges) to determine the character set of the data.
However, such methods tend not to interoperate well, and
have difficulties when multiple character sets need to be
supported by different sessions.
Many computer systems now utilize a variety of character
sets. Increasingly, a server computer needs to document
character sets or translate transmissions and receptions
using different pairs of character sets on a per-
Gellens Expires January 24, 1997 [Page 10]
Internet Draft TELNET CHARSET Option July 24, 1996
application or per-connection basis. This is becoming
more common as client and server computers become more
geographically disperse. (And as servers are
consolidated into ever-larger hubs, serving ever-wider
areas.) In order for files, databases, etc. to contain
correct data, the server must determine the character set
in which the user is sending, and the character set in
which the application expects to receive.
In some cases, it is sufficient to determine the
character set of the end user (because every application
on the server expects to use the same character set, or
because applications can handle the user's character
set), but in other cases different server applications
expect to use different character sets. In the former
case, an initial CHARSET subnegotiation suffices. In the
latter case, the server may need to initiate additional
CHARSET subnegotiations as the user switches between
applications.
At a minimum, the option described in this memo allows
both sides to be clear as to which character set is being
used. A minimal implementation would have the server
send DO CHARSET, and the client send WILL CHARSET and
CHARSET REQUEST. The server could then communicate the
client's character set to applications using whatever
means are appropriate. Such a server might refuse
subsequent CHARSET REQUEST messages from the client (if
it lacked the ability to communicate changed character
set information to applications, for example). Another
system might have a method whereby various applications
could communicate to the TELNET server their character
set needs and abilities, which the server would handle by
initiating new CHARSET REQUEST negotiations as
appropriate.
In some cases, servers may have a large set of clients
which tend to connect often (such as daily) and over a
long period of time (such as years). The server
administrators may strongly prefer that the servers not
do character set translation (to save CPU cycles when
serving very large numbers of users). To avoid manually
Gellens Expires January 24, 1997 [Page 11]
Internet Draft TELNET CHARSET Option July 24, 1996
configuring each copy of the user TELNET software, the
administrators might prefer that the software supports
translate tables. (If the client software received a
translate table from the server and stored it, the table
would only need to be sent once.)
6. Description of the Option
When the client TELNET program is able to determine the
user's character set it should offer to specify the
character set by sending IAC WILL CHARSET.
If the server system is able to make use of this
information, it replies with IAC DO CHARSET. The client
TELNET is then free to request a character set in a
subnegotiation at any time.
Likewise, when the server is able to determine the
expected character set(s) of the user's application(s),
it should send IAC DO CHARSET to request that the client
system specify the character set it is using. Or the
server could send IAC WILL CHARSET to offer to specify
the character sets.
Once a character set has been determined, the server can
either perform the translation between the user and
application character sets itself, or request by
additional CHARSET subnegotiations that the client system
do so.
Once it has been established that both sides are capable
of character set negotiation (that is, each side has
received either a WILL CHARSET or a DO CHARSET message,
and has also sent either a DO CHARSET or a WILL CHARSET
message), subnegotiations can be requested at any time by
whichever side has sent a WILL CHARSET message and also
received a DO CHARSET message (this may be either or both
sides). Once a CHARSET subnegotiation has started, it
must be completed before additional CHARSET
subnegotiations can be started (there must never be more
than one CHARSET subnegotiation active at any given
Gellens Expires January 24, 1997 [Page 12]
Internet Draft TELNET CHARSET Option July 24, 1996
time). When a subnegotiation has completed, additional
subnegotiations can be started at any time.
If either side violates this rule and attempts to start a
CHARSET subnegotiation while one is already active, the
other side MUST reject the new subnegotiation by sending
a CHARSET REJECTED message.
Receipt of a CHARSET REJECTED or TTABLE-REJECTED message
terminates the subnegotiation, leaving the character set
unchanged. Receipt of a CHARSET ACCEPTED or TTABLE-ACK
message terminates the subnegotiation, with the new
character set in force.
In some cases, both the server and the client systems are
able to perform translations and to send and receive in
the character set(s) expected by the other side. In such
cases, either side can request that the other use the
character set it prefers. When both sides simultaneously
make such a request (send CHARSET REQUEST messages), the
server MUST reject the client's request by sending a
CHARSET REJECTED message. The client system MUST respond
to the server's request. (See the CHARSET REQUEST
description, above.)
When the client system makes the request first, and the
server is able to handle the requested character set(s),
but prefers that the client system instead use the
server's (user application) character set, it may reject
the request, and issue a CHARSET REQUEST of its own. If
the client system is unable to comply with the server's
preference and issues a CHARSET REJECTED message, the
server can issue a new CHARSET REQUEST message for one of
the previous character sets (one of those which the
client system originally requested). The client system
would obviously accept this character set.
While a CHARSET subnegotiation is in progress, data
should be queued. Once the CHARSET subnegotiation has
terminated, the data can be sent (in the correct
character set).
Gellens Expires January 24, 1997 [Page 13]
Internet Draft TELNET CHARSET Option July 24, 1996
Note that regardless of CHARSET negotiation, translation
only applies to text (not commands), and only occurs when
in BINARY mode [4]. If not in BINARY mode, all data is
assumed to be in NVT ASCII [1].
Also note that the CHARSET option should be used with the
END OF RECORD option [5] for block-mode terminals in
order to be clear on what character represents the end of
each record.
As an example of character set negotiation, consider a
user on a workstation using TELNET to communicate with a
server. In this example, the workstation normally uses
the Cyrillic (ASCII) character set [2] but is capable of
using EBCDIC-Cyrillic [2], and the server normally uses
EBCDIC-Cyrillic. The server could handle the (ASCII)
Cyrillic character set, but prefers that instead the
client system uses the EBCDIC-Cyrillic character set.
(This and the following examples do not show the full
syntax of the subnegotiation messages.)
CLIENT SERVER
WILL CHARSET WILL CHARSET
DO CHARSET DO CHARSET
CHARSET REQUEST Cyrillic
EBCDIC-Cyrillic
CHARSET ACCEPTED EBCDIC-
Cyrillic
Now consider a case where the workstation can't handle
EBCDIC-Cyrillic, but can accept a translate table:
CLIENT SERVER
WILL CHARSET WILL CHARSET
Gellens Expires January 24, 1997 [Page 14]
Internet Draft TELNET CHARSET Option July 24, 1996
DO CHARSET DO CHARSET
CHARSET REQUEST [TTABLE] 1
Cyrillic
CHARSET TTABLE-IS 1 Cyrillic
EBCDIC-Cyrillic
CHARSET TTABLE-ACK
For another example, consider a case similar to the
previous case, but now the user switches server
applications in the middle of the session (denoted by
ellipses), and the new application requires a different
character set:
CLIENT SERVER
WILL CHARSET WILL CHARSET
DO CHARSET DO CHARSET
CHARSET REQUEST [TTABLE] 1
Cyrillic EBCDIC-INT
CHARSET TTABLE-IS 1 Cyrillic
EBCDIC-Cyrillic
CHARSET TTABLE-ACK
. . . . . .
CHARSET REQUEST EBCDIC-INT
CHARSET ACCEPTED EBCDIC-INT
Gellens Expires January 24, 1997 [Page 15]
Internet Draft TELNET CHARSET Option July 24, 1996
7. Security Considerations
This document raises no security issues.
8. References
[1]Postel, J. and Reynolds, J., ``Telnet Protocol
Specification'', STD 8, RFC 854, ISI, May 1983
[2]Reynolds, J., and Postel, J., ``Assigned Numbers'',
STD 2, RFC 1700, ISI, October 1994
[3]Postel, J. and Reynolds, J., ``Telnet Option
Specifications'', STD 8, RFC 855, ISI, May 1983
[4]Postel, J. and Reynolds, J., 11Telnet Binary
Transmission'', RFC 856, ISI, May 1983
[5]Postel, J., ``Telnet End-Of-Record Option'', RFC 885,
ISI, December 1983
[6]Postel, J., ``Internet Official Protocol Standards'',
STD 1, RFC 1780, IAB, March 1995
9. Author's Address
Randall C. Gellens
Unisys Corporation
25725 Jeronimo Road
Mail Stop 237
Mission Viejo, CA 92691
USA
Phone: +1.714.380.6350
Fax: +1.714.380.5912
Randy@MV.Unisys.Com
Gellens Expires January 24, 1997 [Page 16]