Network Working Group R. Gellens
INTERNET DRAFT Unisys
June 29, 1995
Document: draft-gellens-telnet-char-option-00.txt
Postscript: draft-gellens-telnet-char-option-00.ps
TELNET CHARSET Option
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are
working documents of the Internet Engineering Task Force
(IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as
``work in progress.''
To learn the current status of any Internet-Draft, please
check the ``1id-abstracts.txt'' listing contained in the
Internet-Drafts Shadow Directories on ftp.is.co.za (Africa),
nic.nordu.net (Europe), munnari.oz.au (Pacific Rim),
ds.internic.net (US East Coast), or ftp.isi.edu (US West
Coast).
Gellens Expires January 3, 1996 [Page 1]
Internet Draft TELNET CHARSET Option June 29, 1995
1. Abstract
This document specifies a mechanism for passing character set
and translation information between a TELNET client and
server. Use of this mechanism enables an application used by
a TELNET user to send and receive data in the correct
character set.
Either side can (subject to option negotiation) at any time
request that a (new) character set be used.
2. Command Names and Codes
CHARSET .......................xx
REQUEST.....................01
ACCEPTED....................02
REJECTED....................03
TTABLE-SEND.................04
TTABLE-IS...................05
TTABLE-REJECTED.............06
TTABLE-ACK..................07
TTABLE-NAK..................08
Gellens Expires January 3, 1996 [Page 2]
Internet Draft TELNET CHARSET Option June 29, 1995
As a convenience, standard TELNET text and codes for commands
used in this document are reproduced here (excerpted from [1]):
All TELNET commands consist of at least a two byte sequence:
the "Interpret as Command" (IAC) escape character followed by
the code for the command. The commands dealing with option
negotiation are three byte sequences, the third byte being the
code for the option referenced. ... [O]nly the IAC need be
doubled to be sent as data, and the other 255 codes may be
passed transparently. The following are [some of] the defined
TELNET commands. Note that these codes and code sequences
have the indicated meaning only when immediately preceded by
an IAC.
NAME CODE MEANING
SE 240 End of subnegotiation parameters.
SB 250 Indicates that what follows is
subnegotiation of the indicated
option.
WILL (option 251 Indicates the desire to begin
code) performing, or confirmation that
you are now performing, the
indicated option.
WON'T 252 Indicates the refusal to perform,
(option or continue performing, the
code) indicated option.
DO (option 253 Indicates the request that the
code) other party perform, or
confirmation that you are expecting
the other party to perform, the
indicated option.
DON'T 254 Indicates the demand that the other
(option party stop performing, or
confirmation that you are no longer
Gellens Expires January 3, 1996 [Page 3]
Internet Draft TELNET CHARSET Option June 29, 1995
code) expecting the other party to
perform, the indicated option.
IAC 255 Data Byte 255.
3. Command Meanings
IAC WILL CHARSET
The sender REQUESTS permission to, or AGREES to, use CHARSET
option subnegotiation to choose a character set.
IAC WON'T CHARSET
The sender REFUSES to use CHARSET option subnegotiation to
choose a character set.
IAC DO CHARSET
The sender REQUESTS that, or AGREES to have, the other side
use CHARSET option subnegotiation to choose a character set.
IAC DON'T CHARSET
The sender DEMANDS that the other side not use the CHARSET
option subnegotiation.
IAC SB CHARSET REQUEST <character set> IAC SE
This message initiates a new CHARSET subnegotiation. It can
only be sent by a side that has received a DO CHARSET message
and sent a WILL CHARSET message (in either order).
The sender requests that all text sent to and by it be encoded
in the specified character set.
<Character set> is a sequence of NVT ASCII printable
characters. It is terminated by the IAC SE sequence. Case is
not significant. If a requested character set is registered
with the Internet Assigned Number Authority (IANA) [2], it is
Gellens Expires January 3, 1996 [Page 4]
Internet Draft TELNET CHARSET Option June 29, 1995
required that the standardized spelling of its name or a
registered alias be used. While it is permitted to request
non-standard character sets such as those not registered with
IANA, this is strongly discouraged, as such character sets are
unlikely to be recognized by the receiver of the CHARSET
REQUEST message. Even worse, a non-registered character set
could have the same name as some other character set which is
registered. Each side would then be using a character set
different from that expected by the other.
The receiver responds in one of four ways:
If the receiver is already sending text to and expecting
text from the sender to be encoded in the specified
character set, it sends a positive acknowledgment (CHARSET
ACCEPTED); it MUST NOT ignore the message. (Although
ignoring the message is perhaps suggested by some
interpretations of the relevant RFCs ([1], [3]), in the
interests of determinacy it is not permitted. This ensures
that the issuer does not need to time out and infer a
response, while avoiding (because there is no response to a
positive acknowledgment) the non-terminating subnegotiation
which is the rationale in the RFCs for the non-response
behavior.)
If the receiver is capable of handling the specified
character set, it can respond with a positive
acknowledgment. After doing so, each side MUST encode
subsequent text in the specified character set.
If the receiver is not capable of handling the specified
character set, but is capable of receiving a translate table
to enable it to do so, it can send a request for translate
table (TTABLE-SEND) response.
If the receiver is not capable of handling the specified
character set nor of receiving a translate table, it sends a
negative acknowledgment (CHARSET REJECTED).
Because it is not valid to reply to a CHARSET REQUEST message
with another CHARSET REQUEST message, if a CHARSET REQUEST
message is received after sending one, it means that both
Gellens Expires January 3, 1996 [Page 5]
Internet Draft TELNET CHARSET Option June 29, 1995
sides have sent them simultaneously. In this case, the server
side must issue a negative acknowledgment. The user side must
respond to the one from the server.
IAC SB CHARSET ACCEPTED IAC SE
This is a positive acknowledgment response to a CHARSET
REQUEST message; the receiver of the CHARSET REQUEST message
acknowledges its receipt and accepts the character set. Text
messages which follow this response must now be coded in the
requested character set. This message terminates the current
CHARSET subnegotiation.
IAC SB CHARSET REJECTED IAC SE
This is a negative acknowledgment response to a CHARSET
REQUEST message; the receiver of the CHARSET REQUEST message
acknowledges its receipt but refuses to use the character set.
Messages can not be sent in the indicated character set. This
message can also be sent in response to a TTABLE-IS message,
if the receiver of the TTABLE-IS message has problems with it.
This message terminates the current CHARSET subnegotiation.
IAC SB CHARSET TTABLE-SEND <version> <character set> IAC SE
This is a ``No, but if you hum a few bars I can fake it''
acknowledgment response to a CHARSET REQUEST message; the
receiver of the CHARSET REQUEST message acknowledges its
receipt and requests the sender to transmit a translate table
specifying the mapping between the character set in the
CHARSET REQUEST message and the character set in the TTABLE-
SEND message.
<Version> is a byte whose binary value is the highest version
level of the TTABLE-SEND message which can be sent in
response. This field must not be zero. See the TTABLE-IS
message for the permitted version values.
<Character set> is a sequence of NVT ASCII printable
characters. Case is not significant. It is terminated by the
IAC SE sequence. If a character set is registered with IANA,
Gellens Expires January 3, 1996 [Page 6]
Internet Draft TELNET CHARSET Option June 29, 1995
it is required that the standardized spelling of its name or a
registered alias be used.
If the receiver of the TTABLE-SEND message is not capable of
sending a translate table for the character sets, or is not
capable of doing so without using a version of the TTABLE-IS
message higher than <version>, it sends a TTABLE-REJECTED
message.
IAC SB CHARSET TTABLE-IS <version> <syntax for version> IAC SE
In response to a TTABLE-SEND message, the receiver of the
TTABLE-SEND message acknowledges its receipt and is
transmitting a pair of tables which define the mapping between
the specified character sets.
<Version> is a byte whose binary value is the version level of
this TTABLE-IS message. Different versions have different
syntax. The lowest version level is one (zero is not valid).
The current highest version level is also one. This field is
provided so that future versions of the TTABLE-SEND message
can be specified, for example, to handle character sets for
which there is no simple one-to-one character-for-character
translation. This might include some forms of multi-byte
character sets for which translation algorithms or subsets
need to be sent.
Syntax for Version 1:
<sep> <char set name 1> <sep> < char size 1> < char count 1>
<char set name 2> <sep> <char size 2> <char count 2> <map 1>
<map 2>
<Sep> is a separator byte, the value of which is chosen by
the sender. Examples include a space or a semicolon. Any
value other than IAC is allowed. The obvious choice is a
space or any other punctuation symbol which does not appear in
either of the character set names.
<Char set name 1> and <Char set name 2> are sequences of NVT
ASCII printable characters which identify the two character
sets for which a mapping is being specified. Each is
Gellens Expires January 3, 1996 [Page 7]
Internet Draft TELNET CHARSET Option June 29, 1995
terminated by <sep>. Case is not significant. If a character
set is registered with IANA, it is required that the
standardized spelling of its name or a registered alias be
used.
<Char size 1> and <char size 2> are single bytes each. The
binary value of the byte is the number of bits nominally
required for each character in the corresponding table. It
should be a multiple of eight. [Note to implementers: since
TCP/IP works in bytes, it is possible for bytes of value 255
to appear ``spontaneously'' when using non-8-bit characters.]
<Char count 1> and <char count 2> are each three-byte binary
fields in Network Byte Order [6]. Each specifies how many
characters (of the maximum 2**<char size>) are being
transmitted in the corresponding map.
<Map1> and <Map 2> each consist of the corresponding <char
count> number of characters. These characters form a mapping
from all or part of the characters in one of the specified
character sets to the correct characters in the other
character set. If the indicated <char count> is less than
2**<char size>, the first <char count> characters are being
mapped, and the remaining characters are assumed to not be
changed (and thus map to themselves). That is, each map
contains characters 0 through <char count> -1. <Map 1> maps
from <char set name 1> to <char set name 2>. <Map 2> maps
from <char set name 2> to <char set name 1>. Translation
between the character sets is thus an obvious process of using
the binary value of a character as an index into the
appropriate map. The character at that index replaces the
original character. If the index exceeds the <char count> for
the map, no translation is performed for the character.
IAC SB CHARSET TTABLE-REJECTED IAC SE
In response to a TTABLE-SEND message, the receiver of the
TTABLE-SEND message acknowledges its receipt and indicates it
is unable to comply with the request. This message terminates
the current CHARSET subnegotiation.
Gellens Expires January 3, 1996 [Page 8]
Internet Draft TELNET CHARSET Option June 29, 1995
This message could be sent, for example, because the receiver
does not have a mapping between the character set specified in
the CHARSET REQUEST message and the character set specified in
the TTABLE-SEND message. Or perhaps it cannot send such a
mapping using a version of the TTABLE-IS message which is less
than or equal to the version specified in the TTABLE-SEND
message.
IAC SB CHARSET TTABLE-ACK IAC SE
The sender acknowledges the successful receipt of the
translate table. Text messages which follow this response
must now be coded in the requested character set. This
message terminates the current CHARSET subnegotiation.
IAC SB CHARSET TTABLE-NAK IAC SE
The sender reports the unsuccessful receipt of the translate
table and requests that it be resent. If subsequent
transmission attempts also fail, a TTABLE-REJECTED or CHARSET
REJECTED message (depending on which side sends it) should be
sent instead of additional futile TTABLE-IS and TTABLE-NAK
messages.
Any system which supports the CHARSET option MUST fully support
the CHARSET REQUEST, ACCEPTED, REJECTED, and TTABLE-REJECTED
subnegotiation messages. It MAY optionally fully support the
TTABLE-SEND, TTABLE-ACK, and TTABLE-NAK messages. If it does
fully support the TTABLE-SEND message, it MUST also fully support
the TTABLE-ACK and TTABLE-NAK messages. If it does not fully
support the TTABLE-SEND message, it MUST at least recognize it
and respond with a TTABLE-REJECTED message.
Gellens Expires January 3, 1996 [Page 9]
Internet Draft TELNET CHARSET Option June 29, 1995
4. Default
WON'T CHARSET
DON'T CHARSET
5. Motivation for the Option
Many computer systems now utilize a variety of character sets.
Increasingly, a server computer needs to translate
transmissions and receptions using different pairs of
character sets on a per-application or per-connection basis.
This is becoming more common as user and server computers
become more geographically disperse. (And as servers are
consolidated into ever-larger hubs, serving ever-wider areas.)
In order for files, databases, etc. to contain correct data,
the server must determine the character set in which the user
is sending, and the character set in which the application
expects to receive.
In some cases, it is sufficient to determine the character set
of the end user (because every application on the server
expects to use the same character set), but in other cases
different server applications expect to use different
character sets. In the former case, an initial CHARSET
subnegotiation suffices. In the latter case, the server may
need to initiate additional CHARSET subnegotiations as the
user switches between applications.
6. Description of the Option
When the user TELNET program is able to determine the user's
character set it should offer to specify the character set by
sending IAC WILL CHARSET.
If the server system is able to make use of this information,
it replies with IAC DO CHARSET. The user TELNET is then free
to request a character set in a subnegotiation at any time.
Gellens Expires January 3, 1996 [Page 10]
Internet Draft TELNET CHARSET Option June 29, 1995
Likewise, when the server is able to determine the expected
character set of the user's application, it should send IAC
DO CHARSET to request that the user system specify the
character set it is using. Or the server could send IAC WILL
CHARSET to offer to specify the character set.
Once a character set has been determined, the server can
either perform the translation between the user and
application character sets itself, or request by additional
CHARSET subnegotiations that the user system do so.
Once it has been established that both sides are capable of
character set negotiation (that is, each side has received
either a WILL CHARSET or a DO CHARSET message, and has also
sent either a DO CHARSET or a WILL CHARSET message),
subnegotiations can be requested at any time by whichever side
has sent a WILL CHARSET message and also received a DO CHARSET
message (this may be either or both sides). Once a CHARSET
subnegotiation has started, it must be completed before
additional CHARSET subnegotiations can be started (there must
never be more than one CHARSET subnegotiation active at any
given time). When a subnegotiation has completed, additional
subnegotiations can be started at any time.
If either side violates this rule and attempts to start a
CHARSET subnegotiation while one is already active, the other
side MUST reject the new subnegotiation by sending a CHARSET
REJECTED message.
Receipt of a CHARSET REJECTED or TTABLE-REJECTED message
terminates the subnegotiation, leaving the character set
unchanged. Receipt of a CHARSET ACCEPTED or TTABLE-ACK
message terminates the subnegotiation, with the new character
set in force.
In some cases, both the server and the user systems are able
to perform translations and to send and receive in the
character set expected by the other side. In such cases,
either side can request that the other use the character set
it prefers. When both sides simultaneously make such a
request (send CHARSET REQUEST messages), the server MUST
reject the user's request by sending a CHARSET REJECTED
Gellens Expires January 3, 1996 [Page 11]
Internet Draft TELNET CHARSET Option June 29, 1995
message. The user system MUST respond to the server's
request. (See the CHARSET REQUEST description, above.)
When the user system makes the request first, and the server
is able to handle the requested character set, but prefers
that the user system instead use the server's (user
application) character set, it may reject the request, and
issue a CHARSET REQUEST of its own. If the user system is
unable to comply with the server's preference and issues a
CHARSET REJECTED message, the server can issue a new CHARSET
REQUEST message for the previous character set (the one which
the user system originally requested). The user system would
obviously accept this character set.
While a CHARSET subnegotiation is in progress, data should be
queued. Once the CHARSET subnegotiation has terminated, the
data can be sent (in the correct character set).
Note that regardless of CHARSET negotiation, translation only
applies to text (not commands), and only occurs when in BINARY
mode [4]. If not in BINARY mode, all data is assumed to be in
NVT ASCII.
Also note that the CHARSET option should be used with the END
OF RECORD option [5] for block-mode terminals in order to be
clear on what character represents the end of each record.
As an example of character set negotiation, consider a user on
a workstation using TELNET to communicate with a server. In
this example, the workstation normally uses the Cyrillic
(ASCII) character set [2] but is capable of using EBCDIC-
Cyrillic [2], and the server normally uses EBCDIC-Cyrillic.
The server could handle the (ASCII) Cyrillic character set,
but prefers that instead the user system uses the EBCDIC-
Cyrillic character set. (This and the following examples do
not show the full syntax of the subnegotiation messages.)
USER SERVER
WILL CHARSET WILL CHARSET
Gellens Expires January 3, 1996 [Page 12]
Internet Draft TELNET CHARSET Option June 29, 1995
DO CHARSET DO CHARSET
CHARSET REQUEST Cyrillic
CHARSET REJECTED
CHARSET REQUEST EBCDIC-
Cyrillic
CHARSET ACCEPTED
For another example, consider the previous case, but this time
the workstation cannot handle EBCDIC-Cyrillic, nor can it
accept a translate table:
USER SERVER
WILL CHARSET WILL CHARSET
DO CHARSET DO CHARSET
CHARSET REQUEST Cyrillic
CHARSET REJECTED
CHARSET REQUEST EBCDIC-
Cyrillic
CHARSET REJECTED
CHARSET REQUEST Cyrillic
CHARSET ACCEPTED
Gellens Expires January 3, 1996 [page 13]
Internet Draft TELNET CHARSET Option June 29, 1995
For the next example, consider the previous case, but this
time the workstation can accept a translate table:
USER SERVER
WILL CHARSET WILL CHARSET
DO CHARSET DO CHARSET
CHARSET REQUEST Cyrillic
CHARSET REJECTED
CHARSET REQUEST EBCDIC-
Cyrillic
CHARSET TTABLE-SEND
CHARSET TTABLE-IS
CHARSET TTABLE-ACK
For another example, consider the previous case, but now the
user switches server applications in the middle of the session
(denoted by ellipses), and the new application requires a
different character set:
USER SERVER
WILL CHARSET WILL CHARSET
DO CHARSET DO CHARSET
CHARSET Cyrillic
CHARSET REJECTED
CHARSET REQUEST EBCDIC-
Cyrillic
Gellens Expires January 3, 1996 [Page 14]
Internet Draft TELNET CHARSET Option June 29, 1995
CHARSET TTABLE-SEND
CHARSET TTABLE-IS
CHARSET TTABLE-ACK
. . . . . .
CHARSET REQUEST EBCDIC-INT
CHARSET ACCEPTED
7. Security Considerations
This document raises no security issues.
8. References
[1] Postel, J. and Reynolds, J., ``Telnet Protocol
Specification'', STD 8, RFC 854, ISI, May 1983
[2] Reynolds, J., and Postel, J., ``Assigned Numbers'',
STD 2, RFC 1700, ISI, October 1994.
[3] Postel, J. and Reynolds, J., ``Telnet Option
Specifications'', STD 8, RFC 855, ISI, May 1983
[4] Postel, J. and Reynolds, J., ``Telnet Binary
Transmission'', RFC 856, ISI, May 1983
[5] Postel, J., ``Telnet End-Of-Record Option'', RFC 885, ISI,
December 1983
[6] Postel, J., ``Internet Official Protocol Standards'', STD
1, RFC 1780, IAB, March 1995
Gellens Expires January 3, 1996 [Page 15]
Internet Draft TELNET CHARSET Option June 29, 1995
9. Author's Address
Randall C. Gellens
Unisys Corporation
25725 Jeronimo Road
Mail Stop 237
Mission Viejo, CA 92691
USA
Phone: +1.714.380.6350
Fax: +1.714.380.5912
Randy@MV.Unisys.Com
Gellens Expires January 3, 1996 [Page 16]