Internet Draft                                   Marc Blanchet

draft-ietf-idn-version-00.txt                         Viagenie
October 26, 2000
Expires in six
months




     Handling versions of internationalized domain names protocols


Status of this Memo

This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.


Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other groups
may also distribute working documents as Internet-Drafts.


Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."



      The list of current Internet-Drafts can be accessed at

      http://www.ietf.org/ietf/1id-abstracts.txt

      The list of Internet-Draft Shadow Directories can be accessed at

      http://www.ietf.org/shadow.html.



Abstract

This document describes some issues related to handling changes in the
codepoints and other features in the chars space of an internationalized
domain name as well as changes in the protocol that supports it (whatever
that be).  It also presents some solutions to these issues.


1. Rationale

The nameprep draft [NAMEPREP] is defining prohibited chars, normalization
algorithms, etc.. The current concensus is to take the safer approach of
not accepting new chars if they are not listed, since the new characters
that can be included in the subsequent releases of the universal character
set can have an impact on the domain name (for example, a new variation of
the dot . will mostly be non-compatible and being excluded).

The Unicode and ISO-10646 versioning is designed not for the same purpose
as the idn work: for example, some chars or properties can be changed or
added to those repertoires that cannot be taken as is by the idn
protocol.  In essence, the idn nameprep versions cannot be in sync with
unicode/iso versions. idn need its own versioning of the accepted character
set, which can or cannot be synchronized with the two others. While saying
that, there is no intent at all to define new codepoints inside this work
and any attempt to do that must be rejected.

Since internationalization and specifically internationalization of the
domain names is new, we don't really know if the chosen algorithm, protocol
and codepoitns will not have any problem in the future. We need to handle
versions of the protocol and to have a specific table for idn accepted chars.

2. Versioning of the protocol

Since the idn table of chars will be changed in the future, and possibly
the algorithms and the processing, then there is a need to handle these
situations in a controlled way. A version of the idn protocol should be
defined and included as part of the exchange between the parties so that
they can handle smoothly the new revisions.

2.1 Implementing the version in the protocol
Depending on which way the idn protocol will be, a different versioning
will have to be defined. We will discuss the current proposals and identify
where a versioning system can be handled.

2.1.1 Proposals based on extensions of the DNS protocol
  Proposals based on extensions of the DNS protocol should include in the
bits the version number and a way to exchange version numbers between the
parties. [IDNE] already defined a version number as part of the use of
EDNS. Similar versioning should be defined in the other proposed protocols.

2.1.2 Proposals based on ACE and application
  Since ACEs (for example [RACE]) in applications [IDNA] do not change the
DNS protocol but only encode the idn in a ascii compatible way, it looks
more difficult to include a version number in the ACE encoding, since it
will change the domain name.  The proposal is to use a different
prefix/suffix for each version, by using one of the chars used in the
prefix as a version number, beginning with the lowest possible ascii char
available and increase the ascii codepoint of the char by 1 for each
version. For example, if the prefix is "ra--", then the first version of
idn will be "ra--", the second version will be "rb--", the third "rc--".
While this would be more elegant, one can handle versioning by having
different prefixes for each version, while chosing prefixes randomly (i.e.
1st version: rq--, 2nd version: wz--, ...). IANA should block all possible
prefixes so that no pre-registration is permitted.

2.2 Major and minor version numbers

  A <major>.<minor> version number is proposed (i.e. 1.0, 1.1, 2.0). A
minor revision is defined to be as new chars or small changes in the table
to be handled without any modification of algorithm.  The idea here is to
enable easy upgrading of idn processors when only new chars will be
handled.  In this case, the binary software do not have to be changed and
only the table containing the chars has to be changed to enable a new
version.  A major revision means that the software has to be upgraded since
it implies new ways of handling idn which are not identified in the table.

2.3 Processing of the version numbers
Any idn processor MUST verify the version number before processing. When a
major version number is higher than what the processor currently handle is
detected, processing MUST be stopped and rejected. When a minor version
number is higher than what the processor currently handle, then any
intermediate processors MUST forward the idn but the end processor (i.e.
the dns server authoritative for that zone that is handling the request)
MUST stop and reject the request.  Specific handling of rejecting the
request should be defined in the protocol document.

2.4 Version numbers
  Version numbers of the idn protocol will be handled by IANA. A
IESG-reviewed document should be used as the basis for IANA to define a new
protocol version number.

3. Idn table
  Since the unicode consortium and ISO will issue new versions not at the
same time as the idn protocol versioning, the IETF need to have its own
registered table of accepted idn chars. This will also enable specific data
only useful for idn. The intent is not to duplicate the unicode/iso effort,
it is to support the specific needs of the idn group.  For example, it is
possible that specific chars will have a different behavior than the normal
Unicode way: the special casing for final or non-final letters in the
Unicode SpecialCasing.txt file can be merged (ie not totally identical to
the unicode processing) to only one casing since final and non-final
letters have less meaning in a domain name. Simpler processing for idn is
also useful: for example, the Lud property is probably not useful in the
idn context and can be considered equivalent to Lu. Combining those
properties together means much simple table and simpler processing and less
errors in implementations. Added chars, codepoint, properties MUST NOT be
done in this file, but must be done within the Unicode/ISO process.

This document proposes a table contained in an ascii file handled by IANA
and available in the IANA public directory. The source of the table will be
the Unicode table, with a similar format, but a simpler, and perhaps
different, content based on the needs of the idn protocol. Proposed
filename is: idntab.txt.

This file format will also help implementors to have the same input
description and exhaustive list of which chars and processing to handle
idn. This is easier for implementors to have a complete list of accepted
chars instead a list of non-accepted chars.  The hope is also to help
decrease the different behaviors between the different implementations.
This table can be considered as an implementation of [NAMEPREP].

3.1 File format

The file is divided in two parts. The first part contains variables. The
second part contains all the chars accepted for idn. The two parts are
separated by a empty line.

The format of the first part is:
tag=value<CR><LF>

Currently, only the version tag is defined. The "version" tag defines the
highest idn version number that can be found in this table.  The intent is
to have only one file that is kept current but where the beginning of the
file can be used by parsers to identify the latest version of the file.

The first idntab.txt file will define version=1.0:
version=1.0<CR><LF>

The format of the second part is:
  one codepoint per line
  lines separated by CR/LF
  each field in the line separated by ";"

The fields are (in order, from left to right):
  1. codepoint in hexadecimal (as in unicode table): HHHH
  2. first version of idn table that this char is supported

It is possible that new fields will be added in the future. Parsers should
ignore the fields that they don't understand.

Spaces (ascii  0x20) are ignored. Comments are identified by "#". All text
following the comment char up to the end of line is ignored. Any codepoint
not in the table is considered prohibited. Codepoints that are prohibited
may be included in the table inside commented lines in order to help a
reader to see explicitly which ones are prohibited.

Example of the file idntab10.txt:
version=1.0<CR><LF>
<CR><LF>
0041;1.0;

4. Security Considerations

Changing the way a software react about domain names is clearly something
that have security impacts. While the actual table doesn't present any
security threat by itself, if there is someways by a intruder to reload a
new table into a idn processor software without the knowledge of the user,
then it can completly disables name resolution or have other possible
threats.  In conclusion, care must be taken by software developers on how
to manage the insertion of a new idn table in a idn processor software.

5. IANA Considerations
  This document describes versions of the idn file called idntab.txt. The
file should be kept in the IANA public directory for references purposes.
New versions of the file should be made available after IESG review of a
document.  Old revisions of the file (idntab-xy.txt) should be kept for
documentation purposes.

6. Acknowledgements
  The author would like to acknowledge the comments and idea of the
following people: Fran‡ois Yergeau and Paul Hoffman.

7. Author

Marc Blanchet
Viagenie
2875 boul. Laurier, bur. 300
Ste-Foy, Quebec, Canada
G1V 2M2
Marc.Blanchet@viagenie.qc.ca
tel: 418-656-9254

8. References

[NAMEPREP] Preparation of Internationalized Host Names, Paul Hoffman, Marc
Blanchet. Work in progress.

[RACE] RACE: Row-based ASCII Compatible Encoding for IDN, Paul Hoffman.
Work in progress.

[IDNA] Internationalizing Host Names In Applications (IDNA), Paul Hoffman,
Patrick Falstrom. Work in progress.


Marc Blanchet
Viag‰nie inc.
tel: 418-656-9254
http://www.viagenie.qc.ca

----------------------------------------------------------
Normos (http://www.normos.org): Internet standards portal:
IETF RFC, drafts, IANA, W3C, ATMForum, ISO, ... all in one place.