Note: This was a special meeting originally intended to test
the Marratech conferencing system. Due to a meeting conflict
with ISOC, the normal voice bridge was used instead. The topic
for discussion was internationalization of various IETF protocols.
Loa Andersson
Stuart Cheshire
Olaf Kolkman (IAB Chair)
Gregory Lebovitz
Barry Leiba
Kurtis Lindqvist
Danny McPherson
Dave Oran
Lynn St. Amour (ISOC Liaison)
Dow Street (IAB Executive Director)
Dave Thaler
Lixia Zhang
Gonzalo Camarillo
Andy Malis
Lars Eggert (IESG Liaison)
Aaron Falk (IRTF Chair)
Sandy Ginoza (RFC Editor Liaison)
Russ Housley (IETF Chair)
Dave Thaler described how his attention had recently been drawn to
several standing issues with the use of internationalized domain
names, and more generally, the internationalization of various IETF
protocols. The IAB decided to look into these issues further,
partly due to recent ICANN efforts in allocating internationalized
domain names (and possibly TLDs). Dave had previously introduced
the topic via an email to the IAB, and during the call summarized
some of the key points:
– different parts of the IETF are using different encodings (e.g.,
UTF-8, punycode, etc.)
– there have been past IAB statements in this area, but the
guidance may not be entirely consistent or clear.
– what is implemented in real-world code does not always match the
RFCs, nor is it consistent across OSs or applications.
Dave went on to describe the properties of punycode and UTF-8, and
Olaf and Stuart explained the various encodings used in the
software and interfaces they were familiar with. There was a
discussion about known limitations when converting between
encodings, such as ASCII, UTF-8, UTF-16, and punycode. All of
these formats are used at one point or another in common
implementations (e.g., Windows, OS X, Internet Explorer, Safari,
DNS, etc.). Stuart noted how a browser or other application will
often attempt to guess the proper encoding if left unspecified in
the source file, leading to unpredictable behavior. Other problems
result from inconsistent byte-ordering conventions, and memory
allocation is complicated by the variable length encoding of
UTF-8, UTF-16, and punycode. In short, there are a number of open
issues with the current use of multiple encodings, where additional
guidance in this area might be helpful.
The group considered a scenario where an IETF WG is inventing a new
protocol and must decide which encoding to use. There have been
several RFCs on this, most recently RFC 4690. A common challenge
is how to maintain backward compatibility with old software while
moving to a new, clean standard. Olaf explained the way DNS has
fairly strict conventions of when 8-bit clean data is required, and
that many applications (including DNS itself) will not do an 8-bit
clean comparison but will perform ASCII case folding. Barry added
that this question of the on-wire interface does not even address
what the user sees in their browser or mail agent. This led to a
long discussion of the different interfaces within the overall
system: user / GUI presentation, interfaces between software
components on a single host, and communication on the wire.
It was agreed that Internationalization of Domain Names (IDN) is
only a part of the overall problem space, and that the multiple-
protocol aspect of the space warrants further IAB consideration.
Several independent test cases were identified, and Stuart, Dave,
Olaf, and Barry will coordinate further on a set of small-scale
encoding tests involving a mixture of dns, smtp, imap, and http.
Existing test pages constructed by the community will also be
leveraged in assessing the current state of interoperability.
Dave Thaler will summarize the current list of known issues in
order to determine next steps for the IAB.