Skip to main content

Last Call Review of draft-ietf-precis-framework-15
review-ietf-precis-framework-15-secdir-lc-kaufman-2014-04-24-00

Request Review of draft-ietf-precis-framework
Requested revision No specific revision (document currently at 23)
Type Last Call Review
Team Security Area Directorate (secdir)
Deadline 2014-04-22
Requested 2014-04-10
Authors Peter Saint-Andre , Marc Blanchet
I-D last updated 2014-04-24
Completed reviews Genart Last Call review of -15 by Tom Taylor (diff)
Genart Last Call review of -22 by Tom Taylor (diff)
Secdir Last Call review of -15 by Charlie Kaufman (diff)
Opsdir Last Call review of -15 by Tim Wicinski (diff)
Assignment Reviewer Charlie Kaufman
State Completed
Request Last Call review on draft-ietf-precis-framework by Security Area Directorate Assigned
Reviewed revision 15 (document currently at 23)
Result Has nits
Completed 2014-04-24
review-ietf-precis-framework-15-secdir-lc-kaufman-2014-04-24-00

I have reviewed this document as part of the security directorate's ongoing
effort to review all IETF documents being processed by the IESG.  These
comments were written primarily for the benefit of the security area
directors.  Document
 editors and WG chairs should treat these comments just like any other last
 call comments.



This document concerns international character sets. You might intuitively
think that international character sets would have few if any security
considerations, but you would be wrong. Many security mechanisms depend on the
ability to
 recognize that two identifiers refer to the same entity and inconsistent
 handling of international character sets can result in two different pieces of
 code disagreeing as to whether two identifiers match and this has led to a
 number of serious security problems.



This document defines 18 categories of characters within the UNICODE character
set, with the intention that systems that want to accept subsets of UNICODE
characters in their identifiers specify profiles referencing this document, and
it
 defines two initial classes (IdentifierClass and FreeformClass) that could be
 used directly by lots of protocol specifications.



While I see no problems with this document, it does seem like a missed
opportunity to specify some things that are very important in the secure use of
international character sets. The most important of these is a rule for
determining whether
 two strings should be considered to be equivalent. It is very common in both
 IETF protocols and in operating system object naming to adopt a preserve case
 / ignore case model. That means that if an identifier is entered in mixed
 case, the mixed case is preserved as the identifier but if someone tries to
 find an object using an identifier that is identical except for the case of
 characters, it will find the object. Further, in instances where uniqueness of
 identifiers is enforced (e.g. user names or file names), a request to create a
 second identifier that differs only in the case of the characters from an
 existing one will fail.



These scenarios require that if be well defined whether two characters differ
only in case, and while that is an easy check to make in ASCII with 26 letters
that have upper and lower case versions, the story is much more complex for some
 international character sets. Worse, case mapping of even ASCII characters can
 change based on the “culture”. The most famous example is the Turkish undotted
 lower case ‘i’ and uppercase dotted ‘I’ which caused security bugs because
 mapping “FILE” to lowercase in the Turkish Locale did not result in the string
 “file”. There are also cases where two different lowercase characters are both
 mapped to the same uppercase character. It is a scary world out there.



To be used safely from a security standpoint, there must be a standardized way
to compare two strings for equivalence that all programs will agree on.
Programs will still have bugs, but when two programs interpret equivalence
differently
 it is important that it be possible to determine objectively which one is
 wrong. The ideal way to do this is to have a canonical form of any string such
 that two strings are equivalent if their canonical forms are identical.



Section “10.4 Local Character Set Issues” acknowledges this problem, but offers
no solution.



In section “10.6 Security of Passwords”, this document recommends that password
comparisons not ignore case (and I agree). But for passwords in particular, it
is vital that they be translated to a canonical form because they are frequently
 hashed and the hashes must test as identical. One rarely has the luxury of
 comparing passwords character by character and deciding whether the characters
 are “close enough”.



Section “10.5 Visually Similar Characters” discusses another hard problem:
characters that are entirely distinct but are visually similar enough to
mislead users. This problem occurs even without leaving ASCII in the form of
the digit ‘0’
 vs the uppercase letter ‘O’ and triple of the digit ‘1’, the lowercase letter
 ‘l’, and the uppercase letter ‘I’. In some fonts, various of these are
 indistinguishable. International character sets introduce even more such
 collisions. To the extent that we expect users to look at URLs like https://

www.fideIity.com

 and recognize that something is out of place, we have a problem. It is
 probably best addressed by having tables of “looks similar” characters and
 disallowing the issuance of identifiers that look visually similar to existing
 ones in places like DNS registries and other places where this problem arises.
 Having a document that lists the doppelganger character equivalents would be a
 useful first step towards deploying such restrictions.



I suppose it is too much to expect this document to address either of these
issues, but I couldn’t resist suggesting it.



                --Charlie