i;unicode-casemap - Simple Unicode Collation Algorithm
draft-crispin-collation-unicasemap-07
Revision differences
Document history
Date | Rev. | By | Action |
---|---|---|---|
2012-08-22
|
07 | (System) | post-migration administrative database adjustment to the Yes position for Sam Hartman |
2007-09-07
|
07 | (System) | IANA Action state changed to RFC-Ed-Ack from Waiting on RFC Editor |
2007-09-07
|
07 | (System) | IANA Action state changed to Waiting on RFC Editor from In Progress |
2007-09-07
|
07 | (System) | IANA Action state changed to In Progress from Waiting on Authors |
2007-09-07
|
07 | (System) | IANA Action state changed to Waiting on Authors from In Progress |
2007-09-07
|
07 | (System) | IANA Action state changed to In Progress from Waiting on Authors |
2007-09-07
|
07 | (System) | IANA Action state changed to Waiting on Authors from In Progress |
2007-09-06
|
07 | Amy Vezza | State Changes to RFC Ed Queue from Approved-announcement sent by Amy Vezza |
2007-09-06
|
07 | Amy Vezza | IESG state changed to Approved-announcement sent |
2007-09-06
|
07 | Amy Vezza | IESG has approved the document |
2007-09-06
|
07 | Amy Vezza | Closed "Approve" ballot |
2007-09-06
|
07 | (System) | IANA Action state changed to In Progress |
2007-09-05
|
07 | Lisa Dusseault | State Changes to Approved-announcement to be sent from IESG Evaluation::AD Followup by Lisa Dusseault |
2007-08-31
|
07 | Sam Hartman | [Ballot Position Update] Position for Sam Hartman has been changed to Yes from Discuss by Sam Hartman |
2007-08-31
|
07 | (System) | New version available: draft-crispin-collation-unicasemap-07.txt |
2007-08-24
|
07 | (System) | Removed from agenda for telechat - 2007-08-23 |
2007-08-23
|
07 | Amy Vezza | State Changes to IESG Evaluation::AD Followup from IESG Evaluation by Amy Vezza |
2007-08-23
|
07 | Jon Peterson | [Ballot Position Update] New position, No Objection, has been recorded by Jon Peterson |
2007-08-22
|
07 | Ron Bonica | [Ballot Position Update] New position, No Objection, has been recorded by Ron Bonica |
2007-08-22
|
07 | Dan Romascanu | [Ballot Position Update] New position, No Objection, has been recorded by Dan Romascanu |
2007-08-22
|
07 | Lars Eggert | [Ballot Position Update] New position, No Objection, has been recorded by Lars Eggert |
2007-08-21
|
07 | Russ Housley | [Ballot comment] Based on Gen-ART Review from Christian Vogt. Section 1 ends with an applicability statement for the algorithm. The section says that, … [Ballot comment] Based on Gen-ART Review from Christian Vogt. Section 1 ends with an applicability statement for the algorithm. The section says that, while the algorithm is well-suited for technical languages, it does not work correctly in certain cases when applied to natural language. My suggestion is to move the applicability statement to a more prominent place, perhaps into a new section preceding current section 1. 3rd paragraph of section 1: s/using using/using/ |
2007-08-21
|
07 | Russ Housley | [Ballot Position Update] New position, No Objection, has been recorded by Russ Housley |
2007-08-20
|
07 | David Ward | [Ballot Position Update] New position, No Objection, has been recorded by David Ward |
2007-08-20
|
07 | Jari Arkko | [Ballot Position Update] New position, Yes, has been recorded by Jari Arkko |
2007-08-17
|
07 | Chris Newman | [Ballot Position Update] New position, Yes, has been recorded by Chris Newman |
2007-08-17
|
07 | Chris Newman | [Ballot comment] In response to Sam's discuss, the normalization form described is Normalization form KD excluding the "Canonical Ordering Behavior" of non-spacing marks. Unicode TR … [Ballot comment] In response to Sam's discuss, the normalization form described is Normalization form KD excluding the "Canonical Ordering Behavior" of non-spacing marks. Unicode TR 15 (http://www.unicode.org/reports/tr15/) defines form KD as "compatibility decomposition" and the version of the Unicode specification I have handy (v 2.0) defines compatibility decomposition as recursively applying both compatibility and canonical mappings and then re-ordering non-spacing marks (I assume this hasn't changed). As the UniData file contains both compatibility and canonical decompositions in the "decomposition property" (the Unicode Character Database overview document calls this the Decomposition_Mapping field) the core of this algorithm is the same as NFKD. If this variant of NFKD was a form visible on the wire, I would consider that problematic, but as it is a form used internal to the algorithm only, I do not consider it problematic as long as it is intentional. The one place where this difference from normalization form KD might have surprising results is if one of the input strings contains a non-canonical decomposition of a letter with multiple diacritical marks. In that case the character would not match under the equality function. As the intention of this collation is a 'cheap' collation rather than a linguistically correct collation (just as i;ascii-casemap is not a proper English collation), I consider this simplification justifiable. The "canonical ordering behavior" section of the Unicode specification is non-trivial and provides little incremental benefit (indeed little visible change) to this collation for that additional complexity. The specification could be made more helpful to implementors with a Unicode library by giving advice about whether or not it is acceptable to substitute NFKD for the partial-NFKD described here. On the issue of tracking the Unicode standard, I'll mention that many runtime environments have built-in Unicode support now and it can be difficult to determine which version of Unicode is active and most such environments provide no way simple way to bind to a previous version of UniData.txt. So I consider tracking the current version of Unicode to be pragmatic and necessary to make implementations feasible. I would look to the "running code" principle of the IETF to support this position. |
2007-08-17
|
07 | Sam Hartman | [Ballot discuss] I don't expect to hold this discuss significantly past the telechat and would not be surprised if no change is required. I want … [Ballot discuss] I don't expect to hold this discuss significantly past the telechat and would not be surprised if no change is required. I want to ask how much review has been done for two issues: 1) How well does the decomposition normalization in this spec align with NFKD. The text says it is effectively the same, but where does it produce different results? Do we care? 2) Are we comfortable with the unstable reference to unicode data? Do we at least need to discuss the security considerations of these changes? |
2007-08-17
|
07 | Sam Hartman | [Ballot Position Update] New position, Discuss, has been recorded by Sam Hartman |
2007-08-16
|
07 | Cullen Jennings | Placed on agenda for telechat - 2007-08-23 by Cullen Jennings |
2007-08-16
|
07 | Cullen Jennings | [Ballot Position Update] New position, No Objection, has been recorded by Cullen Jennings |
2007-08-16
|
07 | Ron Bonica | Removed from agenda for telechat - 2007-08-23 by Ron Bonica |
2007-08-16
|
07 | Tim Polk | [Ballot Position Update] New position, No Objection, has been recorded by Tim Polk |
2007-08-14
|
07 | Ross Callon | [Ballot Position Update] New position, No Objection, has been recorded by Ross Callon |
2007-08-09
|
07 | Lisa Dusseault | Ballot has been issued by Lisa Dusseault |
2007-08-09
|
06 | (System) | New version available: draft-crispin-collation-unicasemap-06.txt |
2007-08-07
|
07 | Lisa Dusseault | [Ballot Position Update] New position, Yes, has been recorded for Lisa Dusseault |
2007-08-07
|
07 | Lisa Dusseault | Ballot has been issued by Lisa Dusseault |
2007-08-07
|
07 | Lisa Dusseault | Created "Approve" ballot |
2007-08-07
|
07 | Lisa Dusseault | Placed on agenda for telechat - 2007-08-23 by Lisa Dusseault |
2007-08-07
|
07 | Lisa Dusseault | State Changes to IESG Evaluation from Waiting for AD Go-Ahead by Lisa Dusseault |
2007-08-07
|
07 | Lisa Dusseault | State Changes to Waiting for AD Go-Ahead from Waiting for Writeup by Lisa Dusseault |
2007-08-07
|
05 | (System) | New version available: draft-crispin-collation-unicasemap-05.txt |
2007-07-30
|
07 | Lisa Dusseault | APPs-REVIEW There are a couple of clarifications I would suggest: Section 1, Para 1: "All input is valid." - it is not clear that this … APPs-REVIEW There are a couple of clarifications I would suggest: Section 1, Para 1: "All input is valid." - it is not clear that this refers to the validity test in 4790 as opposed to input to the tests described in the previous sentence. Suggested alternative: "The validity test operation always returns a valid result." Even with that change, later the spec states that "strings in other character sets and/or encodings can not be used with this collation" so wouldn't those return an invalid response if used in the validity test? Same for invalid UTF-8 sequences? Section 1, Para 5: I would like to see a informative reference pointing to the current UnicodeData.txt file rather than just having the generic [UNICODE] reference. Section 5, Para 6: "The resulting two titlecased canonicalized UTF-8 strings are then treated as in i;octet for equality and ordering." shouldn't that also mention substring? Suggested alternative: "The resulting two titlecased canonicalized UTF-8 strings are then treated as in i;octet for equality, substring and ordering operations." Other than that looks good. This should proceed to publication asap as it is needed by a lot of apps. --Cyrus Daboo |
2007-06-20
|
07 | (System) | State has been changed to Waiting for Writeup from In Last Call by system |
2007-06-07
|
07 | Samuel Weiler | Request for Last Call review by SECDIR Completed. Reviewer: Sean Turner. |
2007-06-07
|
07 | Yoshiko Fong | IANA Last Call Comments: Upon approval of this document, the IANA will make the following assignments in the "Collation Registry" registry located at http://www.iana.org/assignments/collation/collation-index.html Coallition: … IANA Last Call Comments: Upon approval of this document, the IANA will make the following assignments in the "Collation Registry" registry located at http://www.iana.org/assignments/collation/collation-index.html Coallition: See Section 2 of [RFC-crispin-collation-unicasemap-02] Description: The i;unicode-casemap collation is well suited to to use with many Internet protocols and computer languages. Use with natural language is often inappropriate; even though the collation apparently supports languages such as Swahili and English, in real-world use it tends to mis-sort a number of types of string Reference: [RFC-crispin-collation-unicasemap-02] We understand the above to be the only IANA Action for this document. |
2007-05-25
|
07 | Lisa Dusseault | Issues raised to deal with in next version or RFC Ed notes: - conversion to titlecased canonicalized UTF8 must be applied recursively - deal with … Issues raised to deal with in next version or RFC Ed notes: - conversion to titlecased canonicalized UTF8 must be applied recursively - deal with apparent conflict between paragraphs on using other Unicode mappings |
2007-05-25
|
07 | Samuel Weiler | Request for Last Call review by SECDIR is assigned to Sean Turner |
2007-05-25
|
07 | Samuel Weiler | Request for Last Call review by SECDIR is assigned to Sean Turner |
2007-05-23
|
07 | Amy Vezza | Last call sent |
2007-05-23
|
07 | Amy Vezza | State Changes to In Last Call from Last Call Requested by Amy Vezza |
2007-05-22
|
07 | Lisa Dusseault | State Changes to Last Call Requested from AD Evaluation::AD Followup by Lisa Dusseault |
2007-05-22
|
07 | Lisa Dusseault | Last Call was requested by Lisa Dusseault |
2007-05-22
|
07 | (System) | Ballot writeup text was added |
2007-05-22
|
07 | (System) | Last call text was added |
2007-05-22
|
07 | (System) | Ballot approval text was added |
2007-05-15
|
07 | Lisa Dusseault | State Changes to AD Evaluation::AD Followup from Publication Requested by Lisa Dusseault |
2007-05-03
|
04 | (System) | New version available: draft-crispin-collation-unicasemap-04.txt |
2007-05-02
|
07 | Lisa Dusseault | PROTO Writeup (1.a) Alexey Melnikov is the document shepherd for this document. The document is ready for publication. (1.b) This document was reviewed … PROTO Writeup (1.a) Alexey Melnikov is the document shepherd for this document. The document is ready for publication. (1.b) This document was reviewed by several active and experienced IMAPEXT and Sieve WG members. There are no concerns about the depth of the reviews. Also note that this document is a dependency for the IMAP I18N document and an indirect dependency for Lemonade Profile Bis document (draft-ietf-lemonade-profile-bis-XX.txt). (1.c) No concerns requiring additional review. In particular, this document was reviewed by Arnt Gulbrandsen, who is one of the editors of the RFC 4790. (1.d) No specific concerns. No IPR disclosure was filed for this document. (1.e) This document is an individual submission. (1.f) No appeals threatened. (1.g) IDnits 2.04.07 was used to verify the document. Excessively long lines were found, but this is purely editorial. It also reports 2 Missing Reference, but they are defined. I think this is a bug in IDnits. Also there are some reports on possible DOWNREFs. 2 of them are informative, the other 2 point to Unicode documents. (1.h) References are properly split. There are no downward normative references. (1.i) An IANA considerations section exists and is clearly defined. It contains a registration of a new collation algorithm. (1.j) The document doesn't have any ABNF, MIB, etc. The XML registration template opens fine with Mozilla. (1.k) Document Announcement Write-Up Technical Summary This document describes "i;unicode-casemap", a simple case-insensitive collation for Unicode strings. It provides equality, substring and ordering operations. Working Group Summary This document is an individual submission. It was informally last called in the IMAPEXT WG and consensus was reached that this collation would be easier to implement than i;basic, thus it has a better chance of being deployed. Document Quality There is at least one server implementation of this document. At least 2 other server vendors are interested in implementing it. Personnel Alexey Melnikov is the document shepherd for this document. Lisa Dusseault is the responsible AD. |
2007-05-02
|
07 | Lisa Dusseault | Draft Added by Lisa Dusseault in state Publication Requested |
2007-04-17
|
03 | (System) | New version available: draft-crispin-collation-unicasemap-03.txt |
2007-04-11
|
02 | (System) | New version available: draft-crispin-collation-unicasemap-02.txt |
2007-03-22
|
01 | (System) | New version available: draft-crispin-collation-unicasemap-01.txt |
2006-12-06
|
00 | (System) | New version available: draft-crispin-collation-unicasemap-00.txt |