JSON VIRITUAL INTERIM - 2013-08-21 @ 15:00-18:00 UTC via Webex NOTE: Meeting adjourned at approximately 17:00 UTC SUMMARY ============================================================== GENERAL APPROACH -------------------------------------------------------------- There was much discussion about whether to explicitly document the individual layers -- including an abstract data model; or to document the known likely interoperability issues, leaving the existing document mostly intact. While some participants see merit in the layers approach, the rough consensus appears to be that the effort is not worthwhile. Instead, there appears to be rough consensus to document the known likely interoperability issues. For each issue, the document will note the most interoperable behavior, and describe a couple of behaviors that can lead to interoperability problems. The normative language will remain unchanged. JSON TEXT -------------------------------------------------------------- The rough consensus appears to be to start with Tim Bray's proposal, as documented at < http://www.ietf.org/mail-archive/web/json/current/msg01383.html >. DUPLICATE NAMES -------------------------------------------------------------- The rough consensus appears to be to start with the duplicate names summary , as documented at < http://www.ietf.org/mail-archive/web/json/current/msg01345.html >. For determining the equality of names, there appears to be rough consensus to compare values as if the values are unescaped, so that "a" and "\u0061" are equal for purposes of name comparison. John Cowan to propose some text to the list. ECMA TC39 / IETF JSON WG -------------------------------------------------------------- The Area Directors and Chairs were made aware that Ecma TC39 is working on a JSON specification. However, the Area Directors and Chairs still believe the efforts of the JSON Working Group are still worthwhile. TC39 members were invited to join the Working Group, but very few have participated. NUMERICAL PRECISION -------------------------------------------------------------- There appears to be rough consensus to start with what is representable using a 64-bit IEEE-754 double precision value. Joe Hildebrand, Tim Bray, and John Cowan volunteered to draft proposed text. PROFILES -------------------------------------------------------------- John Cowan asked about including a profile of JSON, similar to Tim Bray's I-JSON < http://tools.ietf.org/html/draft-bray-i-json-00 >. The rough consensus appears to be to leave that as a separate draft that the Working Group can consider once the chartered work is complete and it re-charters. BEST PRACTICES AND OTHER DOCUMENTS -------------------------------------------------------------- It was suggested that participants start work on a best practices for JSON, similar to BCP 70. However, that work is not currently in charter. RAW NOTES ============================================================== ACTIVE PARTICIPANTS -------------------------------------------------------------- Barry Leiba (BL) Eliot Lear (EL) Jim Schaad (JS) Joe Hildebrand (JH) John Cowan (JC) Larry Massinter (LM) Paul Hoffman (PH) Pete Resnick (PR) Tim Bray (TB) Tony Hansen (TH) DISCUSSION -------------------------------------------------------------- JH -- I think the approach we can do is to call out why something can challenge interop, and strengthen the SHOULD (NOT) by explaining why it's a SHOULD (NOT). I agree it would make it clearer to have a better mental model, but I don't think it's as important as making sure we explain how to not screw up. JS - Do you mean we say SHOULD, but list why it might be ok? JH - No, I mean SHOULD as in REALLY SHOULD, and explain how not following the SHOULD can break things. PH - I mostly agree with Joe, but think we shouldn't say SHOULD unless ... I think it is critical to say where interoperability issues might be, and not following it might cause these interop issues. But I don't think we should try to categorize all of the vendors that operate certain ways. JH - Right, I didn't mean specific vendors, but that we say "here are the things can we've seen, and here's what you need to do to interoperate". PH - Any other comments? JS - I think that's a totally reasonable approach, and how I've been trying to push for this in the JOSE documents. TB - I'm mostly with Joe here. I think we have a lot of shared concensus on what the interop problems are. If the result of our work is merely a document that talks about what the problems can be, and call it done. I don't think we can strengthen the language any further. JH - If we were to create a document that said "You SHOULD do this. But if you don't do it, here are the interop problems." Will the IESG have a problem with that? PR - I don't think there will be any issues, and I'm the one that whines about 2119 language. SHOULD means you need to do this, and you accept all the things that can go wrong if you don't. BL - Are you not planning to talk about the interop issues? JH - No, we're planning to talk about the issues. PR - The fact is that you SHOULD NOT do duplicate names, and you can have problems if you do. PH - I'm not sure we can do a document that says do this and it won't interop. JH - No, I mean that we say, if you don't do this, here are the ways it might not work. TH - We might want to talk about walled gardens. In the walled garden, you'd be fine, but outside you'd have problems. PH - Are you suggesting we put that into the document? TH - I think the document should talk about where you can expect interoperability and where things become less interoperable. JC - I want to qualify some more. I think it's right to talk about things in terms of might break, and not will break. That while there might not be problems now, but there can be in the future. JH - Taking the duplicate names as an example. JC- It's possible to map them. JH - What I'm saying is there are parsers that generate different results JC - I'm pointing to surrogates in particular, since the problem is more severe. TB - We're trying to fix that. JC - I'm just trying to say that there might be systems that can't fix the issues. PH - Where John's desire might be getting more interesting is the newest issues. We're trying to solve the problems we know about, but there can be problems in the future. We're not going to try to solve the problem for all of the future. JC - I have a problem with just "known", not "known possible". PH,JH - Fine TB - I don't see anyone have a problem with a document that is only listing the interop problems, and that's great progress. JH - I think one more thing we do need to talk about in the document is how strings are compared, like what a human would think are equal but the LM - Like saying this pertains to the abstract model and not the serialization layer. JH - I agree it would be cleaner to explain it that way, but I think that is a much bigger edit than we have energy for. I think it can be good enough to say JC - Well, that you should compare them unescaped, but if you don't you will have interop problems. JH - Right. LM - In IRI, we have Abstract layer, serialization, normalization, etc. I think that will help explain things better. PH - So we have Joe's proposal to talk about ad-hoc explications versus Larry's layer system. JC - I would prefer the layer system in principal, but evading it [missed] JS - I think we also need to talk about what you need to do to the value half in addition to the name half in comparisons. PH - Why is it important in JOSE? JS - We have cases where we are caring about what the value of an attribute is, and comparing it to other data I have. PH - Is the JOSE WG finding that is something generally needed, or just for JOSE. JS - Right now, I'm the one with the issue and have raised it in the WG. But I see it as the exact same issue JH - It may some combination of language for object name comparison plus PRECIS rules for comparison. But I'm not convinced we have a general requirement here. JC - I think we move past values and onto significance of the meaning of values. We have whitespace, unless it's inside the string. You can have as much spaces and it doesn't impact the meaning. So that also applies to escaping; that the meaning of the escaping is what you are comparing and not the literal string. PH - These comparisons are being done for some type of hashing, and for cryptographic comparisons? My personal preference is that we don't talk about removing escaping in value parts, and have the groups talk about it in the terms that make sense for them. JS - I don't care about the hashing. I care about say checking that the algorithm value is the value I'm looking for. JH - That's an application-level thing. JC - So why are we not defining the algorithm? JH - I'm trying to be clever because I might have a special way to handle it for my platform. JC - An analogy would be LATIN SMALL LETTER A WITH ACUTE versus LATIN SMALL LETTER A combined with ACCUTE ACCENT. JH - That brings up the specter of normailzation JC - It's only an analogy. If I send you something that would be a capital A for me, but I can't assume you will see it the same. PH - I think there will be implementations that do weird things, and we should document what can cause interop problems. JC - My point might be too subtle PH - This topic might be for the best practices, which there is desire for but yet in charter. JC - If I send you a "\u0061", it means the same as "a" PH - In a key, or any string. JC - Any string. PH - I believe that is true. JH - I believe that is true, but I'm not convinced about the usefulness. If we were to have that discussion earlier in the document, then we can point to it in the [missed] JC - I note that it says "any character may be escaped", but it is a lower-case may. JH - What if we put something right after that they are compared to be the same. JC - I think we need to be careful about equivalency and equality. JH - I think that needs wordsmithing. PH - So John is going to propose text about string comparison. PH - To summarize -- we have some consensus that we document where there are interop problems, how you can avoid them, and here are some of the ways that they can break. TB - I would fine-tune that some to be here are a non-exhaustive list of ways you can break. PH - That sounds good. LM - I was hoping that someone from TC39 to talk about their JSON effort. PH - You have asked, and we have invited them (including phone calls), but they're not here. TB - Are they working on a JSON standard? PR - We have had conversations with TC39 in the last several days. There is a ECMA document on JSON floating around. When I spoke to the Chairs (JSON and TC39), and the basic idea is we continue doing the work that we think needs to get done anyway, and we will deal with the Ecma output when/if it comes. TC39 has not come out and said we should stop work, so we're not going to stop. PH - Also, this work is not public. LM - I am aware of that, and I don't think I'm revealing anything confidential. PH - TC39 has known about this meeting, and they've chosen not to join us today. The Chairs have done the best that they can do, and the ADs plus Eliot have done the best they can do. The WG Chairs are not part of those AD + Eliot discussions. Coming back to this group: there have been proposals on the mailing list that mostly match what we've talked about in this call. Do people think we're ready to move forward? 1) What is a JSON text 2) What does one do with dupicate names There is no proposed text on number precision, but it is one area where interop problems can come up. JH - Do you mean floating point specific, or numbers in general? PH - In general. Tim had some proposals about the JSON text idea, and Matt posted JC - I think this is superseded by the "you might have interoperability problems if...". PH - Tim's proposal is talking about it in those terms. My question is are we moving too quickly? JH - I think we should try and get a draft together. JC - I agree TB - I have seem a number of concrete proposals into the draft and see where it goes JH - I have one question about the text: I expect that all surrogates to be paired, and all are encoded, etc... JC - What do you mean by that? Unmatched unencoded surrogate pairs don't exist (-: JH - My actual question: should we change the ABNF that you're not allowed to do unpaired escaped. JC - No, we cannot. We must not. PH - Why can't we. TB - Because that would cause things that are JSON now to not be JSON anymore JH - I was suggesting it go into the actual ABNF. It was awful, but it was legal. I understand what you're saying about invalidating existing stuff. But I think we're invalidating it anyway when we change the text. TB - No, we're going to say that "if you do these things, then you might have interop problems". LM - The abstract model allows for numbers that are valid numbers, but are not valid Unicode. There are interop because of bad implementations, and interop problems because strings are used that don't map into the prescribed model. JC - I know people use JSON to interchange JavaScript or Java strings, and saying they can't exchange unpaired surrogates now means they don't comply. JH - I'm willing to back off on that. You're selling past the close. PH - So take this to the list, and give it a week, then add the text to the draft in a week? JC - I don't object in principle, but I'm concerned in practice. I don't think we've talked enough about number precision. PH - We understand, and it can still be hashed out. NUMBERS -------------------------------------------------------------- TB - If you do these things, then you can interoperate; if you don't then you might have problems. So for numbers, that I think that means you need to only use numbers that represented in 64-bit IEEE 754. PH - So what you're suggesting is we do numbers as a third point of interop, and you SHOULD do it this way, and if you don't do it that way you can have interop problems. JC - Just make sure we're talking about the right format. JC - I think there will be people that find this too broad, not too narrow. JH - You think there are people that accept 32-bit floating point? JC - I think so. I think there are people that only accept integers. JH - I think we can say it needs to be TB - I work on a number of platforms, and 64-bit float isn't much of a problem. JC - I think it needs to be at least discussed on list PR - I'm a little concerned about paralleling the interop text here as in other places. It is about if you do this, then you have a good shot of interop; but if you do this, then you can go wrong. JC - For instance, I think one person said they can only handle fixed-point numbers. PR - It seems like it's different. While it's a problem, it's not one the IEEE format will completely fix. I'm not sure this is the kind of text we should put in. PH - Two things: 1) precisions beyond 32-bit (or 64-bit) float, and 2) integers only. If someone only has a 32-bit float, and some of the things that can be described in a JSON number and what will happen when the receivers gets those, which is different from only handling integers. If we hear that there are implementations that only handle integers, then we need to discuss it in the document. If there aren't, but can only handle 32-bit float, then I don't think we need to discuss it. What we are talking about if we stop or not. TB - The thing with surrogates is that the current spec implies you have to deal with it, but it doesn't say you only deal with integers. JC - It does allow for 0 digits after it. TB - I think it's an agregious violation, but you might be right. TH - Floating point is by its very nature non-interoperable when you go between machines or into an on-the-wire representation that doesn't hold the exact bit patterns. The best you can say is that sticking to a format compatible with the IEEE standard would provide for the "best chance of interoperability" JH - And once you move it to base-10, then it gets worse JC - 6427 § 4 says that an implementation on the range of numbers, but does not say anything about the precision of numbers. PH - So we can take this to the list. JC - Omitting one and not the other implies a certain interpretation. LM - Are you saying that having more 0's at the end is different than limiting them? EL - There is a law of physics in play here. We haven't specified a limit, but there will be. JH - We can say that applications MAY put a limit of range and precision, and most use IEEE 754 64-bit. PR - That sounds like the right approach. PH - (throwing in a wrench) As soon as we mention IEEE format, we are going to have to say, BUT you can't do +/-Infinity and NaN. JC - I'm fine with that. We're saying that a number needs to be represented in this format, but not that every IEEE is represented. PH - OK, making sure we understand that. PH - I hear a couple of volunteers for numbers. JC - I think the three of us can come to consensus. PH - I don't want the list to think we have a lot of contention, when we don't really have any. Can Joe/Tim/John come to internal consensus before going to the list. JC/JH - Yes. OPEN MIC - ECMA RELATIONSHIP -------------------------------------------------------------- PH - You've brought up the Ecma thing ... is there any specific concerns you want to bring up now? LM - I have heard that some at TC39 are not happy about no formal liaison. EL - I sit on the IAB, and I have been talking to the TC39 people about this. There is an impedence mismatch between the IETF and ECMA. The IAB doesn't see the need for a long-term relationship so we don't want to formalize it. The IETF is open to everyone, and we think the best way to get people to talk is to just have people talk. LM - The mailing list is full about things that are foreign. It's very difficult to just come into the meeting room and start participate at any point. I think we should call them, and tell them we're having a meeting and you should come here. JC - We've done that, but they can't because they're under non-disclosure. EL - I haven't heard that here. We received a document, and we're reviewing it. There is a point on the charter about participation from Ecma, but they haven't. OPEN MIC - APPLICATION/FORM-DATA -------------------------------------------------------------- LM - I've been working on application/form-data, and it has a lot of the same problems with non-ASCII values. And separating things into layers helps better define what needs to happen where. We could say that the Abstract model only contains integer values that can be interpreted as more; application models might have IEEE and so on. If you want to understand why the design choices are what they are, then I think the layering helps do that. JC - It would if it was in the data model. I think that if we had the luxury to start from scratch, it would be useable. But we can't, so I don't think it's useful now. LM - I don't understand that, since the abstract model is already in the applications today. JC - I'm not understanding. LM - The spec today says there are string and numbers, and there are problems with precision and encoding, but it's not a problem in the abstract model. JC - We went down this road in XML. We couldn't agree on what is in or out of the data model. LM - Does JSON have the same problems as XML? JC - No, because no one as attempted to write a JSON data model. PH - For the WG, does a data model document go into 4627, or in something else? LM - I'm not proposing the data model as anything more than a way to describe why certain choices were made. PH - I suggest that we hold that discussion for the mailing list. There are a variety of understanding of implementers, and not identical. Having a data model would be valuable in an implementer's guide, but not in 4627bis. JC - Or a user's guide PH - When it's about what's next. Do we need an implementer's guide, or a combination implementer's/user's guide. We propose that you write up the data model, but not for inclusion in 4627bis. OPEN MIC - PROFILE -------------------------------------------------------------- JC - The idea of a Tim Bray profile of JSON, in the 4627bis. If you want to maximize interop, obey all of these rules...rather than scattering them throughout the document. PH - Do they belong in the document at all, and then do they belong in a single section? TB - No, that would read weird. Put things where you talk about it. PH - So you want scattered? TB - I want do judiciously sprinkle the interop guidance and challenges. PR - There are instances where there are clear interop problems, and that's different than maximizing interop. I think we have consensus on the specific problems, and it sounds like you want a more general guidance. JC - I bring it up because §4 talks about parsers dropping data. TB - I think what John is trying to assemble is my I-JSON draft into 4627bis? JC - I'm not necessarily saying we have it in multiple places, but that we should have all of these points gathered into one place, so we can reference it as a profile. TB - I think we know what the biggest problems, and we can do localized surgery. This is also about having an interoperable profile in this document, but it requires a lot more. JC - I agree at that level it's not in our scope. But now I wonder if there are interop problems if using something other than UTF-8? PH - 4627 does have an IANA consideration, which is a MIME registration that is inviolate. It only talks about UTF-8 ,-16, and -32. JH - I agree with what Paul and Tim said, and we can't change what people are using today, but that we should have another document that JC - I'm asking if we say that if we don't limit to UTF-8, that anyone that does limit is not interopable. JH - I'm aware of one that only understand UTF-16 ... like ECMA-262. I don't think this is a good path to go down. OPEN MIC - BCP 70 -------------------------------------------------------------- LM - I did ask about BCP 70. It seems like we should have something like BCP 70 for JSON. PH - There is interest, but we're not ready for that yet. LM - I'd like to solicit volunteers on that, and other parking lot things. PH - I think you should keep that list, but I don't want to interrupt the WG process. It's not on the charter now. OPEN MIC - NEXT STEPS -------------------------------------------------------------- JH - I've sent along small edits to Tim and John, so we should have something out soon.