# CBOR working group call, 2025-10-01 {#cbor-working-group-call-2025-10-01} Meetecho: https://meetings.conf.meetecho.com/interim/?group=db1470ae-399c-41de-afed-11a1a0e75870 # AGENDA {#agenda} ## WG documents status and issues {#wg-documents-status-and-issues} BL doing short introductions ### cde {#cde} A new [draft PR#42][1] is trying to implement the discussion at the 2025-cbor-16 interim. HTML: : https://cbor-wg.github.io/draft-ietf-cbor-cde/disentangle/draft-ietf-cbor-cde.html Diff: : https://author-tools.ietf.org/api/iddiff?url\_1=https://cbor-wg.github.io/draft-ietf-cbor-cde/draft-ietf-cbor-cde.txt&url\_2=https://cbor-wg.github.io/draft-ietf-cbor-cde/disentangle/draft-ietf-cbor-cde.txt Can the WG already provide feedback? Presented slides: https://datatracker.ietf.org/meeting/interim-2025-cbor-17/materials/slides-interim-2025-cbor-17-sessa-cde-00 CB presenting CB: There is now a draft-PR on GitHub. More work needed before a new version of the Internet Draft to submit. But it already captures the direction. CB(p2): Not uploaded as draft yet b/c needs a bit of redundancy removal. CB(p3): Still no changes to RFC 8949 (ACK'ing that JH wants this different, will need to come back to that), but New terms are still admitted if useful. (quickly skipping on use cases, not as the main scope here) CB (p6): Formalization of encoding constrained set and what is really in scope for CDE (conjunction/union of 3 particular constraints). Should this be written as logical predicates? Something else? CB (p7): Two cases of "why are we talking about constraints". CB (p8): Also in scope for this document: one more encoding constraint that is interesting (definite-length-only). That should be here, not in a separate document. This is for a type of partial implementation that we care enough about; this is the only currently known "partial implementation". CB (p9): Related work in -core-href specifying CRIs. It has text on not using indefinite length. We plan to keep the text and its meaning, but we can refine it according to the language choice developed here. JH (on chat): non-trivial NaNs would fit into a "simplifed" ECS (along with DLO, for the same reasons) CB: Not really, it's two orthogonal issues CB (p11): We have to avoid too many available options. That's difficult to navigate. Also we should not invalidate previous uses of RFC 8949. CB (p12): We now have five named encoding constraints. preferred-serialization is not usually an interoperability thing (ie it is used in cojunction w/ a generic deocder). CB (p12): encoders will often rely on definite-length-encoding. JH (on chat): well-formed has 2 constraints. no-dup-keys, no-invalid-utf8 CB: That's about validity (for a later slide) JH (on chat): but i think valid is required for deterministic. CB (p13): On map validity, which requires an encoder to know all map keys and thus excludes streaming encoders. The constraint lexicographic-map-sorting implies knowledge of the map keys for the sake of sorting. JH: IIRC from 8949, it says you have to operate on the data model and not on encoded version. I wish it'd be "encoed version" b/c that'd be easy to implement based on what you said. Worried they're not exactly the same CB: To be checked. Can you create a GitHub issue? JH: On it. https://github.com/cbor-wg/draft-ietf-cbor-cde/issues/43 CB (p13..): Validities. LL (on chat): check \[for duplicate keys\] on decode or encode? CB: Both. You can have to remember previous map entry's key and compare. JH: Remember now. If you're not doing deterministic encoding, and you have map as map key, then are they duplicates or not? IIRC 8949, they're supposed to be the same. CB: Talking about lexicographic encoding constraint. That forces deterministic encoding of keys. JH: If they're now orthogonal, then there's question of whether you can do that. CB: Yeah. Slide is not true, you may not have (?), and on its own… that needs to be done differently. LL (on chat): We can't require checking encoders for determistic encoding CB: The encoder has the situation undeer control and shouldn't need to check. LL: Suggested on-list that we require deterministic encoders to disallow duplicate keys. You said we can't do that due to circumstances… Is that a requirement? CB: We're defining CDE to make that a requirement now. LL: That changed from last week -- so I was right? CB: This a decision point, can make a decision here. . LL: It also depends on the sorting techcnique. It might notbe cheap. CB: But when you serialize, you can check that. That sort of check happens when serializing C++ multimaps. LL: Just state the requirement and that it is a requirement: That it is not allowed in deterministic encoding to encode duplicate keys. You're only relying on the encoder, and not on the decoder for conformance and checking. JH: Philosophical issue. Two ways of checking that received thing is correctly encoded CDE -- might check as-is-being-decoded, or might decode/reencode/compare, or trust. When doing 2nd way, no need to check. LL (on chat): Checking is expensive. CB: My point -- checking is expensive. CB: Decoder checking has zero-cost in brutish decode/reencode which requires some space and time. LL (on chat): Decoder checking has to be optional. Normative behavior can't depend on it. CB: Only checking decoders have to do decoder checking. General decoder can decode, will just not get the properties. JH (chat): It's a protocol-designer question about whether you need to check at all, and an implementation decision which approach to use. CB: Yes. Protocol designer will say "you're using CDE, you need to check at decoding" for security protocols. LL (on chat): The one design case where checking will be necessary is when it is know that the encoder can't guarantee there will be no duplicates CB: In security it is necessary. LL: Don't think that's true. I'll think more about it. CA: It depends on the security protocol. Some may be designed not to need it. CA: So, COSE shouldn't have been using bstr for protected? CB: Different situation; that's about signing inputs. Here talking about protcols that use the property of being deterministically encoded. Like Andes' embedded signatures. CB: Need more text on this than is currently there. CB (p14): On to text string validity. Do we want to require the expensive check? CB: Do we want to limit CDE to applications where that check can be made? JH: Same argument as for other constraints: This is wedge point for attackers. Maybe this is where something can be implemented so simply, but last time I implemented it, missed 3 constraints of UTF-8. Also, there are now UTF-8 checkers that use wide instructions. Doesn't have to be as slow as it used to be. LL: So, determinism is orthogonal to security in terms of malformed input. Purpose of determinism is determinism. Need to be clear on that orthogonality. Defending against bad input is security, independently of determinism. Let's keep determinism clean about determinism, and not extend it to defense about malicious input. CB: I wasn't so sure, e.g., thinking of when implementing UTF-8. In practice, it's so hardened that it's a non-issue. LL has a good point here. CB: Not sure that we are clear already of which validity checks we should to. I will propose text covering both and we take it from there. CB (p15): In fact, we can have tag validity as part of the same "box" holding string validity. CB (p16): Next step is to complete PR #42. CB: We need a plan to have this finished, having in mind the IETF 124 cut-off. We can rely on one more interim meeting. We need feedback on the PR, on which I can make some editorial updates this week, but save a full reading of the document for later. CB: But after that, anything contentious left? Maybe NaN that JH raised? Anything else before finishing the document? ## Non-trivial NaNs {#non-trivial-nans} LL: There is an encoding with no name, without constraints. CB: Well-formed. LL: So, Well-formed supports non-trivial NaNs just fine. And then "definite-length encoding with non-trivial NaNs" will be custom serialization (?). JH (on chat): Agree that non-trivial NaNs are fine in well-formed, as long as the NaNs aren't touched as in preferred. CA: I thought that one can always use an encoder with a tighter constraint set. Would ths break, because a CDE encoder can't produce them any more? JH: If we say that CDE does not support that, then you'd have an issue. You can still support the data model with a different encoding. CA: I'd prefer a solution that uses a tag. I wouldn't produce non-trivial NaNs. JH: That's why I pushed for a tag. It makes it an encoding decision, irrespective of the data model. CA (chat): Thanks, that sounds good to me. CB: That's one way to do this. Disadvantage is we can't say preferred encoding is a good default any more (and that'd be a big diversion). CB: Cases where preferred serialization loses information are rather artificial, don't think impl will have that problem. Prople working with NaNs will work with a specific encoding size, and the fact that at the transport it saves a few trailing zeros doesn't change how application handles it. But it does require some care in an aside. CA (chat): :thumbsup: LL: I'd expect encoders to support only constrained encoding and deterministic encoding. Non-trivial NaNs are really an edge case. JH: Agree, it's really a small edge case. JH: Right now I only encode trivial NaNs. So there exist situations where sending non-trivial NaNs will be interoperability issue (?). CB (on chat): (You can always handle the issue in ALDR) JH: Section 5.3.1 par 3 JH: Is untrue due to the sign bit. Not sure if that's recoverable. Error in 8949. We should consider it. CB: Correct, they should have the same bit. We can use an Erratum. HJ (on chat): I will file an errata. BL: AOB? IMD: Not change to CDE but question: Checks that need to be done when used in security sensitive context, but not part of CDE or 8949 validity. (Where) Can we document them? Running document? Says "Security sensitive applications SHOULD/MUST(?) do those additional checks". LL: Every use case is security use case in that sense. Different document, not serialization related. IMD: Agree; concern is that we should be talking about it in *some* document. CB: Start with a wiki page, collect cases. Currently, have no cases. General document about using CBOR best practices? Hope this (?) is comprehensive enough, but if there's more, collect it. → Will create wikipage. ## AOB {#aob} Note taking: Marco Tiloca, Christian Amsüss [1]: https://github.com/cbor-wg/draft-ietf-cbor-cde/pull/42 *[CB]: Carsten Bormann *[BL]: Barry Leiba *[FP]: Francesca Palombini *[IMD]: Ira McDonald *[MCR]: Michael Richardson *[CA]: Christian Amsüss *[MT]: Marco Tiloca *[PP]: Philip Prindeville *[ST]: Sean Turner *[BM]: Brendan Moran *[KC]: Kal Conley *[RHo]: Russ Housley *[ML]: Martine Lenders *[MM]: Maria Matějka *[OS]: Orie Steele *[RM]: Rohan Mahy *[JH]: Joe Hildebrand *[LL]: Lawrence Lundblade *[VG]: Vadim Goncharov