Skip to main content

Minutes IETF117: cbor: Tue 00:30
minutes-117-cbor-202307250030-00

Meeting Minutes Concise Binary Object Representation Maintenance and Extensions (cbor) WG
Date and time 2023-07-25 00:30
Title Minutes IETF117: cbor: Tue 00:30
State Active
Other versions markdown
Last updated 2023-08-02

minutes-117-cbor-202307250030-00

CBOR working group session at IETF 117

Monday, 24 July, 2023 — 17:30-18:30 PDT (UTC-7)

Agenda

Minutes

Minute taking: Marco Tiloca, Christian Amsüss

Introduction and administration

BL doing introductions: note-well; blue-sheets, queue and slide control
through meetecho.

BL going through agenda; no changes.

Documents not under discussion today (5 min, including intro)

time-tag https://datatracker.ietf.org/doc/draft-ietf-cbor-time-tag/

BL: changed to Standards Track to Informational. I'm working on the
Shepherd Write-Up, then we'll request publication.

Presented slides:
https://datatracker.ietf.org/meeting/117/materials/slides-117-cbor-carstens-slides-00.pdf

CB (p2): SEDATE is still not done, but we're far enough decoupled to
progress. One editorial can be addressed during IETF LC.

packed https://datatracker.ietf.org/doc/draft-ietf-cbor-packed/

Presented slides:
https://datatracker.ietf.org/meeting/117/materials/slides-117-cbor-carstens-slides-00.pdf

CB (p3): -09 has two ways to set things up. (common and split setup).
More text.
CB: DNS over CoAP is use case, we'd like to hear more implementation
experience, then we're done.
CB: Christian complemeted setup picture by defining how to get tables
from elsewhere; suggesting for next interim.

Active WG documents

(ad-hoc) "CBOR-associated languages"

CB's introduction to the next blocks.

CB (p4):

  • cbor-pretty is in the IETF list of formats used in RFCs (but gets
    long).
  • Diagnostic notation is now EDN since it has been extended. Text form
    for single instance. Easy to convert from and into CBOR. Feels like
    JSON.
  • Then CDDL for describing the data model, a grammar, inspired by
    ABNF.
  • Everyone confuses details between the latter two.

edn-literals

Presented slides:
https://datatracker.ietf.org/meeting/117/materials/slides-117-cbor-carstens-slides-00.pdf

CB (p5): Going through the recent history of the draft. Recently added
ABNF and and appendix on EDV vs. CDDL (similar to the previous presented
slide). We need implementation of ABNF and reviews.

CDDL (20 min)

Presented slides:
https://datatracker.ietf.org/meeting/117/materials/slides-117-cbor-carstens-slides-00.pdf

CB (p6):

  • grammar essentially done -- but just became WG document, so review
    would be good.
  • control -- please look at your specs looking for things that are
    hard to write down, eg. "we have no way to define JSON inside a
    string"?.
  • module: Really 2.0; play around with it, using import/include!
    CB: We need these to be used in specifications, and reviews and
    implementations. All implemented in cddlc tool.
    CB: So, all waiting for reviews.

HB: We have like 50 CDDL fragments for CORIM. We'll use import/include
there, as soon as my coauthors reappear.
CB: Wonderful!

CA: Giving good examples of how documents from ACE and about CWT use
these tools would help.
CB: Good point; next slide is saying the same on a different matter.
We'll need to write a few more examples, not only on specs but also
larger examples on how things fit together. ACE/CWT are good examples.

(ad-hoc) Deterministic encoding, and other active work

CB (p7): Switching to deterministic encoding. There is a well-defined
deterministic encoding in 8949.
CB: New bormann-cbor-det explains how application comes in. This can be
developed as we get more feedback from people using deterministic
encoding.

CB: To benchmark that backgrounder, took dCBOR and extracted differences
from 8949. Essentially it's just the numeric type reduction. With the
backgrounder it's not hard to do.
CB: Still not decided if we need these documents, but I found it useful
to write the backgrounder. We would need some initial reviews.
CB: Then more active, related works: -dns-cbor,
-cddl-csv-rfc-cddl-models, -draft-numbers
CB: rfc-cddl-modules extracts models and helps with applying errata.

MP: The dCBOR draft was very helpful because it condensed things.
Negative 0 etc was interesting; was there anything else worth noting?
Something that needs particular attention from implementer other than 0
adjustment?
CB: Another interesting part is handling subnormal floating point
numbers. Thinking about how to processing them destroys the beautiful
unified number idea.
MP: Also relevant to machine learning.
CB: FP8 and bfloat16 are becoming relevant there; that document might
help.

WMN: Original proposal for dCBOR had subnormals called out for future
work. But it's part of IEEE and thus CBOR. Wanted to make sure it's done
at some point (even though not our focus). Our initial spec dealt with
machine size ints and floats. Wanted to lock those down. Spec did good
job as we had it. When it comes to your draft, 8949 section on
deterministic encoding talks about whether to reject
non-deterministically encoded dictionaries. We want to make sure integer
decoders reject invalid zero-ish floating point numbers. Want to make
sure what dCBOR is strict, both in writing and in reading, rejecting
wrong encodings.

Individual documents under discussion

dns-cbor (10 min)

ML presenting

Presented slides:
https://datatracker.ietf.org/meeting/117/materials/slides-117-cbor-a-concise-binary-object-representation-cbor-of-dns-messages-01.pdf

ML(p3): Motivation -- sizes. Fragments for most names. Compression by
more compact messages with content as application/dns+cbor.
ML(p5): Objectives and methods, including packed
ML(p6): Focus since 116 was on DoC, more work on dns-cbor coming up.
Changes listed.
ML(p7): New implementation during hackathon, almost done. Found
troubles, eg. that question section is not needed but only other
sections; and pseudo-RR take up lots of space in byte encoding.

ML(p8): Next steps are comparison of compression algorithms, comparison
of different formats, implementation with evaluation, possible global
copmression contexts or implied table entries.
ML: Discussion this morning on global contexts, and negotiation of the
compression context.

FN: You have an encoder but no decoder?
ML: The decoder can't do packed yet.
FN: Do you have a wireshark dissector?
ML: We don't have a dissector yet, I just used pcap.
FN: Do you need help for that?
ML: Happily, but maybe wait for next version's fixes.

CB: When will we discuss numbers? How much do we save, on which interim
can we do this?
ML: Hopefully after this IETF meeting I can put more time on this and be
able to provide some numbers for tne next interim meeting.
CB: Always hard to look at proposals w/o seeing what they do to actual
traffic.

dCBOR (15 min)

BL: No slides; WMN, want to start discussion?

WMN: Originally motivated by both blockchain and non-blockchain
companies that need convergent bit sequences for data. CBOR suitable due
to sorting through keys. We often mix numeric types, wrote code and
draft addressing that. As CB pointed out, determinism can't be only at
one level (?). CB's is on the protocol level. Most controversial thing
si numeric reduction -- numeric value should have one representation if
it can. JSON has canonicalization. Here on CBOR, want to do as much as
possible on the protocol level. A side effect is that in weakly typed
languages, you don't have to declare what numebers are (or use
builders). On languages that do care, they can still handle it.
Unaddressed issues are known but not relevat to our use case, but hope
to see discussion on it. Would like to see this as an opt-in part of the
CBOR environment. Determinism places limitations, but those are
important for the applications.

CAl: Our code is in "community review" phase: Rust, swift, Typescript
(to show it can handle weakly typed languages). We'd like to see a
Python version also available. Would love to have more eyes on them.

MP: If WG were to adopt CB's draft, could you contribute to that as part
of WG? Seems like a good starting point.
WMN: Based on hearing his concerns, I'm ready to edit mine way back and
remove things CB said are out of scope. We had things in there for
people who design APIs. Whether group goes with CB's that we amend or we
do a 03 or 04, both good. Not a matter of ego, taking whatever works
best.

CAl: Another question is "Is it worthwhile? IETF doesn't do APIs".
Should we separate that out as another work? There are interesting
things about being safe with representations, where does that go? Should
we keep updating our drafts? Shift the focus on CB's draft?

LL: CB's draft is a good starting point to me, I can implement it in my
CBOR encoder/decoder.
LL: Main question is: How do you avoid subnormals? Can't control FP
hardware. It'll put subnormals in there. Seems like sticky problem.
WMN: Not a numerics expert. My understanding is that different
implementations should end up in the same encoding. We scale down, scale
up, and if it doesn't round-trip, it's not reproducible. We're folling
IEEE. It's our current assumption it works that way, happy to be told
otherwise. Does that answer?
LL: Maybe.
CB: I did this in 2013 when standardizing CBOR: scale-down-and-recover
is what I did back then. But if you can't rely on your FPU to be full
754 unit, you have to do as LL did and do deterministic encoding using
shifts and masks. FPU won't meet the specs of deterministic encoding.
It's not a lot of work, like 7 lines of code that are in the standard.
But that's the least interesting part.
CB: The more interesting part is that these numbers come from
application that did computations. Then you run into problems with
deterministic encoding. Another application may have a different FP unit
and you get to different answers.
CB: What I'm most interested in around the backgrounder for
deterministic is how to deal with strongly number-typed language. What
the number from the wire is in terms of application types, I'd be
interested in feedback to be included in the document.

WMN: Our current draft should say we rely on stable IEEE implementation,
supporter needs to have that in hard- or software. On the
type-when-decoding question: Implemented it in Swift and Rust (both
strongly numerically typed). My approach was to allow looking at it on
CBOR level, but API says "extract it as that type". So if byte stream
comes along, a uint8 can be taken. If float comes at wire, it can't be
extracted. Going through the same array decoding it as a double, that
same array would work. Users of dCBOR codec will tell what is their
maximum precision they support.
CB: The pull-API pointer helped.
LL: Have FP impls been deterministic?
WMN: We've been using them for dates, far from zero. So no edge cases.

LL: One more thing: NaNs, and NaN payloads.
WMN: Researched before spec'ing. quiet and signaling NaNs are processor
artifacts. That has nothing to do with how they're encoded. 8949 has
only single value. As we're trying for subset of CBOR with unambiguous
encoding, should only have one NaN. Happy to look at use cases, but
right now 2-byte unambiguous encoding soudns good.
CB: NaNs are interesting, you'll find such payload in internal
implementations of languages but not leaking to programming model of
them. So single-NaN is safe.

WMN: Made a choice in editor's copy to restrict negative 65-bit integers
as they're out of machine size. Probably should be represented in
bignums. Made them invalid, current spec says they're invalid. If
argument is to do otherwise.
CB: You need them once you do bignums.
CB: Please send me a note if the backgrounder document is not describing
those, so I can add that.

CA: At least for the dCBOR use case you described, there might be
storing of FP numbers (actors store the decoded value and reencode it
without storing the encoded CBOR pattern). It's ok as long as you don't
do calculation with them.
WMN: When doing a protocol where actors do calculations, it's out of
scope for dCBOR to tell how they arrive at the same result.

AOB

Interim calls through IETF 118

BL: Coordinated with the CoRE WG.

BL: Continue on even weeks, Wednesday at 1400 UTC

  • 8/23, 9/6, 9/20, 10/4, 10/18,

BL: Objections to those dates?
(none heard)

BL: Will go to list, then secretariat.