Skip to main content

Minutes interim-2024-cbor-18: Wed 15:00
minutes-interim-2024-cbor-18-202412111500-00

Meeting Minutes Concise Binary Object Representation Maintenance and Extensions (cbor) WG
Date and time 2024-12-11 15:00
Title Minutes interim-2024-cbor-18: Wed 15:00
State Active
Other versions markdown
Last updated 2024-12-11

minutes-interim-2024-cbor-18-202412111500-00

CBOR working group call, 2024-12-11

Meetecho:
https://meetings.conf.meetecho.com/interim/?group=4d336d75-4dcd-4050-a98f-3ed73ab2b50e

Minute takers: Marco Tiloca

AGENDA

WG documents status and issues

CA going throught the document status.

  • more-control: In IESG evaluation; IESG telechat date 2025-01-09
    • genart and secdir reviews look good; waiting for artart review

CB: On the IESG telechat of January 9th. Ongoing directorate reviews.

  • edn-literals: Awaiting outcome of Carsten and Pete's discussions.

CB (p2-3): context
CB (p4): Main topic today are application-specific literals
CB (p4): In hex we have comments, which is surprising to people.
CB (p5): Syntax for prefixed strings; applcation-oriented literals
dt/ip/cri.

CB (p7): idea: do all in the same way; single piece of ABNF. This ABNF
has no details on any particular literal.
CB (p8): Additional pieces of ABNF for plugging in additional syntax

CB (p9): Some implementers prefer single-level ABNF. Including
single-level as well creates risk of inconsistencies; seems best to me
to have a single future-proof form. Everyone can focus on their snippet,
not on how to integrate in the overall ABNF.

CB (p10): -14 (just submitted) proposal: Keep using snippets, point to
abnfrob tool that creates integrated grammar. From my PoV,
best-of-both-worlds (isolation and single-pass approach).
CB: abnfrob not included, so can be fixed after release.

OS: Saw tool; anyone consumed output of current version? Just to
cross-check, e.g., in case of bugs.
OS: Objections to proceeding with this approach? Want to see clear path
forward.
JH: looked at output, but did not implement. Looks like reasonable
compromise. Output more verbose than expected, will look at tooling to
compress that in tools beyond ABNF. More or less on right track, on list
for this week.
CA: Give feedback as XMAS gift.
RM: Window for reviewing next week. Committing to look by Thu next week.

CB: Promise to put in anything JH/RM that makes output more concise.

Next steps:

CA: Will start WGLC around new year if feedback doesn't indicate
otherwise.
OS: Chairs, if feel confident in published draft version, run WGLC on
list as you would (I will moderate and help out).
CA: So it keeps the current status in the Datatracker but will check on
the list before moving forward.
OS: Yes.

CA: We'll need to evaluate the feedback and decide how to proceed.

CB (p10): almost around since 2013 but was not put into original spec to
avoid too many innovations in one step.
CB: Idea is to deflate in model, possibly also using different functions
to generate. No need to copy.
CB (p11): Version -13 is stable now, in two parts (table use and table
setup). Table use should be the one stable point; table setup and
funtions are extension points.

CB: We can consider early allocation of the CBOR simple values and tag
numbers (where applicable per RFC 7120).
CB: Raised issues on the large amount of tag numbers to allocate (40 out
of 159 from the 1+1 tag space). I've documented on the list that this
construct is efficient in terms of saves bytes.
CB: Can modulate registrations in 1+1. 1+0 needs one (but only one).
CB: Why so many? tags are simple, 1 number and data item. Back then, did
not insist that we can have two-item tags.
CB (p13-14): Pattern not invented here, have 6 tag ranges already

RM (on chat): I will reiterate my opinion that if you need more than a
half-dozen tags in the 1+1 and 1+2 space, I think you are holding it
wrong and that the cost/benefit tradeoff is poor.

CA: Relaying from the chat from RM.
RM (on chat): why not use tag with a value of a 2-item array?
that seems perfectly reasonable. CBOR was never supposed to be about
absolute minimum size.
CB: There is this idiom, and proposal to use this existing idiom.

JH: Think this is a bad idiom; having done it 6x doesn't make it right;
bad precedent.
JH: IDC how many are allocated in 1+4. 30k 1+2 would be an issue. 30 1+1
is an issue, >1 1+1 is an issue. Don't think any use case warrants
limited resource. Not convinced this is only possible packing approach.
What if we want to do it the next way?
CB: Not sure why the approach is bad. Interesting about tag ranges:
requires add'l effort from implementer (dict of tag numbers doesn't
scale). That's extra work, would be happy to not have to do that. But
packed is foundational part, just lay dormant. Things would have
benefited from this, eg. CoVID health certificate.
CB: Outlined how represantation with fewer tags may look like (see mail
https://mailarchive.ietf.org/arch/msg/cbor/PfYGm4BQqkTOumSS3PXmGXiWP9s/
). Makes good use of 1+0 allocation; ran through simulation. I think
that the sketched alternative is a significant regression compared to
the current design.

JH: Compression mechanism is important; currently not implementing it
(can't comment on whether best one), will not implement.

CA (not as chair): "next time": Can spin it as "semantics based on
whatever set up the table"; so a packed-bis (maybe with no tables) would
set up its context however differently and could reuse the tags?
JH: I'm hesitant on the processing of the same tag in multiple different
ways. It might work, but I'd need to see an actual design, yet I'm
skeptical.

CB: What you are saying is what the doc already says implicitly. Concept
of "table" is rather abstract. Setup tells you what kind of compression
there is. How argument based tags reach into the table is very general.
Text can be adapted towards that direction, but it wouldn't really be a
functional change.

(some chat on 2-sized array length comparisons, and an AI28 supertag)

OS (as individual): The registry has a range for private-use tags. To
what extent is there a need to reserve tag numbers for packed CBOR?
Can't we rely more on the private range?
OS: Like, 2-layer construct. Context: alternative approach in JSON-LD
that starts with "here's a URL that gives semantics to attributes in
this data model".
CB: In JSON-LD you know where you are in the structure for doing the
right processing. In CBOR, we have to say explicitly (the references
don't have enough informative semantics).

CB: Agree with JH that tags better have specific semantics. There are
partial exceptions like alternatives (which works well with Haskell
semantics). Leads to conflicts, no longer composable, pieces may be
interferring.
CB: There are proposals for fixing original sharing tags (from Perl
context) to make this less vulnerable to composability.

CA: How is packed CBOR not endangering composablity in the same way?
CB: b/c a reference is not supposed to go outside of the table setup tag
around it.
CA: Could packed have a layer that ensures just that property, and then
packed is what happens when tables are set up?
CB: Seems like you're re-inventing packed. The means for setting up a
table are an extension point.

CA: Not sure how to make good progress on this.

OS (as individual): Fascinating exciting work. There is discussion on
the attitude to take with large ranges. Would like to see opinions on
the list about this topic as such. What are criteria for range allocs?
OS: Other thought: Size bounds complexity of data model instances.
Relation between cbor-packed and maximum-depth-exceeded limit of JSON
parsers. Size of allocation creates a bound, is that bound correct? More
or less objectsions with different bound?

CB: Design objective: We don't want to create artificial limitations
(other than the fact that ranges are finite). Couple hundred million is
practical.
CA: IIRC, JH said he's OK with any reasonable number within 1+8 range.
JH: Right, I still care but certain amounts of new tag numbers from a
single document on a large range is fine.
OS (chat): Joe it sounds like you are suggesting guidance to DEs. Which
would need to go into an RFC, to ensure that it is persisted by our
process.

CB: Understanding what we discussing is a pre-requisite for being DE for
the CBOR Tags registry. Not sure we need to have an RFC describing these
guidelines.

CA: To get progress, Would it help JH to have a ~10-line example of what
the most exotic non-packed-like-looking thing can be that would be
allowed to "reuse the allocations"?
JH: I don't know.
CB: Careful about proof by lack of imagination.
CA: Can we show that it's powerful enough to be effectively extensible?

JH: I'm worried about someone one day saying "You missed something here,
there's a better and more efficient way".
CA: Anything coming up will have to be in the CBOR area and its
boundaries. Any -bis of packed CBOR would still allocate tags, and other
things in CBOR in general is supposed to be small.
CA: Would hope to have shown that what can be done reusing packed's tags
and what can be done with tags in general does not meaningfully differ.

CB: Could demonstrate, not prove. Any output would be ~{rather absurd}.

JH: Can we discuss the AI 28 "supertag" approach just to discard it?
JH: proposal: 28 always takes a 3-tuple (integer number, integer extra
info, anything).
CB: That's CBOR 2.0. Idea of packed is to do it with generic decoder.
The compatibility issue stays.

JH: Agreed. If problem comes up over and over, where extra byte matters
(most of the time it won't) …. See point of saving byte here. But if
pattern keeps coming up, ought to have some kind of reusable approach.

CA: Given time, that's homework for everyone.

OS: We should look at some concrete data model needing this (how many
bytes saved etc), to better understand which ranges are more appropriate
to consider for registration.
OS: Also make sure we get feedback from CBOR using groups. To understand
if this is a good investment.

  • building a forward agenda for 2025 meeting (MCR)

(skipped, MCR not here)

CDE

WGLC now?

(skipped out of time)