Minutes interim-2025-cbor-01: Wed 15:00
minutes-interim-2025-cbor-01-202501081500-00
Meeting Minutes | Concise Binary Object Representation Maintenance and Extensions (cbor) WG | |
---|---|---|
Date and time | 2025-01-08 15:00 | |
Title | Minutes interim-2025-cbor-01: Wed 15:00 | |
State | Active | |
Other versions | markdown | |
Last updated | 2025-02-17 |
CBOR working group call, 2025-01-08
Meetecho:
https://meetings.conf.meetecho.com/interim/?group=937ce6ec-8189-4915-ae78-95456198724e
AGENDA
(CB's slides p2 doubling as agenda slide)
WG documents status and issues
CA doing introductions
status more-control comments (very brief)
IESG telechat: 2025-01-09
- PRs from IESG and Directorate feedback (#12 (merged; WG: OK?), #13
(merged)) - Artart (Darrel Miller) .decimal discussion (.base10 ?)
CB: Uneventful so far.
CB: We have some PR on Github, some of which merged. I plan to send out
-08 with all the changes in.
CB: Those who care, please look at repository; don't think anything is
controversial
MID-WGLC discussions about EDN (up to 45')
WGLC until 2025-01-15
CB: We also had -15 with most updates. Then -16 followed as also
including a PR that I forgot to merge.
Processed into -16:
- Comments from Rohan Mahy and Michael Richardson, in several PRs (eg.
#70) merged- mandatory space; no
falsetrue1trueh'00'
- structure of the document
- smaller editorial changes
- mandatory space; no
RM: Would be nice if PRs sa a bit longer before getting merged (24h?).
CA: Even for more editorial things?
RM: No, only for substantial changes.
CA: Agree, it would be good.
CB: Technical change in MSC was significant, needs more look -- but also
useful to have in document. Don't consider them cast in stone.
RM: Yet it's good for the PR to stay up so that people can add comments
to it.
JH: Also, I need 24h before interim for doc. Wake up to 2 diffs and
slides. Deadline needs to be "yesterday".
→ CA. nudge for slides earlier
Open topics remain around:
-
overescaping troubles single-layer ABNF processing (
h'0\u0030'
)- Are there use cases for escaping beyond quotes? (sub-topics
CRIs, ASCII-only) - Joe's Proposal thread: require app strings to have
"\" char
escaping processed by app - Christian's proposal mail: forbid over-escaping for
non-control ASCII in any sqstr
- Are there use cases for escaping beyond quotes? (sub-topics
-
comments in app strings, in particular h and b64 (
h'00 / 🎵 / 01'
)
* encoding indicators:
* extensibility of AIs (1234_fjord
)
* consistency of position -
2-pass vs 1-pass being the primary
Presentations:
JH (p2): Shocked to find 1st line to be valid. Makes abnfrob output
pretty big. Envisioned new instance of parsing process, which is
undesirable.
JH (p2): Can't use existing parsers for h and b64 due to comments. ip
and dt copied in ABNF from RFCs; existing IP parser may or may not use
RFC form, will need to double-check for each tool.
JH (p2): 1-pass needs cooking first for Unicode. But error offsets are
in wrong place then without source mapping (which is overkill).
JH (p3): What abnfrob does.
JH (p4): For app strings in here, no escapes needed; needed in comments
of h and b64. Considering them undesirable.
JH: For CRI, if I use CRI in document that can't do Unicode, there are
not too many such formats any more, and not well readable. Don't
understeand use case.
JH: Any all ASCII format to support?
CA: Internet Drafts may not include all-Unicode characters.
JH: Authors know what format is used and can select a good set of
characters.
CA: Unless they want to show limitations. It's also convenient for
displaying purpose, and that's when a diagnostic notation is also
intended to help.
JH: That's a reasonable argument.
CB: Not different format, just different selection of character
repertoire.
JH (p4): Some JSON processors do conversion. Should we specify it?
CB: Concretely on \t, JSON must emit it that way. More generally,
coming out of JSON, we ~implicitly inherit its patterns, eg. ability to
express any JSON as ASCII, would like to keep that.
JH: For comments and h and b64, ability to have escapes is fine, not
processed semantically. But can't be processes in 1st pass, has to come
through.
RM on JSON lineage: JSON only has double quoted strings. Need to process
any double quoted string that's in JSON, but don't need to do app
strings that also have the same restrictions.
RM proposing: Non-comment data in single quoted string, whatever is
data, does not get escaped.
CA: That proposal is out there since not long before the meeting, also
in CB's slides.
CB: ? (invariants).
RM: You can select the grammar to ensure that it does not happen for
your app-string.
CA: That assumes that grammars have their explicit definitions for
app-strings.
RM: It's about saying how app-strings work.
JH: If CRIs don't have their escape mechanism, you need one.
CA: That mechanism is in place.
CB: We don't fully understand discussed things, some don't even have a
name.
OS (on chat): +1 to creating some spanning examples, and naming them...
I think that will help ground the conversation.
JH (p5): Building blocks for app strings, raw text passed on.
JH (p7): Proposed related changes to ABNF.
CB: I'd call that 'raw mode'; clarifying: apply raw mode to all prefixed
and no unprefixed strings.
JH: That's correct.
JH (p8-9): more building block, and examples.
CB (p7): There are objectives for EDN, not fully spelled out. Some are
just obvious for people familiar with this space. EDN is meant to be
used beyond machine-to-machine interchange.
CB (p8): On invariant, a goal is to rely on a minimal source character
repertoire, that's why the escaping capability. Like JSON, forbid
control characters in strings (exception for new lines). Aim to less
protection from %x80-10fffff in Unicode.
CB (p9): Joe has a proposal for pluggable application-extensions (see
his slides). Better to be prudent and not totally open on this point.
RM (on p8): (?)
JH: Those are reasonable requirements.
RM: Not entirely sure of what it means "minimal source character
repertoire". Which escapes get used?
CB: That's for the tool to decide. The tool's decision is not an
invariant. When we design this, it is an invariant to follow.
CA: It's a design constraint.
RM: "Invariant"? CA: "Design constraint" might be better term.
RM: If "you should be able to express any CBOR with limited character
set", that's a fine constraint. Then what is the rest about?
CB: The rest is about what the processing environment is going to do to
your document; machine translatability between the character
repertoires.
RM: So 1st bullet is deign constraint; 2-3 are not constraints or
invariants.
CB: 3rd is a relaxation.
RM: Right, that's the source of confusion.
CB (p10): comments on Joe's proposed changes
JH: One more restriction on the grammar for app-string is that they
cannot accept an unescaped single quote.
RM: …or unescaped backslash.
JH: Correct.
CB: Think of a process that extracts the string in raw mode.
CB (p11): Some alternatives: raw mode is for all prefixed, and cooked
mode is for unprefixed; raw mode only for h'' and b64'' (not much
support) and others cooked; give a choice, but how? idea from CA (see
next)
JH: Then I'd still need to abnfrob date and IP.
CB (p11): Choice would be an option, but unclear how that would work.
CB (p12): CA's proposal. Always use cooked, but change single quoted
syntax to allow less Unicode escapes – disallow \u escapes for
printable ASCII except \ and \'. We'd keep abfrob, but output is more
managable.
CB: and mine / 3rd proposal is the draft.
JH: Is this just for prefixed or also unprefixed single quoted?
CA: I think it makes sense for all. I don't care too much.
JH. I'd have to think a bit about it, but it feels reasonable.
JH: Could you send to the list some updated output?
CB: Sure.
OS (as an individual): It's really hard to understand these proposals
without concrete examples that show what we gain or loose. Examples
should be related to common cases that we encounter (Unicode, emojis,
...).
CA: Starting pad in
https://notes.ietf.org/notes-ietf-interim-2025-cbor-01-cbor--examples
- optional commas (RH)
CB: You can get an EDN implementation of -14 that parses what is in
slide 6. Good to look at the JSON syntax again, requiring space & comma
wherever there is a comma.
JH: Love the goal of requiring space; don't like 1st example because
comma is at beginning of list.
CA: The encoding indicator is on the '0'.
JH: Right, I misread it. Don't like [,something]
or [,,]
CB: but
that's not in it.
RM: See comments at
https://github.com/cbor-wg/edn-literal/pull/74#issuecomment-2577940734
RM: Two questions: a) (?)
RM: b) Is only-comment a separator?
RM: Proposed alternate cleaner version of ABNF, it can work with either
choice of comment-as-separator.
CB: I'm used to languages with a lexer. We expect that comments break a
lexical item and start a new one. Not my favorite thing, but it should
be allowed as possible.
CB: I'd prefer [2/foo/3]
to be possible.
RM: Disagree, but won't die on that hill
JH: Same as Rohan, fine to move on.
- encoding indicators
JH: If normal mechanism is to error on them, they ossify, then it's not
worth having an extension point.
CA: Impression is that they are ignore-when-not-recognized
JH: Then there should be tests/examples.
→CA: send out pad link
CDE: check status
postponed
Can we WGLC now?
(Any further action needed on PR #10 and Issue #22/PR #23?)
Packed CBOR: Status
postponed
1st WGLC 2022, refinements since
5 normative references from other drafts
Now: Early allocation request triggered fundamental discussion
AOB
Note taking: (volunteer here)