Minutes IETF120: cbor: Fri 22:30
minutes-120-cbor-202407262230-00
Meeting Minutes | Concise Binary Object Representation Maintenance and Extensions (cbor) WG Snapshot | |
---|---|---|
Date and time | 2024-07-26 22:30 | |
Title | Minutes IETF120: cbor: Fri 22:30 | |
State | Active | |
Other versions | markdown | |
Last updated | 2024-08-05 |
CBOR session at IETF 120 (Friday at 15:30)
Agenda
-
Intro and agenda review
-
Document status in general
-
Documents for discussion
- edn-literal
- cddl-more-control
- cbor-packed
- cbor-cde
- cbor-det
- cbor-numbers
-
cddl-modules
See also: Discussion on ABNF tooling and imports --
https://mailarchive.ietf.org/arch/msg/cbor/BzPmdKJyM7gOlrASb2zkDmmg84g
-
Dates for interim calls through IETF 121
-
AOB
Notes
Minute takers: MT, CA
Intro and agenda review
Slides:
https://datatracker.ietf.org/meeting/120/materials/slides-120-cbor-chairs-slides-02
BL doing introductions.
BL: We'll first go through some document status, then more document for
discussion. We'll also announce the plan for the next series of interim
meetings.
Document status in general
CB presenting (p5)
- time-tag in AUTH48
CB: -time-tag and -update 8610-grammar are with the RFC Editor. For the
former, we need to hear from one co-author, it might take a couple of
weeks more.
- update8610-grammar in EDIT
CB: EDITing for errata and more details
- edn-literal Waiting for AD Go-Ahead; needs discussion
CB: Left the WG, now in AD review. We'll discuss.
- cddl-more-control completed WGLC; discussion below
CB: It has the Shephert report done and can be shipped.
- cbor-packed got more feedback based on experience; discussion below
CB: No slides, no text, but we can discuss.
- cbor-cde got mailing list discussion; discussion of -04 below
- cbor-det need to decide to adopt or send to Independent Stream
CB: We should focus on completing -cde, then we can look at the other
two documents.
CB: We decided dCBOR is not on the WG's plate, going to ISE.
- cbor-numbers; discussion below
- deterministic-cbor can go to the independent stream
- cddl-modules has a corner case related to importing sockets;
discussion below
CB: We still have to solve one item. I can report on it and we can
discuss.
- edn-e-ref, some documents using this
CB: I'll touch on it when discussing -edn-literal.
- draft-numbers, adoption?
Carsten's overview
Slides (for the rest of the session):
https://datatracker.ietf.org/meeting/120/materials/slides-120-cbor-carstens-slides-00
CB(p1): Referencing RATS meeting. Escaping hell of nested JSON: "In
CBOR, there is no escaping". May go on T-shirts :-)
CB(p2): Context for languages. CBOR as binary representation, EDN for
diagnostics of instances, CDDL as a grammar for specification of data
models (close to ABNF).
edn-literal
CB(p3-5): Meant to be used in tools, whiteboards, specs etc. It's not an
interchange format (therefore didn't even have ABNF originally). Maybe
time to cover ABNF now. Interoperable JSON is subset of EDN.
RM: Clarification, on map keys: There was no documentation prior on what
was allowed. Positive, negative integers and text strings are used. Do
we have in the wild map keys that are not integers?
CB: Certainly, e.g., the SUIT manifest specification uses arrays as map
keys. Maybe using a map as map key is weird, the equivalence is not
great and we want to avoid duplicated map keys.
CL (on chat): I am using arrays as map keys in a specification I am
writing.
CB(p5): Overview of old and new extensions.
CB(p6): Recently added/improved comments for readability. The final
comma is optionally possible to use. Also added application-oriented
literals.
CB(p7): background on application-oriented literals -- innovation of
"tags" in CBOR, worked well. In EDN you still had to write tag with its
representation (error prone), now allowing shown syntax in last two
lines. We started with date/time and IP addresses, but there are more.
We went for the "byte string" syntax.
{Side chat remarks:
RM: I always found calling new app-strings "tags" is likely to cause
confusing with CBOR tags among folks outside the WG.
OS: Do you have an alternative suggestion?
RM: app-string prefix or just prefix, or just app-string
CA: app-string prefix / prefix sounds good to me. (app-string is
probably the whole thing, prefix plus content)
AJS: +1
}
CB(p7): Extensible by registering the prefix before the single-quote
strings; not necessarily tags. This started the work on EDN, but then we
did more work on it also thinking of ABNF.
CB(p8): One example that is offered is e''. Simple mechanism, picked up
in specs. Particular example: code point allocation.
CB(p9): Another example is cri'' used for CRIs defined in the CoRE WG.
The value is the literal representation of a URI corresponding to the
intended CRI, which is otherwise intended to be encoded as a CBOR array.
CB(p10): We have to put this in relation with ABNF. Describes how it
looks like in general. What we have is generic but has to be ready for
future extensions.
CB(p11): The idea is to define any new application-oriented literal
together with an additional little piece of ABNF defining the syntax of
what is within ''. EDN parsers might not be upgraded with the most
recent prefixes, then you get a tag 999 (possibly for later processing).
For CRIs, the definition is simple, just 1 ABNF line built on importin
external ABNF definitions.
CB(p12): Grammar for base syntax, and then per prefix a syntax. The
alternative is to add more to the base ABNF for each new application
extension. That's less pluggable and harder to get it right (especially
for processing tag 999).
CB(p13): Always presented this as two-layer approach. Other people have
worked with ABNF also in the context of web linking. Definitions that
describe both how something is embedded in something and on the top
level(?) caused troubles in web linking; we had trouble with that in
6690.
CB(p13): PR #49 proposes replacement for 4 prefixes defined in this
document. Adding more will have to be done in a compatible way.
CB: Discussion was on the list; will hear them in a moment.
CB(p14): My view of pros and cons. The one-layer approach is more
familiar to people with long ABNF experience (early users and JSON
implementors) and it is easier to share. For two-layers, than enables
true pluggability and isolates the pluggable and the base, so we can use
existing ABNF directly. In general, we should be wanting to separate
concerns by using layers.
RM: On (p12), tag 999: Seems that if somebody has support for a new
prefix, and someone else does not, they decode to different CBOR, and
are not compatible. What's the advantage of adding the tag compared to a
hard error?
CB: It's good thinking of software that evolves. Then it has advantages.
RM: Example: Test rig generates CDN documents. If converter to CBOR
doesn't fail, they won't notice it's not intended encoding.
CB: The base configuration of each parser should be to not do that. But
an application might understand tag 999, and if that is the result, the
application might take care of what the parser couldn't do.
Application can ask parser to handle the prefixes the parser doesn't
know on its onw.
PR: Preamble: EDN is just fine. Rohan asked me to look at proposed ABNF.
Look at (p14), pros and cons.
PR: All of this confuses implementation with documentation. Nothing
about proposed ABNF can not be implemented in 2-layers. Documentation
will have sub-sections for sub-types of literals; introducing a new one
introduces a new literal. Only adds app-string /= new
.
PR: The downside of the 2-layer approach is that things that expect ABNF
to be single and complete will fail. This is not valid ABNF because
those EDN literal subtypes don't attach to anything, they're dangling
productions.
PR: CB pointed out something about 1997 ABNF. But we have 2234; old ABNF
was parser description, but we now have a new ABNF grammar that's lousy
for going instantly from grammar to implementation.
PR: Don't play games with ABNF to create something what would play nice
with your implementation. Use ABNF as is, and you can get just what you
want.
PR: As far as other RFCs referenced, there is buggy ABNF in there (for
convenience).
PR: Over-all, using the shown ABNF, you can do the same but have a
complete ABNF of the grammar, which is not the case now in your
document.
CB: Interesting philosophical PoV; reality: tools don't work well when
modifying your ABNF in-flight. The tools expect the complete ABNF to be
parsed and then converted to a program using it.
BL: When you say modify-in-flight, do you mean there is no good support
for or-equal?
CB: I mean you have a tool that used compiled ABNF, and then add a
plug-in that adds a specific application-oriented literal. ABNF
implementations I know require recompiling. That reality makes me averse
to the phylosophical model.
CB: "Missing link" pointed out is a very simple relationship. Prefix
goes into name of the top production of the application-oriented
grammar. There the link is done. Have to do that with any ABNF, you
always need to specify a starting point. So those missing links are
normal.
CB: As for 8288 criticism, for me that's maybe a bit too
non-traditional, but they found 1-layer really really doesn't work for
them, and now using crutches for 2-layer approach. Would be nice if ABNF
had syntax for that.
PR: So: why don't we fix the tool instead of fixing ABNF into something
it is not desired for?
CB: Still have to recompile. What you describe as complex is trivial in
real-world ABNF impls. Recompiling ABNF each time a new literal is
introduced is more complicated. I'd not support it. I wouldn't write a
new parser generator to support a phylosophical model.
RM: 1. Surprised of hearing of re-compiling vs. not doing it. Why is it
a problem to compile at all?
RM: 2. I read ABNF, then that character didn't make sense, I sent a
mail. I assumed it was a single-layer ABNF, that's the case in other
specifications. Won't be last person to make that mistake.
RM: 3. PR #49 gives combined ABNF, but also gave ABNF compatible for the
other app-string prefixes (except CRI). Can write up the rest for CRI as
well quickly if desired.
CB: So we have not described this well enough, it's an editorial problem
and we can add more text to fix it. If you add a plugin you have to
compile the plugin (a few lines of code only), but not the base, that's
a big difference.
CB: You didn't do the CRIs, clearly extensions start as unknown.
CB: 2-layer can use single-production IRI-reference from 3986.
RM: But says "copy and fix"? Old ABNFs are broken. You can't blindly
paste them. You have to understand what you are importing and re-write
it.
CB: In this case we had to fix missing imported productions, but that
happens all the time with ABNF.
RM: So it's a single-layer thing(?). So answer should be "you can't do
it". This has a relevance and impact that go beyond the WG and should
engage the IETF at large.
CA: Practical question. Fitting in, say, date-time ABNF. With 2-layer I
can use existing ABNF on 2nd layer. With 1-layer, wouldn't that ABNF
need to be different from original ABNF b/c it takes care of escapes?
SEDATE's date-time ABNF should be good to copy-paste in. Needs a lot of
extra work to escape every single apostrophe into the escaped version.
PR: The documentation should not be driven by an implementation model.
RM is right, and he was planning to do a single-pass implementation.
Someone has to do extra work. If you take my approach, CB needs some
additional code. If you take CB approach, others need some additional
code. My point is not changing ABNF in itself and keep it
implementation-independent. I don't but that ABNF as-is prevents
convenient implementations, it just requires some more work, and of
course one can't just copy and paste.
CB: Annoyed by assertion that this is changing ABNF. ABNF is used as
designed. Just used by combining multiple parsers, but fits the problem.
CB: Most important, design is shaped by deployment and by how it will be
used to solve people's problems. This requires thinking about
implementations. Whole point of this mechanism is to ease adding stuff.
Not just implementation (one more parser), but also for people who write
spec. No sane person will take IRI reference and try to parse it through
single-pass. People will just feed it to URI parser and it will subtly
break but mostly work. Two-pass increases likelihood that extensions
will work.
RM: On having to do work, I have an example.
RM: On escaping. You may need to add that escaping, but that's a
feature. Chance of people doing single-pass where they miss that
something is high. Someone writes app-string with unescaped backslash,
converting that to single-pass ABNF shows that.
CA: That would mean that someone might build a JSON parser, and then
start plugging it a URI parser into that. Do you think that will happen?
RM: JSON is inextensible, right?
CA: People extensions in strings.
RM: If you want to add an upstream prefix with a backslash, you need to
have a correct grammar to make it right.
RM: That's how ABNF has been used at IETF, and has been done even where
number of things added and variety of things added has been substantial.
This has not been a problem for the Internet community, so not sure why
for this WG.
CB: Following CA's example, you'd argue that any application that uses
JSON to keep URIs should be URI parser into JSON parser?
RM: JSON is not extensible. Have to treat it like a fixed black box. If
you have syntax coloring in your editor for JSON, syntax colorer will do
the right thing w/rt the JSON. What you describe for ABNF will not be
correct for a single-pass ABNF.
BL: Have to move on; hoped to resolve this, not yet resolved. Will post
something on the list.
(CB note to self b/c editor open: ABNF would be broken if something
inside app-literal referenced the floating items, but it doesn't. if it
did, then it'd be broken)
cddl-more-control
(Shepherd: AJS)
CB(p15-16): This is about extending with control operators for CDDL
processors. It started with b64,b32,... encoded byte strings, especially
if used for describing JSON. Distinguishes many variants for base..
encodings. The printf part covers what would otherwise be many control
operators. Got quite some feedback, also from the works on deterministic
encoding.
CB(p17): Had WGLC in March, and made fixes on strictness requirement.
Now we have the Shepherd write-up done. I think it's ready to be
shipped.
RM: does .json escape/unescape double quotes?
CB: Yes. (??? see recording for details)
BL: If no other comments come in, we're done here.
CB: Timed implementing it, 2h45min, mainly b/c base32 library was
broken.
LL: I used this to verify CDDL in EAT through base64 and into JSON. It
worked well. It took me more than 2 hours to do that, but it's good.
BL: Moves ahead.
cbor-packed
CB(p18): You know where to find data compressors in general.
CB(p18): In constrained, hard to do b/c copying, thus interesting to
have sth that is like compression but operates on data in packet buffer.
Thus easier to use compressed data.
CB(p19): The draft defines reference sets and a number of tags. It also
defines the shared semantics, and function tags to extend the reference
semantics.
CB(p20): You might have an unpacking error to be handled at the
application. You may have an API for that, or rely on a standard
representation. We have to look into that.
CB(p21): To be progressed independently, a draft by CA proposed an
alternative way to set up tables by reference.
CB: This draft needs some polishing but it should be largely done.
CA: I got no feedback on which parts still need work.
CB: The timeline would be this summer.
CA: Please say if you see any issue in starting a WG Last Call; if none,
we'll start it.
{ minutes cleanup: merge above
CA: During interims, had "not ready for WGLC hands", asked around, but
got no comments on non-ready parts.
CA: Only open question I'm aware of was whether by-reference should be
merged in (as author: don't care, CB had weak negative preference) --
that all? Please tell if you want merge.
CA: Not aware of any other blockers -- anything else keeping it from
WGLC?
}
CB: How many have read the draft?
AJS: I've read it recently, got excited, no objections to WG Last Call.
CB: We can fix anything that we find during WG Last Call, and there is
the open issue.
CA: Thus the "once unpack-error is fixed" in the question; we'll start a
WG Last Call then.
CL (on chat): I concur with AJS, including not having read the latest
draft. (But have read previous drafts.)
cbor-cde
CB(p22): When CBOR started, we were not interested in supporting
deterministic encoding, but we had it as a desired feature. But anything
we defined was at the serialization layer, leaving a lot open at the
application layer. Right now, generic deterministic encoding is not
standardized.
CB(p23): Triggered by works on deterministic encoding, we tried to cover
what was left. An application has to decide on the exact meaning of an
information to encode in CBOR. Defensive mechanism: Even though people
can build specific application profiles when unhappy with CDE, it's
better to give a single architecture as robust ground to build on.
CB: I understand there is the desire to not give too much space to
application profiles. Point is that if this emergency valve is not
there, CDE universality is damaged.
LL: Background: There's CBOR (no serialization details specified, just
core rules like 'no duplicate map keys'), on top of that 'preferred
serialization' (or 'base serialization'?). For this discussion,
preferred is important. For SenML, TEEP, CWT, you want everyone to use
preferred to interoperate as a minimum. Good for 99%. That 1% needs
determinism, eg. dCBOR and internals of COSE where structures are
derived (not payloads). dCBOR is incredibly specific use case (so much
that it's not in the WG). If we introduce profiles, why do we do it for
1% and not all use cases? Why is not CDE another profile?
CA (on chat): Another item in the 1% is EDHOC.
CB: It's not a profile b/c you don't want other profiles to exist at
this level.
LL: Not too worried about that b/c only real example has been dCBOR
(which is reasonable to do for a specific use case). But in 10 years,
not had a buch of dumb proposals. (...): I'm just not too enthusiastic
on adding yet another concept, especially since it seems so rare to come
up with something like these profiles.
CB(p24): Also on the topic of map keys ordering. Preferred serialization
is a bit too weak b/c it does not exclude indefinite length. Now have a
hierarchy: preferred, basic, CDE, and on top of that application
profiles.
CL: I am currently using CDE for variability reduction, exactly as this
says. But, we relax this to not require map ordering because it can be
used to communicate useful (but not required) information. So this is
valuable. Not sure if it warrants an application profile. Let's find out
more on the list.
cbor-det
(out of time at this point)
cbor-numbers
cddl-modules
See also: Discussion on ABNF tooling and imports --
https://mailarchive.ietf.org/arch/msg/cbor/BzPmdKJyM7gOlrASb2zkDmmg84g
Dates for interim calls through IETF 121
Proposed dates, coordinated with CORE:
- 21 Aug, 4 & 18 Sep, 2 & 16 Oct
- Same cadence, continuing as we were.