Skip to main content

Minutes IETF105: cbor
minutes-105-cbor-01

Meeting Minutes Concise Binary Object Representation Maintenance and Extensions (cbor) WG
Date and time 2019-07-23 14:00
Title Minutes IETF105: cbor
State Active
Other versions plain text
Last updated 2019-08-13

minutes-105-cbor-01
CBOR WG Meeting IETF 105 - Montreal Tuesday, July 23, 2019, 10:00 -
11:30 Chairs: Francesca Palombini, Jim Schaad

Recordings: https://youtu.be/UxwaM20zNa4
Slides: https://datatracker.ietf.org/meeting/105/session/cbor

Note takers: Christian Amsüss

* Introduction [10'] : Chairs Agenda bashing and WG status update

  Recording: https://youtu.be/UxwaM20zNa4?t=110
  Slides: https://datatracker.ietf.org/meeting/105/materials/slides-105-cbor-chairs-03
  
  Francesca pointing out Note Well and going throug the agenda Carsten
  Bormann: may need >10' for CDDL2, more like 20.
  
  Status update: interims were had and recorded. CDDL is RFC8610 now.
  Charter updated. CBOR-bis has been progressed and will be discussed
  today CBOR array in shepherd review.

Non WG documents:

* CDDL 2 - Collect ideas [20'] : Carsten

  Recording: https://youtu.be/UxwaM20zNa4?t=294 
  Slides: https://datatracker.ietf.org/meeting/105/materials/slides-105-cbor-cddl-cbor-tags-02
  (slides 1-16)

CDDL has been published. Done, but what next. post-1.0 was a topic in
IETF103, and items were collected in cbor-cddl-freezer. Can take things
out of it, but probably not want everything now at the same time.
Prioritize. Probably other things around as well, discuss on list. Henk:
Everything from freezer will be added onto the existing CDDL, not a
fundamental change to syntax. Anything written in past 5 years will
still keep functioning when freezer items are added. Carsten: mental
node: Pointing out that forward compatibility is important and work done
using CDDL is precious. There are extension points that can be exercised
w/o "new CDDL". WG could exercise that, eg. with other regex schemes
(eg. YANG ppl have struggle with Open... ppl where YANG prescribes W3C
syntax like CDDL, but Open... want to use POSIX (really weird!);
occasionally run into that but not lots of pressure here). The .bits is
little-endian, big could be defined as well (but requires knowing how
big it is, so a bit more work but could be done). Could also go to
bitfields. (Bitfield example is at slide 10 as per printed numbers).
Bitfield could look like this -- say in the field that there is a
particular number of bits for items in there. As ppl increasingly use
CDDL to combine new stuff with old stuff in binary form, that's a good
thing to have. But also see T2TRG work by Ivaylo on binary bitstream
stuff in YANG -- look at this before committing to anyting. All those
things can be done w/o changing the language, just using extension
points. Another thing doable w/o changing the language is having
alternative representations. About JSON representation; description of a
possible serialization fits on a slide in CDDL notation. Could go ahead
and write a (informational) document to describe expressing the AST of
CDDL in JSON for interoperability. Example with three rules on slide 9
(as per printed numbers). There's many ways how it could be done,
that'sone.

Can also put new things in the language. Cuts (for reducing set of
choices) currently only works for map keys, could extend to whole map
members or even further. Could have computed literals useful for specs
where there is structure in constants, or auto-advancement. Could have
better literal expressions for specific tags. Regexp literals have been
suggested but not urgent. Could embed ABNF so not stuck with regexp, but
use full power of ABNF. Larger projects: Could have co-occurrence
constraints, eg. if two integer items are somewhere, then one could be
required to be less than the other. Currently not doable as no
pointer/selector construct. YANG uses something like XPath for that.
Depends on how far along you want to go in the validation chain in CDDL.
Ppl came far w/o, but maybe want to do this. Another large project:
Modules. CDDL specs work together to create larger things. Could have
modules, namespaces, import/export with URIs, and versioning of modules.
Want to talk w/ YANG ppl on how they did it, trying to get it right.
"Variants": Small details where CBOR serialization is different from
JSON (eg. int/string). Would be nice to have single document to
describe. But there can still be variants from user, and then again
there are two variants of the CDDL. Can put things atop of language
(like C preprocessor, so a CDDL 1 comes out of the processor), or
inside. Both possible, will need to decide. Many projects have post
validation mechanism: Validation not only decides whether input is fine,
but also annotates. Could go beyond that by having real default values,
or adding units and other values. Could go right into the relationship
to semantics and RDF.

All of this needs to be priorized. Proposal is to install a WG document
that serves as road map, maybe starting from a restructured freezer
document. No intent to make the RFC as a document (roadmap RFCs are sth
diffeent), this is only a running document but it's a WG document and
the WG agrees on it. My intention is to take today's output and update
freezer to get there. If there's anything you'd like to have, here's the
mike.

(Nobody). Oh, we're done -- new version is 1.0, nobody needs anything
(new ;-)

Henk breaking awkward silence: Supporting all those items b/c composed
them together. If you think any of them is vital to current work, say
"This is the very least I need". If have roadmap, we'll have sequence
and work them off. If you find sth useless, we may strike it off, or
make a sequence of it. If there's still awkward silence, we'd go to the
ppl who put in the requirements. Eg. constants w/ base and addition is
convenient thing for me and others ... and now queue is filling.

Laurence Lundblade: Ability to express CBOR in JSON is valuable, trying
to do already, already trying in EAT[?] in RATS to express claims.
Carsten: That's a different thing. You talk about one spec for CBOR and
JSON, that's different. What I said was about representing CDDL in JSON.
LL: Variant slide? CB: About single spec for JSON and CBOR LL: Yes, that
Jefrrey Yasskin: Ability to refer to CBOR definitions from other specs
is important. Sean Lenard before CDDL 1 discussed clearer-for-author
ways to do cuts, interested if can be accomplished in CDDL2. It's
interest not own energy to do it. CB: Will need ppl with language
experience, possibly from outside. [...] Environment eggs? [...] CB: Ah,
import. Yasskins: High priority. CB: Also send important things to
mailing list.

Francesca Palombini not-as-chair: About keeping this CDDL freezer
updated in WG as item -- think we need to discuss that b/c there's
danger this will delay things. Helpful but may delay. Option: Have a
wiki instead to keep track of priorities etc. Just b/c writign and
updating documents takes time. CB: But wikis too. Advantage is that we
win 6 weeks per year to make changes. FP: Yes but one more active
document on table. CB: Yes, and singling it out as WG document gives it
status. FP: Sure but it just would take time to do these edits. Just
think wiki is more dynamic. LL: Just suggesting github issues with
labels. CB: Good idea and roadmap can point to them, but there needs to
be information on how things fit together. Issues stand side-by-side and
don't tell you relation.FP: Good to have it in one place, and can refer
to issues. But yes, discuss that as well.

Alexey: Discussing format of how to preserve this. Don't have to spend
time discussing this here. Happy with chairs making unilateral decisions
on this w/ consultation. Good thing about WG document is better control.
Ppl find this and are more likely to come to WG. That's positive about
having a WG doc. Other side was implying that CB has lots on his
plate... FP: Yes. Still relevan but doc expired. Wiki can still be
official. A: Fine with chairs making decision. If stays document, maybe
find another editor to help out.

Jefffrey: about features. Some more complex features make me nervous,
lots of speculaitive "maybe we can use this here" ... but could get it
wrong. Make sure there are solid use cases. Arrays came from somewhere,
but actually didn't pan out that way.

McDonald: prefer roadmap doc to github issues which are annoying @@@ FP
plz copy/paste

Hank: Prefer to look to documents. Can also cross-ref to github, chain.
But manual process. About time, it's a mixed argument. About
contributors and accessibility is important. Can track documents. Cf.
side meetings: hard time tracking them b/c they live on wikis.


WG documents:

* CBOR specification status [50'] : Carsten
https://tools.ietf.org/html/draft-ietf-cbor-7049bis

  Recording: https://youtu.be/UxwaM20zNa4?t=1941
  Slides: https://datatracker.ietf.org/meeting/105/materials/slides-105-cbor-cddl-cbor-tags-02
  (slides 17-44)

Items from last face-to-face that are not done yet; going through.

Error levels: 80% done but needs more work. More editorial changes to
come up.

"strict": is a confusing concept; if any concepts worth preserving,
"better give them different and more specific names. There was sth about
"decoders that check whether preferred encoding has been used, there was
"text about security merits but they don't exist. There were others:
"What part of CBOR validity checking do we factor here? Need better
"terminology. "require valid" mode will always be hard to do for all
"tags, as new tags can be registered. Expectation will always be that
"generic decoder does some work, but some will be done by application,
"and application validity can only be done by applciation.

On tag validty, discussed structural vs semantic. Last meeting decided
to move tags out to separate documents, but this sends a signal of
demoting tags / undermining stability of tag part of ecosystem. In
hindsight, probably don't want to do that. We don't have to. Wanted to
stick w/ structural validity but say it's ultimately an explicit
concept. Make explicit that generic decoder could present tags it
considers structurally invalid to the applications as such. App could
then implement semantic validity checking if so desired. Jeffrey: How's
that gonna show up in the spec fo rthe higher-level application? CB:
"Tag validity for this tag works this way". J: So "even though it's
invalid, it's valid for *this protocol"? CB: Yes

Jim Schaad from floor: Can you give examples of structurally invalid tag
that you can make work? CB: Not talking about not-wellformed. For
instance, some tags require array as contained element. In CBOR that's
type 4. Now we have array tags. An app w/ array tags could say that "you
can use this tag w/ array tags as structural component, even though
original definition only said CBOR arrays". Structurally, expect
meta-type 4 but what you get is tag-4-plus-byte-string. That's
structurally invalid but semantically can be OK. JS: Problem w/
applications that do this. Decoder will only do this for thigns it has
learned about. [...] CB: In specific interface, already there is a way
to present unknown tag. Could use same interface "unknown tag" to
present known tag with structural unexpectedness. Alert application to
"this is not your normal", and app that may have code for new tag can
also have code for old-but-unconventional tag. All that has to be
written up at some point...

Another thing about tag vality from Peter [?]: Some early tags don't
work properly b/c decoding is based on serialization order. Unless
generic deocder alerady knows it and keeps serialization order
available, there's no chance to decode it. Some impls always preserve
order in maps (often by accident), but if generic decoder does that, it
can be done in application, but if not (and it may), then you can't
process. That's a weird thing and we normally don't want to have it, but
it was expeditious there at that time. Should have text "it's not
entirely forbidden, but don't do that".

Had discussion about tag validity: embedded CBOR item doesn't require
anything from byte string for validity, while embedded mime requires
"valid MIME" which is complicated. But hard part is required for the
easy part. Missing guidance for defining tags. Should also look into
validity of tag 24. Good generic decoder validity check is
well-formedness of the embedded thing as validity criterion for thing
outside. Can easily be checked for being unambiguous. That's my
suggestion, discussed at interim already.

Other validity checking: good idea to check, but not all will be able
to. Mandatory checking might be a problem. Other meaning of "strictness"
could be applied, a decoder could say it's "map-validity-checking" or
not, and app developer would know that of the decoder.

New issue (previous from -04): JSON-to-CBOR conversion not normative,
but normatively referenced by other specs. Fish stick / aquarium
situation. (JSON lacks CBOR-level information). Main issue is number
system, JSON doesn't distinguish int/float. CBOR separates them.
JSON-to-CBOR needs to make decision on how to represent integers
expressed in JSON b/c floats can exceed CBOR 64/65 bit integer range.
But as ppl usually do I-JSON, where float is stuffed in binary64. In
binary64, everything 63 bits is inexact, so can't know. Recommendation:
two pieces of guidance. Users of pure JSON can detect integers and store
in number (possibly bignum). Users of I-JSON put everything into a
binary64 and see if absolute value is <2**53 and make it an int,
otherwise stay with float (Old text had several numbers).

JS: Alexey, has IESG made statement on I-JSON vs JSON? Alexey: Think
not. Proposal seems to make sense.

CB: Most people are in I-JSON space, but there are others, and they can
benefit from this.

Major editorial ToDos that need fixing. "follows" terminology to be
removed (in favor of "encloses"? unsure.) Current text says uneven
number of maps are invalid but it's deeply hidden, let's make it
explicit/redundant. Security considerations need finishing. Separate
terms for abstract data item from encoded data item. 

Minor editorial issues [did not go into detail, see slide].

[skipping backup slides]

FP: As it's hard to see from slides, but are all github issues covered?
CB: That was the intention.

FP: Now time for conversation. Anyone not happy w/ proposals?

JS: Back to JSON numbers. Said "need to decide" -- is that "app needs to
decide" or "wg needs to decide"? CB: App needs to. Or generic decoder
implementer needs to.

CB: Plan should be to use time until Singapore to complete this, and to
go through WGLC so completion can happen in Singapore.

FP: Next interim 31st of July is cancelled, next is 14th of August (?).
Would be good to have timeline for update for covering the remaining
issues. CB: Almost all of them can be covered so ppl can read before the
interim. From that there can be questions in the interim.


CB: cbor-sequence. Would like to go ahead with this. CBOR tag
definitions. array is done, and nothing came up during write-up, to go
to IESG. For OID we are chartered. Time, template etc for when we are
rechartered. When is that? Alexey: Latest charter update based on chair
text was done yesterday. Last person blocking rechartering cleared.
Couple of tweaks, but next week or so. JS: May say it's adopted [?]?
Alexey: Have my permission.

* Flextime [5'] + Wrap up

    Recording: https://youtu.be/UxwaM20zNa4?t=3404

With 30 more minutes ... anyone to the mike?

CB: Relaying one fun bit of data from last weeks: Driving licenses. Will
be on your phone in the future. Might have CBOR in it.

LL: Not sure about procedures ... discussing strict mode. FP: Go ahead.
LL: related to CBOR-bis issues slide. Made comments on github issue,
reiterating: Trying to understand strict mode, that was confusing to get
head around and figure out what as an implementer of decoder I should
do. Finally feel like having wrapped head around. Conslusion: Strict
mode portrayed as counter part to canonical encoding. If no canonical is
there, you can do strict. Strict mode addresses ambiguity in decoding.
My thinking now is that variabiltiy in encoding in terms of
serialization, and particularly with map ordering and duplicates in them
map. Variation in serialization is not ambiguous b/c clear rules are
there for what integer value is (even though encoded differently). So
not much difference there that doesn' tneed to be covered b ystrict mode
b/c no ambiguity. [...] With map ordering and map duplicates, that's a
charactistic of valid or invalid. Don't see reason for strict mode to
exist. If we can have good text for valid/invalid, no need to have
strict mode any more. CB: (slide on that? page 21). 723 is confused
about this. There's a number of issues under it, and agree term should
go away b/c has no clear meaning. Two thigns of interest: Entities
looking at encoding CBOR items actually decode them, sometimes just
compare items in total, eg in hashing. For that it's useful to have
deterministic encoding / "canonical". Decoder will normally not have
code to check whether deterministic was used. Could have mode that
demands that deterministic encoding. That's useful of a generic decoder
-- of course can also fake it by re-encoding and comparing, so it
doesn't need to be a necessary feature of an encoder but an efficiency
thing. Other thing: Performing some validity checks is expensive, so may
be parameters to control them. Expect code implementing mime check to
have a switch for actually validating that. There may be flag to
validate UTF-8 wellformedness. May be flag for map validity. Map
validity is interesting and different from others b/c naïve
implementation might lose input values (and undefined which ones) on
duplicate map keys, and that can be used in certain kinds of attacks.
That's where application can't do validation b/c it already lost
information on the way. So that's one place where it's implrtant to tell
the incoder to "not lose information for me" ie. "err out rather than
losing items". That's also about strict mode, but different [?]. Turns
into an enumeration of flags an app might set in a decoder, which would
include canonical checking and map validity and maybe others. May be
more to ease app's life. Eg. my decoder has flag that uses different
data type for strings in key maps and other strings, b/c that's useful
on that platform but has no place in standard. That a good way forward?
LL: Yes. One thing about nonpreferred serialization. If you receive, you
can unambiguously create preferred. JS shaking head: If you deserialize
all the way to the data model, may not be able to encode back
deterministically, b/c data-model-to-CBOR-types pops up. CB unsure:
Well, first if language like Lua that doesn't distinguish btwn array and
map, yes. Many environments do preserve enough. JS: If encode date and
go back, may not end up with the same string. LL: My comment was only on
serialziation, which is different, right? If decoder could tell whether
serialization is preferred is interesting. Probably some cost to it.

CB gathering more feedback CB: Jeffrey, what hurts you most? Jeffrey:
Changes so far have been good. Get back to whole thign to check whether
rigurous enough for web standard process. Good you're fixing outstanding
issues. Hope to have time to go through whole thing and double-check in
next month or two.