Skip to main content

Minutes interim-2023-cbor-09: Wed 14:00
minutes-interim-2023-cbor-09-202305311400-00

Meeting Minutes Concise Binary Object Representation Maintenance and Extensions (cbor) WG
Date and time 2023-05-31 14:00
Title Minutes interim-2023-cbor-09: Wed 14:00
State Active
Other versions markdown
Last updated 2023-06-27

minutes-interim-2023-cbor-09-202305311400-00

CBOR working group conference call, 2023-05-31
Meetecho:
https://meetings.conf.meetecho.com/interim/?short=872f9f0d-9c33-408e-ba89-65d4e83980e3

CBOR use in IETF and other SDOs

BL starting the meeting.

CAl: Updates on dCBOR / latest draft, then next steps.
WMN: Was advised that API recommendations are not too important on IETF
side, so moved to less important part. Many clarifications. Update a
section on numerical reduction. Also want to check DISPATCH
recommendation (next point). Does the group want to take that? Then
move on to numerical-reduction topic.
CAl: What are the next steps? Is there interest on this document from
the community towards an RFC?

CB: The document is really interesting, but it contains many different
things, we'll have to disentangle them. I'd like an API discussion, but
the result would be a different document than a normative one about a
so-called encoding profile. Tutorial elements of the document may be
useful, but maybe not in IETF process. The authors likely want to find
people to cooperate with about specific aspects/components to be
identified and worked out here.

CAl: Are you interested in helping us identify the standardizable part?
Can do whatever is necessary to meet group requirements, but also
welcome help with that.
CB: Sure, time is limited but I'm interested in helping doing the
extraction to converge towards a Standards Track document.

WMN: Which sections are you referring to as tutorial-style?
WMN: Motivation for API recommendations is that goal of determinism can
not be supported by protocol only, but needs to be supported by how
codec is used.
CAl: Would we just refer to another document for the choices there?
BL: It's frequently done, like referring to a wikipage or an
informational document.
WMN: Does CBOR WG feel like developing a deterministic profile is in
confluence with [...]?

CAl: What's the process for accepting a WG item?
BL: Discussion is a start. When chairs determine it's at a point for WGA
question, will post that. Quick poll -- so far, any opinions to not
take this on?
CB: Don't know yet what "this" will be. Should have discussion on
approach of mapping floats to ints is really worth having as a separate
role in the CBOR ecosystem, but that's easier to decide when written up.
(Currently section 3). Section 4 is maybe a different thing, that's "how
to design a CBOR protocol", some is applications, some is API. I don't
see a standards track document coming of the latter.

BL: The next step if for Cal and WMN and CB to figure out out to
split/reorganize the document.
CAl: WMN and I will do day in SF, will you be there?
CB: No, but if you can enable remote presence, will join.

WMN: Primary issue w/ numerical reduction is support for languages that
don't have hierarchy of types (JS, Ruby). There it's harder to work with
fixed size numerical types. Ability to present dCBOR value to API should
not be burden on engineer. Should be on the codec.

CAl: we're planning to apply for a side meeting on the higher-level
protocol in SFO. I saw a requested slot for CBOR. Any time or agenda
yet?
BL: We'll work on the agenda in two weeks.
Cal: We'll do the change and split you recommend and wish to see.

CA: What kind of weak-typing planned to support? Why cut at int/float?
Comparing to Python2, tstr/bstr would be mixed. How would those be
helped, and how would that method be applied to int/float?

WMN: It would benefit all languages. You present whatever you have to
the encoder and it will take care of the rest, considering the specific
language at hand.
CA: I see what this does, but then why keeping the distinction between
certain types? Why precisely here?
WMN: Many languages have a stronger typing w/rt strings; not so much for
numbers. Issue with numbers is distinct. When you can distinguish, you
have to. Numbers are just so common.
CAl: Where thes are being used, there is a lot of JavaScript challenges
(JWT), and people start needing a schema and things get complicated.
Trying to stop bringing in semantic information or schema data. Not
against schemas, but causing lots of challenges. Users in Blockchain
Commons work with constrained devices with constrained processors (even
a constrained version of C).
CB: Some history. When designing CBOR, looked at msgpack and found byte
and text string distinction was murky there and wanted better way
(disucssions w/ msgpack found them not interested in standardization).
Looking at which distinctions make sense for applications is close to
fundamental CBOR stuff. In numerical part, focus was on constrained
devices, so float would not be used a lot (more integer or fixed point).
At that time, state of mind was that floats extend the number space.
Concept of generic data model; 7049 didn't outright state identity but
was open to protocols mapping them. Between 7049 and 8949, feedback said
that decision was not good; unhappiness with int and float turning up in
same context, so should be separate on CBOR level. (Applications can
still use identity).
CB: It was surprising to move this discussion towards deterministic
encoding. When deterministic encoding is done at the CBOR level, it does
not yield deterministic encoding in general. The appication has to
contribute. Things from this profile are a new idea.

CAl: One reason we're bringing this forward is many software developers
having in common that they sign data. Because of requirement to have
consistent signing across different environments, driven to CBOR. Lots
of code out there, micropython. all have to interoperate.
CB: Trying to find out why signal from implementers about separating
int/float was so strong and how your community arrives at a different
signal. Maybe different community. Maybe environment changed. People
with lots of numeric data often use arrays, thus tagged arrays. Also
signable.

Cal: That detail was also another point.
CAl: Two pcs of feedback: 1) not understanding who "our customers" were
(resp: anybody storing data at rest) -- everyone storing that should
consider privacy docs below. Data format used often to store this is
graph store; created a triple store where signed store can elide
triples.

Is there interest in the CBOR group to create a "problems statement" for CBOR in regards to privacy & human rights?

(sliding over to the other topic)
CAl: We have collected several use cases from healthcare, education etc.
on how this can be used. Then we're also interested to try standardizing
the 7 tags that we have to support triple stores and data minimazation
requirements. Is there interest on that in this community?

CB: There is an IETF WG doing something very similar, the SCITT WG
working on supply chain transparency. They're also looking at Markle
Trees and transarency registries. You should try to involve those
people, starting with offline discussions with the main drivers. You
envelope contribution can be a useful input to SCITT. The tags are a way
to realize that. We need a common view on how this is done in the IETF.

CAl: The suggestion was to have an early draft providing a problem
statement. This mention data minimization as key area. I do monitor the
SCITT list, but it's a group for a very specific application, I better
thought of CBOR. This can also drag to security. Not sure of the right
answer, CBOR just felt like the right place.
WMN: Designing Gordian on top of CBOR; impetus for dCBOR arisen from
implementing Envelope w/o it and realizing number issues get in the way.
In Swift and Rust it is implemented; in JS/TypeScript/Kotlin expecting
trouble at envelope level where it's easier handled at CBOR level.

CB: What'd happen if moved numeric reduction into Gordian envelope
protocol?
CAl: Not necessarily needed to happen. There's a value in itself. We may
be working on a future scenario in W3C to produce a profile describing
what is required to be determistic in a consistent way. Then another
document can provide a small set of tags to be simply used to create
complex structures beyond tables and lists, e.g., graphs. And yet
another document can tell how to anonymize data. Not all of them are
necessarily coming from CBOR.
WMN: Having pushed those down into our dCBOR libs, would never pull it
back up into Gordian. Our view is that it's not only useful to the IETF
but to everybody.
CA: This can be useful for languages liks JS. Not sure if it makes sense
for the encoder or for Gordian. Maybe this is not about encoding but
about a new variant of the information model. You start from the
original CBOR information model, on top of which you put your proposal.
Then you encode deterministically. Gordian envelope would just be an
application using the result.
CAl: Are you going to be in SFO?
CA: Only remotely. Planned to be in Prague.
CB: Me too.
CAl: We'll manage online. We think this is important work. In JSONLD
things are so tangled that it's way more difficult to move around the
different layers than the way it is for what we're proposing. Question
is if there's enough interested people in CBOR that can help. We're
providing multiple implementations.

CA: For many distinct parts, you'll find contributors here. For keeping
this easy to understand, something that worked well in the past (e.g.,
for signing) is considering CBOR data simply as byte strings. Was this
evaluated? For it may simplify a lot of things.
WMN: Once you encode something as envelope, it's a hash tree. Two
parties have to converge to an identical hash. Why not just agree in
advance on a single way to encode?
CA: We're already passing around a signature, then the data to be
signed.
CAl: It's an architectural choice that JWT made and other followed. The
plaintext is the way it is, and that's hard to handle (see
multi-signature). People need data in their graph database and export
again.

CB: The consensus by now is to sign data in flight (which COSE does).
There are applications that care about signatures at rest. XML (?)....
From people who worked with COSE you'll always get that pushback
(because it's easier w/ data in flight). But there's also understanding
that there are appls that need to do that.

BL: Running out of time. Let's continue this discussion on the mailing
list. The next step is for CB to get together with dCBOR people for
identifying/extracting content from the current proposal.
BL: Let's mostly use the mailing list for now, then let's see. In two
weeks, we'll start putting together the agenda for CBOR at IETF 117.

  • This was one of the suggestions from Dispatch when Gordian Envelope
    was presented.
  • We have a VERY ROUGH draft at
    https://hackmd.io/W5KMEnb_RTW8ipm9_gTaOw
  • Blockchain Commons would like to see Envelope in ART (not SEC), and
    possibly under CBOR, as it allows for graph-structured CBOR data,
    while offering elision and data minimization requirements from RFC
    6973 & 8280.

IETF 117 agenda

Placeholder for when the time comes. Please enter items at any time.

AOB

Note taking: Marco Tiloca, Christian Amsüss