CBOR WG Meeting
IETF 106 - Singapore
Thursday, Nov 21, 2019, 17:40 - 18:40
Chairs: Francesca Palombini, Jim Schaad

Recordings: https://www.youtube.com/watch?v=3lJZ4dZuaPA
Session's material: https://datatracker.ietf.org/meeting/106/session/cbor

Minute takers:
    Michael Richardson

* Introduction [5'] : Chairs
  Agenda bashing and WG status update
  
  STATUS: 6 interims since 105, lots of progress made on documents.
  RECHARTERING has been finalized since 105.
  CBOR Array Tags and CBOR SEQ are in RFCEDITOR QUEUE.
  CBORBIS version -09 is in WGLC, up to Dec. 12 (extra long time)
  
Interim meetings: December 18, 2020-January 15 and 2020-Jaunary 29.
Wednesday at the same time slot (16:00 to 17:00 UTC)
Avoid conflict with CORE interim.

Carsten: the CORE interims will also not be every two weeks, so there will be no collision.


WG documents:

* CBOR specification status [25'] : Carsten
  https://tools.ietf.org/html/draft-ietf-cbor-7049bis
          - WGLC start on 14th to run for 4 weeks
          - Shepherd - Francesca 

This document is going to Internet Standard, so not adding anything fancy.
29 issues closed since IETF105.

Sean Leonard: asks about JSON situation.
Jeffrey Yasskin: ask security team about JSON vulnerabilities caused by duplicate keys.
Paul Hoffman: has seen situation where the duplicate replaces the original, vs ones where it ignores the other case.

Jim: one of the types of cases where you can have vulnerabilities is in the COSE if you use the "critical" header parameters and you put 2, different applications could chose either one and you could have problems.
CB: yes.

Laurence: wrote the code Jim talked about last IETF, because COSE says "no duplicate header parameters". That duplication I want to pick up in COSE layer not in CBOR layer.
CB: But you wouldn't see it in COSE because some parser would discard at CBOR layer.
Laurence: Seems there is a lot variability

MCR: pull parser and indefinite map, you wouldn't know if there is another key. so any ability to say to validate mean you should go to the end of the stream. You would have to process all amount of data. Very bad to say something like this. We should make recomendation if we should keep the first or the last, not a must.
CB: There is no good recomendation
MCR: Picking none is worse

Sean Leonard: Picking none could be ok
It is much more serious when the attacker can control the production of the map keys.
If you ask for user supplied data, but not keys, then the "secure" can control production of keys.
Henk: I am in favour of no change.  I always assume that the last key is the correct key.
Carsten: JSON maps are essentially indefinite, because there is no up-front size.
Stuart: check RFC6763, what did it say? take the first, ignore the rest (there is precedent for that).
Jeff: take the first or error would be my preference.
Carsten: If you can do take the first, you can fully validate.
FP: hard time reading consensus.
Alexey: take first or error. I hope that No change not mean not discussed in the text.
Carsten: the text discusses the problem. It says the application MUST handle this problem.
Q: What are the situations.
Carsten: if you turn it into a platform dictionary, processing things naively results in the last one winning.
Jim: 3 choices, 1 take first, 2 error, 3 return all of them and let the application deal with it.
Jeffrey: You can always check for presence before you insert, and some maps API that require you to do that, but some don't
MCR: says that most lookup/dictionary/hash code will have to do a lookup to find out where to put the item, so there does not have to be 2x the work.
LL: there is cases where you are not going to store the whole thing.
FP: asks LL what his preferred option is?
SL: if you just care if some key is present, once you get the hit you stop processing. Then the value does not matter.

Pete Resnick: if we find a security problem in the field, what are we going to do?
Carsten: we have already found it?
Carsten: literature is full of examples of (check for duplicates?), just none have been documented for JSON.
PH: we argued about this for IJSON, and decided not to allow them to be valid (punted)
Alexey: if we were designing a new protocol, we would have picked one.
Jeffrey: two options are: "protocols have to decide pick first or error" vs "leave that decision to application"
   (clarifying discussion, HUM requested)
LL: whatever they do, protocols have to have the option to say "don't care".
FP: if we do a change, then the change would say, "either you do not care, or if you care, you pick the first or return error"
Paul suggests we do not hum here.

FP: third option: do not change the processing, and make sure there is sufficient language to explain why all the options are bad.
Alexey Melnikov: we have a protocol that has discovered an issue.  We could make implementations that are broken to be fixed.  Suggesting to add text ruling out certain choices.
Jeffrey: there is a ZIP example of arbitrary choice.
JS: what Michael said is true for COSE.  The only place this is an issue is with the protected attributes.
LL: whether or not there is a problem depends upon the protocol. 
LL: If you are designing a protocol that has a problem with duplicate keys then pick one method.  If there is no problem, then you don't have to pick.
SL: the security problem comes from whether a second software takes it from a second source material.  if the first process takes the map and picks arbitrarily and passes the processes things, then there is not a problem.  It's when two different decoders are used.
PaulHoffman: gone down a rathole.  This whole discussion is about decoders that validate.  Decoders that do not validate are perfectly good decoders.  We can only make it safer for decoders that validating.
Jeffrey: validating decoders will error here. It's non-validating will not.  The other mistake is a specification of the format that there are two possible parses of the data.
PaulHoffman: are you saying that all parsers have to be validating?
Jeffrey: no, we need to pick which value a non-validating parser would use
LL: the duplicate detection can be done in two layers, ... but only if the decoder does not discard duplicates.
If your protocol can not tolerate duplicates, then you need to use the right decoder.
Jeffrey: would be happy with three results. (1) all non-validators pick the first, (2) protocols have to pick which one to pick, (3) or declare that there is no security impact.

FP: the room is against "no change" option.  (That is, for change) We need more discussion.
Alexey Melnikov: Jeffrey please send a message to the mailing list

FP: two new issues in tracker, but no discussion needed for those. Also all please do review the document.

Other:

* CDDL cont. - Ways forward [15'] : Carsten / Chairs

1) 
Alexey adds +1 to Module work
Carsten says that this part is hard.

2) Computed Literals are easy says Carsten
Sean Leonard, item (3) will be hard to implement. (1) and (2) are easy to implement, but will clutter specifications.
Henk: claims blame for (1). Because CDDL from old world, and there are byte offsets that have to computed.
Sean is in favour of syntatic suger.  For (1), can we have a list where increment is implied.... or do we really need general math?  Turing machine math?
Henk: no... do not want too much, but it winds up going there.

FP: can this be done separately from CDDL 2.0?
Carsten: CDDL has defined extension points, and they can be used to do this.


3) CBOR and JSON in one spec.
4) co-ocurrence constrained

FP asks for permission to share the answers, and asks for objections from those who participated in the survey.

Ways forward:
    + start 2.0 document with the features discussed today.
Paul will volunteer other people.


* Flextime [10']

* Wrap-up [5'] : Chairs