Minutes interim-2021-jsonpath-01: Tue 09:00
minutes-interim-2021-jsonpath-01-202105110900-00

Meeting Minutes JSON Path (jsonpath) WG
Title Minutes interim-2021-jsonpath-01: Tue 09:00
State Active
Other versions markdown
Last updated 2021-05-14

Meeting Minutes
minutes-interim-2021-jsonpath-01-202105110900

jsonpath IETF Working Group May 2021 Interim

Date: 2021-05-11 Time: 9:00 - 11:00 UTC

  • Chair: Tim Bray
  • Chair: James Gruessing
  • Area Director: Francesca Palombini
  • Scribe: James Gruessing, Carsten Bormann

Attendees

  1. Carsten Bormann, TZI

Agenda Bashing

None

Discussion

Terminology - #84

Tim: On email you said we are good on the terminology issue Stefan: We need to some work and finish in quite a short time Tim: If I understand the state of the discussion, we should stick to 8259 terminology where possible, and introduce terms as necessary Everything is a JSON value, except member names and reasonable to have something shorter, candidates such as "node" Calling for opinions on this? Carsten: We don't have to start from January, we have a few things in the document and a few open issues for indexing filters and syntactic elements We have had "indexing", "key" for the first thing There are a few details we have to understand about selectors and grouping Carsten: Maybe we need a term for things with indexing, including all three kinds Maybe it's not helpful to separate the kinds into different categories Greg: The selector is the operator, the encompassing thing Index is the bracket selector, indexes are the things in there The selector is broad, the path is a sequence of selectors Carsten: I think we have consensus on overall structure of this Stefan: We need a specific term for index-value pair Glyn: I have a concern about the use of "index", it carries array connotations Greg: I see where you're coming from, having the understanding that any data type can be an index makes it more a "general use term" Tim: "key" has object connotation, I'm okay with either of these I'm not hearing a strong objection to the alternatives Marko: This summary doesn't capture array slice selector is an index, which can be repeated The slice expression is a valid way to select things Carsten: Good idea to make a union a comma-separated list of slices without weakening the term index Marko: The index cannot be expressed just as generic form as it's not a valid start value Carsten: There is something to solve at the union selector Marko: The union selector is index or key, filtered Greg: A filter selector would also work Tim: I'm not hearing strong issues of principle here, but an editorial issue and this should be handed to the editors Carsten: We should give this to the editors to handle, but have a leeway, let's agree direction Greg: Supposing we call an index a number, the union is a comma-delimited sequence or filters, or slices. Carsten: You're making a technical change Marko: Is a wildcard a special case? Stefan: We are looking for a term for this index value pair I think Greg: The difference with this terminology is we're not talking about values, only the value or numerical index Stefan: If we take the normalised expression, brackets are always keys until we find the final value Stefan: If we call it "index" I'm fine with it, but if Glyn says index is more connected to array Glyn: I am happy to wait for a PR Carsten: I'm happy to do it Tim: I'm not detecting any disagreement on this Greg: I'll write the terminology PR Tim: Are we still sticking with the term "node"? Greg: It is the pairing of the location and the value

Consensus: Sticking with the term "node" implying the location and the value

Action: Greg to PR the terminology

Relative Path Support - #59

Tim: I think the use of relative paths is not controversial, but the issue is using '@' Glyn: I thought it was using @ outside the filter expression? Marko: To supply a dynamic value to an index? Glyn: The use case is an "escape hatch" to provide current value to the processor Carsten: Do we have an example? Glyn: David provides an example in the issue but I don't favour it Stefan: If you allow query expressions with an @ symbol that part much be defined elsewhere Greg: That is David's point, and that some external tool that has navigated partially into an input, and wants to navigate relatively Stefan: If you define a relative path, will you always need the root path? If not, it would suffice to redefine the root Greg: Are you suggesting getting rid of the $ and just using @? Stefan: Just using $ and don't talk about @ Carsten: If there's a difference between notating $ and @, if there is no difference we can always use $ Greg: There's a distinct difference wanting relative and root On March 12th I give an example where both are used, in some cases back to root, in other to specify the local Marko: What I was trying to write is the difference; we could just re-evaluate with the current node Where we wouldn't want to do that, is when we need to go back above the current root in a filter expression or such Stefan: There should be consensus about the fact we cannot go back Greg: There a data models that don't support navigating to the parent of a value Marko: It's not about having navigation, it's about having two handles Carsten: I can imagine a use case where we need to address this Tim: There's an argument to be made this is outside our charter, of known implementations 25% support them Greg: Mine does not We could say the spec says "start with the $", and optionally support starting with an @ Tim: I hate optional behaviours in standards due to interoperability This smells outside our charter Carsten: We need to define our extensibility model before we finish this issue One way is to nail down the feature set once, everything not in that feature set is not valid Another way is to introduce versions, with linear progression Both are extremely unattractive, so we should group optional things into small number of features Greg: Another option is to just allow and support $, and fail if it doesn't start with $ Glyn: I think this is outside our agenda Greg: I'm saying we don't move forward with this, but if implementations use it in future we decide to update the spec we can include it Stefan: A very cheap thing might be is to always start the query with the $, but the $ addresses not just the root Tim: It's not obvious that relative path and @ falls into the "common semantics" and other aspects of the existing implementations Carsten: This does not relieve us of the extensibility model Glyn: If we make @ at the beginning of a path a syntax error, one model is to allow implementations to override syntax errors Tim: I haven't seen a specific proposal, Carsten did I miss something? Carsten: I haven't written this up yet, but we should apply existing IETF experience here We should identify extension points so it can be done in an orderly way Tim: One extension point might be the first character of a query? Carsten: Yes, and an IANA registry of first characters with a registry policy We should have a defined model for implementors or this WG to define extensions Tim: I don't get the feeling that anyone on the call strongly feels we should add support for queries to begin with @ Greg: I agree so long as we can update this spec at some later point Stefan: I think we should only support $ for now

Consensus: Queries should not begin with @, but this is a natural extension point.

Action: Chairs to update the Github issue and allow for non-attendees to reply Action: Write an extensibility model

Regular Expressions in Filters - #70

Greg: There is two aspects here, do we support them and what kind do we support? Carsten: We should answer the second aspect first Carsten: Two kinds of regular expressions - matching vs/ computing And that we need a mechanism for matching Carsten: Need to refer to existing standard and subset Tim: Is there anyone who wants regular expressions removed Marko: That would be a deal breaker, but it would be hard to make something interoperable Greg: As an implementor my preference would be get a library that does regular expressions for me, matching exactly what we specify Carsten: For a library developer adding a feature is a win, for a standards developer is a lose, lose, lose Greg: This is a big enough feature worth including, we just have to figure out how Tim: Unless we can have a specification of what we mean, we shouldn't have regular expressions Those who want regular expressions need to propose exactly what that means Glyn: If we pick a subset of W3C, any implementation would have to massage that into implementations

Tim: Are there any other IETF RFCs that use regular expressions? Carsten: YANG for example defines using W3C, but implementations use PCRE Marko: When we say subsets, does the standard define invalid/malformed expression? Would the standard require the library to refuse it? Carsten: That is answered by the extensibility concept Tim (summary): If we get a specific proposal, we can go ahead Stefan: Computed REs or literal REs only, proposal to stick with literal only now Carsten: It's likely an extension would come up Glyn: DoS issues? Could make our subset a common subset of W3C and RE2 to address that.

Consensus: Regular expressions can be included once we have a specific proposal Consensus: Only permit literals in regular expressions

Action: Carsten to attempt a PR Action: Cover DoS in the security considerations of the document

Duplicates in Selector output - #23

Greg: Two of the same value in the input, if it's extensive object it doesn't make sense remain portion of the path Questions raised around identifying not only value but location If somehow the path evaluated the same location in both, those would be removed ... but if the same value but different location, those would remain in the output Carsten: Can you clarify if by "same" you mean "equal" or "identical"? Greg: Two nodes with the same value should not be collapsed, but two selectors return the same node, those should be collapsed What I wrote on the issue is not what we landed on, what I wrote was strict value equality not node equality. From a performance perspective I don't want to evaluate the same object twice Stefan: I agree with this and see overall agreement Tim: When I'm writing a JSONPath usually I know the JSON I'm writing against looks like I'm having trouble imaging a scenario where I would want duplicate removal to happen Greg: The argument of identifying a count of something is a really good argument against this Tim: The removal of duplicates in something that happens at another level of the program Greg: Maybe this is out of scope, and let the application handle this afterwards Glyn: I would be very happy to allow duplicates Carsten: We may have a simplified view of expressions actually mean, and might assume duplicates being removed Stefan: I can agree to just allow duplicates for now Marko: Isn't descendants selector kind of implicitly removing duplicates? Greg: It wouldn't remove them Tim: The draft does not well describe this

Consensus: Duplicate removal won't be part of the document

Action: Marko to write the descendants PR

Filter Expressions - #64

Carsten: terminology: filter selectors contain filter expressions (vs. indexing selectors that contain indexing expressions, which may or may not be similar) Stefan: I proposing first we do the filter expressions, then return to indexing expressions I'm not sure if it's possible to do valid parsing without the parenthesis Greg: Lots of questions in the air, we should coalesce the related into a single issue #17, #57, #92 It might help for someone to condense these down into list of open questions Carsten: These should be several issues in the end Stefan: I don't see the necessity in doing them in new issues Greg: Maybe we can look at closing the other ones, there's a lot to unfold Stefan: One central discussion point is the filter selector expression syntax, which must resolve to a boolean expression I would start with just using comparison and boolean operators Greg: However the current node is equal to some other node that I reference from the root minus 1, I should be able to do that Tim: We should look at support in existing implementations Greg: If you're going to be passing boolean operations it's not a big stretch to support these arithmetic operations Glyn: Numeric operations don't pay for the cost, agree implementation wouldn't be too bad but increased verbiage in the specification Tim: If it turns out this is not widely supported, I'd be negative of us inventing it Stefan: Also comparing weak or strong type comparisons, I would prefer weak type Greg: I imagine loose typing to be divisive, and would prefer strict equality Glyn: Does anyone support loose equality? Marko: What we have to define is what happens when input has incompatible types, and how to model the behaviour of ignoring those cases Greg: We already have verbiage in the specification to that effect Stefan: Is it our task to do provisions to avoid security issue point, or the task of the implementors Greg: Either the implementors or the client of the implementation It's our responsibility to highlight known potential risks I don't know if it our responsibility to mitigate those risks Carsten: If avoiding the risk is feasible, we should avoid it Glyn: Another consideration is non-scalar operations Tim: I would like to rule it out, as it's a fruitful source of baffling behaviour Marko: What about lexicographical sorting or collation? (for > etc.) Tim: Unicode code points, or madness Marko: We have to say that explicitly Stefan: Can we agree to use for operators C-based syntax Tim: Do check existing implementations

Action: Stefan to write PR

AOB

Next Interim meeting

Action: Chairs to arrange a Doodle for an interim meeting in mid-June