Ballot for draft-ietf-cellar-ebml
Yes
No Objection
Note: This ballot was opened for revision 13 and is now closed.
Section 1. Please add a reference to Matroska instead of an inline URL Section 1. Please add a reference for WebM Section 2 and Table 4 of Section 5. These sections define EBML Class. However, it isn’t used else where in the document. What is it supposed to be used for? Section 4.4. Recommend replacing “This table” text to be the name of specific table in question (i.e., Table 1, Table 2) Section 6.2. Per “Unknown-Sized Element MUST NOT be used or defined unnecessarily; however if the Element Data Size is not known before the Element Data is written, such as in some cases of data streaming, then Unknown- Sized Elements MAY be used.”, should this text be read as “the Unknown- Sized Elements MUST only be used if the Element Data Size is not known before the Element Data is written”. I’m having trouble understanding how to handle normative language for a qualitative statement of “unnecessarily” Section 11.1.5.1. Double checking on the grammar of the name attribute – it is permitted to start with a “-“ or a “.”? Section 11.1.5.3. Are there any uniqueness properties for an id attribute? Drawing a parallel from XML, I would have thought that each EBML element would have unique ID per doctype (say like an xml:id) Section 11.1.7.2. documentation@purpose has a number of possible enumerated values, however, none are defined in the text (they are only listed)
I am trusting my ART Area Directors colleagues for this document. Did a quick browse through and it looks good to me. Still wondering why the three authors have no affiliation at all...
Thanks for addressing my discuss and comments!
I support Adam's DISCUSS point about the abstract. I'm not clear on why the IANA registry names are prefixed with "CELLAR." Are there more general EBML Element ID and DocType registries envisioned? If not, I would suggest dropping the "CELLAR." For future documents I would expect the authors to respond to the Gen-ART reviewer's comments via email after fixes have been applied to address the issues raised in the review.
For completeness, the datatracker should point at this document replacing draft-lhomme-cellar-ebml.
This was a very difficult read: I found a lot of the document to be convoluted and hard to follow, and I considered balloting Abstain, as I’m not sure that DISCUSS is appropriate for my complaints, but I can’t really say that I have “no objection”. In the end I decided to go with a “No Objection” ballot, to call out the worst of the issues, and to hope that you will consider the changes I suggest and that others will follow the text more easily than I could. — Section 4.1 — Each Variable Size Integer begins with a VINT_WIDTH which consists of zero or many zero-value bits. The count of consecutive zero-values of the VINT_WIDTH plus one equals the length in octets of the Variable Size Integer. For example, a Variable Size Integer that starts with a VINT_WIDTH which contains zero consecutive zero-value bits is one octet in length and a Variable Size Integer that starts with one consecutive zero-value bit is two octets in length. The VINT_WIDTH MUST only contain zero-value bits or be empty. I found this very hard to follow, and had to read it several times before I understood what you’re getting at. I found things such as “zero or many zero-value bits” to be confusing. May I suggest alternative text, which describes the concept and then gets to the details?: NEW Each Variable Size Integer begins with a VINT_WIDTH followed by a VINT_MARKER. VINT_WIDTH is a sequence of zero or more bits of value 0, and is terminated by the VINT_MARKER, which is a single bit of value 1. The total number of bits (VINT_WIDTH and VINT_MARKER combined) is the number of octets of the Variable Size Integer. Thus, the single bit “1” describes a Variable Size Integer with a length of one octet. The sequence of bits “01” describes a Variable Size Integer with a length of two octets. “001” describes a Variable Size Integer with a length of three octets, and so on, with each additional 0-bit adding one octet to the length of the Variable Size Integer. END I, at least, find that easier to follow. Does it work for you? For the next paragraph, which limits the length under various circumstances, I suggest putting it in terms of the number of octets in the integer, rather than the number of bits in the VINT_WIDTH, which might be better put into Section 4.3, rather than 4.1. Text such as, “A Variable Size Integer in an EBML Header can be at most 4 octets long, except [...] , where it can be up to 8 octets long,” is easier to understand than the text explaining limits on the number of bits in VINT_WIDTH. — Section 4.4 — Table 2 and the text that introduces it would be better if they talked about the integer that’s represented (2), rather than the binary value (0b10 in the text and 10 in the table), considering that it is a Variable Size INTEGER, yes? — Section 6.2 — An EBML Element with an unknown Element Data Size is referred to as an Unknown-Sized Element. A Master Element MAY be an Unknown-Sized Element; however an EBML Element that is not a Master Element MUST NOT be an Unknown-Sized Element. Master Elements MUST NOT use an unknown size unless the unknownsizeallowed attribute of their EBML Schema is set to true (see Section 11.1.5.10). This also seems confusing and perhaps contradictory because of how it uses the BCP 14 key words. May I suggest this, which neither uses nor needs the key words?: NEW An EBML Element with an unknown Element Data Size is referred to as an Unknown-Sized Element. Only a Master Element is allowed to be of unknown size, and it can only be so if the unknownsizeallowed attribute of its EBML Schema is set to true (see Section 11.1.5.10). END — Section 7.7 — The Master Element MAY also use an unknown length. The “MAY” isn’t really correct, is it? There are restrictions that make it not entirely optional, as noted in the next sentence. I suggest not using BCP 14 here, and just saying, “The Master Element may be of unknown length.” The Master Element contains zero, one, or many other elements. Does this mean anything more than the simpler, “The Master Element contains zero or more other elements.”? As written, one tends to ask what “many” means here. — Section 8.2 — The EBML Body MUST NOT contain any data that is not part of an EBML Element. Why is this repetition needed? Doesn’t the similar sentence in Section 8 cover this? — Section 10 — An EBML Document handles 2 different versions: the version of the EBML Header and the version of the EBML Body. Both versions are meant to be backward compatible. I don’t see how that’s practical, as, taken strictly, it means you’ll never be able to make a significant change that is not backward compatible, so you’ll be stuck with errors or limitations forever. Are you sure you won’t need to allow for incompatible versions at some point? — Section 17.1 — Values from 1 to 126 are to be allocated according to the "RFC Required" policy [RFC8126]. Why did you choose that policy? Are you aware that this allows registrations from non-IETF-stream RFCs? In particular, anyone can get an RFC published in the Independent stream with a very light level of review. Did you consider IETF Review, which requires an RFC in the IETF stream (including Informational and Experimental RFCs)? Or even Standards Action, which requires standards-track RFCs? The same comment applies to "matroska" and "webm" in Section 17.2.
The -16 addresses my DISCUSS point; thanks! I'm given to understand that email discussion of the comments (preserved below) is forthcoming, but do have one note on the new text in the -16: In Section 7.3 we imply that a float is 32-bit and 64-bit at the same time; I think s/and/or/ makes more sense (and, of course, the EBML element length indicates which one is present). I support Adam's Discuss regarding the Abstract. Section 2 "Parent Element": A relative term to describe the "Master Element" which contains a specified element. For any specified "EBML Element" that is not at "Root Level", the "Parent Element" refers to the "Master Element" in which that "EBML Element" is contained. It sounds like this is intended to be "directly" or "immediately" contained (in order to be unique), right? If not, then it sould be ''refers to a "Master Element" in which [...]'' Section 4.1 Each Variable Size Integer begins with a VINT_WIDTH which consists of zero or many zero-value bits. The count of consecutive zero-values of the VINT_WIDTH plus one equals the length in octets of the Variable Size Integer. [...] Does the following attempted rewording change the meaning? % Each Variable Size Integer begins with a VINT_WIDTH which consists of % zero or more bits set to zero. The length in octets of the entire % Variable Size Integer is determined as one plus the number of % consecutive bits set to zero. (I find the current formulation rather hard to parse.) Section 6.2 | "\root\level1\level2\<global>" | Global Element cannot be | | | assumed to have this path, | | | while parsing "elt" it can | | | only be a child of "elt" | Cannot be assumed by who/what? My brain is trying to parse this as just "cannot assume this path". Section 7.5 Should we say anything about termination of a UTF-8 string needing to still result in valid UTF-8 (i.e., not insert NULs in the middle of a codepoint)? Section 7.7 stored within Master Elements SHOULD only consist of EBML Elements and SHOULD NOT contain any data that is not part of an EBML Element. When might this SHOULD (NOT) be violated? Section 8.2 part of an EBML Element. This document defines precisely which EBML Elements are to be used within the EBML Header, but does not name or (for EBMLVersion 1 only, right?) Section 11.1 Element; for example matroska or webm (see Section 11.2.6). The DocType value for an EBML Document Type MUST be unique and persistent. It might be appropriate to refer to Section 17.2 and/or the IANA registry for DocType values, here. EBMLVersion to only support a value of "1". If an EBML Schema adopts the EBML Header Element as-is, then it is not required to document that Element within the EBML Schema. If an EBML Schema constrains Does "as-is" imply some level of future-compatibility/extensibility for when EBMLVersions other than "1" are defined? Section 11.1.1 It's a little amusing that we bother to provide "default" attributes when the "range" attribute uniquely determines the allowed value. Section 11.1.4 Each "<element>" defines one EBML Element through the use of several attributes that are defined in Section 11.1.3. EBML Schemas MAY I think this makes more sense as "Section 11.1.5". Section 11.1.5.2 This ABNF seems to only allow "direct" recursion where element <x> appears directly inside element <x>, without any intermediate elements. I assume that's the intent, though it would be surprising in a general-purpose markup language. In some cases the EBMLLastParent part of the path is an EBMLGlobalParent. A path with a EBMLGlobalParent defines a Section 11.3. Any path that starts with the EBMLFixedParent of the That second sentence doesn't parse. As an example, a "path" of "1*(\Segment\Info)" means the element Info is found inside the Segment elements at least once and with no maximum iteration. An element SeekHead with path "0*2(\Segment\SeekHead)" may not be found at all in its Segment parent, once or twice but no more than that. The way this text is written makes me want to interpret the path occurence counts more like the (regular) minOccurs/maxOccurs element attributes, as opposed to applying to the path components to get to the specific element in question. Section 11.1.9.2 <element name="Item" path="1*1(\Items)" id="0x4025" type="master" minOccurs="1" maxOccurs="1"> <documentation lang="en" purpose="definition"> A set of items. Is this "name" supposed to be "Item" or "Items"? Section 11.1.10-11.1.12 I'm not sure I have a full understanding of how <restriction>/<enum> are used; perhaps a reference to the corresponding XML behavior is in order? Section 11.1.13-11.1.14 The <extention type="..."> usage seems underspecified. Section 11.1.15 <xs:attribute name="path" use="required"> <!-- <xs:simpleType> <xs:restriction base="xs:integer"> <xs:pattern value="[0-9]*\*[0-9]*()"/> </xs:restriction> </xs:simpleType> --> </xs:attribute> Why do we include this commented-out snippet? <xs:attribute name="unknownsizeallowed" type="xs:boolean"/> <xs:attribute name="recurring" type="xs:boolean"/> Don't we effectively set default values for these two in the prose description? Section 11.1.16 Identically Recurring Elements SHOULD include a CRC-32 Element as a Child Element; this is especially recommended when EBML is used for long-term storage or transmission. If a Parent Element contains more I'm not sure if the "long-term" is intended to also bind as "long-term transmission" (though I'm not sure what it would mean in that case). It's also not entirely clear what kinds of transmission would benefit from this, as reliable media presumably don't need redundancy for reliability, but unreliable media can't really be used to carry EBML without some framing requirements to know when elements start. Section 11.1.18 If a Mandatory EBML Element has no default value declared by an EBML Schema and its Parent Element is present then the EBML Element MUST be present as well. If a Mandatory EBML Element has a default value declared by an EBML Schema and its Parent Element is present and the value of the EBML Element is NOT equal to the declared default value then the EBML Element MUST be present. This seems almost tautological, in that how would an EBML Element have a value if it was not present? (The following paragraph that talks about when to write such elements, does make more sense.) Section 11.3.1 path: "*1((1*\)\CRC-32)" Using backslash as both an escape character and a path separator makes my head hurt, and I did not have enough caffeine yet this morning to figure it out. 8.1.1.6.2 of [ITU.V42.1994], with initial value of 0xFFFFFFFF. The CRC value MUST be computed on a little endian bitstream and MUST use little endian storage. bitstream or bytestream? Section 12 If a Master Element contains a CRC-32 Element that doesn't validate, then the EBML Reader MAY ignore all contained data except for Descendant Elements that contain their own valid CRC-32 Element. Ignoring only part of the known questionable content could have significant security considerations, if (e.g.) security-relevant restrictions are in the garbled part of the document but the sensitive content has a (valid) redundant CRC. [review terminated early]
I only did a very brief review but I don't think there are any transport issues :-)